Skip to content

Help needed when processing my own dataset for testing. #122

@namln2k

Description

@namln2k

Hi guys, I've completed the training and validations and tests with the preprocessed dataset a few days ago. However my mentor ordered me to test this model on the DUC-2004 dataset. I'm not sure how to process the dataset just for testing, so I followed the guides in README.md and stuck at this step:
Step 4. Format to Simpler Json Files python preprocess.py -mode format_to_lines -raw_path RAW_PATH -save_path JSON_PATH -map_path MAP_PATH -lower
Below is part of the output of Step 3 and the command I used in step 4
image
As you can see, no output was printed in step 4. I doubt that's because of the /urls folder. I don't know how to process the files in it to match with my dataset.
Moreover, my merged_story_tokenized files look like this
image
Can someone please help me? Or show me the way to process the data, just for testing?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions