Amazon

It should be noted that this series of datsets is divided into many sub-datasets according to the categories of Amazon products. You can download the corresponding data set files according to your needs. We will demonstrate the operation steps of 'Amazon_Video_Games' dataset below, and you can process sub-datasets of other categories according to the demonstration.

1.Clone the repository and install requirements. (If you have already done this, please move to the step 2.)

git clone https://github.com/RUCAIBox/RecDatasets

cd RecDatasets/conversion_tools

pip install -r requirements.txt

2.Download the Amazon dataset and extract the dataset file.

(If you have already done this, please move to the step 3.)

You can download ratings only data and metadata from http://jmcauley.ucsd.edu/data/amazon/.

However, we're sorry for the metadata before 2014 we used can't be downloaded from the website now. We will continue with the assumption that the operator already has the original metadata. If you don't have the original metadata, you can download processed data from Baidu Yun or Google Drive.

Here we download ratings_Video_Games.csv and meta_Video_Games.json.gz,

save them in current folder(Amazon_Video_Games/), and do:

gunzip meta_Video_Games.json.gz

3.Go the conversion_tools/ directory and run the following command to get the atomic files of Amazon_Video_Games dataset.

python run.py --dataset amazon_video_games \
--input_path Amazon_Video_Games --output_path training_data/Amazon_Video_Games \
--convert_inter --convert_item

input_path is the path of the input decompressed Amazon_Video_Games file

output_path is the path to store converted atomic files

convert_inter Amazon_Video_Games can be converted to '*.inter' atomic file

convert_item Amazon_Video_Games can be converted to '*.inter' atomic file

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Amazon

FilesExpand file tree

Amazon.md

Latest commit

History

Amazon.md

File metadata and controls

Amazon