Korean International Trade Scraper

This project is a Python scraper that collects information about trade statistics between South Korea and foreign countries from the TradeData website. It is designed to handle multiple pages and uses a hidden API, discovered through page inspection, to perform asynchronous requests efficiently.

Features

Data Scraping: Collects information on import and export between South Korea and foreign countries.
Asynchronous Requests: Uses aiohttp and asyncio to improve scraper performance, especially when dealing with multiple pages.
DataFrame Generation: The collected data is organized into a pandas DataFrame.
CSV storage: The data is saved in a structured CSV file for later analysis.

Data Collected

The scraper collects the following information from the website:

Year: Year in which the trade took place.
Country: Country with which Korea traded.
Goods: Goods that were traded.
Export Weight: Weight of exports in tons.
Export Value: Monetary value of exports in dollars.
Import Weight: Weight of imports in tons.
Import Value: Monetary value of imports in dollars.
Balance of Trade: Difference between the value of exports and imports.

Technologies Used

Python >= 3.13
aiohttp: For asynchronous HTTP requests.
asyncio: To manage asynchronous execution.
pandas: For data manipulation and analysis.
CSV: For data storage.

How to Use

Prerequisites

Ensure that you have Python >= 3.13 installed. Use pyenv to manage Python versions if necessary:

Install pyenv:
```
curl https://pyenv.run | bash
```
Install Python 3.13:
```
pyenv install 3.13
pyenv local 3.13
```

Setting Up the Environment

Clone the repository:

git clone https://github.com/pablomendesfaria/korean-international-trade-scraper.git
cd korean-international-trade-scraper

Install Poetry: Poetry is used to manage dependencies and the virtual environment.
```
curl -sSL https://install.python-poetry.org | python3 -
```
Install dependencies: Use Poetry to install project dependencies in an isolated environment:
```
poetry install
```
Activate the virtual environment:
```
poetry shell
```

Running the Scraper

Execute the scraper with the desired output file name:
```
poetry run python app/scraper.py output_file_name
```
The collected data will be saved in the output_file_name.csv file inside the data folder.

Exiting the Virtual Environment

When you finish using the scraper, exit the Poetry virtual environment:

exit

Project Structure

app: Module that stores the project script.
- scraper.py: Main script that performs scraping and saves the data.
data: Folder with the output file.
- output_file_name.csv: File generated with the collected data.
.python-version: Specify the Python version used in the project.
pyproject.toml: Configuration file for Poetry, specifying dependencies and project metadata.
.venv/: Virtual environment directory managed by Poetry (not included in the repository).

Notes

If you encounter any issues, verify that:

Your Python version is correctly set to 3.13 using pyenv.
Poetry has successfully installed all dependencies.

Feel free to open an issue or contribute to the repository!

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.vscode		.vscode
app		app
data		data
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Korean International Trade Scraper

Features

Data Collected

Technologies Used

How to Use

Prerequisites

Setting Up the Environment

Running the Scraper

Exiting the Virtual Environment

Project Structure

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Korean International Trade Scraper

Features

Data Collected

Technologies Used

How to Use

Prerequisites

Setting Up the Environment

Running the Scraper

Exiting the Virtual Environment

Project Structure

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages