IUT de Lannion
BUT Informatique 3
2024-2025
SAE 5.C.01 : Datamining
BigBookSociety is a dynamic book recommandation website that uses user data to propose interesting books to an user according to its preferences and information.
The project includes:
- ETL Service: Processes input CSV files and generates transformed CSVs in
data/populate/. - Database Service: A PostgreSQL instance that is initialized with SQL scripts and imports data from the ETL output.
- API Service: A FastAPI backend that connects to the PostgreSQL database.
- Web Service: An NGINX-based dynamic web server.
- Cleanup Service: An optional one-off service to remove temporary CSV files after the database has been populated.
The project uses a .env file (located in the project root) to manage environment-specific variables. For example, your .env file might look like:
DATABASE_NAME=db_sae
DB_USERNAME=postgres
PASSWORD=password
HOST=db
PORT=5432These variables are referenced in your docker-compose.yml and used by the API and DB services to ensure consistency.
Make sure to not share this file with anyone as it contains your db password
-
Clone this repository
git clone https://github.com/MxPerrot/BigBookSociety.git cd ./BigBookSociety/ -
Prepare Input Data:
Place your input CSV files (Big_Boss_authors.csv,bigboss_book.csv,formulaire.csv) into thedata/directory. You might need to create this directory. -
Run the ETL Process:
The ETL service will process these CSV files and output transformed files todata/populate/.
To run the ETL service, execute:docker-compose up --build etl
NOTE This might a few minutes, do not worry, if something goes wrong you will get an error message.
-
Set up Environment Create the
.envfile in your project root (BigBookSociety/) -
Initialize the Database:
The PostgreSQL container will automatically run the SQL scripts located in thedatabase/folder on its first initialization.
The scripts import data from the CSV files indata/populate/.
Note: If you change any credentials, you may need to remove the persistent volume (usingdocker-compose down -v) so the DB reinitializes. -
Start API and Web Services:
To start the remaining services, run:
docker-compose up --build db api web
- API Service:
Accessible at http://localhost:8000 (try http://localhost:8000/docs for the interactive docs). - Web Service:
Accessible at http://localhost.
- API Service:
-
Cleanup Temporary CSV Files (Optional):
Once the database is populated and the system is running, you can clean up the CSV files by running the cleanup service:docker-compose run cleanup
WARNING This will erase all files in the data folder, including the three files of step 1
-
Database Connection Issues:
Ensure that your API connects usingHOST=db(as defined in the.envfile) and that the DB service is fully initialized before the API attempts to connect. Consider using a wait script if needed. -
ETL Not Producing Expected Output:
Verify that the ETL process writes output to thedata/populate/folder and that the file paths in your SQL scripts correctly point to these files (taking into account the volume mounts in Docker Compose). -
Web Server Not Serving Files:
Check that your static files are correctly located in theweb/directory and that the Dockerfile for the web service is properly copying them to NGINX’s default directory.
Wizards of the West Coast
- Nathan Bracquart
- Miliaw Chesné
- Asaïah Cosson
- Damien Goupil
- Ewan Lansonneur
- Florian Normand
- Maxime Perrot