This project presents an end-to-end data analytics workflow using PostgreSQL and Python.
The goal is to extract meaningful insights from an e-commerce dataset through structured queries, data processing, and visualization.
The project demonstrates how raw transactional data can be transformed into actionable business insights.
- Analyze customer behavior and purchasing patterns
- Identify top-performing product categories
- Evaluate delivery performance across regions
- Understand revenue trends over time
- Build visual reports for decision-making
- Database: PostgreSQL (pgAdmin)
- Programming: Python
- Libraries:
- pandas
- matplotlib
- plotly
- sqlalchemy
- psycopg2
- openpyxl
project-root/ │ ├── sql/ │ ├── schema.sql │ ├── queries.sql │ ├── src/ │ ├── main.py │ ├── analytics.py │ ├── charts/ ├── exports/ │ └── mercadoinsights_report.xlsx │ ├── requirements.txt └── README.md
The dataset is based on a real-world e-commerce dataset (Brazilian Olist dataset), containing:
- Orders
- Customers
- Products
- Payments
- Delivery information
- Identifies the most common payment methods used by customers
- Shows the highest revenue-generating product categories
- Measures average delivery time across different states
- Monthly revenue analysis over time
- Examines relationship between product price and shipping cost
The project generates multiple types of charts:
- Pie chart → Payment type distribution
- Bar chart → Top product categories
- Horizontal bar → Delivery time by state
- Line chart → Monthly revenue trend
- Histogram → Price distribution
- Scatter plot → Price vs freight value
Interactive visualizations are also generated using Plotly.
- Create database:
mercadoinsights_db - Import dataset tables using pgAdmin
Execute:
sql/schema.sql sql/queries.sql
pip install -r requirements.txt
python src/main.py
- Excel report with multiple sheets
- Saved charts (PNG)
- Interactive Plotly HTML visualizations
- Identified top-performing product categories by revenue
- Observed seasonal trends in monthly sales
- Detected variations in delivery time across regions
- Found correlation between product price and freight cost
Academic Project (Data Visualization & Analytics)
- Add dashboard (Streamlit or Power BI)
- Real-time data integration
- Advanced predictive analytics
MIT License