This project analyzes a Brazilian e-commerce dataset to explore sales trends, regional demand distribution, and delivery performance.
The objective is to identify logistical inefficiencies and uncover data-driven opportunities to improve delivery efficiency and customer experience.
Brazilian E-commerce Public Dataset (Olist)
- Python (pandas) — data cleaning and feature engineering
- Google Colab — analysis environment
- Tableau Public — interactive dashboard visualization
ecommerce_analysis.ipynb— data cleaning and analysisDashboard.png— dashboard previewREADME.md— project documentation
View Interactive Dashboard on Tableau: https://public.tableau.com/...
- Orders steadily increased throughout 2017
- Peak demand observed in late 2017 – early 2018
- Decline at the end of the timeline is due to incomplete data
- São Paulo (SP) dominates in order volume (~40K+ orders)
- Other high-demand states include RJ and MG
- Demand is highly concentrated in a few regions
- Average delivery time: ~12 days
- Median delivery time: 10 days
- Significant variability across regions
- Extreme outliers observed (up to 200+ days)
Order volume is highly concentrated in a few states, with São Paulo (SP) dominating significantly.
High-demand regions (SP, RJ, MG) show faster delivery times, suggesting more efficient logistics and better infrastructure.
Remote states (RR, AP, AM) experience significantly longer delivery times, highlighting geographic and logistical constraints.
Improving delivery infrastructure in low-demand regions could:
- reduce delivery delays
- increase customer satisfaction
- expand market reach
- Identified regions with inefficient delivery performance
- Revealed strong demand concentration in key states
- Highlighted opportunities for logistics optimization and expansion
If delivery time in high-delay regions (RR, AP, AM) is reduced by 20%:
- Customer satisfaction is expected to improve
- Potential increase in repeat purchases
- Reduced delivery-related complaints
This highlights the business value of optimizing logistics in remote areas.
Delivery performance varies significantly across regions. While high-demand states benefit from efficient logistics, remote areas face delays. Targeted improvements in these regions could provide measurable business value.
