MRF Digital Suite is an end-to-end operational analytics system designed to ingest, validate, analyze, and report municipal waste-load data from Material Recovery Facilities (MRFs).
The system processes structured CSV data, stores it in a relational database, computes key operational metrics, and generates automated analytical reports (CSV & PDF) along with an interactive dashboard for real-time insights.
This project demonstrates practical data engineering, analytics, and automation workflows in a real-world operations setting.
- CSV import with schema normalization
- Append or overwrite import mode
- Validation for missing/invalid fields
- Duplicate record detection
- Error handling with skipped row tracking
- SQLite relational database
- Structured schema for waste_records
- Parameterized queries for safe data insertion
- Modular data access functions
Computes operational metrics including:
- Total waste processed
- Revenue potential (material-based pricing model)
- Weekly waste trends
- Material distribution breakdown
- Destination analysis
- Average daily waste
- Today’s operational summary
Revenue is dynamically calculated using configurable material rate mappings.
- CSV export for raw data reporting
- Styled PDF report generation using ReportLab
- Summary statistics tables
- Material-wise revenue breakdown
- Destination percentage analysis
- Time-range filtering support
- Dynamic report timestamps
- KPI metric cards
- Weekly trend visualization
- Material composition pie chart
- Revenue by material bar chart
- Destination breakdown
- Real-time dashboard refresh
- Downloadable data exports
CSV Upload → Validation & Cleaning → SQLite DB → Aggregation Logic →
KPI Engine → Dashboard API → Streamlit UI
↓
CSV & PDF Reports
- Python
- SQLite
- Pandas
- Streamlit
- Plotly
- ReportLab
- SQL (aggregation queries, filtering, grouping)
- Standard Python libraries (datetime, csv, os, uuid)
MRFDigital-Suite/
│
├── app.py # Streamlit frontend & routing logic
├── dashboard.py # Aggregation & KPI computation
├── report_generator.py # Data import, validation & PDF/CSV reporting
├── mrf_data.db # SQLite database file
├── uploads/ # Uploaded CSV files
├── reports/ # Generated reports
├── templates/ # HTML templates (if applicable)
├── static/ # Static assets
└── README.md
- Total Waste Processed (kg)
- Revenue Potential (₹)
- Average Daily Waste
- Weekly Waste Trend (Last 30 Days)
- Top Material Category
- Main Destination
- Revenue by Material Type
The dashboard computes operational KPIs directly from the waste_records table using SQL-style aggregation logic:
-
Total Waste Processed Sum of
weightacross all records within the selected time range. -
Average Daily Waste Computed as the mean of daily aggregated totals over the last 30 days.
-
Today’s Loads Derived from records where
date = current_date. -
Material Distribution Aggregated using
GROUP BY materialto compute total weight per material type. -
Destination Breakdown Aggregated using
GROUP BY destinationto analyze waste routing patterns.
All KPIs are calculated dynamically based on real-time database queries.
Revenue is calculated using a predefined material rate mapping:
| Material | Rate (₹/kg) |
|---|---|
| PET | 28 |
| HDPE | 28 |
| Paper | 11 |
| Glass | 2 |
| MLP | 4 |
| Rubber | 4 |
For each material:
Revenue = Total Weight (kg) × Rate (₹/kg)
Material-wise revenue is computed after aggregation and then summed to derive total revenue potential.
This approach demonstrates how structured operational data can be transformed into business-impact metrics.
Weekly trends are generated by:
- Filtering records within the last 30 days.
- Grouping by
date. - Aggregating daily total waste.
- Ordering results chronologically.
This allows visualization of:
- Operational consistency
- Peak processing days
- Volume fluctuations over time
The trend is recalculated dynamically for each dashboard load.
To ensure KPI accuracy:
- Duplicate records are detected before insertion.
- Invalid or malformed data rows are skipped.
- Append and overwrite modes are explicitly controlled.
- All aggregations are performed after validation.
This ensures reported metrics are reliable and reproducible.
pip install streamlit pandas plotly reportlabstreamlit run app.py- Load sample data (100 synthetic records)
- Upload custom CSV data
- Generate reports
- Download exports
Expected fields:
date, vehicle_id, weight, material, destination
Alternate header variations are supported and normalized automatically.
Generated reports include:
- Total waste processed
- Number of loads
- Average load weight
- Top material & destination
- Material-wise revenue table
- Destination percentage distribution
- Time-stamped generation details
Reports are saved in the /reports directory.
- Duplicate detection before insert
- Validation for invalid weight values
- Missing essential field checks
- Optional append mode for safe incremental imports
- End-to-end data pipeline thinking
- Structured relational data modeling
- SQL-style aggregation logic
- KPI computation aligned to business needs
- Data validation & integrity controls
- Automation of reporting workflows
- Interactive analytics visualization
- Clean modular code organization
- Migration to PostgreSQL for production-scale deployment
- Role-based access control
- Logging system for import & report tracking
- Scheduled automated report generation
- REST API deployment option