Skip to content

hinmbo/Food-Delivery-Estimation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Food Delivery Time Estimation

Overview

The Food Delivery Time Estimation project utilizes machine learning models to predict the delivery time of food orders based on various factors such as geographical distance, delivery driver attributes, and vehicle type. This project leverages Python and libraries like Pandas, Scikit-learn, and Streamlit to provide a user-friendly web interface for predictions and data analysis.

Features

  • Interactive Web Application: Built with Streamlit to provide a seamless user interface.
  • Multiple ML Algorithms: Supports models such as Random Forest, Gradient Boosting, Linear Regression, and more.
  • Data Visualization: Includes detailed exploratory data analysis with visualizations using Matplotlib and Seaborn.
  • Distance Calculation: Implements the Haversine formula for precise distance computation between locations.
  • Customizable Inputs: Allows users to input data dynamically for real-time predictions.

Technologies Used

  • Programming Language: Python
  • Data Processing: Pandas, Numpy
  • Machine Learning: Scikit-learn
  • Visualization: Matplotlib, Seaborn
  • Web Interface: Streamlit
  • Big Data Processing: PySpark (optional for larger datasets)

Dataset

The project uses a food delivery dataset sourced from Kaggle, which contains details such as:

  • Restaurant and delivery location coordinates (latitude and longitude).
  • Delivery driver attributes (e.g., age, vehicle type).
  • Delivery time taken (in minutes).

Installation

  1. Clone the repository:

    git clone https://github.com/your-username/food-delivery-time-estimation.git
  2. Navigate to the project directory:

    cd food-delivery-time-estimation
  3. Install dependencies:

    pip install -r requirements.txt
  4. Run the application:

    streamlit run app.py

Usage

  1. Open the application in your browser.

  2. Navigate between the following tabs:

    • Application: Input details to estimate delivery time.
    • Data Analysis: Explore and visualize the dataset.
    • README: View project documentation directly within the app.
  3. Choose a machine learning algorithm and provide necessary inputs like restaurant and customer coordinates, vehicle type, and driver age.

  4. View estimated delivery time and additional model accuracy metrics.

Project Workflow

1. Data Processing

  • Clean and preprocess the dataset.
  • Compute the distance between restaurant and delivery location using the Haversine formula.
  • Convert categorical variables (e.g., vehicle type) to numerical using one-hot encoding.

2. Model Training

  • Train multiple machine learning models using Scikit-learn.
  • Evaluate models based on metrics like Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and R-squared.

3. Real-Time Predictions

  • Accept user inputs via the Streamlit interface.
  • Perform predictions using the selected machine learning model.
  • Display results along with optional performance metrics.

Exploratory Data Analysis (EDA)

The project includes detailed EDA with:

  • Distribution analysis of key variables.
  • Correlation heatmaps.
  • Summary statistics for numeric data.

Correlational Analysis

In which, we calculated the correlation between each numerical column and 'Time_taken(min)` column to identify the relevant variables that impact the delivery time. image

Geospatial Visualisation

A scatter plot was created to represent the geographic locations of restaurants and delivery points using latitude and longitude. Using the defined Haversine method, a new column 'distance' was added to the Data Frame for further analysis.

image

Haversine Method

def haversine(lat1, lon1, lat2, lon2):
    R = 6371  # Το ρ της γης είναι 6371 km.


    # Μετατροπή από γεωγραφικού πλάτους και μήκους σε radians.
    lat1 = mt.radians(lat1)
    lon1 = mt.radians(lon1)
    lat2 = mt.radians(lat2)
    lon2 = mt.radians(lon2)


    # Υπολογισμός διαφερός κάθε γεωγραφικής θέσεις
    Δlat = lat2 - lat1
    Δlon = lon2 - lon1


    # Φόρμουλα Haversine
    a = mt.sin(Δlat/2)**2 + mt.cos(lat1) * mt.cos(lat2) * mt.sin(Δlon/2)**2
    c = 2 * mt.atan2(mt.sqrt(a), mt.sqrt(1-a))
    d = R * c


    return d

Temporal Analysis

By analysing the distribution of delivery times based on the type of vehicle, we ploted the kernel density estimation (KDE) plot. image

Age Distribution

To visualise the age distribution of delivery drivers using a histogram. image

Results

The models were evaluated based on:

  • Mean Squared Error (MSE)
  • Root Mean Squared Error (RMSE)
  • Mean Absolute Error (MAE)
  • R-squared (R²)

Random Forest Regression provided the best results with the highest R² score and lowest error metrics.

Streamlit Web Interface

image image

The vast array of machine learning models such as: Random Forest Regression, Gradient Boosting Regression, Linear Regression, Decision Tree Regression, Extra Trees Fores and K-Neighbors Regression.

image

Furthermore, within the confines of the Web Interface the user can see the Exploratory Data Analysis and analyse the data, as we did, step by step. Aiding in the knowledge and total understanding of Data Analysis in general.

image image image

Include screenshots here to showcase the Streamlit application, demonstrating features such as: Input forms for predictions. Model selection options. Prediction results and data visualization tabs.

Future Enhancements

  • Incorporate additional features like weather and traffic data.
  • Support larger datasets using distributed computing with PySpark.
  • Optimize models for better performance.

Contributors

  • Vasileios Katotomichelakis (Π2020132)
  • Charalampos Makrylakis (Π2019214)

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgments

Special thanks to Kaggle for the dataset and open-source libraries used in this project.

About

The Food Delivery Time Estimation project utilizes machine learning models to predict the delivery time of food orders based on various factors such as geographical distance, delivery driver attributes, and vehicle type

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages