Read this in other languages: Français
This project aims to analyze the French road traffic accident dataset (BAAC) to identify key risk factors and build a Machine Learning model to predict accident severity.
The final objective is to develop a tool capable of:
- Identifying the most influential features (weather, light conditions, driver profile, etc.) on accident severity.
- Predicting the probability of an accident being severe (resulting in death or serious injury) based on its characteristics.
- Presenting these findings interactively through a simple web application.
The data used comes from the public dataset "Bases de données annuelles des accidents corporels de la circulation routière" (Annual databases of personal injury accidents) available on data.gouv.fr.
- Source: Link to the dataset
- Scope: Annual data from 2018 to 2023.
The dataset is split into four main files for each year :
caracteristiques.csv: General characteristics of the accident (date, time, weather conditions, light).lieux.csv: Information about the accident location (road category, surface condition, infrastructure).usagers.csv: Information about the users involved (age, gender, injury severity, safety equipment).vehicules.csv: Information about the vehicles involved (category, direction of impact).