[Project Description] [Project Planning] [Key Findings] [Data Dictionary] [Data Acquire and Prep] [Data Exploration] [Modeling] [Conclusion]
This project will be added to linkedin: http://linkedin.com/jonathantware
In this project we will be using the data from kaggle's chicago crime data set in order to predict specific crime counts for next year.
To achieve accurate predictions, I will develop a predictive model using machine learning techniques. I will explore different algorithms suitable for time series forecasting. Additionally, regression models like linear regression may be used to estimate crime counts based on the selected features.
There are 3 Targets to this project. 'ASSAULT', 'BATTERY', and 'CRIMINAL DAMAGE'
- Need to explore the data.
- run autocorrelation visualizations
- Select features for modeling
- Run features through atleast 5 different algorythms.
Further feature exploration to see if model prediction can increase. Look for data to add to the TSA such as population and weather.
- Assault had a week correlation with time
- Battery had a weak correlation with time
- Criminal Damages, although there. also had a weak correlation
- using 10 crime types, only 3 made the cut for modeling
| Attribute | Definition | Data Type |
|---|---|---|
| Date | Dates in order by day | int |
| THEFT | Amount of thefts that has occured in a particular day | int |
| BATTERY | Amount of battery that havs occured in a particular day | int |
| ASSAULT | Amount of assault that has occured in a particular day | int |
| CRIMINAL DAMAGE | Amount of criminal damage that has occured in a particular day | int |
| MOTOR VEHICLE THEFT | Amount of motor vehicle theft that has occured in a particular day | int |
| NARCOTICS | Amount of narcotic related crimes that have occured in a particular day | int |
| HOMICIDE | Amount of homicides that have occured in a particular day | int |
| HUMAN TRAFFICKING | Amount of human trafficking that has occured in a particular day | int |
| OFFENSE INVOLVING CHILDREN | Amount of offenses involving children that have occured in a particular day | int |
| KIDNAPPING | Amount of kidnapping that has occured in a particular day | int |
| ** |
- Install necessary python packages and kaggle pip install.
- Clone the individual_project repository.
- Use Wrangle function to acquire zip and extract data into current directory fetching (https://www.kaggle.com/datasets/adelanseur/crimes-2001-to-present-chicago)
- Ensure the wrangle.py and model.py files are in the same folder as the crime_final_report.ipynb notebook.
- dropped duplicate rows.
- dropped nulls.
- created function to acquire and prep data
- function created to split data into train, validate and test
- Python files used for exploration:
- wrangle.py
- Three crime types out of ten were chosen for seperate targets for modeling using autocorrelation. .