Fitbit Time Series Project

Author: Corey Solitaire

Project Goals:

To review six months of Fitbit data in order to characterize the person who wore the fitness tracker (employee or test subject status unknown) and predict the next two weeks worth of data that is missing.

Deliverables:

A jupyter notebook containing your analysis
A clean data set. The source data is a little messy and there is some work that needs to be done before you can effectively work with it. Your notebook should include the code to go from the original source data to a data set that you can work with.
A summarization of the data, ie what can you say about the individual who was wearing this fitness tracker?
Predictions for the missing two weeks worth of data in a separate csv file.

Executive Summary:

Project Summary:
The purpose to this project was to clean and explore an unlabled fitness tracker dataset, to provide evidence to support characterization of an individual, and to utilize time series analysis to predict two missing weeks of tracker data.

Background:
Here at Big Research Co.®, we love data so much - everyone wears Fitbits, even our employees! We believe these watches are the next step in the Big Data Industry and help enhance our current research. Our research spans from fitness equipment and drug trials to very ethical human experimentation. However, someone in a research and develoment lab mixed up the labels for our Fitbit datasets and one was left out.

Process:
Data set was combined from eight csv files in Excel. 97% of food log data was missing, so values were dropped. This resulted in a data frame that was 10 columns by 225 rows, with each row representing a specific day. Data was explored and weekly trends were obsereved. Data was applied to several time series models, with a seven-day moving average providing the lowest RMSE. When this model was applied, it resulted in a 29% improvement over baseline (last observed value).

Results and Conclusions:
General trends in the data demonstrate a slight upward trend in calories and activity over time, with the majority of higher activity levels occuring on the weekend. The lack of regular exercise, the lenght of time spendt sedentary and the the missing food data suggest this fitness tracker belongs to an employee. While the model was able to improve on baseline observations, I was not able to match a model to weekly seasonality. There exists the possibility that alternate modeling techiniques (TSA with Prophet) would return better results; however, this modeling technique was not included in this study.

Reccomendations:

Not Enough Data: With only seven months of data, it was difficult to pick up longterm trends in seasonality. More data would have yeilded better results.
New Model: Spending time refining previous cycle model and exploring Prophet would improve model forcasting.
Cross Validation: Time-based splitting can provide statistically robust model evaluation and best simulate real-life scenarios. Exploring the use of time-based cross validation to form a type of “sliding window” average would most likely lead to better model results.

Instructions for Replication:

Files are located in Git Repo here User will need env.py file with access to Codeup database

Data Dictionary:

Column Name	Description
date	yyyy-mm-dd, df index
cals_burned	calories burned for the day
steps	steps taken in the day
dist	distance walked, possibly in miles
floors	uncertain, possible floors walked up or down
mins_sedentary	minutes of the day sedentary
mins_lightly_active	minutes of the day lightly active
mins_fairly_active	minutes of the day fairly active
mins_very_active	minutes of the day vary active
activity_cals	uncertain, possibly calories burned due to active minutes
month	month of the observation
weekday	weekday of the observation

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.gitignore		.gitignore
README.md		README.md
explore.py		explore.py
fitbit_final_notebook.ipynb		fitbit_final_notebook.ipynb
model.py		model.py
wrangle.py		wrangle.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fitbit Time Series Project

Author: Corey Solitaire

Project Goals:

Deliverables:

Executive Summary:

Instructions for Replication:

Data Dictionary:

Audience:

Setting:

Workflow:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Fitbit Time Series Project

Author: Corey Solitaire

Project Goals:

Deliverables:

Executive Summary:

Instructions for Replication:

Data Dictionary:

Audience:

Setting:

Workflow:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages