Welcome to The Data Science 101 repository.
The welcoming remarks presentation can be found here.
Here we have recorded the presentations for most of the SLUs that were presented during the bootcamp weekend. Enjoy!
Here is you'll find all information needed to setup your environment and the workflow you'll use during the course.
This is one-time setup. Get it right the first time and you won't have to worry about it again!
Here you can find videos of the presentations at the Batch3 LDSSA course which is the original source of the material.
- Install GitHub Desktop.
- Sign up for a GitHub account if you don't already have one.
The work you will be doing during the course makes use of packages to provide extra functionality. Installing and managing different versions (that may have subtle changes) in different operating systems and ensuring everyone gets the same results is a challenging task. Anaconda is currently the best solution for this problem.
Go to Anaconda for installation instructions. BE SURE to choose "Python 3.7 version".
If you are on windows
Make sure that there are no non-english characters in your username or on the path and that you have qt installed. Here is a reference on how to address these issues.
It's good practice to store your work with version control but for this course it will not be required.
- Select "File > Clone repository" on the menubar

- Select by URL "DareData/data-science-101"
- Select and press clone
In the data-science-101 repository that you just cloned there is a sample
learning unit.
It's used to give instructors guidelines to produce the learning units.
We are also using it to ensure that you are able to run and submit a learning
unit.
So go to the sample/SLU00 - LU Tutorial directory.

With each learning unit you will be provided with an environment.yml file.
It tells Anaconda all the packages the learning unit depends on and it
will be used to create an Anaconda environment.
An environment is simply an isolated set of packages. When you run your code inside an environment you will have access to the same version of the packages the instructor used to create the notebooks.
- Select "Environments"
- Select "Import"

- Set
slu00for the name and select theenvironment.ymlfile in the learning unit directory (the one in yourbatch3-workspace).
All learning units come as a set of Jupyter Notebooks (and some links to presentations). Notebooks are documents that can contain text, images and live code that you can run interactively.
In this section we will launch the Jupyter Notebook application. The application is accessed through the web browser.
Once you have the application open feel free to explore the sample learning unit structure. It will give you a handle on what to expect and what rules the instructors follow (and the effort they put) when creating a learning unit.
So let's start the Jupyter Notebook app.
Every learning unit contains an exercise notebook with exercises you will work on. So let's have a look at the sample Learning Unit.
- On the Jupyter Notebook UI in the browser open the exercise notebook

- Follow the instructions provided in the notebook
Besides the exercises and the cells for you to write solutions you will see
other cells with a series of assert statements.
This is how we (and you) will determine if a solution is correct.
If all assert statements pass, meaning you dont get an AssertionError or
any other kind of exception, the solution is correct.
Once you've solved all of the notebook we recommend the following this simple checklist to avoid unexpected surprises.
- Save the notebook (again)
- Run "Restart & Run All"

- At this point the notebook should have run without any failing assertions
If you want to submit your notebook before it is all the way done to check intermediate progress, feel free to.
If you are able to go through the entire process and get a passing grade on the sample LU you'll have a good understanding of the same flow that you'll use for all LUs throughout the course.
