Skip to content

DareData/data-science-101

 
 

Repository files navigation

Data Science 101

Welcome to The Data Science 101 repository.

The welcoming remarks presentation can be found here.

Here we have recorded the presentations for most of the SLUs that were presented during the bootcamp weekend. Enjoy!

Here is you'll find all information needed to setup your environment and the workflow you'll use during the course.

  1. Initial Setup
    1. Setup Git/GitHub
    2. Install Anaconda
    3. Setup your Repository

Initial Setup

This is one-time setup. Get it right the first time and you won't have to worry about it again!

Presentation Videos

Here you can find videos of the presentations at the Batch3 LDSSA course which is the original source of the material.

Setup Git/GitHub

  1. Install GitHub Desktop.
  2. Sign up for a GitHub account if you don't already have one.

Install Anaconda

The work you will be doing during the course makes use of packages to provide extra functionality. Installing and managing different versions (that may have subtle changes) in different operating systems and ensuring everyone gets the same results is a challenging task. Anaconda is currently the best solution for this problem.

Go to Anaconda for installation instructions. BE SURE to choose "Python 3.7 version".

If you are on windows

Make sure that there are no non-english characters in your username or on the path and that you have qt installed. Here is a reference on how to address these issues.

Setup your Workspace Repository

It's good practice to store your work with version control but for this course it will not be required.

Using GitHub Desktop
  1. Select "File > Clone repository" on the menubar Clone menubar
  2. Select by URL "DareData/data-science-101"
  3. Select and press clone

Running a Learning Unit

In the data-science-101 repository that you just cloned there is a sample learning unit. It's used to give instructors guidelines to produce the learning units. We are also using it to ensure that you are able to run and submit a learning unit.

So go to the sample/SLU00 - LU Tutorial directory. Sample learning unit

Creating a Conda Environment

With each learning unit you will be provided with an environment.yml file. It tells Anaconda all the packages the learning unit depends on and it will be used to create an Anaconda environment.

An environment is simply an isolated set of packages. When you run your code inside an environment you will have access to the same version of the packages the instructor used to create the notebooks.

Using the Graphical Interface
  1. Select "Environments"
  2. Select "Import" Select environment
  3. Set slu00 for the name and select the environment.yml file in the learning unit directory (the one in your batch3-workspace). Create environment

Working on the Learning Unit

All learning units come as a set of Jupyter Notebooks (and some links to presentations). Notebooks are documents that can contain text, images and live code that you can run interactively.

In this section we will launch the Jupyter Notebook application. The application is accessed through the web browser.

Once you have the application open feel free to explore the sample learning unit structure. It will give you a handle on what to expect and what rules the instructors follow (and the effort they put) when creating a learning unit.

So let's start the Jupyter Notebook app.

Using the Graphical Interface
  1. Click the play button next to the newly created environment and select "Open with Jupyter Notebook" Anaconda Jupyter
The Exercise Notebook

Every learning unit contains an exercise notebook with exercises you will work on. So let's have a look at the sample Learning Unit.

  1. On the Jupyter Notebook UI in the browser open the exercise notebook Open exercise notebook
  2. Follow the instructions provided in the notebook

Besides the exercises and the cells for you to write solutions you will see other cells with a series of assert statements. This is how we (and you) will determine if a solution is correct. If all assert statements pass, meaning you dont get an AssertionError or any other kind of exception, the solution is correct.

Once you've solved all of the notebook we recommend the following this simple checklist to avoid unexpected surprises.

  1. Save the notebook (again)
  2. Run "Restart & Run All" Restart & Run All
  3. At this point the notebook should have run without any failing assertions

If you want to submit your notebook before it is all the way done to check intermediate progress, feel free to.

If you are able to go through the entire process and get a passing grade on the sample LU you'll have a good understanding of the same flow that you'll use for all LUs throughout the course.

About

LDSSA Batch3 learning material

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Jupyter Notebook 99.7%
  • Python 0.3%