Skip to content

jsya/ga_hw

Repository files navigation

<<<<<<< HEAD

ga_hw

Homework goes here something something end

DAT10 Course Repository

Course materials for General Assembly's Data Science course in Washington, DC (11/30/15 - 02/22/16).

Instructor: Keegan Hines (blog, [github] (https://github.com/keeganhines), twitter)

Monday Wednesday
11/30: Introduction to Data Science 12/02: Command Line, Version Control
12/07: Data Reading and Cleaning 12/09: Exploratory Data Analysis
12/14: Visualization 12/16: Machine Learning
12/21: Getting Data 12/23: No Class
12/28: K-Nearest Neighbors 12/30: Basic Model Evaluation
01/04: Linear Regression 01/06: First Project Presentation
01/11: Logistic Regression 01/13: Advanced Model Evaluation
01/18: No Class 01/20: Naive Bayes and Text Data
01/25: Natural Language Processing 01/27: Kaggle Competition
02/01: Decision Trees 02/03: Ensembling
02/08: Advanced scikit-learn, Clustering 02/10: Regularization, Regex
02/15: No Class 02/17: [Course Review] ()
02/22: [Final Project Presentations] ()

Python Resources

Local Data Science Resources

Resource Web Cost Notes
Data Community DC (DC2) http://www.datacommunitydc.org/ $ DC2 is an umbrella organization of several local Meet Up groups all focused on various aspects of data science. These Meet Ups are almost always free to attend.
DataSociety http://datasociety.co/ $$ Introductory online data science courses with a focus on R.
District Data Labs http://www.districtdatalabs.com/#!workshops/cwef $$ Weekend workshops and online courses which each focus on an advanced data science concept. They also have a part-time incubator program where participants collaborate on a final project.
General Assembly https://generalassemb.ly/education/data-science $$$ Part-time classes, you're already here! Good job!
Academic https://gradanalytics.georgetown.edu, /http://datasci.columbian.gwu.edu/, http://volgenau.gmu.edu/data-analytics-engineering $$$$ Many Universities are now offering Masters degrees and certificate programs in Data Science. These are obviously quite expensive.

Class 1: Introduction to Data Science

Homework:

  • Find and bring in a dataset that is professionally relevant to you.
  • Work through GA's friendly command line tutorial using Terminal (Linux/Mac) or Git Bash (Windows).
  • Read through this command line reference, and complete the pre-class exercise at the bottom. (There's nothing you need to submit once you're done.)
  • Watch videos 1 through 8 (21 minutes) of Introduction to Git and GitHub, or read sections 1.1 through 2.2 of Pro Git.
  • If your laptop has any setup issues, please work with us to resolve them by Wednesday. If your laptop has not yet been checked, you should come early on Wednesday, or just walk through the setup checklist yourself (and let us know you have done so).

Resources:


Class 2: Command Line, Git, and Python Review

  • Command line concepts and exercises (code)
  • Git and GitHub (slides)
  • Python Fundamentals
    • Python interfaces (shell, script, IDEs, notebooks)
    • Fundamental programming concepts (notebook)
    • Extra Python review

Homework:

Git and Markdown Resources:

  • Pro Git is an excellent book for learning Git. Read the first two chapters to gain a deeper understanding of version control and basic commands.
  • If you want to practice a lot of Git (and learn many more commands), Git Immersion looks promising.
  • If you want to understand how to contribute on GitHub, you first have to understand forks and pull requests.
  • GitRef is my favorite reference guide for Git commands, and Git quick reference for beginners is a shorter guide with commands grouped by workflow.
  • Cracking the Code to GitHub's Growth explains why GitHub is so popular among developers.
  • Markdown Cheatsheet provides a thorough set of Markdown examples with concise explanations. GitHub's Mastering Markdown is a simpler and more attractive guide, but is less comprehensive.

Command Line Resources:

  • If you want to go much deeper into the command line, Data Science at the Command Line is a great book. The companion website provides installation instructions for a "data science toolbox" (a virtual machine with many more command line tools), as well as a long reference guide to popular command line tools.
  • If you want to do more at the command line with CSV files, try out csvkit, which can be installed via pip.

cb3c318f35fb2729f68b904e87c4e6ccadbb5c6d

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors