-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathRegression Example.txt
More file actions
102 lines (69 loc) · 8.01 KB
/
Regression Example.txt
File metadata and controls
102 lines (69 loc) · 8.01 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
n he moment you sit in a cab ride, in a cab, you see that there's a fixed amount there. It says $2.50. You, rather the cab, moves or you get off. This is what you owe to the driver the moment you step into a cab. That's a constant. You have to pay that amount if you have stepped into a cab. Then as it starts moving for every meter or hundred meters the fare increases by certain amount. So there's a... there's a fraction, there's a relationship between distance and the amount you would pay above and beyond that constant. And if you're not moving and you're stuck in traffic, then every additional minute you have to pay more. So as the minutes increase, your fare increases. As the distance increases, your fare increases. And while all this is happening you've already paid a base fare which is the constant. This is what regression is. Regression tells you what the base fare is and what is the relationship between time and the fare you have paid, and the distance you have traveled and the fare you've paid. Because in the absence of knowing those relationships, and just knowing how much people traveled for and how much they paid, regression allows you to compute that constant that you didn't know. That it's $2.50, and it would compute the relationship between the fare and and the distance and the fare and the time. That is regression.
The typical work day for a Data Scientist varies depending on what type of project they are working on.
Many algorithms are used to bring out insights from data.
Accessing algorithms, tools, and data through the Cloud enables Data Scientists to stay up-to-date and collaborate easily.
Data science and business analytics have always been very hot subjects. Fasle after 2012
ts, who are mostly aspiring data scientists, need to learn many tools such as Python, UNIX commands, pandas, and Jupyter notebook.
Big data was started by Google when Google tried to figure out how how to solve their PageRank algorithm
Big data is is data that is large enough and has enough volume and velocity that you cannot handle it with traditional data database systems
The output of a data mining exercise largely depends on the quality of the data
any tools such as Python, UNIX commands, pandas, and Jupyter notebook
Use cases of deep learning Speech recognition , Classifying images at a large scale.
Linear algebra
Ml in predictivee NETFLIX, SHOPPING in banks transaction history basis!
Created a free account in IBM watson and used the image rexognition software .
Data Science helps physicians provide the best treatment for their patients, and helps meteorologists predict the extent of local weather events, and can even help predict natural disasters like earthquakes and tornadoes.
That companies can start on their data science journey by capturing data. Once they have data, they can begin analysing it.
Some ways that data is generated by consumers.
How businesses like Netflix, Amazon, UPs, Google, and Apple use the data generated by their consumers and employees.
The purpose of the final deliverable of a Data Science project is to communicate new information and insights from the data analysis to key decision-makers
The typical work day for a Data Scientist varies depending on what type of project they are working on.
Many algorithms are used to bring out insights from data.
Accessing algorithms, tools, and data through the Cloud enables Data Scientists to stay up-to-date and collaborate easily
brilliant data scientist who is passionate about the field of IT won't necessary excel in the field of healthcare if they are passionate about it.
Career tips and Motivation Tips
Data Scientists need programming, mathematics, and database skills, many of which can be gained through self-learning.
Companies recruiting for a Data Science team need to understand the variety of different roles Data Scientists can play, and look for soft skills like storytelling and relationship building as well as technical skills.
High school students considering a career in Data Science should learn programming, math, databases, and, most importantly practice their skills.
summary, detailed contents, acknowledgements, references and appendices.
The report should present a thorough analysis of the data and communicate the project findings.
Cover page
Table of contents
Introductory section
Methodology section
Results section
Discussion section
Conclusion section
References
Acknowledgment
Append
Languages to use in Course 2
Python heavily include IBM, Wikipedia, Google, Yahoo!, CERN, NASA, Facebook, Amazon, Instagram, Spotify, and Reddit. Python is a powerful general-purpose programming language that can do a lot of things. It is widely supported by a global community and shepherded by the Python Software Foundation. 1. Python is a high-level general-purpose programming language that can be applied to many different classes of problems. 2. It has a large, standard library that provides tools suited to many different tasks, including but not limited to databases, automation, web scraping, text processing, image processing, machine learning, and data analytics.
3. For data science, you can use Python's scientific computing libraries such as Pandas, NumPy, SciPy, and Matplotlib. 4. For artificial intelligence, it has TensorFlow, PyTorch, Keras, and Scikit-learn. 5. Python can also be used for Natural Language Processing (NLP) using the Natural Language Toolkit (NLTK).
So if Python is open source and R is free software, what’s the difference? Well, Both open source and free software commonly refer to the same set of licenses. Many open source projects use the GNU General Public License, for example. Both open source and free software support collaboration. In many cases (but not all), these terms can be used interchangeably. The Open Source Initiative (OSI) champions open source while the Free Software Foundation (FSF) defines free software
Who is R for? It's most often used by statisticians, mathematicians, and data miners for developing statistical software, graphing, and data analysis. The language’s array-oriented syntax makes it easier to translate from math to code,
R is popular in academia but companies that use R include IBM, Google, Facebook, Microsoft, Bank of America, Ford, TechCrunch, Uber, and Trulia. R has become the world’s largest repository of statistical knowledge.
ta Visualization is part of an initial data exploration process, as well as being part of a final deliverable. Model Building is the process of creating a machine learning or deep learning model using an appropriate algorithm with a lot of data. Model deployment makes such a machine learning or deep learning model available to third-party applications. Model monitoring and assessment ensures continuous performance quality checks on the deployed models.
Watson studio and watson openscale represents complete life cycle all tools at single hut
video #5 we’ll look at packages, APIs, datasets, and models for data science.
collection of fuctions and codes
py lib visulaization matplotlib --grpahs and plot and SeaBorn -- heat maps
sci lib py pandas numpy to math operation
ml lib sck=ikit regression
deep ml lib >> keras high interface to build models , low envt is requred ex tensorflow good for experimentation , pytorch to test ideas
apache spark --compute clusters and process data in parallel
can use py , R ,Scala its lib use in DS for e.g. Vegas --stastics , BigDL
R lib
Ggplot2 for data viz, keras +tensor flow ,R defacto stnd but now superseded by Py.
A P I
rest api req/res
two s/w talk to each othr
Rest Api here to s/w communicate using internet and take adv of greate data acccess , AI Alogo,and many ohter resources
Reprsentationl State API infromation is writtnen in JOSN file
watson text to speech API
watson lag translator API get/post .
##Data Sets Powering DS
structured cooletion of data imae audio vidoe file
tabular data row and cloumns e.g. csv
hierrarchal data ,n/w to represnt relationship e.g iSocail networking nodes
Pvt Data Ownrship