GitHub - krithvi/iris-dataset: A Machine Learning Project done to analyse the accuracy of ML algorithms with Iris dataset

KNOW I PROJECT:

IRIS DATASET:

Team name: Incredibles
Team Members:

Shruthakeerthy.S
Nihil
Madhumita
Kavya

The objective of this project is classification of iris dataset by discovering patterns from examining the petal and sepal size of the flower. The dataset was taken from kaggle.

ABOUT THE DATASET:
The data set contains 3 classes of 50 instances each, where each class refers to a type of iris plant The data base contains the following attributes: 1). sepal length in cm 2). sepal width in cm 3). petal length in cm 4). petal width in cm 5). class: - Iris Setosa - Iris Versicolour - Iris Virginica Based on the contribution of first four features the species among three is predicted.

OBSERVATIONS FROM DATA VISUALIZATIONS:
• After graphing the features in a pair plot, we can see the clear relationship that iris setosa (pink) in different from other two species.
• But there is a overlap in pairwise relationship of iris vericolour(dark blue ) and iris virginica (light blue).

IMPLEMENTATION:
This project is implemented using jupyter notebook in python. The models were trained using Logistic Regression ,K-Nearest Neighbors, Random Forest and SVC algorithms .Out of these algorithms used SVC gave the highest accuracy of 97%.

JUSTIFICATION:
• SVC is better than logistic regression because SVC tries to maximize the margin between the closest support vectors while LR the posterior class probability.
• SVC is better than KNN because knn classifies data based on the distance metric whereas SVM need a proper phase of training. Due to the optimal nature of SVM, it is guaranteed that the separated data would be optimally separated.
• SVC is better than random forest because random forest gives the probability for being in a class but SVC may give clear margin of separation of classes.

ADVANTAGES OF SVM:

SVM works relatively well when there is clear margin of separation between classes.
SVM performs and generalized well on the out of sample data. Due to this as it performs well on out of generalization sample data SVM proves itself to be fast as the sure fact says that in SVM for the classification of one sample , the kernel function is evaluated and performed for each and every support vectors.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
Iris.csv		Iris.csv
README.md		README.md
irisprofinal.ipynb		irisprofinal.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

KNOW I PROJECT:

IRIS DATASET:

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

krithvi/iris-dataset

Folders and files

Latest commit

History

Repository files navigation

KNOW I PROJECT:

IRIS DATASET:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages