This repository is a comprehensive collection of Data Science and Machine Learning projects.
It demonstrates:
✨ Data Analysis
🤖 Machine Learning
📊 Statistical Testing (ANOVA)
📈 Advanced Visualization
🎬 Animated Graphs
🌍 Real-world datasets
📦 Project Root
- 📂 codes → Jupyter notebooks (ML, EDA, visualization)
- 📂 data → Real-world datasets (CSV, Excel)
- 📜 README.md
✔ ANOVA testing
✔ Survival prediction
✔ Confusion matrix + ROC curve
✔ Advanced visualization
✔ Engagement analysis
✔ Trend detection
✔ Histogram & pie charts
✔ Player performance
✔ Career graph
✔ Team prediction model
✔ Monthly trending
✔ Genre popularity
✔ Viewer insights
✔ Magnitude distribution
✔ Time trend analysis
✔ Geographic insights
✔ Future trend modeling
✔ Dataset generation
✔ Visualization forecasting
✔ Logistic Regression
✔ Linear Regression
✔ Classification & Prediction
✔ Model evaluation
- Pairplots
- Heatmaps
- Violin plots
- KDE plots
- 3D Visualizations
- Animated graphs
Dataset loaded using:
import seaborn as sns
df = sns.load_dataset("titanic")- Removed missing values
- Selected relevant columns
- Prepared dataset for statistical testing
The project includes multiple visualizations:
- 📊 Bar plots
- 📉 Histograms
- 📦 Boxplots
- 🎬 Animated survival visualization
Example:
sns.barplot(x="class", y="survived", hue="sex", data=df)3️⃣ Statistical Analysis (Two-Way ANOVA) This analysis tests:
- Effect of Gender
- Effect of Passenger Class
- Interaction between Gender × Class
Example model:
from statsmodels.formula.api import ols
model = ols('survived ~ C(sex) * C(Q("class"))', data=df).fit()🎬 Animated Visualization
This project also includes animated plots using Matplotlib.
import matplotlib.animation as animationAnimated plots dynamically display survival rates across classes.
Python | Pandas | NumPy | Seaborn | Matplotlib | Plotly | Scikit-learn- Python
- Pandas
- Seaborn
- Matplotlib
- Statsmodels
- Jupyter Notebook
📷 File Structure
├── 📁 .ipynb_checkpoints
│ ├── 📄 accident_predict-checkpoint.ipynb
│ ├── 📄 gender-checkpoint.ipynb
│ ├── 📄 ios_android-checkpoint.ipynb
│ ├── 📄 kolkata-checkpoint.ipynb
│ ├── 📄 mock_test-checkpoint.ipynb
│ ├── 📄 testing-checkpoint.ipynb
│ └── 📄 train-checkpoint.ipynb
├── 📁 codes
│ ├── 📁 .ipynb_checkpoints
│ │ ├── 📄 advanced.ipynb
│ │ ├── 📄 climate.ipynb
│ │ ├── 📄 earth_quake.ipynb
│ │ ├── 📄 phone-pay_razar_paypal-checkpoint.ipynb
│ │ ├── 📄 test-checkpoint.ipynb
│ │ ├── 📄 titanic-checkpoint.ipynb
│ │ ├── 📄 train-checkpoint.ipynb
│ │ └── 📄 visual.ipynb
│ ├── 📄 JIS.ipynb
│ ├── 📄 accident_predict.ipynb
│ ├── 📄 adavance_pd.ipynb
│ ├── 📄 assigment1.ipynb
│ ├── 📄 breast_cancer.ipynb
│ ├── 📄 earth_quake_2.ipynb
│ ├── 📄 ecomic_gwrth.ipynb
│ ├── 📄 finalproject.ipynb
│ ├── 📄 gender.ipynb
│ ├── 📄 ios_android.ipynb
│ ├── 📄 ipl.ipynb
│ ├── 📄 jis_university.db
│ ├── 📄 jis_university_students.xlsx
│ ├── 📄 kolkata.ipynb
│ ├── 📄 match.ipynb
│ ├── 📄 mock_test.ipynb
│ ├── 📄 netflix.ipynb
│ ├── 📄 new_titanic.ipynb
│ ├── 📄 personl_data.ipynb
│ ├── 📄 phone-pay_razar_paypal.ipynb
│ ├── 📄 socia1l.ipynb
│ ├── 📄 social.ipynb
│ ├── 📄 social_media_usage.xlsx
│ ├── 📄 social_use.ipynb
│ ├── 📄 student_fruit.ipynb
│ ├── 📄 testing.ipynb
│ ├── 📄 testing3.ipynb
│ ├── 📄 titanic.ipynb
│ ├── 📄 titanic2.ipynb
│ ├── 📄 titanic3.ipynb
│ ├── 📄 titanic4ANOVA.ipynb
│ └── 📄 train.ipynb
├── 📁 data
│ ├── ( all data of .ipynb files )
└── 📝 README.md
🚀 How to Run the Project
Clone the repository:
git clone https://github.com/yourusername/titanic-analysis.git
Install required libraries
pip install pandas seaborn matplotlib statsmodels
Run Jupyter Notebook
jupyter notebook
👨💻 Author
Amit Paul
💻 Data Science | Python | Machine Learning
⭐ Support
If you like this project:
- ⭐ Star the repository
- 🍴 Fork the project
- 🚀 Share it with others