Skip to content

Latest commit

 

History

History
161 lines (106 loc) · 3.7 KB

File metadata and controls

161 lines (106 loc) · 3.7 KB

Student Performance Predictor


Project Overview

Student Performance Predictor is intended as an educational project to demonstrate the end-to-end workflow of a supervised learning task:

  • Data collection / cleaning
  • Feature engineering
  • Model training and hyperparameter tuning
  • Evaluation and visualization

Use this repository to reproduce results, extend the model, or integrate additional data and features.


Features

  • Data preprocessing utilities
  • Exploratory Data Analysis (EDA) notebooks
  • Baseline models (e.g., logistic regression, decision tree, random forest)
  • Model evaluation scripts and visualization
  • (Optional) demo / app to try predictions interactively

Tech Stack

  • Language: Python 3.8+
  • Data: pandas, numpy
  • Modeling: scikit-learn
  • Visualization: matplotlib, seaborn
  • Notebooks: Jupyter
  • (Optional) Web demo: Streamlit or Flask (if included)
  • Dev tools: pip / virtualenv

Add or adjust versions in requirements.txt as needed.


Repository Structure

This is a suggested/typical layout — adapt if actual repo differs:

  • data/ - datasets (gitignored if large)
  • notebooks/ - analysis & experiments (Jupyter notebooks)
  • src/ - data processing and model training code
  • models/ - saved model artifacts
  • requirements.txt - Python dependencies
  • README.md - this file

Getting Started

Prerequisites

  • Python 3.8 or newer
  • git
  • (Optional) virtualenv or conda

Clone the repository

Open a terminal and run:

git clone https://github.com/NihalDR/Student-Performance-Predictor.git
cd Student-Performance-Predictor

Install dependencies

Create and activate a virtual environment, then install:

# using venv
python -m venv .venv
# macOS / Linux
source .venv/bin/activate
# Windows (PowerShell)
.venv\Scripts\Activate.ps1

pip install --upgrade pip
pip install -r requirements.txt

If there is no requirements.txt, install the minimum tools:

pip install pandas numpy scikit-learn matplotlib seaborn jupyter

Run notebooks / scripts

Start Jupyter to view notebooks:

jupyter notebook
# or
jupyter lab

Run a training script (example):

python src/train.py --config config/train.yaml

If there is a demo app (Streamlit example):

streamlit run app.py
# or for Flask
python app.py

Adjust commands to match scripts present in the repository.


Dataset

Place your dataset file(s) inside the data/ folder. Example expected location:

  • data/students.csv

If using a public dataset (e.g., UCI Student Performance dataset), include a copy or a link in data/README.md. Ensure large datasets are not committed to git — prefer instructions to download or a script to fetch them.


Training & Evaluation

  1. Preprocess the data (scripts in src/preprocess.py or notebooks).
  2. Train models (src/train.py) and save best models to models/.
  3. Evaluate performance using cross-validation and holdout test set, and visualize metrics in notebooks/ or src/evaluate.py.

Common evaluation metrics:

  • For regression: RMSE, MAE, R^2
  • For classification: Accuracy, Precision, Recall, F1, ROC-AUC


License

This project does not include a license by default. To make contributions and reuse clearer, add a LICENSE file (e.g., MIT License). Example:

MIT License
See the LICENSE file for details.

Contact

For questions, issues, or feature requests, please open a GitHub Issue in this repository or contact the owner: @NihalDR.

Happy modeling!