Skip to content

Merrick1307/matching-ml-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Teacher-Student Matching Automation

This project implements an automated scheduling system that matches students with teachers for 1:1 and group lessons, using machine learning (XGBoost) and an optimization algorithm (Hungarian algorithm). It continuously learns from each cycle feedback, retrains models, and improves scheduling over time.


Features

  • Match students with teachers based on:

    • Subject overlap
    • Time availability
    • Teacher capacity limits
    • Student learning style ↔ teacher teaching style compatibility
  • Hybrid system:

    • Machine Learning (XGBoost regressor) predicts success scores.
    • Hungarian algorithm finds the optimal assignments.
  • Continuous learning pipeline:

    • Collects previous cycle feedback.
    • Appends to training dataset.
    • Retrains model and improves performance.
  • Automated weekly workflow with schedule.

  • Generates reports:

    • Schedules (schedule.csv)
    • Model versions & metrics (saved in models/)

Project Structure

.
├── app.py                     # Orchestrates workflow (matching + feedback + retraining)
├── main.py                    # Entry point with scheduling loop
├── schedule.csv               # Generated schedule after each run
├── models/                    # Trained models (saved as joblib files)
│   ├── model_v1.joblib
│   ├── model_v2.joblib
│   └── ...
├── data/
│   ├── students_large.csv     # Synthetic student dataset
│   ├── teachers_large.csv     # Synthetic teacher dataset
│   └── training_data_large.csv# Historical training data
├── utils/
│   ├── __init__.py
│   ├── matching_agent.py      # Core ML + matching logic
│   └── continuous_pipeline.py # Continuous learning wrapper
├── pyproject.toml             # Poetry dependencies
└── poetry.lock

⚙️ How It Works

1. Data Preprocessing

  • Students and teachers are loaded from CSV.
  • Comma-separated fields (subjects, time_slots) are normalized into lists.
  • Missing fields are filled with defaults (e.g., rating=4.5, learning_style="Visual").

2. Feature Engineering

For each (student, teacher, subject, time_slot) candidate:

  • Grade level (elementary/middle/high)
  • Subject overlap ratio
  • Time preference rank
  • Learning style ↔ teaching style compatibility
  • Teacher experience & rating
  • One-hot encodings for subject & time slots

3. Model Training

  • Uses XGBoost Regressor to predict a success score (0–1).
  • Trained on historical matches (training_data_large.csv).
  • Evaluated with RMSE (root mean squared error).

4. Matching Algorithm

  • Builds a compatibility matrix from ML scores.

  • Expands teacher slots according to max_students_per_slot.

  • Uses Hungarian algorithm to find optimal matches.

  • Produces schedule with:

    • student_id, teacher_id, subject, time_slot, lesson_type (1:1 or group).

5. Continuous Learning

At the end of each cycle:

  • Simulates (or collects) feedback → updates training dataset.

  • Retrains ML model and saves a new version in models/.

  • Tracks progress:

    • Model versions
    • RMSE improvements
    • Match counts

6. Automated Scheduling

  • main.py runs the workflow once per week on fridays at 00:00 using the schedule library.
  • Loop checks every 15 minutes (time.sleep(900)), very lightweight.

️ Installation

  1. Clone repo:

    git clone https://github.com/yourname/student-teacher-matching.git
    cd student-teacher-matching
  2. Install dependencies with Poetry:

    poetry install --no-root

Dependencies:

  • pandas, numpy, scikit-learn
  • xgboost
  • scipy
  • joblib
  • schedule

▶ Usage

Run once (manual cycle):

poetry run python app.py

Run continuously (weekly automation):

poetry run python main.py

This will:

  • Generate/update schedule.csv.
  • Save updated model in models/.
  • Retrain continuously over time.

Example Output

schedule.csv

student_id,student_name,teacher_id,teacher_name,subject,time_slot,lesson_type,compatibility_score
1,Ada,2,Mrs. Ama,English,Morning,1:1,0.92
3,Funke,1,Mr. Obi,Math,Morning,group,0.85
5,Chinedu,3,Mr. John,Math,Afternoon,1:1,0.88

Metrics Report

At the end of each cycle, the system prints:

=== LEARNING PROGRESS SUMMARY ===
Cycles completed: 3
Model versions: 1 -> 3
Total RMSE improvements: +8.20%
Average matches per cycle: 15.7

Pipeline Diagram

flowchart TD
    A[Student Data students.csv] --> C[Preprocessing]
    B[Teacher Data teachers.csv] --> C
    C --> D[Feature Engineering]
    D --> E[ML Model XGBoost]
    E --> F[Hungarian Algorithm]
    F --> G[Schedule schedule.csv]
    
    G --> H[Feedback Collection]
    H --> I[Historical Training Data training_data.csv]
    I --> E

    style A fill:#f9f,stroke:#333,stroke-width:1px
    style B fill:#f9f,stroke:#333,stroke-width:1px
    style C fill:#bbf,stroke:#333,stroke-width:1px
    style D fill:#bbf,stroke:#333,stroke-width:1px
    style E fill:#bfb,stroke:#333,stroke-width:1px
    style F fill:#bfb,stroke:#333,stroke-width:1px
    style G fill:#ffb,stroke:#333,stroke-width:1px
    style H fill:#fdd,stroke:#333,stroke-width:1px
    style I fill:#fdd,stroke:#333,stroke-width:1px
Loading

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages