Skip to content

ak-abdullah/Artificial-Intelligence

Repository files navigation

AI-Based Student Clustering and Exam Report Generator

Python scikit-learn Pandas ReportLab

Automated exam seating plan generator for universities. Takes student and faculty data, clusters students by domain and batch using K-Means, assigns rooms, maps faculty supervisors, and generates PDF seating plans for each exam slot.

Built for FAST-NUCES with 2500 students across 5 departments and 5 batch years.


⚡ How it works

flowchart TD
    A[Load student and faculty CSVs] --> B[Encode domain and batch features]
    B --> C[MinMaxScaler normalization]
    C --> D[K-Means clustering with 5 clusters]
    D --> E[Assign students to rooms\n30-35 students per room]
    E --> F[Map faculty supervisors by domain]
    F --> G[Split by exam time slot]
    G --> H[Sort students by roll number within rooms]
    H --> I[Generate student and faculty PDFs]

    style A fill:#1e293b,color:#f8fafc,stroke:#334155
    style D fill:#0f172a,color:#f8fafc,stroke:#6366f1
    style I fill:#0f172a,color:#f8fafc,stroke:#22c55e
Loading

🛠️ Stack

Layer Technology
Clustering scikit-learn (K-Means)
Data Processing pandas
Feature Scaling MinMaxScaler
PDF Generation ReportLab
Data Generation Faker
Environment Python

📂 Dataset

Student dataset — 2500 students across 5 departments

Department Count
Software Engineering 700
Business Analytics 600
Artificial Intelligence 500
Computer Science 400
Electrical Engineering 300

Batches: 19, 20, 21, 22, 23

Faculty dataset — name, department, assigned to clusters by domain


📁 Project structure

ai-clustering/
├── question_3.py          # main script — clustering, room assignment, PDF generation
├── data_generation_used.py  # synthetic student data generator
├── Faculty_data_generator.py  # faculty data generator
├── pdf.py                 # PDF generation utilities
├── student_data.csv       # student dataset
├── faculty.csv            # faculty dataset
└── delete_files.py        # cleanup utility

🚀 Running locally

pip install scikit-learn pandas reportlab faker
python question_3.py

When prompted, enter the exam time slot name. The script generates:

  • {slot}_students.csv and {slot}_students.pdf — student seating plan
  • {slot}_faculty.csv and {slot}_faculty.pdf — faculty supervisor assignment

Run once per exam slot. The script handles cleanup of intermediate files automatically.


💡 What I learned building this

K-Means on categorical data needs encoding first. Domain names like "Computer Science" and "Artificial Intelligence" mean nothing to the algorithm as strings. I used domain count and batch year as numeric features after MinMaxScaler normalization, which gave clean cluster separation aligned with actual department groupings.

Room assignment was the hardest part. The logic has to handle rooms of different sizes (30 to 35 students), roll over to a new room when one fills up, and restart the room counter across multiple exam slots without mixing students from different time slots.

Sorting students by roll number within each room before generating the PDF was a small detail that made the output actually usable. Without it faculty would have no logical order to check students against.


📬 Contact

Built by Abdullah Khalid

LinkedIn Email Portfolio

About

K-Means clustering to group 2500 students by domain and batch, with automated PDF seating plans and faculty assignments for university exams.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages