Automated exam seating plan generator for universities. Takes student and faculty data, clusters students by domain and batch using K-Means, assigns rooms, maps faculty supervisors, and generates PDF seating plans for each exam slot.
Built for FAST-NUCES with 2500 students across 5 departments and 5 batch years.
flowchart TD
A[Load student and faculty CSVs] --> B[Encode domain and batch features]
B --> C[MinMaxScaler normalization]
C --> D[K-Means clustering with 5 clusters]
D --> E[Assign students to rooms\n30-35 students per room]
E --> F[Map faculty supervisors by domain]
F --> G[Split by exam time slot]
G --> H[Sort students by roll number within rooms]
H --> I[Generate student and faculty PDFs]
style A fill:#1e293b,color:#f8fafc,stroke:#334155
style D fill:#0f172a,color:#f8fafc,stroke:#6366f1
style I fill:#0f172a,color:#f8fafc,stroke:#22c55e
| Layer | Technology |
|---|---|
| Clustering | scikit-learn (K-Means) |
| Data Processing | pandas |
| Feature Scaling | MinMaxScaler |
| PDF Generation | ReportLab |
| Data Generation | Faker |
| Environment | Python |
Student dataset — 2500 students across 5 departments
| Department | Count |
|---|---|
| Software Engineering | 700 |
| Business Analytics | 600 |
| Artificial Intelligence | 500 |
| Computer Science | 400 |
| Electrical Engineering | 300 |
Batches: 19, 20, 21, 22, 23
Faculty dataset — name, department, assigned to clusters by domain
ai-clustering/
├── question_3.py # main script — clustering, room assignment, PDF generation
├── data_generation_used.py # synthetic student data generator
├── Faculty_data_generator.py # faculty data generator
├── pdf.py # PDF generation utilities
├── student_data.csv # student dataset
├── faculty.csv # faculty dataset
└── delete_files.py # cleanup utility
pip install scikit-learn pandas reportlab faker
python question_3.pyWhen prompted, enter the exam time slot name. The script generates:
{slot}_students.csvand{slot}_students.pdf— student seating plan{slot}_faculty.csvand{slot}_faculty.pdf— faculty supervisor assignment
Run once per exam slot. The script handles cleanup of intermediate files automatically.
K-Means on categorical data needs encoding first. Domain names like "Computer Science" and "Artificial Intelligence" mean nothing to the algorithm as strings. I used domain count and batch year as numeric features after MinMaxScaler normalization, which gave clean cluster separation aligned with actual department groupings.
Room assignment was the hardest part. The logic has to handle rooms of different sizes (30 to 35 students), roll over to a new room when one fills up, and restart the room counter across multiple exam slots without mixing students from different time slots.
Sorting students by roll number within each room before generating the PDF was a small detail that made the output actually usable. Without it faculty would have no logical order to check students against.
Built by Abdullah Khalid