Helping Pakistani students navigate the complex world of international student visas using Retrieval-Augmented Generation (RAG) and Large Language Models.
Live Demo: http://13.235.238.227:8080
Thousands of Pakistani students struggle every year to find accurate, up-to-date visa information for studying abroad β most rely on outdated blogs or expensive consultants. TakeoffPK is an end-to-end RAG chatbot grounded in official government documents, achieving 100% accuracy on a custom 29-question evaluation suite across 6 countries.
| Country | Visa Types |
|---|---|
| πΊπΈ USA | F-1 Student Visa (UG, Masters, PhD) |
| π¬π§ UK | Student Visa β CAS, Tier 4 (UG, PG, PhD) |
| π¨π¦ Canada | Study Permit β PAL (UG, Masters, PhD) |
| π©πͺ Germany | Student Visa, PhD Visa, EU Blue Card, DAAD |
| π¦πΊ Australia | Student Visa Subclass 500 (UG, PG, PhD) |
| πΉπ· Turkey | Student Visa, TΓΌrkiye Burslari Scholarship |
%%{init: {'theme': 'dark', 'flowchart': {'defaultRenderer': 'elk', 'curve': 'basis'}}}%%
flowchart LR
subgraph SOURCES["π Knowledge Base β 20 Official PDFs"]
direction TB
USA["πΊπΈ <b>USA</b><br/>F-1 Visa Β· SEVIS Β· Embassy"]
UK["π¬π§ <b>UK</b><br/>Student Visa Β· CAS Β· UKVI"]
CA["π¨π¦ <b>Canada</b><br/>Study Permit Β· PAL Β· IRCC"]
DE["π©πͺ <b>Germany</b><br/>Student Visa Β· DAAD Β· PhD"]
AU["π¦πΊ <b>Australia</b><br/>Subclass 500 Β· OSHC Β· GTE"]
TR["πΉπ· <b>Turkey</b><br/>TΓΌrkiye Burslari Β· Student Visa"]
end
subgraph PIPELINE["βοΈ RAG Pipeline"]
direction LR
INGEST["π₯ <b>Document Ingestion</b><br/>Load PDFs Β· Split into chunks"]
EMBED["π€ <b>Embedding</b><br/>Convert text β 384d vectors<br/>via HuggingFace Inference API"]
SEARCH["π² <b>Semantic Search</b><br/>Find top-5 relevant chunks<br/>via Pinecone Vector DB"]
GENERATE["β‘ <b>Answer Generation</b><br/>Grounds answer in context<br/>via Groq LLaMA 3.3 70b"]
RESPOND["π <b>Response Delivery</b><br/>Adds disclaimer Β· Returns<br/>formatted answer to user"]
INGEST --> EMBED
EMBED --> SEARCH
SEARCH --> GENERATE
GENERATE --> RESPOND
end
subgraph CICD["π CI/CD β Automated on every push"]
direction LR
TEST["β
<b>Quality Check</b><br/>Linting Β· 15 unit tests"]
BUILD["π³ <b>Containerise</b><br/>Docker image built<br/>4.4 GB β 300 MB optimised"]
REGISTRY["π¦ <b>Image Registry</b><br/>Pushed to AWS ECR<br/>Versioned and stored"]
DEPLOY["βοΈ <b>Deploy</b><br/>AWS EC2 t3.micro<br/>Free tier Β· port 8080"]
TEST --> BUILD --> REGISTRY --> DEPLOY
end
USER(["π€ <b>Pakistani Student</b><br/>Asks visa question"])
USA & UK & CA & DE & AU & TR --> INGEST
RESPOND --> USER
DEPLOY -.->|"hosts"| RESPOND
classDef source fill:#1a2a1a,stroke:#2d5a2d,color:#7fc97f
classDef pipeline fill:#1a1a2e,stroke:#3d3d7a,color:#9090d4
classDef cicd fill:#2a1a1a,stroke:#6a3030,color:#d49090
classDef user fill:#1a2a2a,stroke:#2d6a6a,color:#90d4d4
class USA,UK,CA,DE,AU,TR source
class INGEST,EMBED,SEARCH,GENERATE,RESPOND pipeline
class TEST,BUILD,REGISTRY,DEPLOY cicd
class USER user
| Layer | Technology |
|---|---|
| LLM | Groq β llama-3.3-70b-versatile |
| Embeddings | HuggingFace Inference API β all-MiniLM-L6-v2 (384d) |
| Vector Database | Pinecone Serverless |
| Backend | Python, Flask |
| Frontend | HTML, CSS, JavaScript |
| Containerization | Docker |
| Registry | AWS ECR |
| Deployment | AWS EC2 t3.micro |
| CI/CD | GitHub Actions |
| Testing | pytest Β· custom batch evaluator Β· LangSmith |
| Linting | flake8 |
A keyword-matching evaluation script (batch_test.py) tests the live app against 28 country-specific questions. Any answer containing at least one expected keyword passes.
| Country | Questions | Passed | Accuracy |
|---|---|---|---|
| π¬π§ UK | 5 | 5 | 100% |
| π¨π¦ Canada | 5 | 5 | 100% |
| π©πͺ Germany | 4 | 4 | 100% |
| π¦πΊ Australia | 4 | 4 | 100% |
| πΊπΈ USA | 5 | 5 | 100% |
| πΉπ· Turkey | 2 | 2 | 100% |
| Cross-Country | 2 | 2 | 100% |
| Total | 28 | 28 | 100% |
langsmith_eval.py runs a deeper evaluation using a second LLM as judge (temperature=0.0) across 3 dimensions on a 5-question ground truth dataset:
| Metric | What it checks |
|---|---|
| Correctness | Is the answer factually accurate against a written ground truth? |
| Groundedness | Is the answer supported by the retrieved Pinecone chunks, or hallucinated? |
| Relevance | Does the answer actually address the question asked? |
Results across 3 experiments:
| Experiment | Correctness | Groundedness | Relevance |
|---|---|---|---|
| #1 β baseline | 0.60 | 0.20 | 1.00 |
| #2 | 0.80 | 0.40 | 1.00 |
| #3 β latest | 0.80 | 0.40 | 1.00 |
Correctness and groundedness both doubled from the baseline to experiment #2 and have since stabilized, indicating consistent system behavior. Relevance has been perfect across all runs.
Note on groundedness: The score is intentionally lower than correctness because the system prompt injects verified 2025 policy facts (e.g. SDS discontinuation, AUD/CAD fund requirements) as a safety layer. The judge only evaluates against retrieved Pinecone chunks and penalizes these additions even though they are correct and deliberate.
LangSmith Results:
TakeoffPK/
βββ src/
β βββ __init__.py
β βββ helper.py β PDF loading, text splitting, embeddings
β βββ prompt.py β System prompt for the LLM
βββ Data/ β Add PDFs here locally (not tracked by Git)
β βββ usa/
β βββ uk/
β βββ canada/
β βββ germany/
β βββ australia/
β βββ turkey/
βββ templates/
β βββ chat.html β Frontend UI
βββ tests/
β βββ test_app.py β 15 unit tests (pytest)
βββ app.py β Flask application
βββ store_index.py β One-time PDF ingestion into Pinecone
βββ batch_test.py β 29-question accuracy evaluation (run locally)
βββ langsmith_eval.py β LLM-as-judge evaluation via LangSmith
βββ requirements.txt β Dependencies
βββ Dockerfile
βββ .dockerignore
βββ .env.example
βββ .gitignore
βββ .github/
βββ workflows/
βββ main.yaml β CI/CD pipeline
- Python 3.10
- Conda or virtualenv
- Free API keys: Pinecone Β· Groq Β· HuggingFace
# 1. Clone
git clone https://github.com/slaiba123/TakeoffPK.git
cd TakeoffPK
# 2. Create environment
conda create -n TakeoffPK python=3.10 -y
conda activate TakeoffPK
pip install -r requirements.txt
# 3. Configure environment variables
cp .env.example .env
# Fill in your API keys in .env
# 4. Add official PDFs to the correct Data/ subfolder (see PDF Sources below)
# 5. Index documents into Pinecone (run once)
python store_index.py
# 6. Run the app
python app.py
# Open: http://localhost:8080# Unit tests
pytest tests/ -v
# Accuracy evaluation (requires app running on port 8080)
python batch_test.py
# LangSmith evaluation (requires LANGCHAIN_API_KEY in .env)
python langsmith_eval.pyEvery push to main triggers:
Push to main
β
βΌ
β CI β flake8 linting + pytest (15 unit tests)
β
βΌ
β‘ Build β Docker image built and pushed to AWS ECR
β
βΌ
β’ Deploy β EC2 pulls latest image, restarts container
The app runs on AWS EC2 t3.micro (1 vCPU, 1GB RAM) inside Docker, deployed automatically via GitHub Actions.
Estimated monthly cost: $0 β within AWS free tier limits (EC2 + ECR + EBS).
To deploy your own instance, you need an EC2 instance running Ubuntu 22.04 with Docker installed, an ECR repository, and the following secrets added to your GitHub repo:
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_REGION
AWS_ECR_LOGIN_URI
ECR_REPOSITORY_NAME
PINECONE_API_KEY
GROQ_API_KEY
HUGGINGFACE_API_KEY
Full step-by-step setup: AWS Deployment Guide (or refer to the workflow file at .github/workflows/main.yaml)
All data sourced from official government and embassy websites:
| Country | Source |
|---|---|
| πΊπΈ USA | travel.state.gov Β· pk.usembassy.gov |
| π¬π§ UK | assets.publishing.service.gov.uk |
| π¨π¦ Canada | ircc.canada.ca |
| π©πͺ Germany | germany.info Β· daad.de |
| π¦πΊ Australia | immi.homeaffairs.gov.au |
| πΉπ· Turkey | islamabad-emb.mfa.gov.tr |
This tool is for informational purposes only. Visa rules change frequently β always verify with the official embassy or consulate before making any application decisions. This project is not affiliated with any government body or embassy.
Laiba Mushtaq β Computer Engineering Student GitHub: @slaiba123
