A bilingual (English & Roman Urdu) car problem diagnosis web app powered by Machine Learning. Describe your car issue in plain language and the app will identify the problem and provide step-by-step solutions — no technical knowledge required.
- Naive Bayes Classifier for automatic language detection using character-level n-gram analysis, with a confidence score shown for every prediction.
- TF-IDF + Cosine Similarity for intelligent problem matching — finds the most semantically relevant problems from the database based on your description.
- Keyword Extraction to pull the most meaningful words from your input and display them alongside results.
- Model Caching via
@st.cache_resourceso models only train once per session, keeping the app fast.
- Accepts input in English or Roman Urdu (e.g. "Engine bohot garam hai").
- Automatically detects the language of your input and responds with solutions in the same language.
- Full dataset of problems and solutions available in both languages.
Covers 11 categories of common car problems:
| Category | Examples |
|---|---|
| Engine | Overheating, knocking noise, misfiring, oil leak |
| Brakes | Squeaking, grinding, soft pedal, ABS warning |
| Electrical | Dead battery, dim lights, alternator failure |
| Transmission | Slipping gears, hard shifting, fluid leak |
| Tires | Flat tyre, uneven wear, vibration |
| Steering | Hard to turn, pulling to one side |
| AC & Climate | Not cooling, bad smell, heater issues |
| Fuel System | Poor mileage, fuel leak, pump failure |
| Suspension | Bumpy ride, clunking noise, sagging |
| Body | Rust, broken mirror, seat belt issues |
| Safety | Airbag warning light |
- Clean, wide-layout UI with a text input area for describing your problem.
- Results displayed in expandable cards showing the matched problem, similarity score, a progress bar, and the recommended solution.
- A sidebar showing active ML models, their descriptions, and example queries to help you get started.
- Displays detected language, confidence percentage, and extracted keywords at the top of results.
Python 3.8 or higher is recommended.
Install all dependencies using the command in the Setup section below. The libraries needed are:
streamlit— web app frameworkpandas— dataset handlingnumpy— numerical operationsscikit-learn— ML models (Naive Bayes, TF-IDF, Cosine Similarity)
No API keys are required. This app runs entirely offline using locally trained ML models. There are no external API calls.
Follow these steps in order:
Step 1 — Make sure Python is installed
Open a terminal or Command Prompt and verify:
python --versionYou should see Python 3.8 or above. If not, download it from python.org.
Step 2 — (Optional but recommended) Create a virtual environment
python -m venv venvActivate it:
- Windows:
venv\Scripts\activate - Mac/Linux:
source venv/bin/activate
Step 3 — Install the required libraries
pip install streamlit pandas numpy scikit-learnStep 4 — Navigate to the project folder
cd "path\to\your\Car-Malfunction-Detector"Replace the path with wherever you saved the project.
Step 5 — Run the app
streamlit run app_final.pyStep 6 — Open in browser
Streamlit will automatically open the app in your browser. If it doesn't, copy the URL shown in the terminal (usually http://localhost:8501) and paste it into your browser.
- Type your car problem in the text box in English or Roman Urdu.
- Click the 🔍 Detect Problem button.
- The app will display:
- The detected language and confidence level
- Keywords extracted from your input
- Up to 5 matching problems ranked by similarity, each with a solution
- Click on any result card to expand and read the full solution.
Example queries to try:
My engine is overheating
Brake se bohot zyada awaaz aa rahi hai
Battery kharab hai car start nahi ho rahi
AC thanda nahi kar raha
Brakes making loud grinding noise
Car-Malfunction-Detector/
│
├── app_final.py # Main application file
└── README.md # This file
- Expand the dataset — Add more problems per category and cover more car makes/models for better accuracy.
- Deep Learning model — Replace TF-IDF with a transformer-based model (e.g. BERT or a small LLM) for more accurate semantic understanding of problem descriptions.
- Hybrid detection system — Combine the current ML approach with a rule-based keyword matching system for a more robust ensemble that handles edge cases better.
- User feedback loop — Allow users to rate whether the solution was helpful, and use that data to retrain and improve the model over time.
- Confidence threshold tuning — Let users adjust the minimum similarity score threshold to control how strict or loose the matching is.
- Full Urdu script support — Add support for native Urdu script (اردو) in addition to Roman Urdu.
- Multilingual expansion — Support additional languages such as Arabic, Hindi, or Punjabi to reach a wider audience.
- Voice input — Integrate speech-to-text so users can describe their car problem verbally instead of typing.
- Mobile-friendly UI — Optimize the Streamlit layout for smaller screens and mobile browsers.
- Deploy to cloud — Host the app on Streamlit Community Cloud, Heroku, or AWS so it's accessible without local setup.
- REST API — Expose the detection logic as a REST API endpoint so it can be integrated into other apps or mobile applications.
- CSV/database backend — Move the hardcoded dataset to an external CSV or database (e.g. SQLite) so problems and solutions can be updated without touching the code.
- Car make/model filtering — Allow users to specify their car brand or model to get more targeted solutions.
- Mechanic locator — Integrate a map or link to help users find nearby mechanics based on their detected problem category.
- Problem history — Save past queries and results within the session so users can review previous diagnoses.
- Image input — Allow users to upload a photo (e.g. of a warning light or leak) and use computer vision to assist with diagnosis.
"Missing ScriptRunContext" warnings
This happens when you run the file with python app_final.py instead of streamlit run app_final.py. Always use the streamlit run command.
Module not found error
Run pip install streamlit pandas numpy scikit-learn to make sure all dependencies are installed.
Port already in use If port 8501 is busy, run on a different port:
streamlit run app_final.py --server.port 8502