CuraLens is a deep learningโbased web platform designed to assist in the screening of oral cavity and skin images for abnormal patterns.
It functions as an AI-assisted decision-support system and does not provide medical diagnosis or replace clinical judgment.
Oral cancer and skin malignancies have a high prevalence, particularly in countries like India.
Early-stage screening and risk flagging can help guide individuals toward timely clinical evaluation.
This project explores how computer vision, transfer learning, and clinical metadata fusion can support preliminary screening in an ethical and responsible manner.
- Dual-modality screening: Oral cavity (v1 + v2/v3) and skin lesion (v1 + v3)
- Multimodal fusion: EfficientNetB0 image branch fused with 6D clinical metadata
- Grad-CAM explainability on both v1 and v2/v3 models โ visual heatmaps highlight suspicious regions
- Three-tier risk scoring: Low / Medium / High with colour codes and clinical recommendations
- Metadata schema validation with graceful degradation on missing or out-of-range fields
- Two-phase training strategy: warm-up (CNN frozen) โ fine-tuning (top-20 EfficientNet layers unfrozen)
- Focal loss support for handling class imbalance
- Stratified K-fold cross-validation with saved per-fold metrics
- REST API with
/predict,/predict/skin,/predict_v2,/schema/<type>,/health - Flask SPA web interface with animated gradients and real-time Grad-CAM display
- Automatic prediction logging to
automation_logs/ - System health endpoint for monitoring model load status
| Model | Architecture | Val AUC | Input |
|---|---|---|---|
| Oral v1 | MobileNetV2 (frozen) โ Dense(128) โ Sigmoid | 0.993 | 224ร224 RGB |
| Skin v1 | MobileNetV2 (frozen) โ Dense(128) โ Sigmoid | 0.943 | 224ร224 RGB |
Image Input (224ร224ร3) Metadata Input (6D clinical)
โ โ
EfficientNetB0 (frozen/fine-tuned) BatchNorm
GlobalAvgPool โ Dense(512) โ Dropout Dense(64) โ Dense(64)
โ โ
โโโโโโโโโโโ Concatenate (576D) โโโโโโโ
โ
Dense(256) โ Dense(128) โ Sigmoid โ P(cancer)
Oral v3 metadata (6D): age, smoking_years, cigarettes_per_day, alcohol_units_per_week, chewing_tobacco, family_history
Skin v3 metadata (6D): age, skin_type (Fitzpatrick 1-6), sunburn_history, outdoor_hours_per_week, tanning_bed_use, family_history
curl -X POST http://localhost:5001/predict \
-F "image=@path/to/oral.jpg" \
-F "mode=diagnostic"Response includes cancer_probability, risk_level, recommendation, gradcam_png_b64.
curl -X POST http://localhost:5001/predict/skin \
-F "image=@path/to/lesion.jpg"# Oral v3 with clinical metadata
curl -X POST http://localhost:5001/predict_v2 \
-F "cancer_type=oral" \
-F "image=@oral.jpg" \
-F "age=52" \
-F "smoking_years=15" \
-F "cigarettes_per_day=10" \
-F "alcohol_units_per_week=7" \
-F "chewing_tobacco=0" \
-F "family_history=1"
# Skin v3 with clinical metadata
curl -X POST http://localhost:5001/predict_v2 \
-F "cancer_type=skin" \
-F "image=@lesion.jpg" \
-F "age=45" \
-F "skin_type=2" \
-F "sunburn_history=8" \
-F "outdoor_hours_per_week=20" \
-F "tanning_bed_use=0" \
-F "family_history=0"Response includes probability, risk_level, risk_label, confidence_band, recommendation, color_code, gradcam_png_b64.
curl http://localhost:5001/schema/oral
curl http://localhost:5001/schema/skin
curl http://localhost:5001/schema/oral_legacycurl http://localhost:5001/healthReturns load status of all 5 model variants, available endpoints, and overall "ok" / "degraded" status.
OralCancerApp/
โโโ train.py # v1 oral training (MobileNetV2)
โโโ train_v2.py # v2/v3 multimodal training pipeline
โโโ train_skin.py # v1 skin training (MobileNetV2)
โโโ predict.py # CLI prediction (v1 oral)
โโโ evaluate_v2.py # v1 vs v2 ablation evaluation
โโโ evaluate_skin.py # Skin model test-set evaluation โ NEW
โโโ generate_research_report.py # Auto-generate research report
โโโ web_app.py # Flask web application + REST API
โโโ system_smoke_test.py # End-to-end HTTP integration tests
โโโ requirements.txt # Python dependencies
โ
โโโ models/ # v1 trained models
โ โโโ oral_cancer_model.h5
โ โโโ model_metadata.json
โ โโโ skin_model/
โ โโโ skin_screening_model.h5
โ
โโโ models_v2/ # v2/v3 architectures + trained weights
โ โโโ multimodal_model.py # Legacy oral 4D architecture
โ โโโ oral_model.py # Oral v3 6D architecture
โ โโโ skin_model.py # Skin v3 6D architecture
โ โโโ metadata_scaler.pkl # Fitted StandardScaler (saved after training)
โ โโโ saved_model/ # oral_legacy SavedModel
โ โโโ oral_saved_model/ # oral v3 SavedModel
โ โโโ skin_saved_model/ # skin v3 SavedModel
โ
โโโ utils_v2/
โ โโโ gradcam.py # Grad-CAM explainability (single + multi-input)
โ โโโ metadata_schema.py # Field definitions, validation, normalisation
โ โโโ risk_scoring.py # Risk tier logic (Low/Medium/High)
โ
โโโ modules/
โ โโโ skin_screening.py # v1 skin screening wrapper
โ
โโโ data_clean/ # Training data (oral)
โ โโโ metadata.csv
โ โโโ train/ val/
โ
โโโ skin_dataset_resized/ # Training data (skin)
โ โโโ train_set/ val_set/ test_set/
โ
โโโ evaluation_outputs/ # Metrics, ROC curves, confusion matrices
โโโ research_report/ # Auto-generated publication report
โโโ automation_logs/ # Prediction history JSONs
โโโ test_assets/ # Test images for smoke tests
โโโ sample.jpg
pip install -r requirements.txtpython web_app.py
# or specify a port:
python web_app.py 8080Open: http://localhost:5001
v1 oral model (MobileNetV2):
python train.pyv2 multimodal oral model (EfficientNetB0 + 4D metadata):
python train_v2.py --epochs-phase1 30 --epochs-phase2 20v3 skin multimodal model (EfficientNetB0 + 6D metadata):
python train_v2.py --cancer-type skin --epochs-phase1 30 --epochs-phase2 20With focal loss (recommended when dataset is imbalanced):
python train_v2.py --cancer-type oral --use-focal-lossWith cross-validation:
python train_v2.py --cross-validate --cv-folds 5v1 vs v2 oral ablation evaluation:
python evaluate_v2.pySkin model on held-out test set:
python evaluate_skin.pyGenerate research report:
python generate_research_report.py# Start the server first, then in another terminal:
python system_smoke_test.py
# Or auto-start the server:
python system_smoke_test.py --autostartpython predict.py path/to/image.jpg
python predict.py image.jpg 0.35 # custom threshold| Tier | P(cancer) Range | Color | Action |
|---|---|---|---|
| Low | 0.0 โ 0.3 | ๐ข Green | Routine monitoring |
| Medium | 0.3 โ 0.7 | ๐ก Amber | Further clinical evaluation |
| High | 0.7 โ 1.0 | ๐ด Red | Urgent specialist referral |
| Model | Val AUC | Sensitivity | Specificity | Notes |
|---|---|---|---|---|
| Oral v1 (MobileNetV2) | 0.993 | 0.986 | 0.955 | Real images, image-only |
| Skin v1 (MobileNetV2) | 0.943 | โ | โ | Real images, image-only |
| Oral v2 (EfficientNetB0 + 4D) | 0.784 | 0.845 | 0.612 | |
| Oral v2 training peak | (0.992) | โ | โ | Synthetic metadata |
Note: v2/v3 multimodal metrics reflect synthetic metadata. Replace
data_clean/metadata.csvwith real patient records before any clinical or research claims.
This system is developed strictly for educational and research purposes.
- The model performs image-based screening only
- It does not diagnose cancer or any disease
- Results must always be reviewed by qualified medical professionals
- Clinical decisions must not be made based on this tool alone
- v2/v3 multimodal models are trained on synthetic metadata โ not validated for clinical claims
- Collect real patient metadata to unlock full multimodal accuracy potential
- Train oral v3 (6D schema) โ architecture is ready, training script needed
- Multi-class oral abnormality categorisation (leukoplakia, erythroplakia, etc.)
- REST API authentication and rate limiting for research integration
- Mobile application interface
- Docker containerisation for reproducible deployment
Jay Gautam
B.Tech โ Computer Science (Artificial Intelligence & Machine Learning)
| Component | Status |
|---|---|
| Oral screening v1 (MobileNetV2) | โ Complete |
| Skin screening v1 (MobileNetV2) | โ Complete |
| v2 Multimodal oral (EfficientNetB0 + 4D) | โ Trained (synthetic metadata) |
| v3 Skin multimodal (EfficientNetB0 + 6D) | โ Architecture ready |
| v3 Oral multimodal (EfficientNetB0 + 6D) | ๐ก Architecture ready, training pending |
| Grad-CAM explainability (v1 + v2/v3) | โ Complete |
| Flask REST API + SPA UI | โ Complete |
| Research report pipeline | โ Complete |
| Real patient metadata | โ Pending (synthetic used for now) |
| Clinical validation | โ Out of scope |
CuraLens is a technical exploration of AI-assisted screening, designed with responsibility, transparency, and academic integrity at its core.