A web-based interface for running B13 thermal diffusion model predictions with real-time results display.
chromosome/
├── auth_service.py # Flask backend with prediction endpoint
├── predict_b13_diffusion.py # Prediction script (CLI & API)
├── exp/
│ └── best_model.pt # Trained B13 diffusion model
├── test/ # Sample input directory
│ ├── images/ # Input PNG images (3+ frames)
│ └── metadata/ # Input JSON metadata files
├── predictions/ # Output directory (created automatically)
│ ├── images/ # Generated predictions
│ └── metadata/ # Generation metadata with context
└── frontend/
├── index.html # Main UI (login + prediction dashboard)
└── static/
├── script.js # Frontend logic (auth + predictions)
└── style.css # Styling with results grid
- Email OTP Login - Secure authentication via email OTP
- Session Management - Redis-based session storage
- Chaos Mode - Fun UI chaos toggle (preserved from original)
-
Input Configuration
- Input Directory: Path to directory with
images/andmetadata/subdirectories - Model Path: Path to
.pttrained model file (default:./exp/best_model.pt) - Output Directory: Where predictions will be saved (default:
./predictions)
- Input Directory: Path to directory with
-
Prediction Execution
- Validates input directory and model existence
- Runs
predict_b13_diffusion.pyvia subprocess - Uses 3 consecutive frames as context → generates 1 future frame
- Displays real-time status updates
-
Results Display
- Image gallery grid with thumbnails
- Metadata overlay showing:
- Prediction time
- Context frames used
- Geographic region
- Inference steps
- Download button for each image
- Smooth scroll to results
POST /send-otp- Send OTP to emailPOST /verify-otp- Verify OTP and create sessionPOST /check-session- Validate session tokenPOST /logout- Invalidate session
-
POST /predict- Run B13 diffusion prediction{ "session_token": "abc123...", "input_dir": "./test", "output_dir": "./predictions", "model_path": "./exp/best_model.pt" } -
GET /predictions/<path>- Serve generated images
# Python dependencies (if not already installed)
pip install flask flask-cors redis torch torchvision diffusers pillow tqdm
# Start Redis server
redis-serverCreate input directory structure:
mkdir -p test/images test/metadata
# Add at least 3 consecutive PNG images to test/images/
# Add corresponding JSON metadata files to test/metadata/Example input structure:
test/
├── images/
│ ├── frame_t0.png # t-20 minutes
│ ├── frame_t1.png # t-10 minutes
│ └── frame_t2.png # t-0 minutes (current)
└── metadata/
├── frame_t0.json
├── frame_t1.json
└── frame_t2.json
python auth_service.pyServer will start on http://localhost:5000
Open browser to: http://localhost:5000
- Enter email → Click "Send OTP"
- Check email for 6-digit code
- Enter OTP → Click "Verify"
- Configure paths in dashboard:
- Input Directory:
./test - Model Path:
./exp/best_model.pt - Output Directory:
./predictions
- Input Directory:
- Click "Start Prediction"
- Wait for prediction to complete (shown in status)
- View results in image gallery below
- RGB format, any resolution (will be resized to 512x512)
- Consecutive time series (10-minute intervals)
- Named chronologically (sorted alphabetically)
Each JSON file should contain:
{
"observation_time_utc": "2025-10-12T00:00:00Z",
"min_lat": 20.0,
"max_lat": 50.0,
"min_lon": 120.0,
"max_lon": 150.0,
"segment_index": 1,
"satellite": "Himawari-8",
"enhanced": false,
"composite_bands": ["B13"]
}predictions/
├── images/
│ └── future_from_frame_t0_to_frame_t2.png # Generated future frame
└── metadata/
└── future_from_frame_t0_to_frame_t2.json # With generation info
Output metadata includes:
- All input metadata fields
_generationobject with:- Model name and version
- Generation timestamp
- Inference parameters (steps, guidance)
- Context frame references
- Temporal context times
ConditionalLatentDiffusion (must match training script exactly)
- VAE:
stabilityai/sd-vae-ft-mse(frozen) - Conditioning: 14-dim metadata → 768-dim embedding
- Temporal: 6-dim (hour, day, month - cyclical)
- Spatial: 4-dim (lat/lon bounds)
- Metadata: 4-dim (segment, satellite, enhanced, composite)
- U-Net: 4 channel latent space with cross-attention
- Scheduler: DDIM (50 steps default)
- Check model path:
./exp/best_model.pt - Ensure model was trained with
exp/train_b13_single_gpu.py
- Verify path is relative to project root
- Check directory has
images/andmetadata/subdirectories
- Model must be trained with same architecture as
predict_b13_diffusion.py - Conditioning encoder: Linear(14, 256) → GELU → Linear(256, 512) → GELU → Linear(512, 768)
- No temporal context encoder in current version
- Default timeout: 10 minutes
- Reduce
--num-inference-stepsfor faster generation (lower quality) - Check GPU availability:
torch.cuda.is_available()
- Check browser console for CORS errors
- Verify Flask server is running
- Check
/predictions/<path>endpoint serves images correctly
- Generation Time: ~30-60 seconds per frame (GPU) / 5-10 minutes (CPU)
- Memory Usage: ~4GB GPU VRAM for inference
- Inference Steps: 50 (default) - reduce for speed, increase for quality
- Batch Processing: Processes multiple 3-frame sequences sequentially
- Real-time Progress: WebSocket for live inference progress
- Temporal Context Encoder: Use actual pixel data from previous frames (requires retraining)
- Batch Upload: Upload images directly via web interface
- Comparison View: Side-by-side comparison of input/output
- Animation: Create GIF/video from sequence of predictions
- Export Metadata: Download metadata as CSV for analysis
- Added
inputDir,modelPath,outputDirinput fields - Implemented
handleDirectorySubmit()to call/predictendpoint - Created
displayResults()to render image gallery - Added responsive grid layout for results
- Container expands to 1400px when dashboard active
- Added
POST /predictendpoint inauth_service.py - Subprocess execution of
predict_b13_diffusion.py - Image serving via
/predictions/<path>route - Session validation for prediction requests
- 10-minute timeout with error handling
- Updated
load_input_data()to group 3-frame sequences - Modified
predict()to use metadata-only conditioning - Architecture matches training script exactly (no context encoder)
- Output includes temporal context metadata
- Satellite Data: Himawari-8 B13 thermal infrared
- Model: Conditional Latent Diffusion with VAE + U-Net
- Framework: PyTorch, Hugging Face Diffusers
- Frontend: Vanilla JavaScript (no frameworks)