This project uses Meta's Segment Anything Model (SAM) to automatically generate masks and remove backgrounds from images. It was designed for batch processing large image datasets with cloud storage integration (Yandex Disk).
The notebook NewMaskSelection.ipynb implements a complete pipeline for:
- Downloading images from Yandex Disk in batches
- Generating segmentation masks using SAM
- Selecting and refining the best mask for background removal
- Saving processed images with transparent backgrounds
- Uploading results back to cloud storage
- Uses SAM (Segment Anything Model) with
vit_l(Large ViT) architecture - Pretrained weights:
sam_vit_l_0b3195.pth - Runs on CUDA-enabled GPUs for acceleration
- Employs
SamAutomaticMaskGeneratorfor automatic mask generation
checkResolution(image, maxres)- Resizes images exceeding maximum resolution to prevent memory issues
sortMasksByArea(masks)- Sorts masks by area (largest first)sortMasksByPerimeter(masks)- Sorts masks by bounding box perimeterselectAndDeleteMask(image, masks, ...)- Main function that:- Selects the primary mask based on size criteria
- Handles edge cases where SAM separates background into layers
- Applies edge detection to preserve image boundaries
- Generates final transparency mask
check_mask(mask, biggest_contour)- Validates mask covers sufficient image areachecking_lines(mask, thickness, available)- Detects if mask has edge artifacts by checking border regionsget_biggest_contour(contours)- Extracts largest contour for validation
SaveAllMasks(image, masks, path, ext)- Saves top 8 masks for manual reviewmake_all_calcs_and_save(...)- Complete processing pipeline for single image
getBatchAsync(batch_id, client, batch_size=100, n_parallel_requests=10)- Downloads batches of images from Yandex Disk
- Parallel downloads with configurable concurrency
- Creates local directory structure automatically
recursive_upload(from_dir, to_dir, client, n_parallel_requests=5)- Recursively uploads processed results
- Creates remote directories as needed
- Parallel uploads for efficiency
┌─────────────────┐
│ Yandex Disk │
│ (Source Images)│
└────────┬────────┘
│ Download (async batch)
▼
┌─────────────────┐
│ Local Storage │
│ E:/input/ │
└────────┬────────┘
│ Process each image
▼
┌─────────────────┐
│ SAM Model │
│ Generate Masks │
└────────┬────────┘
│ Select & Refine
▼
┌─────────────────┐
│ Mask Selection │
│ & Validation │
└────────┬────────┘
│ Save Results
▼
┌─────────────────┐
│ Output Folders │
│ - res/ │ → Final images with transparent BG
│ - checking/ │ → Preview collages
│ - [name](id)/ │ → Individual mask variants
└────────┬────────┘
│ Upload (async)
▼
┌─────────────────┐
│ Yandex Disk │
│ (Results) │
└─────────────────┘
For each batch (e.g., batch_0/):
batch_0/
├── image1.png/
│ ├── image1.png_mask.png # Primary mask
│ └── _(1).png, _(2).png... # Alternative masks (top 8)
├── image2.png/
│ └── ...
├── checking/
│ ├── image1.png.png # Collage: result | mask | original
│ └── ...
└── res/
├── image1.png # Final transparent image
└── ...
The notebook implements intelligent mask selection:
- Primary Check: If largest mask covers ≥90% of image dimensions → use as foreground
- Alternative: If not, combine all masks and invert (assumes background was segmented)
- Edge Refinement:
- Apply Gaussian blur to smooth mask edges
- Check border regions (5-12px thickness)
- If borders contain too much transparency → restore edge pixels
- Validation: Verify mask contour spans full image height/width
From execution logs:
- ~1500+ images processed in Google Colab initially
- Full reprocessing done locally for better reliability
- Processing time: ~2-5 seconds per batch (100 images)
- Total processing: ~5 minutes for 150+ batches
torch (2.2.2+cu121)
opencv-python (cv2)
numpy
matplotlib
yadisk (Yandex Disk API)
asyncio
- GPU: CUDA-enabled (tested with torch 2.2.2+cu121)
- RAM: Sufficient for large image batches
- Storage: Local directories for input/output caching
sam_vit_l_0b3195.pth- SAM ViT-Large weights
- Install dependencies:
pip install torch opencv-python numpy matplotlib yadisk - Download SAM checkpoint:
sam_vit_l_0b3195.pth - Configure paths in notebook:
input_path- Local output directoryinput_dir- Local input directorydownload_path- Yandex Disk source pathupload_path- Yandex Disk result path
Run notebook cells sequentially:
- Import libraries
- Define async cloud functions
- Define mask processing functions
- Load SAM model
- Validate batches (check for errors/missing files)
- Process new images
- Upload results
The notebook tracks three error categories:
list_of_inv_bg- Images with invalid background masks (1432 found in sample)list_of_missed_res_check- Masks generated but results missing (0 in sample)list_of_missed_masks- Mask generation failed (0 in sample)
Common issues identified:
.psdfiles or 0-byte files mislabeled as images- SAM sometimes segments background instead of foreground
- Edge artifacts in masks require post-processing
Preview of background removal: https://disk.yandex.ru/d/L8SmEplUzZMJpg
- Originally designed for Google Colab with dynamic Yandex Disk integration
- Moved to local execution for better stability and performance
- OpenCV errors led to discovery of corrupted input files
- Increased parallel request count for efficient small file uploads
- Multiple reprocessing runs to refine mask selection algorithm
This project uses Meta's Segment Anything Model. See SAM license for usage terms.