Paper: Pan-microscope image segmentation based on a single training set
The code was written in Python 3.9 and tested on Windows 11 and RedHat Linux server 7.7.
- Processor: AMD Ryzen 5 5600H or equivalent
- Memory (RAM): 16GB
- Storage: 100 GB available space
- Graphics: NVIDIA GeForce RTX 3060, 6GB VRAM
Recommended Requirements (the ones we used for training on full datasets specified in the manuscript):
- Processor: 2 Xeon-Gold processors running at 2.1 GHz, with 20 cores each
- Memory (RAM): 192 GB of DDR4 RAM
- Storage: 3.2 TB NVMe local drive
- Graphics: 2 NVIDIA V100 PCIe 32 GB GPUs (2×7TFLOPS)
Installation time is less than 10 minutes.
- Clone the repository (
git clone https://github.com/rahi-lab/YeaZ-micromap) or download it directly from the GitHub webpage - Create a virtual environment
conda create -n YeaZ-micromap python=3.9 - Activate the virtual environment
conda activate YeaZ-micromap - Install PyTorch
pip install torch==2.0.0+cu117 torchvision==0.15.1+cu117 --index-url https://download.pytorch.org/whl/cu117 - Navigate to the folder where you cloned the YeaZ-micromap repository and install the required packages
pip install -r requirements.txt
The code can be run from the command line and is split into two parts: (i) Training of the microscopy style transfer using CycleGAN (ii) Evaluation of the training by segmenting the mapped images using a pre-trained YeaZ network for segmentation and translation of the style by using the weights from the best epoch.
More specifically:
-
train_cyclegan.py script performs:
- style transfer training between the images in the trainA and trainB folders
-
evaluate.py script performs:
- style transfer on source dataset images in one of the specified folders (testA or testB) using the pretrained CycleGAN weights from specified training epochs
- segmentation on the style-transferred images using the pretrained YeaZ weights
- evaluation of segmentation quality based on the segmented images and GT masks
-
predict.py script performs:
- style transfer on source dataset images in one of the specified folders (testA or testB) using the pretrained CycleGAN
- segmentation on the style-transferred images using the pretrained YeaZ weights
All three scripts rely on the following input data structure:
input_data ├── trainA │ ├── A1.png │ └── ... ├── trainB │ ├── B1.png │ └── ... ├── testA │ ├── A2.png │ └── ... ├── testB │ ├── B2.png │ └── ... ├── testA_masks │ ├── A2_mask.h5 │ └── ... └── testB_masks ├── B2_mask.h5 └── ...Depending on the usage, some of the folders can be empty:
- testA(_masks) and testB(_masks) can be empty during CycleGAN training
- trainA and trainB can be empty during the evaluation step
-
Additionally, the helper function, preprocessing.py, prepares the raw input data, of variable sizes and contents, for style transfer training.
Training script arguments follow the established options nomenclature from the original cycleGAN repository (https://github.com/taesungp/contrastive-unpaired-translation). For more details see the comments in the code below.
To preprocess the raw images into patches for style transfer training, use the following command on both source and target datasets:
$ python preprocess.py --src_path INPUT_FOLDER --dst_path OUTPUT_FOLDERPlease replace placeholders with actual values and descriptions relevant to your script.
| Argument | Description | Default Value |
|---|---|---|
--src_path |
Path to the folder containing the images. | - |
--dst_path |
Path to the folder where patches will be saved. | - |
--var_thr |
Empirical variance threshold for filtering out empty patches detection. | 500000 |
--scale_factor |
Factor to scale the brightness of input patches (this serves only for easier visualization in visdom if some images are too dark) | 1.0 |
--patch_size |
Size of the output square patches. | 256 |
$ python -m visdom.serverVisdom is a visualization tool that communicates with the CycleGAN code during training and saves one example of mapping per epoch. This is useful for quickly checking whether the mapping qualitatively makes sense. Saved data can be later accessed using an HTML interface, in Checkpoint/Experiment_Name/web/index.html. When running on server, this can be skipped.
To initiate the training process, execute the following command:
$ python train_cyclegan.py \
--dataroot INPUT_DATA_FOLDER \
--checkpoints_dir GENERAL_CYCLE_GAN_TRAINING_FOLDER \
--name NAME_OF_SPECIFIC_CYCLEGAN_TRAINING \
--grid_lambdas_A L1 L2 \
--grid_lambdas_B L1Please replace placeholders with actual values and descriptions relevant to your script.
| Argument | Description | Default Value |
|---|---|---|
--dataroot INPUT_DATA_FOLDER |
Directory containing training images. | - |
--checkpoints_dir GENERAL_CYCLE_GAN_TRAINING_FOLDER |
Directory to save trained models. Models are saved after each epoch by default. | - |
--name NAME_OF_SPECIFIC_CYCLEGAN_TRAINING |
Experiment name for future reference. | - |
--grid_lambdas_A L1 L2 ... |
Cycle consistency loss weights for A->B->A mapping. | 10 |
--grid_lambdas_B L1 L2 ... |
Cycle consistency loss weights for B->A->B mapping. | 10 |
If multiple lambda values are specified, a grid search will be performed.
If no lambda values are specified, default values (10, 10) will be used.
| Argument | Description | Default Value |
|---|---|---|
--model cycle_gan |
Generative model for transferring images. | - |
--gpu_ids GPU_ID |
-1 for CPU, 0 for GPU0, 0 1 for GPU0 and GPU1. |
0 |
--batch_size BATCH_SIZE |
Batch size for training. | 1 |
--n_epochs N_EPOCHS |
Number of training epochs. | 200 |
--n_epochs_decay N_EPOCHS_DECAY |
Epochs before the learning rate linearly decays to zero | 200 |
--lr LR |
Initial learning rate for Adam optimizer. | 0.0002 |
For evaluating the segmentation accuracy, the user provides the directory with checkpoint weights from the CycleGAN training (checkpoints_dir) and the YeaZ DNN weights of a network trained on the target dataset (path_to_yeaz_weights).
The rest of the arguments refer to either other trained CycleGAN specifications (dataroot, name) or to YeaZ segmentation parameters (threshold, min_seed_dist, min_epoch, max_epoch, epoch_step). The input_data folder contains the mask of the small annotated patch of the test image for only one of the domains (corresponding to the target set). If specified, a subpart (patch) of the big mask can be used for training evaluation instead of the whole mask. In that case metrics_patch_borders should be supplied as an additional parameter. The resulting segmentation masks will be saved in results_dir and the metrics of segmentation in metrics_path.
To evaluate the style-transferred images and metrics, use the following command:
$ python evaluate.py \
--dataroot INPUT_DATA_FOLDER \
--checkpoints_dir GENERAL_CYCLE_GAN_TRAINING_FOLDER \
--name NAME_OF_SPECIFIC_CYCLEGAN_TRAINING \
--path_to_yeaz_weights PATH_TO_YEAZ_WEIGHTS \
--min_epoch MIN_EPOCH \
--max_epoch MAX_EPOCH \
--epoch_step EPOCH_STEP \
--results_dir RESULTS_FOLDER \
--metrics_path PATH_TO_METRICS_CSV_FILEPlease replace placeholders with actual values and descriptions relevant to your script.
| Argument | Description | Default Value |
|---|---|---|
--dataroot |
Directory containing test images. | - |
--checkpoints_dir |
Directory with CycleGAN training checkpoints. | - |
--name |
Experiment name from CycleGAN training. | - |
--path_to_yeaz_weights |
Path to the pretrained YeaZ weights. | - |
--min_epoch |
First CycleGAN epoch for evaluation. | 1 |
--max_epoch |
Last CycleGAN epoch for evaluation. | 201 |
--epoch_step |
Evaluate every n-th epoch. | 5 |
--results_dir |
Output folder for style-transferred images and segmentation masks. | - |
--metrics_path |
Path to save evaluation metrics (AP). | - |
| Argument | Description | Default Value |
|---|---|---|
--original_domain A or B |
Target dataset to use test sets from. | A |
--skip_style_transfer |
(flag) Skip style transfer if already performed. | - |
--skip_segmentation |
(flag) Skip segmentation if already performed. | - |
--skip_metrics |
(flag) Skip metrics if already evaluated. | - |
--threshold |
Threshold used during YeaZ prediction. | 0.5 |
--min_seed_dist |
Minimal seed distance between cells for YeaZ prediction. | 5 |
--metrics_patch_borders Y0 Y1 X0 X1 |
Metrics patch borders, e.g., 480 736 620 876. |
- |
--plot_metrics |
(flag) Plot evaluation metrics. | - |
Make sure that the test set and test set masks are following the input_data structure given above.
To perform style mapping from selected epoch followed by segmentation, use the following command:
$ python predict.py \
--dataroot INPUT_DATA_FOLDER \
--checkpoints_dir GENERAL_CYCLE_GAN_TRAINING_FOLDER \
--name NAME_OF_SPECIFIC_CYCLEGAN_TRAINING \
--path_to_yeaz_weights PATH_TO_YEAZ_WEIGHTS \
--epoch EPOCH \
--results_dir RESULTS_FOLDER| Argument | Description | Default Value |
|---|---|---|
--dataroot |
Directory containing unlabeled input images. | - |
--checkpoints_dir |
Directory with CycleGAN training checkpoints. | - |
--name |
Experiment name from CycleGAN training. | - |
--path_to_yeaz_weights |
Path to pretrained YeaZ weights. | - |
--epoch |
Epoch to use for style transfer. | - |
--results_dir |
Output folder for style-transferred images and segmentation masks. | - |
| Argument | Description | Default Value |
|---|---|---|
--original_domain A or B |
Source dataset to use for prediction. | A |
--skip_style_transfer |
(flag) Skip style transfer if already performed. | - |
--skip_segmentation |
(flag) Skip segmentation if already performed. | - |
--threshold |
Threshold used during YeaZ prediction. | 0.5 |
--min_seed_dist |
Minimal seed distance between cells for prediction. | 5 |
The demo showcases YeaZ-micromap capabilities for style transfer of yeast microscopy, their segmentation in the source domain, and evaluation criteria (average precision, AP) for selecting the best style transfer epoch for the segmentation task. Note that the demo is run on much smaller datasets, to allow testing on normal (desktop) PCs. For running on bigger datasets we recommend using scientific computing infrastructure (see more in Hadrware requirements).
Source domain, set A: Phase contrast
Target domain, set B: Brightfield
YeaZ neural network was in this case trained only on the phase contrast images.
Demo time (training + evaluation): ~2 h
-
Install YeaZ-micromap (see installation instructions above)
-
Data download
- Download the data from the following link Data
- Unpack the downloaded file and place its contents into ./data/ folder
-
Data preprocessing
- Preprocess PhaseContrast images:
$ python preprocess.py --src_path ./data/input_data/PhaseContrast_demo/ --dst_path ./data/input_data/trainA/ --scale_factor 10 - Preprocess BrightField images:
$ python preprocess.py --src_path ./data/input_data/BrightField_demo/ --dst_path ./data/input_data/trainB/ - Preprocessed PhaseContrast and BrightField images can be found in the folders trainA and trainB respectively (within the ./data/input_data/ folder)
- Preprocess PhaseContrast images:
-
Style transfer training
- Start visdom:
$ python -m visdom.server - Run CycleGAN training:
$ python train_cyclegan.py --dataroot ./data/input_data/ --name demo --checkpoints_dir ./data/checkpoints/ --gpu_ids 0 --n_epochs 100 --n_epochs_decay 100 --batch_size 1 --display_freq 1 - Track the training progress via visdom at http://localhost:8097/
- All weights will be stored at ./data/checkpoints
- Start visdom:
-
Evaluate domain adaptation
- Run evaluate script:
$ python evaluate.py --dataroot ./data/input_data/ --checkpoints_dir ./data/checkpoints/ --name demo_lambda_A_10.0_lambda_B_10.0 --path_to_yeaz_weights ./data/input_data/YeaZ_weights/weights_budding_PhC_multilab_0_1 --max_epoch 200 --results_dir ./data/results_evaluate/ --metrics_path ./data/results_evaluate/metrics_lambda_A_10.0_lambda_B_10.0.csv --metrics_patch_borders 200 456 200 456 --plot_metrics --original_domain B - You can find the style transfer output at ./data/results_evaluate/demo_lambda_A_10.0_lambda_B_10.0/test_[EPOCH]/images/fake_A/wt_FOV9_PhC_absent.nd2_channel_10p.png by replacing the EPOCH placeholder
- You can find the generated segmentation masks from the style-transferred images at ./data/results_evaluate/demo_lambda_A_10.0_lambda_B_10.0/test_[EPOCH]/images/fake_A/wt_FOV9_PhC_absent.nd2_channel_10p_mask.h5 by replacing the EPOCH placeholder.
You can use YeaZ-GUI (GitHub, Win app, Mac app) to visualize the masks. - Average precision (AP) metrics can be found in the ./data/results_evaluate/ folder, files: metrics_lambda_A_10.0_lambda_B_10.0.csv and metrics_lambda_A_10.0_lambda_B_10.0.png
The expected output of the YeaZ-micromap is shown in the figure below.
Note that the output of the demo run on your computer might not be identical to the one shown here due to the stochastic training of the CycleGAN. - Run evaluate script:
-
Predict the style transfer and segmentation on all unlabeled BrightField data
- Select the epoch with the best average precision (AP) from the previous step. We will use the CycleGAN weights from this epoch for style tranfer of the whole unlabeled dataset. Replace the EPOCH placeholder in the call bellow with the selected epoch.
- Run the predict script:
$ python predict.py --dataroot ./data/input_data_all/ --checkpoints_dir ./data/checkpoints/ --name demo_lambda_A_10.0_lambda_B_10.0 --path_to_yeaz_weights ./data/input_data/YeaZ_weights/weights_budding_PhC_multilab_0_1 --epoch EPOCH --results_dir ./data/results_predict/ --original_domain B
If you get GPU memory overflow due to the images' size, add--gpu_ids -1argument to use the CPU. Beware, this will increase the execution time. - Segmentation labels with the corresponding style-transfered images can be found at ./data/results_predict/images/fake_A
- You can now use YeaZ GUI (GitHub, Win app, Mac app) to adjust and validate the generated labels.

