The kaggle competition overview is here. This repository is for our 7th solution (Team: luddite&MT) writeup. Short solution summary is here(solution summary).
- Please make sure to put
train.csvdownloaded from kaggle indata/input - Please refer to here to prepare train images with H1520xW912 and put them into
data/input/train_imagesdirectory. - [Option] If you would like to use sigmoid-windowing applied images, please refer to here to get windowing information in advance. If the given information are added on the dataframe from
train.csvas new columns,train.csvof step1 can be replaced by this. - [Option] If you would like to use external dataset, please refer to VinDr webpage to download corresponding images and annotation file. Make sure to put them in
data/input/external_dataanddata/input, respectively.
- Train only using kaggle dataset like below.
python -u src/train.py configs/config0.yaml - Conduct pseudolabeling on external dataset.
python -u src/external_pseudolabeling.py configs/config0.yaml - Change the config file as follows according to your purpose.
For breast-level/external dataset/no windowing:config1.yaml
For laterality-level/external dataest/no windowing:config2.yaml
For breast-level/external dataset/windowing:config3.yaml
For laterality-level/external dataset/windowing:config4.yaml
- To complete inference faster, we compiled the pytorch models with Torch-tensorRT in advance. I noticed that the compilation did not work as usual for the 'tf' type of EfficientNet due to its dynamic padding function, so I edited the source code and used it. See here.
- Full inference code is open as the kaggle notebook. Please see this.