- μ΄λ―Έμ§λ₯Ό λ³΄κ³ μ£Όμ΄μ§ μ§λ¬Έμ λ΅λ³νλ Visual Question Answering λͺ¨λΈ κ°λ°
| κΆνμ | λ₯μ¬ν¬ | λ°μ’ ν | μ μ°¬μ½ | μ‘°μ |
$ pip install -r requirements.txt
$ python3 train_v1.py # version 1
$ python3 train_v2.py # version 2
$ python3 train_v3.py # version 3
$ python3 inference_v1.py # version 1
$ python3 inference_v2.py # version 2
$ python3 inference_v3.py # version 3
$ python3 ensemble.py # ensemble
$ fashion_reader
βββ config
β βββ train_config_base.yaml
βββ models
β βββ get_model.py
β βββ vqa_model.py
βββ modulus
β βββ dataset.py
β βββ earlystoppers.py
β βββ recorders.py
β βββ trainer.py
β βββ utils.py
βββ results
βββ train_v1.py
βββ inference_v1.py
βββ train_v2.py
βββ inference_v2.py
βββ train_v3.py
βββ inference_v3.py
βββ ensemble.py
$ fashion_reader
βββ results
βββ train_v1
β βββ loss.png
β βββ model.pt
β βββ answers.csv
β βββ score.jpg
β βββ train_config_base.yaml
β βββ train_log.log
βββ train_v2
β βββ ...
βββ train_v3
βββ ...
| Version | Pre-trained Model | Config |
|---|---|---|
| V1 | xlm-roberta-base & resnet50 | Link |
| V2 | xlm-roberta-large & resnet50 | Link |
| V3 | xlm-roberta-base & resnet152 | Link |