Dear author,
Thanks for your work.
Question 1:
I am quite confused about your definition of the model_old. I thought model_old should be the model from the last tasks, therefore, the number of heads of the model_old in tasks 1 should be always 1. However, "the number of heads of the model_old in tasks 1" is sometimes 1 (expected value) and sometimes 2 (unexpected value) during def eval.
I think it's because of your arrangement of the def train which includes train_loop and post_train_process,
When you call .train and .eval sequentially, the post_train_process will make the number of heads of the model_old in tasks 1 be 2 in the .eval function.
For example. In def search_tradeoff (gridsearch.py), you call .train and .eval.
and in incremental_learning.py, you call .train and .eval.
By setting num_epochs=1, the following results can make my confusion clear.
Can I confirm my understanding from you:
- when def eval is called after post_train_process. model_old = model
- when def eval is called before post_train_process. model_old is from last task, and model is for current task,
Can you help me to understand this point? What's the specific explanation of your model_old in different cases?
Question 2:
The following values are from Task-Aware incremental performance. Surprisingly, as the number of tasks increases, the performance on the first task increases instead (as shown by the bolded numbers), can you kindly explain the reason?
Task Incremental Acc
81.5% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% Avg.: 81.5%
81.3% 48.5% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% Avg.: 64.9%
81.6% 52.9% 75.1% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% Avg.: 69.9%
83.0% 54.6% 76.5% 68.4% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% Avg.: 70.6%
83.5% 55.9% 77.9% 68.3% 72.4% 0.0% 0.0% 0.0% 0.0% 0.0% Avg.: 71.6%
84.2% 58.3% 78.4% 70.2% 73.7% 66.2% 0.0% 0.0% 0.0% 0.0% Avg.: 71.8%
84.0% 59.9% 79.3% 70.4% 77.1% 66.4% 71.5% 0.0% 0.0% 0.0% Avg.: 72.7%
83.8% 60.1% 79.8% 70.6% 77.1% 66.3% 69.5% 69.7% 0.0% 0.0% Avg.: 72.1%
83.9% 60.0% 80.1% 70.8% 78.1% 67.1% 70.0% 69.0% 66.0% 0.0% Avg.: 71.7%
83.9% 59.6% 80.3% 71.2% 78.1% 68.3% 71.1% 70.5% 64.6% 64.2% Avg.: 71.2%
Thanks.
Mengya Xu
Task 1
LR GridSearch
| Epoch 1, time= 1.8s | Train: skip eval | Valid: time= 0.4s loss=1.969, TAw acc= 29.6% | *
Current best LR: 0.05
| Epoch 1, time= 1.9s | Train: skip eval | Valid: time= 0.4s loss=2.072, TAw acc= 25.6% | *
Current best LR: 0.05
| Epoch 1, time= 1.8s | Train: skip eval | Valid: time= 0.4s loss=2.115, TAw acc= 23.4% | *
Current best LR: 0.05
Current best acc: 29.6
Trade-off GridSearch
inside def train_epoch
num_model_old_heads 1
num_model_heads 2
| Epoch 1, time= 2.8s | Train: skip eval |
inside def eval
num_model_old_heads 1
num_model_heads 2
(this def eval is called after def train_epoch, specifically, this def eval is called before post_train_process.)
Valid: time= 0.4s loss=22.879, TAw acc= 19.2% | *
| Selected 2000 train exemplars, time= 11.8s
inside def eval
num_model_old_heads 2
num_model_heads 2
(this def eval is called after def train, specifically, this def eval is called after post_train_process.)
Current acc: 0.214 for lamb=4
inside def train_epoch
num_model_old_heads 1
num_model_heads 2
| Epoch 1, time= 2.9s | Train: skip eval |
inside def eval
num_model_old_heads 1
num_model_heads 2
Valid: time= 0.4s loss=12.227, TAw acc= 22.4% | *
| Selected 2000 train exemplars, time= 11.7s
inside def eval
num_model_old_heads 2
num_model_heads 2
Current acc: 0.236 for lamb=2.0
inside def train_epoch
num_model_old_heads 1
num_model_heads 2
| Epoch 1, time= 2.8s | Train: skip eval |
inside def eval
num_model_old_heads 1
num_model_heads 2
Valid: time= 0.4s loss=7.281, TAw acc= 23.8% | *
| Selected 2000 train exemplars, time= 11.7s
inside def eval
num_model_old_heads 2
num_model_heads 2
Current acc: 0.274 for lamb=1.0
Train
inside def train_epoch
num_model_old_heads 1
num_model_heads 2
| Epoch 1, time= 2.9s | Train: skip eval |
inside def eval
num_model_old_heads 1
num_model_heads 2
Valid: time= 0.4s loss=7.291, TAw acc= 25.0% | *
| Selected 2000 train exemplars, time= 11.7s
Test on task 0 : loss=3.747 | TAw acc= 31.0%, forg= -5.6%| TAg acc= 22.5%, forg= 2.9% <<<
inside def eval
num_model_old_heads 2
num_model_heads 2
Test on task 1 : loss=7.030 | TAw acc= 25.8%, forg= 0.0%| TAg acc= 14.1%, forg= 0.0% <<<
Save at results_test/cifar100_icarl_icarl_fixd_5
Task 2
Dear author,
Thanks for your work.
Question 1:
I am quite confused about your definition of the model_old. I thought model_old should be the model from the last tasks, therefore, the number of heads of the model_old in tasks 1 should be always 1. However, "the number of heads of the model_old in tasks 1" is sometimes 1 (expected value) and sometimes 2 (unexpected value) during def eval.
I think it's because of your arrangement of the def train which includes train_loop and post_train_process,
When you call .train and .eval sequentially, the post_train_process will make the number of heads of the model_old in tasks 1 be 2 in the .eval function.
For example. In def search_tradeoff (gridsearch.py), you call .train and .eval.
and in incremental_learning.py, you call .train and .eval.
By setting num_epochs=1, the following results can make my confusion clear.
Can I confirm my understanding from you:
Can you help me to understand this point? What's the specific explanation of your model_old in different cases?
Question 2:
The following values are from Task-Aware incremental performance. Surprisingly, as the number of tasks increases, the performance on the first task increases instead (as shown by the bolded numbers), can you kindly explain the reason?
Task Incremental Acc
81.5% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% Avg.: 81.5%
81.3% 48.5% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% Avg.: 64.9%
81.6% 52.9% 75.1% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% Avg.: 69.9%
83.0% 54.6% 76.5% 68.4% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% Avg.: 70.6%
83.5% 55.9% 77.9% 68.3% 72.4% 0.0% 0.0% 0.0% 0.0% 0.0% Avg.: 71.6%
84.2% 58.3% 78.4% 70.2% 73.7% 66.2% 0.0% 0.0% 0.0% 0.0% Avg.: 71.8%
84.0% 59.9% 79.3% 70.4% 77.1% 66.4% 71.5% 0.0% 0.0% 0.0% Avg.: 72.7%
83.8% 60.1% 79.8% 70.6% 77.1% 66.3% 69.5% 69.7% 0.0% 0.0% Avg.: 72.1%
83.9% 60.0% 80.1% 70.8% 78.1% 67.1% 70.0% 69.0% 66.0% 0.0% Avg.: 71.7%
83.9% 59.6% 80.3% 71.2% 78.1% 68.3% 71.1% 70.5% 64.6% 64.2% Avg.: 71.2%
Thanks.
Mengya Xu
Task 1
LR GridSearch
| Epoch 1, time= 1.8s | Train: skip eval | Valid: time= 0.4s loss=1.969, TAw acc= 29.6% | *
Current best LR: 0.05
| Epoch 1, time= 1.9s | Train: skip eval | Valid: time= 0.4s loss=2.072, TAw acc= 25.6% | *
Current best LR: 0.05
| Epoch 1, time= 1.8s | Train: skip eval | Valid: time= 0.4s loss=2.115, TAw acc= 23.4% | *
Current best LR: 0.05
Current best acc: 29.6
Trade-off GridSearch
inside def train_epoch
num_model_old_heads 1
num_model_heads 2
| Epoch 1, time= 2.8s | Train: skip eval |
inside def eval
num_model_old_heads 1
num_model_heads 2
(this def eval is called after def train_epoch, specifically, this def eval is called before post_train_process.)
Valid: time= 0.4s loss=22.879, TAw acc= 19.2% | *
| Selected 2000 train exemplars, time= 11.8s
inside def eval
num_model_old_heads 2
num_model_heads 2
(this def eval is called after def train, specifically, this def eval is called after post_train_process.)
Current acc: 0.214 for lamb=4
inside def train_epoch
num_model_old_heads 1
num_model_heads 2
| Epoch 1, time= 2.9s | Train: skip eval |
inside def eval
num_model_old_heads 1
num_model_heads 2
Valid: time= 0.4s loss=12.227, TAw acc= 22.4% | *
| Selected 2000 train exemplars, time= 11.7s
inside def eval
num_model_old_heads 2
num_model_heads 2
Current acc: 0.236 for lamb=2.0
inside def train_epoch
num_model_old_heads 1
num_model_heads 2
| Epoch 1, time= 2.8s | Train: skip eval |
inside def eval
num_model_old_heads 1
num_model_heads 2
Valid: time= 0.4s loss=7.281, TAw acc= 23.8% | *
| Selected 2000 train exemplars, time= 11.7s
inside def eval
num_model_old_heads 2
num_model_heads 2
Current acc: 0.274 for lamb=1.0
Train
inside def train_epoch
num_model_old_heads 1
num_model_heads 2
| Epoch 1, time= 2.9s | Train: skip eval |
inside def eval
num_model_old_heads 1
num_model_heads 2
Valid: time= 0.4s loss=7.291, TAw acc= 25.0% | *
| Selected 2000 train exemplars, time= 11.7s
Task 2