You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Apr 19, 2026. It is now read-only.
Hi, guys, thank you for your electra models. Recently, i used my own data to continue pretrain a base-level electra model(this one: https://github.com/ymcui/Chinese-ELECTRA). This pretrain model is a Chinese electra, so it has a different vocab.txt(same as bert base vocab 21128 lines). So, what I have done was: First, I used build_pretrain_dateset.py(21128 vocab) to generate tfrecords. Second, I added init_checkpoint according to this:#74. Third, I pretrained my own base-level electra-chinese model from Chinese-ELETRA, the parameters i used were lr 2e-4, training steps 1000000, base model. command line are like this: python3 run_pretraining.py --data-dir pretrain_chinese_model/ --model-name my_model --init_checkpoint pretrain_chinese_model/models/Chinese-Electra
The loss after 100000 steps was around 3.4. However, I used 100000 steps pretrain_model to finetune a classification model. the performance is much worse than original Chinese_Electra. i was wondering why even 100000steps continue pretrain from Chinese_Electra could make such a bad performance, did I make any mistakes.
Hi, guys, thank you for your electra models. Recently, i used my own data to continue pretrain a base-level electra model(this one: https://github.com/ymcui/Chinese-ELECTRA). This pretrain model is a Chinese electra, so it has a different vocab.txt(same as bert base vocab 21128 lines). So, what I have done was: First, I used build_pretrain_dateset.py(21128 vocab) to generate tfrecords. Second, I added init_checkpoint according to this:#74. Third, I pretrained my own base-level electra-chinese model from Chinese-ELETRA, the parameters i used were lr 2e-4, training steps 1000000, base model. command line are like this: python3 run_pretraining.py --data-dir pretrain_chinese_model/ --model-name my_model --init_checkpoint pretrain_chinese_model/models/Chinese-Electra
The loss after 100000 steps was around 3.4. However, I used 100000 steps pretrain_model to finetune a classification model. the performance is much worse than original Chinese_Electra. i was wondering why even 100000steps continue pretrain from Chinese_Electra could make such a bad performance, did I make any mistakes.