Skip to content
This repository was archived by the owner on Apr 19, 2026. It is now read-only.
This repository was archived by the owner on Apr 19, 2026. It is now read-only.

finetune models have relative bad performance when using my own base level pretrain models #111

@652994331

Description

@652994331

Hi, guys, thank you for your electra models. Recently, i used my own data to continue pretrain a base-level electra model(this one: https://github.com/ymcui/Chinese-ELECTRA). This pretrain model is a Chinese electra, so it has a different vocab.txt(same as bert base vocab 21128 lines). So, what I have done was: First, I used build_pretrain_dateset.py(21128 vocab) to generate tfrecords. Second, I added init_checkpoint according to this:#74. Third, I pretrained my own base-level electra-chinese model from Chinese-ELETRA, the parameters i used were lr 2e-4, training steps 1000000, base model. command line are like this: python3 run_pretraining.py --data-dir pretrain_chinese_model/ --model-name my_model --init_checkpoint pretrain_chinese_model/models/Chinese-Electra

The loss after 100000 steps was around 3.4. However, I used 100000 steps pretrain_model to finetune a classification model. the performance is much worse than original Chinese_Electra. i was wondering why even 100000steps continue pretrain from Chinese_Electra could make such a bad performance, did I make any mistakes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions