I try training PLBert for vietnamese (using multilingual Bert based model with wiki-vi dataset), but Vocab loss is 0.0 (begin first step), is it okay? @yl4579
Step [19920/1000000], Loss: 0.33009, Vocab Loss: 0.00000, Token Loss: 0.27358
Step [19930/1000000], Loss: 0.34967, Vocab Loss: 0.00000, Token Loss: 0.21050
Step [19940/1000000], Loss: 0.31899, Vocab Loss: 0.00000, Token Loss: 0.27819
Step [19950/1000000], Loss: 0.31953, Vocab Loss: 0.00000, Token Loss: 0.27369
Step [19960/1000000], Loss: 0.32448, Vocab Loss: 0.00000, Token Loss: 0.37215
I try training PLBert for vietnamese (using multilingual Bert based model with wiki-vi dataset), but Vocab loss is 0.0 (begin first step), is it okay? @yl4579
Step [19920/1000000], Loss: 0.33009, Vocab Loss: 0.00000, Token Loss: 0.27358
Step [19930/1000000], Loss: 0.34967, Vocab Loss: 0.00000, Token Loss: 0.21050
Step [19940/1000000], Loss: 0.31899, Vocab Loss: 0.00000, Token Loss: 0.27819
Step [19950/1000000], Loss: 0.31953, Vocab Loss: 0.00000, Token Loss: 0.27369
Step [19960/1000000], Loss: 0.32448, Vocab Loss: 0.00000, Token Loss: 0.37215