Skip to content

pytorch_pretrained_bert vs transformers #16

@goodmami

Description

@goodmami

The README has a link to https://github.com/huggingface/pytorch-pretrained-BERT, but this redirects to https://github.com/huggingface/transformers and I think the former is deprecated. The pytorch-pretrained-bert package still exists on PyPI (link), but I installed transformers instead. Now I'm getting ModuleNotFoundError: No module named 'pytorch_pretrained_bert'. In line 12 shown below, I simply replaced pytorch_pretrained_bert with transformers:

class EmbeddingsBert(Module):
def __init__(self, bert_path: str):
super().__init__()
from pytorch_pretrained_bert import BertModel, BertTokenizer
self.bert_embeddings = BertModel.from_pretrained(bert_path)
self.bert_tokenizer = BertTokenizer.from_pretrained(bert_path, do_lower_case=False)

This gets me a little further, but then I see this (partial traceback):

  File "/home/mwg/wsd/disambiguate/python/getalp/wsd/modules/embeddings/embeddings_bert.py", line 69, in forward
    inputs, _ = self.bert_embeddings(inputs, attention_mask=pad_mask, output_all_encoded_layers=False)
  File "/home/mwg/wsd/disambiguate/py36/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
TypeError: forward() got an unexpected keyword argument 'output_all_encoded_layers'

I looked at the following link but did not see anything about output_all_encoded_layers: https://github.com/huggingface/transformers#Migrating-from-pytorch-pretrained-bert-to-transformers

I saw huggingface/transformers#3541 and then changed line 13 above to have the parameter output_hidden_states=False and removed the output_all_encoded_layers=False parameter from line 69 (shown below):

inputs, _ = self.bert_embeddings(inputs, attention_mask=pad_mask, output_all_encoded_layers=False)

After this I was able to get some output. Can you confirm if these changes are sufficient? If so I can put together a PR for the fix.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions