-
Notifications
You must be signed in to change notification settings - Fork 14
Dummy models for CI #256
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Dummy models for CI #256
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests.
... and 5 files with indirect coverage changes 🚀 New features to boost your workflow:
|
65b8075 to
de1af14
Compare
de1af14 to
62e0b0a
Compare
| @@ -0,0 +1,20 @@ | |||
| encoderfile: | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: i'd put configs in the respective subfolders of ./models and name them all encoderfile.yml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, good idea 👍
rather than in whatever_encoderfile.yaml within ./models, perhaps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, currently models is in gitgnore. Maybe we could have some specific dir for encoderfiles?
| ORTModelForTokenClassification, | ||
| ) | ||
|
|
||
| AutoConfig.register(DUMMY_SEQUENCE_ENCODER, DummySequenceConfig) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is code where models weights themselves are generated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's in
encoderfile/scripts/create_dummy_model.py
Line 78 in 62e0b0a
| class DummySequenceClassifier(PreTrainedModel): |
I preferred to download the weights as standard, maybe we can optionally generate them from scratch; the procedure is there, in any case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, fun fact: since the output is dynamically generated, there are actually no weights. So the torch model exporter refuses to write anything, and the onnx exporter fails because it sees no weights. We need to include a dummy val in the state so it gets exported and everything works ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here:
encoderfile/scripts/create_dummy_model.py
Line 95 in 62e0b0a
| self.register_buffer( |
We could have hardcoded weights instead (or maybe hardcoded outputs rather) if you'd prefer.
For CI purposes, we would need dummy models that return a predictable (e.g. constant) value, with as little processing as possible. These will honor the general interface (token classification, sequence classification, etc) but just return a dummy value. This allows testing integration and non-ML related changes efficiently.
Closes #98