-
Notifications
You must be signed in to change notification settings - Fork 29.9k
Update model configs - Allow setters for common properties #13026
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…d as parameters to __init__
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The design looks good to me! I think we could have a few more common attributes, since we are in the process of adding them:
- the vocab size (seems to be pretty consistent)
- the embedding size
- the inner size for the feed-forward layers
Those on top of max_position_embeddings
should all be included in common_properties
so that we are sure they are common to each model.
My idea was to put this into an independent, new PR and too keep this PR focused on just changing the getter / setters. My plan is to come up with some scheme which attributes should be common. Here we can differentiate between model types: text (differentiated between encoder only and encoder-decoder), image, audio. I analyzed all 50+ config classes and these are the most common fields:
But as mentioned, I would put this in another PR. |
Hi @sgugger @LysandreJik @patil-suraj @patrickvonplaten I also updated all other config classes so that they all use the I kept the behavior for the config classes as is, i.e. no new getter-methods were added, config classes were just extended to allow setting of the common properties. If a setter method cannot be implemented for a class, an exception is raised:
All unit tests are passing. Would be happy if you could have a look at this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for rolling this out on all models. I think the last thing missing is some documentation in model_classes/config.rst, to list all the common attributes that can be used/set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me - thanks a lot for your work here!
One thing, I'm wondering is whether we should allow this:
config = GPT2Config(hidden_size=4, n_embd=20, max_position_embeddings=80)
at all or just directly throw an error. I think it's cleaner to throw an error here actually instead of silently "disregarding" n_embd=20
. What do you think @nreimers ?
@sgugger Will add a note to the docs @patrickvonplaten Throwing an error is not easy.
are identical calls of the method. We would expect method 1 to work. In order to throw an exception for method 2, we could do:
Do you have other ideas how this could be checked? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Played with it extensively, this is working great! Thanks for working on this @nreimers.
Regarding @patrickvonplaten's comment, I think it's fine as it is but it could maybe use a note in the documentation.
The most important use-case regarding two variables assigned imo is when instantiating a config from a pretrained config and overriding an element, like in this scenario:
config = GPT2Config.from_pretrained("gpt2")
# config.n_embd: 768
config = GPT2Config.from_pretrained("gpt2", hidden_size=2)
# config.n_embd: 2
And this works flawlessly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine by me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very clean implementation. Thanks a lot for working on this!
@sgugger And added a section on the common attributes. Please have a look. |
Hi, However, now the run_examples_torch fails in CircleCI:
Not sure why this happens, as this PR is not touching run_image_classification.py Is this an issue with CircleCI or with the specific unit test? |
Hi @patil-suraj What are the next steps for this PR? Wait until #13438 is merged and then, when all tests are passing, merging this PR? Who will be merging this PR? Should I do it once all tests are passing? |
The failed test is not related to this PR and all of us has approved this PR, so feel free to merge if everything is ready :) |
Update model configs - Allow setters for common properties
Not all models use the same naming for config values, e.g.
hidden_size
is calledn_embd
in GPT2Config. So far, getters had been implemented in the config classes to allow that a GPT2Config can be accessed viaconfig.hidden_size
.But the setters were missing, so that this code fails so far:
Changes
This PR adds an
attribute_map
to the config classes that maps the config parameters. For GPT2, this map looks like this:The
PretrainedConfig
class overwrites the get & set attribute to check for the mappings in the attribute_map:Advantages
config.hidden_size = 4
andGPT2Config(hidden_size=4)
attribute_map
Detailed changes
PretrainedConfig
: Add__setattr__
and__getattribute__
methods. Added docstring forattribute_map
GPT2Config
: Add attribute map, remove old getterstest_configuration_common.py
: Updatecreate_and_test_config_common_properties
method so that it tests that setters exist and workWork in ProgressSo far I only updated the GPT2Config to get your feedback. Unit-Tests for other config classes that have not yet been updated (i.e. don't provide setters for the common fields) like the GPTNeo config class will fail.Once the design of the solution is approved, I will update all other config classes.Update: All config classes updated
Fixes
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@sgugger @LysandreJik @NielsRogge
Code to test the change
Besides the unit tests, you can use this code to test the changes quickly: