Update model configs - Allow setters for common properties #13026

nreimers · 2021-08-06T09:37:03Z

Update model configs - Allow setters for common properties

Not all models use the same naming for config values, e.g. hidden_size is called n_embd in GPT2Config. So far, getters had been implemented in the config classes to allow that a GPT2Config can be accessed via config.hidden_size.

But the setters were missing, so that this code fails so far:

from transformers import GPT2Config
config = GPT2Config()
config.hidden_size = 4 # Fails

config = GPT2Config(hidden_size =4) # Fails

Changes

This PR adds an attribute_map to the config classes that maps the config parameters. For GPT2, this map looks like this:

attribute_map = {"hidden_size": "n_embd",
                     "max_position_embeddings": "n_positions",
                     "num_attention_heads": "n_head",
                     "num_hidden_layers": "n_layer"
                     }

The PretrainedConfig class overwrites the get & set attribute to check for the mappings in the attribute_map:

    def __setattr__(self, key, value):
        if key in super().__getattribute__('attribute_map'):
            key = super().__getattribute__('attribute_map')[key]
        super().__setattr__(key, value)

    def __getattribute__(self, key):
        if key != 'attribute_map' and key in super().__getattribute__('attribute_map'):
            key = super().__getattribute__('attribute_map')[key]
        return super().__getattribute__(key)

Advantages

Setters work, i.e. you can use config.hidden_size = 4 and GPT2Config(hidden_size=4)
No need to write individual getter- or setter-methods in the config classes. They are derived from the attribute_map

Detailed changes

PretrainedConfig: Add __setattr__ and __getattribute__ methods. Added docstring for attribute_map
GPT2Config: Add attribute map, remove old getters
test_configuration_common.py: Update create_and_test_config_common_properties method so that it tests that setters exist and work

Work in Progress

So far I only updated the GPT2Config to get your feedback. Unit-Tests for other config classes that have not yet been updated (i.e. don't provide setters for the common fields) like the GPTNeo config class will fail.

~~Once the design of the solution is approved, I will update all other config classes.~~

Update: All config classes updated

Fixes

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@sgugger @LysandreJik @NielsRogge

Code to test the change

Besides the unit tests, you can use this code to test the changes quickly:

from transformers import GPT2Config

config = GPT2Config()

config.hidden_size = 4
print("Hidden size", config.hidden_size, config.n_embd)

config.n_positions = 65
print("n_positions", config.max_position_embeddings, config.n_positions)

config.max_position_embeddings = 123
print("n_positions", config.max_position_embeddings, config.n_positions)


print("\n\n================\n\n")

## Note: conflicting arguments: hidden_size and n_embd are identical fields
# In that case, the synonym (hidden_size) will have higher priority
config = GPT2Config(hidden_size=4, n_embd=20, max_position_embeddings=80)
print("Hidden size", config.hidden_size, config.n_embd)
print("n_positions", config.max_position_embeddings, config.n_positions)

print("Export to json")
config.save_pretrained(".")

## Load config
print("Load from disc")
config = GPT2Config.from_pretrained('.')
print("Hidden size", config.hidden_size, config.n_embd)
print("n_positions", config.max_position_embeddings, config.n_positions)
assert config.hidden_size == config.n_embd
assert config.hidden_size == 4

assert config.max_position_embeddings == config.n_positions
assert config.max_position_embeddings == 80

…d as parameters to __init__

sgugger

The design looks good to me! I think we could have a few more common attributes, since we are in the process of adding them:

the vocab size (seems to be pretty consistent)
the embedding size
the inner size for the feed-forward layers

Those on top of max_position_embeddings should all be included in common_properties so that we are sure they are common to each model.

nreimers · 2021-08-06T11:30:16Z

The design looks good to me! I think we could have a few more common attributes, since we are in the process of adding them:

the vocab size (seems to be pretty consistent)

the embedding size

the inner size for the feed-forward layers

Those on top of max_position_embeddings should all be included in common_properties so that we are sure they are common to each model.

My idea was to put this into an independent, new PR and too keep this PR focused on just changing the getter / setters.

My plan is to come up with some scheme which attributes should be common. Here we can differentiate between model types: text (differentiated between encoder only and encoder-decoder), image, audio.

I analyzed all 50+ config classes and these are the most common fields:

model_type 55
vocab_size 51
architectures 49
pad_token_id 42
max_position_embeddings 41
num_hidden_layers 40
initializer_range 36
eos_token_id 34
bos_token_id 32
hidden_size 32
layer_norm_eps 32
hidden_act 30
intermediate_size 30
num_attention_heads 29
hidden_dropout_prob 28
attention_probs_dropout_prob 26
transformers_version 25
type_vocab_size 23
attention_dropout 22
gradient_checkpointing 21
dropout 19
activation_dropout 18
d_model 17
init_std 17
activation_function 16

But as mentioned, I would put this in another PR.

nreimers · 2021-08-30T12:40:00Z

Hi @sgugger @LysandreJik @patil-suraj @patrickvonplaten

I also updated all other config classes so that they all use the attribute_map so that common properties (like hidden_size) can also be set (config.hidden_size = 123) or passed as argument (MyConfigClass(hidden_size = 123)).

I kept the behavior for the config classes as is, i.e. no new getter-methods were added, config classes were just extended to allow setting of the common properties.

If a setter method cannot be implemented for a class, an exception is raised:

transformers/src/transformers/models/funnel/configuration_funnel.py

Line 176 in c8973d1

@num_hidden_layers.setter

All unit tests are passing.

Would be happy if you could have a look at this PR.

sgugger

Thanks for rolling this out on all models. I think the last thing missing is some documentation in model_classes/config.rst, to list all the common attributes that can be used/set.

patrickvonplaten

This looks good to me - thanks a lot for your work here!

One thing, I'm wondering is whether we should allow this:

config = GPT2Config(hidden_size=4, n_embd=20, max_position_embeddings=80)

at all or just directly throw an error. I think it's cleaner to throw an error here actually instead of silently "disregarding" n_embd=20. What do you think @nreimers ?

nreimers · 2021-08-31T11:04:38Z

@sgugger Will add a note to the docs

@patrickvonplaten Throwing an error is not easy.

GPT2Config defines n_embd=768 in the __init__ method, so:
config = GPT2Config(hidden_size=4)
and
config = GPT2Config(hidden_size=4, n_embd=768)

are identical calls of the method. We would expect method 1 to work.

In order to throw an exception for method 2, we could do:

Replace all default parameters with None, see if hidden_size is not set, then set n_embd to 768 => Major refactoring on all config classes would be needed with quite a lot of overhead. Further, default parameters would no longer be visible from the definition of the method.
Check if n_embd != hidden_size and n_embd != 768 => config = GPT2Config(hidden_size=4, n_embd=8) would throw an error, but config = GPT2Config(hidden_size=4, n_embd=768) would not raise an error (also not a nice solution). Also major refactoring would be needed as we would need to keep track of the default values for all parameters.

Do you have other ideas how this could be checked?

LysandreJik

Played with it extensively, this is working great! Thanks for working on this @nreimers.

Regarding @patrickvonplaten's comment, I think it's fine as it is but it could maybe use a note in the documentation.

The most important use-case regarding two variables assigned imo is when instantiating a config from a pretrained config and overriding an element, like in this scenario:

config = GPT2Config.from_pretrained("gpt2")
# config.n_embd: 768

config = GPT2Config.from_pretrained("gpt2", hidden_size=2)
# config.n_embd: 2

And this works flawlessly.

patrickvonplaten

Fine by me!

patil-suraj

Very clean implementation. Thanks a lot for working on this!

nreimers · 2021-09-01T14:54:17Z

@sgugger
I updated the docs:
transformers/docs/source/main_classes/configuration.rst

And added a section on the common attributes. Please have a look.

nreimers · 2021-09-06T09:30:41Z

Hi,
I just updated the PR with the newest commits from the master branch.

However, now the run_examples_torch fails in CircleCI:

==================================== ERRORS ====================================
______________ ERROR collecting examples/pytorch/test_examples.py ______________
ImportError while importing test module '/home/circleci/transformers/examples/pytorch/test_examples.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
/usr/local/lib/python3.6/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
examples/pytorch/test_examples.py:51: in <module>
    import run_image_classification
examples/pytorch/image-classification/run_image_classification.py:27: in <module>
    from torchvision.transforms import (
E   ModuleNotFoundError: No module named 'torchvision'

Not sure why this happens, as this PR is not touching run_image_classification.py

Is this an issue with CircleCI or with the specific unit test?

patil-suraj · 2021-09-06T09:35:51Z

Hi @nreimers, it's not related to this PR. That test fails because torchvision is not installed on the CI ( which is required in run_image_classification.py) for examples test. I've proposed a fix here #13438

nreimers · 2021-09-06T09:54:54Z

Hi @patil-suraj
Thanks for the quick response.

What are the next steps for this PR? Wait until #13438 is merged and then, when all tests are passing, merging this PR?

Who will be merging this PR? Should I do it once all tests are passing?

patil-suraj · 2021-09-06T09:57:09Z

The failed test is not related to this PR and all of us has approved this PR, so feel free to merge if everything is ready :)

nreimers and others added 8 commits July 21, 2021 14:33

refactor GPT Config to allow dyn. properties

2198ee7

Merge branch 'huggingface:master' into master

200fa62

make attribute_map a class attribute

9723538

Merge branch 'master' of https://github.com/nreimers/transformers

9d1c3a7

remove old code

d320193

update unit test to test config: Add test for common properties setter

b6e4e39

update unit test to test config: Add test for common properties passe…

bf7b8c3

…d as parameters to __init__

Merge branch 'huggingface:master' into master

90111bd

nreimers changed the title ~~[DRAFT] Update model configs - Allow setters for common properties~~ [WIP] Update model configs - Allow setters for common properties Aug 6, 2021

sgugger reviewed Aug 6, 2021

View reviewed changes

LysandreJik requested review from patil-suraj and patrickvonplaten August 6, 2021 13:22

nreimers and others added 10 commits August 24, 2021 14:37

Merge branch 'huggingface:master' into master

07920cd

update to black code format

8935071

Allow that setters are not defined for certain config classes

b763ad6

update config classes to implement attribute_map

25f3771

update config classes to implement attribute_map

f751b53

Merge branch 'huggingface:master' into master

d3fa88b

bugfix lxmert config - id2labels was not defined when num_labels was set

b4f4cf6

update broken configs - add attribute_maps

b523102

update bart config

b77aa45

update black codestyle

c8973d1

nreimers changed the title ~~[WIP] Update model configs - Allow setters for common properties~~ Update model configs - Allow setters for common properties Aug 30, 2021

sgugger approved these changes Aug 30, 2021

View reviewed changes

patrickvonplaten reviewed Aug 31, 2021

View reviewed changes

LysandreJik approved these changes Sep 1, 2021

View reviewed changes

patrickvonplaten approved these changes Sep 1, 2021

View reviewed changes

patil-suraj approved these changes Sep 1, 2021

View reviewed changes

nreimers and others added 5 commits September 1, 2021 14:50

Merge branch 'huggingface:master' into master

973ea0b

update documentation on common config attributes

3b1aa7c

update GPTJ config to new attribute map

6a79818

update docs on common attributes

3e45a28

update docs on common attributes

6cb843a

nreimers and others added 5 commits September 1, 2021 17:14

gptj config: add max_position_embeddings

43beba0

gptj config: format with black

2584e2c

update speech to text 2 config

9e94d9a

format doc file to max_len 119

292d35c

Merge branch 'huggingface:master' into master

6440551

nreimers and others added 5 commits September 6, 2021 14:15

update config template

06907ff

update config template

4601692

update config template

e249f95

Merge branch 'huggingface:master' into master

f5161d0

Merge branch 'huggingface:master' into master

0b5f4b3

nreimers merged commit c8be8a9 into huggingface:master Sep 6, 2021

LysandreJik mentioned this pull request Sep 16, 2021

Inconsistency between GPTNeo and GPT2 config classes #12183

Closed

LysandreJik mentioned this pull request Oct 13, 2021

Attributes explicitly defined in model configurations are now overridden by the default type. #13992

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update model configs - Allow setters for common properties #13026

Update model configs - Allow setters for common properties #13026

Uh oh!

nreimers commented Aug 6, 2021 •

edited

Loading

Uh oh!

sgugger left a comment

Uh oh!

nreimers commented Aug 6, 2021

Uh oh!

nreimers commented Aug 30, 2021

Uh oh!

sgugger left a comment

Uh oh!

patrickvonplaten left a comment

Uh oh!

nreimers commented Aug 31, 2021 •

edited

Loading

Uh oh!

LysandreJik left a comment

Uh oh!

patrickvonplaten left a comment

Uh oh!

patil-suraj left a comment

Uh oh!

nreimers commented Sep 1, 2021

Uh oh!

nreimers commented Sep 6, 2021

Uh oh!

patil-suraj commented Sep 6, 2021 •

edited

Loading

Uh oh!

nreimers commented Sep 6, 2021

Uh oh!

patil-suraj commented Sep 6, 2021

Uh oh!

Uh oh!

Update model configs - Allow setters for common properties #13026

Update model configs - Allow setters for common properties #13026

Uh oh!

Conversation

nreimers commented Aug 6, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Update model configs - Allow setters for common properties

Changes

Advantages

Detailed changes

Work in Progress

Fixes

Before submitting

Who can review?

Code to test the change

Uh oh!

sgugger left a comment

Choose a reason for hiding this comment

Uh oh!

nreimers commented Aug 6, 2021

Uh oh!

nreimers commented Aug 30, 2021

Uh oh!

sgugger left a comment

Choose a reason for hiding this comment

Uh oh!

patrickvonplaten left a comment

Choose a reason for hiding this comment

Uh oh!

nreimers commented Aug 31, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LysandreJik left a comment

Choose a reason for hiding this comment

Uh oh!

patrickvonplaten left a comment

Choose a reason for hiding this comment

Uh oh!

patil-suraj left a comment

Choose a reason for hiding this comment

Uh oh!

nreimers commented Sep 1, 2021

Uh oh!

nreimers commented Sep 6, 2021

Uh oh!

patil-suraj commented Sep 6, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nreimers commented Sep 6, 2021

Uh oh!

patil-suraj commented Sep 6, 2021

Uh oh!

Uh oh!

nreimers commented Aug 6, 2021 •

edited

Loading

nreimers commented Aug 31, 2021 •

edited

Loading

patil-suraj commented Sep 6, 2021 •

edited

Loading