Skip to content

Commit c7be096

Browse files
committed
Merge branch 'master' into cli
2 parents 3492a6e + 33adab2 commit c7be096

File tree

112 files changed

+4670
-858
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

112 files changed

+4670
-858
lines changed

.circleci/config.yml

+34
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,27 @@ jobs:
7070
- run: sudo pip install pytest codecov pytest-cov
7171
- run: python -m pytest -sv ./transformers/tests/ --cov
7272
- run: codecov
73+
build_py3_custom_tokenizers:
74+
working_directory: ~/transformers
75+
docker:
76+
- image: circleci/python:3.5
77+
steps:
78+
- checkout
79+
- run: sudo pip install --progress-bar off .
80+
- run: sudo pip install pytest
81+
- run: sudo pip install mecab-python3
82+
- run: RUN_CUSTOM_TOKENIZERS=1 python -m pytest -sv ./transformers/tests/tokenization_bert_japanese_test.py
83+
build_py2_custom_tokenizers:
84+
working_directory: ~/transformers
85+
docker:
86+
- image: circleci/python:2.7
87+
steps:
88+
- checkout
89+
- run: sudo pip install --progress-bar off .
90+
- run: sudo pip install pytest
91+
- run: sudo apt-get -y install libmecab-dev mecab mecab-ipadic-utf8 swig
92+
- run: sudo pip install mecab-python
93+
- run: RUN_CUSTOM_TOKENIZERS=1 python -m pytest -sv ./transformers/tests/tokenization_bert_japanese_test.py
7394
deploy_doc:
7495
working_directory: ~/transformers
7596
docker:
@@ -82,6 +103,16 @@ jobs:
82103
- run: sudo pip install --progress-bar off -r docs/requirements.txt
83104
- run: sudo pip install --progress-bar off -r requirements.txt
84105
- run: ./.circleci/deploy.sh
106+
repository_consistency:
107+
working_directory: ~/transformers
108+
docker:
109+
- image: circleci/python:3.5
110+
resource_class: small
111+
parallelism: 1
112+
steps:
113+
- checkout
114+
- run: sudo pip install requests
115+
- run: python ./utils/link_tester.py
85116
workflow_filters: &workflow_filters
86117
filters:
87118
branches:
@@ -91,6 +122,9 @@ workflows:
91122
version: 2
92123
build_and_test:
93124
jobs:
125+
- repository_consistency
126+
- build_py3_custom_tokenizers
127+
- build_py2_custom_tokenizers
94128
- build_py3_torch_and_tf
95129
- build_py3_torch
96130
- build_py3_tf

README.md

+44-2
Original file line numberDiff line numberDiff line change
@@ -56,9 +56,10 @@ Choose the right framework for every part of a model's lifetime
5656
| [Quick tour: Usage](#quick-tour) | Tokenizers & models usage: Bert and GPT-2 |
5757
| [Quick tour: TF 2.0 and PyTorch ](#Quick-tour-TF-20-training-and-PyTorch-interoperability) | Train a TF 2.0 model in 10 lines of code, load it in PyTorch |
5858
| [Quick tour: Fine-tuning/usage scripts](#quick-tour-of-the-fine-tuningusage-scripts) | Using provided scripts: GLUE, SQuAD and Text generation |
59+
| [Quick tour: Share your models ](#Quick-tour-of-model-sharing) | Upload and share your fine-tuned models with the community |
5960
| [Migrating from pytorch-transformers to transformers](#Migrating-from-pytorch-transformers-to-transformers) | Migrating your code from pytorch-transformers to transformers |
6061
| [Migrating from pytorch-pretrained-bert to pytorch-transformers](#Migrating-from-pytorch-pretrained-bert-to-transformers) | Migrating your code from pytorch-pretrained-bert to transformers |
61-
| [Documentation][(v2.2.0/v2.2.1)](https://huggingface.co/transformers/v2.2.0) [(v2.1.1)](https://huggingface.co/transformers/v2.1.1) [(v2.0.0)](https://huggingface.co/transformers/v2.0.0) [(v1.2.0)](https://huggingface.co/transformers/v1.2.0) [(v1.1.0)](https://huggingface.co/transformers/v1.1.0) [(v1.0.0)](https://huggingface.co/transformers/v1.0.0) [(master)](https://huggingface.co/transformers) | Full API documentation and more |
62+
| [Documentation][(v2.2.0/v2.2.1/v2.2.2)](https://huggingface.co/transformers/v2.2.0) [(v2.1.1)](https://huggingface.co/transformers/v2.1.1) [(v2.0.0)](https://huggingface.co/transformers/v2.0.0) [(v1.2.0)](https://huggingface.co/transformers/v1.2.0) [(v1.1.0)](https://huggingface.co/transformers/v1.1.0) [(v1.0.0)](https://huggingface.co/transformers/v1.0.0) [(master)](https://huggingface.co/transformers) | Full API documentation and more |
6263

6364
## Installation
6465

@@ -144,7 +145,8 @@ At some point in the future, you'll be able to seamlessly move from pre-training
144145
9. **[CTRL](https://github.com/salesforce/ctrl/)** (from Salesforce) released with the paper [CTRL: A Conditional Transformer Language Model for Controllable Generation](https://arxiv.org/abs/1909.05858) by Nitish Shirish Keskar*, Bryan McCann*, Lav R. Varshney, Caiming Xiong and Richard Socher.
145146
10. **[CamemBERT](https://camembert-model.fr)** (from Inria/Facebook/Sorbonne) released with the paper [CamemBERT: a Tasty French Language Model](https://arxiv.org/abs/1911.03894) by Louis Martin*, Benjamin Muller*, Pedro Javier Ortiz Suárez*, Yoann Dupont, Laurent Romary, Éric Villemonte de la Clergerie, Djamé Seddah and Benoît Sagot.
146147
11. **[ALBERT](https://github.com/google-research/ALBERT)** (from Google Research and the Toyota Technological Institute at Chicago) released with the paper [ALBERT: A Lite BERT for Self-supervised Learning of Language Representations](https://arxiv.org/abs/1909.11942), by Zhenzhong Lan, Mingda Chen, Sebastian Goodman, Kevin Gimpel, Piyush Sharma, Radu Soricut.
147-
11. Want to contribute a new model? We have added a **detailed guide and templates** to guide you in the process of adding a new model. You can find them in the [`templates`](./templates) folder of the repository. Be sure to check the [contributing guidelines](./CONTRIBUTING.md) and contact the maintainers or open an issue to collect feedbacks before starting your PR.
148+
12. **[T5](https://github.com/google-research/text-to-text-transfer-transformer)** (from Google AI) released with the paper [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/abs/1910.10683) by Colin Raffel and Noam Shazeer and Adam Roberts and Katherine Lee and Sharan Narang and Michael Matena and Yanqi Zhou and Wei Li and Peter J. Liu.
149+
13. Want to contribute a new model? We have added a **detailed guide and templates** to guide you in the process of adding a new model. You can find them in the [`templates`](./templates) folder of the repository. Be sure to check the [contributing guidelines](./CONTRIBUTING.md) and contact the maintainers or open an issue to collect feedbacks before starting your PR.
148150

149151
These implementations have been tested on several datasets (see the example scripts) and should match the performances of the original implementations (e.g. ~93 F1 on SQuAD for BERT Whole-Word-Masking, ~88 F1 on RocStories for OpenAI GPT, ~18.3 perplexity on WikiText 103 for Transformer-XL, ~0.916 Peason R coefficient on STS-B for XLNet). You can find more details on the performances in the Examples section of the [documentation](https://huggingface.co/transformers/examples.html).
150152

@@ -445,6 +447,46 @@ python ./examples/run_generation.py \
445447
--repetition_penalty=1.2 \
446448
```
447449

450+
## Quick tour of model sharing
451+
452+
New in `v2.2.2`: you can now upload and share your fine-tuned models with the community, using the <abbr title="Command-line interface">CLI</abbr> that's built-in to the library.
453+
454+
**First, create an account on [https://huggingface.co/join](https://huggingface.co/join)**. Then:
455+
456+
```shell
457+
transformers-cli login
458+
# log in using the same credentials as on huggingface.co
459+
```
460+
Upload your model:
461+
```shell
462+
transformers-cli upload ./path/to/pretrained_model/
463+
464+
# ^^ Upload folder containing weights/tokenizer/config
465+
# saved via `.save_pretrained()`
466+
467+
transformers-cli upload ./config.json [--filename folder/foobar.json]
468+
469+
# ^^ Upload a single file
470+
# (you can optionally override its filename, which can be nested inside a folder)
471+
```
472+
473+
Your model will then be accessible through its identifier, a concatenation of your username and the folder name above:
474+
```python
475+
"username/model_name"
476+
```
477+
478+
Anyone can load it from code:
479+
```python
480+
tokenizer = AutoTokenizer.from_pretrained("username/pretrained_model")
481+
model = AutoModel.from_pretrained("username/pretrained_model")
482+
```
483+
484+
Finally, list all your files on S3:
485+
```shell
486+
transformers-cli ls
487+
# List all your S3 objects.
488+
```
489+
448490
## Migrating from pytorch-transformers to transformers
449491

450492
Here is a quick summary of what you should take care of when migrating from `pytorch-transformers` to `transformers`.

docs/source/conf.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@
2626
# The short X.Y version
2727
version = u''
2828
# The full version, including alpha/beta/rc tags
29-
release = u'2.2.1'
29+
release = u'2.2.2'
3030

3131

3232
# -- General configuration ---------------------------------------------------

docs/source/index.rst

+1
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,7 @@ The library currently contains PyTorch and Tensorflow implementations, pre-train
5858
installation
5959
quickstart
6060
pretrained_models
61+
model_sharing
6162
examples
6263
notebooks
6364
serialization

docs/source/model_sharing.md

+40
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,40 @@
1+
# Model upload and sharing
2+
3+
Starting with `v2.2.2`, you can now upload and share your fine-tuned models with the community, using the <abbr title="Command-line interface">CLI</abbr> that's built-in to the library.
4+
5+
**First, create an account on [https://huggingface.co/join](https://huggingface.co/join)**. Then:
6+
7+
```shell
8+
transformers-cli login
9+
# log in using the same credentials as on huggingface.co
10+
```
11+
Upload your model:
12+
```shell
13+
transformers-cli upload ./path/to/pretrained_model/
14+
15+
# ^^ Upload folder containing weights/tokenizer/config
16+
# saved via `.save_pretrained()`
17+
18+
transformers-cli upload ./config.json [--filename folder/foobar.json]
19+
20+
# ^^ Upload a single file
21+
# (you can optionally override its filename, which can be nested inside a folder)
22+
```
23+
24+
Your model will then be accessible through its identifier, a concatenation of your username and the folder name above:
25+
```python
26+
"username/pretrained_model"
27+
```
28+
29+
Anyone can load it from code:
30+
```python
31+
tokenizer = AutoTokenizer.from_pretrained("username/pretrained_model")
32+
model = AutoModel.from_pretrained("username/pretrained_model")
33+
```
34+
35+
Finally, list all your files on S3:
36+
```shell
37+
transformers-cli ls
38+
# List all your S3 objects.
39+
```
40+

0 commit comments

Comments
 (0)