Skip to content

Commit 569f61a

Browse files
Add TF multiple choice example (#12865)
* Add new multiple-choice example, remove old one
1 parent 4f19881 commit 569f61a

File tree

4 files changed

+483
-812
lines changed

4 files changed

+483
-812
lines changed
+24-19
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<!---
2-
Copyright 2020 The HuggingFace Team. All rights reserved.
2+
Copyright 2021 The HuggingFace Team. All rights reserved.
33
44
Licensed under the Apache License, Version 2.0 (the "License");
55
you may not use this file except in compliance with the License.
@@ -13,26 +13,31 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
1313
See the License for the specific language governing permissions and
1414
limitations under the License.
1515
-->
16+
# Multiple-choice training (e.g. SWAG)
1617

17-
# Multiple Choice
18+
This folder contains the `run_swag.py` script, showing an examples of *multiple-choice answering* with the
19+
🤗 Transformers library. For straightforward use-cases you may be able to use these scripts without modification,
20+
although we have also included comments in the code to indicate areas that you may need to adapt to your own projects.
1821

19-
## Fine-tuning on SWAG
22+
### Multi-GPU and TPU usage
2023

24+
By default, the script uses a `MirroredStrategy` and will use multiple GPUs effectively if they are available. TPUs
25+
can also be used by passing the name of the TPU resource with the `--tpu` argument.
26+
27+
### Memory usage and data loading
28+
29+
One thing to note is that all data is loaded into memory in this script. Most multiple-choice datasets are small
30+
enough that this is not an issue, but if you have a very large dataset you will need to modify the script to handle
31+
data streaming. This is particularly challenging for TPUs, given the stricter requirements and the sheer volume of data
32+
required to keep them fed. A full explanation of all the possible pitfalls is a bit beyond this example script and
33+
README, but for more information you can see the 'Input Datasets' section of
34+
[this document](https://www.tensorflow.org/guide/tpu).
35+
36+
### Example command
2137
```bash
22-
export SWAG_DIR=/path/to/swag_data_dir
23-
python ./examples/multiple-choice/run_tf_multiple_choice.py \
24-
--task_name swag \
25-
--model_name_or_path bert-base-cased \
26-
--do_train \
27-
--do_eval \
28-
--data_dir $SWAG_DIR \
29-
--learning_rate 5e-5 \
30-
--num_train_epochs 3 \
31-
--max_seq_length 80 \
32-
--output_dir models_bert/swag_base \
33-
--per_gpu_eval_batch_size=16 \
34-
--per_device_train_batch_size=16 \
35-
--logging-dir logs \
36-
--gradient_accumulation_steps 2 \
37-
--overwrite_output
38+
python run_swag.py \
39+
--model_name_or_path distilbert-base-cased \
40+
--output_dir output \
41+
--do_eval \
42+
--do_train
3843
```

0 commit comments

Comments
 (0)