You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/source/model_doc/t5.rst
+4-2
Original file line number
Diff line number
Diff line change
@@ -20,13 +20,14 @@ Training
20
20
~~~~~~~~~~~~~~~~~~~~
21
21
T5 is an encoder-decoder model and converts all NLP problems into a text-to-text format. It is trained using teacher forcing.
22
22
This means that for training we always need an input sequence and a target sequence.
23
-
The input sequence is fed to the model using ``input_ids``. The target sequence is shifted to the right, *i.e.* perprended by a start-sequence token and fed to the decoder using the `decoder_input_ids`. In teacher-forcing style, the target sequence is then appended by the EOS token and corresponds to the ``lm_labels``. The PAD token is hereby used as the start-sequence token.
23
+
The input sequence is fed to the model using ``input_ids``. The target sequence is shifted to the right, *i.e.* prepended by a start-sequence token and fed to the decoder using the `decoder_input_ids`. In teacher-forcing style, the target sequence is then appended by the EOS token and corresponds to the ``lm_labels``. The PAD token is hereby used as the start-sequence token.
24
24
T5 can be trained / fine-tuned both in a supervised and unsupervised fashion.
25
25
26
26
- Unsupervised denoising training
27
+
27
28
In this setup spans of the input sequence are masked by so-called sentinel tokens (*a.k.a* unique mask tokens)
28
29
and the output sequence is formed as a concatenation of the same sentinel tokens and the *real* masked tokens.
29
-
Each sentinel tokens represents a unique mask token for this sentence and should start with ``<extra_id_1>``, ``<extrac_id_2>``, ... up to ``<extra_id_100>``. As a default 100 sentinel tokens are available in ``T5Tokenizer``.
30
+
Each sentinel token represents a unique mask token for this sentence and should start with ``<extra_id_1>``, ``<extra_id_2>``, ... up to ``<extra_id_100>``. As a default 100 sentinel tokens are available in ``T5Tokenizer``.
30
31
*E.g.* the sentence "The cute dog walks in the park" with the masks put on "cute dog" and "the" should be processed as follows:
31
32
32
33
::
@@ -37,6 +38,7 @@ T5 can be trained / fine-tuned both in a supervised and unsupervised fashion.
37
38
model(input_ids=input_ids, lm_labels=lm_labels)
38
39
39
40
- Supervised training
41
+
40
42
In this setup the input sequence and output sequence are standard sequence to sequence input output mapping.
41
43
In translation, *e.g.* the input sequence "The house is wonderful." and output sequence "Das Haus ist wunderbar." should
0 commit comments