Skip to content

Commit e80be7f

Browse files
authored
docs: add xlm-roberta section to multi-lingual section (#4101)
1 parent 18db92d commit e80be7f

File tree

1 file changed

+13
-1
lines changed

1 file changed

+13
-1
lines changed

docs/source/multilingual.rst

+13-1
Original file line numberDiff line numberDiff line change
@@ -104,4 +104,16 @@ BERT has two checkpoints that can be used for multi-lingual tasks:
104104
- ``bert-base-multilingual-cased`` (Masked language modeling + Next sentence prediction, 104 languages)
105105

106106
These checkpoints do not require language embeddings at inference time. They should identify the language
107-
used in the context and infer accordingly.
107+
used in the context and infer accordingly.
108+
109+
XLM-RoBERTa
110+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
111+
112+
XLM-RoBERTa was trained on 2.5TB of newly created clean CommonCrawl data in 100 languages. It provides strong
113+
gains over previously released multi-lingual models like mBERT or XLM on downstream taks like classification,
114+
sequence labeling and question answering.
115+
116+
Two XLM-RoBERTa checkpoints can be used for multi-lingual tasks:
117+
118+
- ``xlm-roberta-base`` (Masked language modeling, 100 languages)
119+
- ``xlm-roberta-large`` (Masked language modeling, 100 languages)

0 commit comments

Comments
 (0)