mms_benchmark - Benchmark results

Our preliminary results has been presented in (Rajda et al. 2022) and finally presented in (Augustyniak et al. 2023) review at NeurIPS’23.

Benchmark results - F1 Macro scores

Models

Model	Inf. time [s]	#params	#langs	base	data	reference
mT5	1.69	277M	101	T5	\(CC^b\)	(Xue et al. 2021)
LASER	1.64	52M	93	BiLSTM	\(OPUS^c\)	(Artetxe and Schwenk 2019)
mBERT	1.49	177M	104	BERT	Wiki	(Devlin et al. 2019)
MPNet**	1.38	278M	53	XLM-R	\(OPUS^c\), \(MUSE^d\), \(Wikititles^e\)	(Reimers and Gurevych 2020)
XLM-R-dist**	1.37	278M	53	XLM-R	\(OPUS^c\), \(MUSE^d\), \(Wikititles^e\)	(Reimers and Gurevych 2020)
XLM-R	1.37	278M	100	XLM-R	CC	(Conneau et al. 2020)
LaBSE	1.36	470M	109	BERT	CC, Wiki + mined bitexts	(Feng et al. 2020)
DistilmBERT	0.79	134M	104	BERT	Wiki	(Sanh et al. 2020)
mUSE-dist**	0.79	134M	53	DistilmBERT	\(OPUS^c\), \(MUSE^d\), \(Wikititles^e\)	(Reimers and Gurevych 2020)
mUSE-transformer*	0.65	85M	16	transformer	mined QA + bitexts, SNLI	(Yang et al. 2020)
mUSE-cnn*	0.12	68M	16	CNN	mined QA + bitexts, SNLI	(Yang et al. 2020)

* mUSE models were used in TensorFlow implementation in contrast to others in torch
a Base model is either monolingual version on which it was based or another multilingual model which was used and adopted
b Colossal Clean Crawled Corpus in multilingual version (mC4)
c multiple datasets from OPUS website (https://opus.nlpl.eu)
d bilingual dictionaries from MUSE (https://github.com/facebookresearch/MUSE)
e just titles from wiki articles in multiple languages

Results

References

Artetxe, Mikel, and Holger Schwenk. 2019. “Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond.” Transactions of the Association for Computational Linguistics 7 (September): 597–610. https://doi.org/10.1162/tacl_a_00288.

Augustyniak, Łukasz, Szymon Woźniak, Marcin Gruza, Piotr Gramacki, Krzysztof Rajda, Mikołaj Morzy, and Tomasz Kajdanowicz. 2023. “Massively Multilingual Corpus of Sentiment Datasets and Multi-Faceted Sentiment Classification Benchmark.” https://arxiv.org/abs/2306.07902.

Conneau, Alexis, Kartikay Khandelwal, Naman Goyal, Vishrav Chaudhary, Guillaume Wenzek, Francisco Guzmán, Edouard Grave, Myle Ott, Luke Zettlemoyer, and Veselin Stoyanov. 2020. “Unsupervised Cross-Lingual Representation Learning at Scale.” In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 8440–51. Online: Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.acl-main.747.

Devlin, Jacob, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. “BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding.” In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 4171–86. Minneapolis, Minnesota: Association for Computational Linguistics. https://doi.org/10.18653/v1/N19-1423.

Feng, Fangxiaoyu, Yinfei Yang, Daniel Cer, Naveen Arivazhagan, and Wei Wang. 2020. “Language-agnostic BERT Sentence Embedding.” Computing Research Repository arXiv:2007.01852. https://arxiv.org/abs/2007.01852.

Rajda, Krzysztof, Lukasz Augustyniak, Piotr Gramacki, Marcin Gruza, Szymon Woźniak, and Tomasz Kajdanowicz. 2022. “Assessment of Massively Multilingual Sentiment Classifiers.” In Proceedings of the 12th Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis, 125–40. Dublin, Ireland: Association for Computational Linguistics. https://doi.org/10.18653/v1/2022.wassa-1.13.

Reimers, Nils, and Iryna Gurevych. 2020. “Making Monolingual Sentence Embeddings Multilingual Using Knowledge Distillation.” In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 4512–25. Online: Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.emnlp-main.365.

Sanh, Victor, Lysandre Debut, Julien Chaumond, and Thomas Wolf. 2020. “DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter.” Computing Research Repository arXiv:1910.01108. https://arxiv.org/abs/1910.01108.

Xue, Linting, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, and Colin Raffel. 2021. “MT5: A Massively Multilingual Pre-Trained Text-to-Text Transformer.” In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 483–98. Online: Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.naacl-main.41.

Yang, Yinfei, Daniel Cer, Amin Ahmad, Mandy Guo, Jax Law, Noah Constant, Gustavo Hernandez Abrego, et al. 2020. “Multilingual Universal Sentence Encoder for Semantic Retrieval.” In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 87–94. Online: Association for Computational Linguistics. https://doi.org/10.18653/v1/2020.acl-demos.12.