WAT

The Workshop on Asian Translation

Evaluation Results

BLEU

#	Team	Task	Date/Time	DataID	BLEU										Method	Other Resources	System Description
#	Team	Task	Date/Time	DataID	juman	kytea	mecab	moses- tokenizer	stanford- segmenter- ctb	stanford- segmenter- pku	indic- tokenizer	unuse	myseg	kmseg	Method	Other Resources	System Description
1	ORGANIZER	INDIC21gu-en	2021/04/08 17:20:38	4791	-	-	-	26.21	-	-	-	-	-	-	NMT	No	Bilingual baseline trained on PMI data. Transformer base. LR=10-3
2	NICT-5	INDIC21gu-en	2021/04/21 15:41:23	5276	-	-	-	33.65	-	-	-	-	-	-	NMT	No	Pretrain MBART on IndicCorp and FT on bilingual PMI data. Beam search. Model is bilingual.
3	SRPOL	INDIC21gu-en	2021/04/21 19:30:48	5326	-	-	-	35.86	-	-	-	-	-	-	NMT	No	Base transformer on all WAT21 data
4	NICT-5	INDIC21gu-en	2021/04/22 11:51:12	5351	-	-	-	33.53	-	-	-	-	-	-	NMT	No	MBART+MNMT. Beam 4.
5	gaurvar	INDIC21gu-en	2021/04/25 18:13:05	5535	-	-	-	15.53	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
6	gaurvar	INDIC21gu-en	2021/04/25 18:31:36	5546	-	-	-	17.48	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
7	gaurvar	INDIC21gu-en	2021/04/25 18:44:52	5557	-	-	-	16.79	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 model trained for multiple indic languages
8	gaurvar	INDIC21gu-en	2021/04/25 18:57:22	5566	-	-	-	17.50	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
9	sakura	INDIC21gu-en	2021/04/30 22:37:18	5871	-	-	-	38.73	-	-	-	-	-	-	NMT	No	Fine-tuning of multilingual mBART many2many model with training corpus.
10	IIIT-H	INDIC21gu-en	2021/05/03 18:12:36	6016	-	-	-	39.39	-	-	-	-	-	-	NMT	No	MNMT system (XX-En) trained via exploiting lexical similarity on PMI+CVIT parallel corpus, then improved using back translation on PMI monolingual data followed by fine tuning.
11	CFILT	INDIC21gu-en	2021/05/04 01:11:08	6053	-	-	-	35.31	-	-	-	-	-	-	NMT	No	Multilingual(Many-to-One(XX-En)) NMT model based on Transformer with shared encoder and decoder.
12	coastal	INDIC21gu-en	2021/05/04 01:43:37	6096	-	-	-	23.88	-	-	-	-	-	-	NMT	No	seq2seq model trained on all WAT2021 data
13	CFILT-IITB	INDIC21gu-en	2021/05/04 01:52:25	6114	-	-	-	28.79	-	-	-	-	-	-	NMT	No	Multilingual NMT (Many to One): Transformer based model with shared encoder-decoder and shared BPE vocabulary trained using all indic language data converted to same script
14	CFILT-IITB	INDIC21gu-en	2021/05/04 01:57:18	6125	-	-	-	31.02	-	-	-	-	-	-	NMT	No	Multilingual NMT (Many to One): Transformer based model with shared encoder-decoder and shared BPE vocabulary trained using all Indo-Aryan languages data converted to same script
15	coastal	INDIC21gu-en	2021/05/04 05:41:08	6163	-	-	-	34.60	-	-	-	-	-	-	NMT	No	mT5 trained only on PMI
16	sakura	INDIC21gu-en	2021/05/04 13:12:11	6203	-	-	-	39.27	-	-	-	-	-	-	NMT	No	Pre-training multilingual mBART many2many model with training corpus followed by finetuning on PMI Parallel.
17	SRPOL	INDIC21gu-en	2021/05/04 15:25:07	6243	-	-	-	43.98	-	-	-	-	-	-	NMT	No	Ensemble of many-to-one on all data. Pretrained on BT, finetuned on PMI
18	SRPOL	INDIC21gu-en	2021/05/04 16:29:47	6269	-	-	-	42.87	-	-	-	-	-	-	NMT	No	Many-to-one on all data. Pretrained on BT, finetuned on PMI
19	IITP-MT	INDIC21gu-en	2021/05/04 17:44:27	6282	-	-	-	36.49	-	-	-	-	-	-	NMT	No	Many-to-One model trained on all training data with base Transformer. All indic language data is romanized. Model fine-tuned on BT PMI monolingual corpus.
20	mcairt	INDIC21gu-en	2021/05/04 19:15:33	6334	-	-	-	36.77	-	-	-	-	-	-	NMT	No	multilingual model(many to one) trained on all WAT 2021 data by using base transformer.
21	NICT-5	INDIC21gu-en	2021/06/21 11:53:21	6474	-	-	-	37.49	-	-	-	-	-	-	NMT	No	Using PMI and PIB data for fine-tuning on am mbart model trained for over 5 epochs.
22	NICT-5	INDIC21gu-en	2021/06/25 11:47:19	6494	-	-	-	38.13	-	-	-	-	-	-	NMT	No	Using PMI and PIB data for fine-tuning on a mbart model trained for over 5 epochs. MNMT model.

Notice:

This table is sorted by the leftmost segmenters. You can change the segmenter used to sort by clicking each segmenter link.

RIBES

#	Team	Task	Date/Time	DataID	RIBES										Method	Other Resources	System Description
#	Team	Task	Date/Time	DataID	juman	kytea	mecab	moses- tokenizer	stanford- segmenter- ctb	stanford- segmenter- pku	indic- tokenizer	unuse	myseg	kmseg	Method	Other Resources	System Description
1	ORGANIZER	INDIC21gu-en	2021/04/08 17:20:38	4791	-	-	-	0.764569	-	-	-	-	-	-	NMT	No	Bilingual baseline trained on PMI data. Transformer base. LR=10-3
2	NICT-5	INDIC21gu-en	2021/04/21 15:41:23	5276	-	-	-	0.810918	-	-	-	-	-	-	NMT	No	Pretrain MBART on IndicCorp and FT on bilingual PMI data. Beam search. Model is bilingual.
3	SRPOL	INDIC21gu-en	2021/04/21 19:30:48	5326	-	-	-	0.818570	-	-	-	-	-	-	NMT	No	Base transformer on all WAT21 data
4	NICT-5	INDIC21gu-en	2021/04/22 11:51:12	5351	-	-	-	0.811609	-	-	-	-	-	-	NMT	No	MBART+MNMT. Beam 4.
5	gaurvar	INDIC21gu-en	2021/04/25 18:13:05	5535	-	-	-	0.668248	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
6	gaurvar	INDIC21gu-en	2021/04/25 18:31:36	5546	-	-	-	0.690114	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
7	gaurvar	INDIC21gu-en	2021/04/25 18:44:52	5557	-	-	-	0.715044	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 model trained for multiple indic languages
8	gaurvar	INDIC21gu-en	2021/04/25 18:57:22	5566	-	-	-	0.712002	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
9	sakura	INDIC21gu-en	2021/04/30 22:37:18	5871	-	-	-	0.834934	-	-	-	-	-	-	NMT	No	Fine-tuning of multilingual mBART many2many model with training corpus.
10	IIIT-H	INDIC21gu-en	2021/05/03 18:12:36	6016	-	-	-	0.830158	-	-	-	-	-	-	NMT	No	MNMT system (XX-En) trained via exploiting lexical similarity on PMI+CVIT parallel corpus, then improved using back translation on PMI monolingual data followed by fine tuning.
11	CFILT	INDIC21gu-en	2021/05/04 01:11:08	6053	-	-	-	0.807849	-	-	-	-	-	-	NMT	No	Multilingual(Many-to-One(XX-En)) NMT model based on Transformer with shared encoder and decoder.
12	coastal	INDIC21gu-en	2021/05/04 01:43:37	6096	-	-	-	0.775336	-	-	-	-	-	-	NMT	No	seq2seq model trained on all WAT2021 data
13	CFILT-IITB	INDIC21gu-en	2021/05/04 01:52:25	6114	-	-	-	0.786408	-	-	-	-	-	-	NMT	No	Multilingual NMT (Many to One): Transformer based model with shared encoder-decoder and shared BPE vocabulary trained using all indic language data converted to same script
14	CFILT-IITB	INDIC21gu-en	2021/05/04 01:57:18	6125	-	-	-	0.795199	-	-	-	-	-	-	NMT	No	Multilingual NMT (Many to One): Transformer based model with shared encoder-decoder and shared BPE vocabulary trained using all Indo-Aryan languages data converted to same script
15	coastal	INDIC21gu-en	2021/05/04 05:41:08	6163	-	-	-	0.824060	-	-	-	-	-	-	NMT	No	mT5 trained only on PMI
16	sakura	INDIC21gu-en	2021/05/04 13:12:11	6203	-	-	-	0.839416	-	-	-	-	-	-	NMT	No	Pre-training multilingual mBART many2many model with training corpus followed by finetuning on PMI Parallel.
17	SRPOL	INDIC21gu-en	2021/05/04 15:25:07	6243	-	-	-	0.853263	-	-	-	-	-	-	NMT	No	Ensemble of many-to-one on all data. Pretrained on BT, finetuned on PMI
18	SRPOL	INDIC21gu-en	2021/05/04 16:29:47	6269	-	-	-	0.849734	-	-	-	-	-	-	NMT	No	Many-to-one on all data. Pretrained on BT, finetuned on PMI
19	IITP-MT	INDIC21gu-en	2021/05/04 17:44:27	6282	-	-	-	0.827301	-	-	-	-	-	-	NMT	No	Many-to-One model trained on all training data with base Transformer. All indic language data is romanized. Model fine-tuned on BT PMI monolingual corpus.
20	mcairt	INDIC21gu-en	2021/05/04 19:15:33	6334	-	-	-	0.829389	-	-	-	-	-	-	NMT	No	multilingual model(many to one) trained on all WAT 2021 data by using base transformer.
21	NICT-5	INDIC21gu-en	2021/06/21 11:53:21	6474	-	-	-	0.827233	-	-	-	-	-	-	NMT	No	Using PMI and PIB data for fine-tuning on am mbart model trained for over 5 epochs.
22	NICT-5	INDIC21gu-en	2021/06/25 11:47:19	6494	-	-	-	0.830437	-	-	-	-	-	-	NMT	No	Using PMI and PIB data for fine-tuning on a mbart model trained for over 5 epochs. MNMT model.

Notice:

This table is sorted by the leftmost segmenters. You can change the segmenter used to sort by clicking each segmenter link.

AMFM

#	Team	Task	Date/Time	DataID	AMFM										Method	Other Resources	System Description
#	Team	Task	Date/Time	DataID	unuse	unuse	unuse	unuse	unuse	unuse	unuse	unuse	unuse	unuse	Method	Other Resources	System Description
1	ORGANIZER	INDIC21gu-en	2021/04/08 17:20:38	4791	-	-	-	0.726576	-	-	-	-	-	-	NMT	No	Bilingual baseline trained on PMI data. Transformer base. LR=10-3
2	NICT-5	INDIC21gu-en	2021/04/21 15:41:23	5276	-	-	-	0.793874	-	-	-	-	-	-	NMT	No	Pretrain MBART on IndicCorp and FT on bilingual PMI data. Beam search. Model is bilingual.
3	SRPOL	INDIC21gu-en	2021/04/21 19:30:48	5326	-	-	-	0.812870	-	-	-	-	-	-	NMT	No	Base transformer on all WAT21 data
4	NICT-5	INDIC21gu-en	2021/04/22 11:51:12	5351	-	-	-	0.796604	-	-	-	-	-	-	NMT	No	MBART+MNMT. Beam 4.
5	gaurvar	INDIC21gu-en	2021/04/25 18:13:05	5535	-	-	-	0.687977	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
6	gaurvar	INDIC21gu-en	2021/04/25 18:31:36	5546	-	-	-	0.696278	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
7	gaurvar	INDIC21gu-en	2021/04/25 18:44:52	5557	-	-	-	0.696879	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 model trained for multiple indic languages
8	gaurvar	INDIC21gu-en	2021/04/25 18:57:22	5566	-	-	-	0.698257	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
9	sakura	INDIC21gu-en	2021/04/30 22:37:18	5871	-	-	-	0.820654	-	-	-	-	-	-	NMT	No	Fine-tuning of multilingual mBART many2many model with training corpus.
10	IIIT-H	INDIC21gu-en	2021/05/03 18:12:36	6016	-	-	-	0.806061	-	-	-	-	-	-	NMT	No	MNMT system (XX-En) trained via exploiting lexical similarity on PMI+CVIT parallel corpus, then improved using back translation on PMI monolingual data followed by fine tuning.
11	CFILT	INDIC21gu-en	2021/05/04 01:11:08	6053	-	-	-	0.797069	-	-	-	-	-	-	NMT	No	Multilingual(Many-to-One(XX-En)) NMT model based on Transformer with shared encoder and decoder.
12	coastal	INDIC21gu-en	2021/05/04 01:43:37	6096	-	-	-	0.779452	-	-	-	-	-	-	NMT	No	seq2seq model trained on all WAT2021 data
13	CFILT-IITB	INDIC21gu-en	2021/05/04 01:52:25	6114	-	-	-	0.765441	-	-	-	-	-	-	NMT	No	Multilingual NMT (Many to One): Transformer based model with shared encoder-decoder and shared BPE vocabulary trained using all indic language data converted to same script
14	CFILT-IITB	INDIC21gu-en	2021/05/04 01:57:18	6125	-	-	-	0.776935	-	-	-	-	-	-	NMT	No	Multilingual NMT (Many to One): Transformer based model with shared encoder-decoder and shared BPE vocabulary trained using all Indo-Aryan languages data converted to same script
15	coastal	INDIC21gu-en	2021/05/04 05:41:08	6163	-	-	-	0.814168	-	-	-	-	-	-	NMT	No	mT5 trained only on PMI
16	sakura	INDIC21gu-en	2021/05/04 13:12:11	6203	-	-	-	0.818623	-	-	-	-	-	-	NMT	No	Pre-training multilingual mBART many2many model with training corpus followed by finetuning on PMI Parallel.
17	SRPOL	INDIC21gu-en	2021/05/04 15:25:07	6243	-	-	-	0.835789	-	-	-	-	-	-	NMT	No	Ensemble of many-to-one on all data. Pretrained on BT, finetuned on PMI
18	SRPOL	INDIC21gu-en	2021/05/04 16:29:47	6269	-	-	-	0.833146	-	-	-	-	-	-	NMT	No	Many-to-one on all data. Pretrained on BT, finetuned on PMI
19	IITP-MT	INDIC21gu-en	2021/05/04 17:44:27	6282	-	-	-	0.814556	-	-	-	-	-	-	NMT	No	Many-to-One model trained on all training data with base Transformer. All indic language data is romanized. Model fine-tuned on BT PMI monolingual corpus.
20	mcairt	INDIC21gu-en	2021/05/04 19:15:33	6334	-	-	-	0.819546	-	-	-	-	-	-	NMT	No	multilingual model(many to one) trained on all WAT 2021 data by using base transformer.
21	NICT-5	INDIC21gu-en	2021/06/21 11:53:21	6474	-	-	-	0.000000	-	-	-	-	-	-	NMT	No	Using PMI and PIB data for fine-tuning on am mbart model trained for over 5 epochs.
22	NICT-5	INDIC21gu-en	2021/06/25 11:47:19	6494	-	-	-	0.000000	-	-	-	-	-	-	NMT	No	Using PMI and PIB data for fine-tuning on a mbart model trained for over 5 epochs. MNMT model.

Notice:

This table is sorted by the leftmost segmenters. You can change the segmenter used to sort by clicking each segmenter link.
Adequacy-Fluency Metrics (AMFM) is a two-dimensional automatic evaluation metric for machine translation, designed to operate at the sentence level. It is based on adequacy and fluency, to decouple semantic and syntactic components of the translation process to provide a balanced view on translation quality.
AMFM is calculated without tokenizers.
The detail of AMFM is shown on the following paper: "Adequacy–Fluency Metrics: Evaluating MT in the Continuous Space Model Framework" [pdf]. Invited Talk in WAT2015 also helps understanding [slide].

HUMAN (WAT2022)

Notice:

HUMAN (WAT2022) is the result of the Pairwise Crowdsourcing Evaluation on WAT2022.
HUMAN (WAT2022) was evaluated by 5 different workers and the final decision is made by the voting of the judgements.

HUMAN (WAT2021)

#	Team	Task	Date/Time	DataID	HUMAN	Method	Other Resources	System Description
1	NICT-5	INDIC21gu-en	2021/04/21 15:41:23	5276	Underway	NMT	No	Pretrain MBART on IndicCorp and FT on bilingual PMI data. Beam search. Model is bilingual.
2	NICT-5	INDIC21gu-en	2021/04/22 11:51:12	5351	Underway	NMT	No	MBART+MNMT. Beam 4.
3	gaurvar	INDIC21gu-en	2021/04/25 18:44:52	5557	Underway	NMT	No	Multi Task Multi Lingual T5 model trained for multiple indic languages
4	gaurvar	INDIC21gu-en	2021/04/25 18:57:22	5566	Underway	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
5	sakura	INDIC21gu-en	2021/04/30 22:37:18	5871	Underway	NMT	No	Fine-tuning of multilingual mBART many2many model with training corpus.
6	IIIT-H	INDIC21gu-en	2021/05/03 18:12:36	6016	Underway	NMT	No	MNMT system (XX-En) trained via exploiting lexical similarity on PMI+CVIT parallel corpus, then improved using back translation on PMI monolingual data followed by fine tuning.
7	CFILT	INDIC21gu-en	2021/05/04 01:11:08	6053	Underway	NMT	No	Multilingual(Many-to-One(XX-En)) NMT model based on Transformer with shared encoder and decoder.
8	CFILT-IITB	INDIC21gu-en	2021/05/04 01:52:25	6114	Underway	NMT	No	Multilingual NMT (Many to One): Transformer based model with shared encoder-decoder and shared BPE vocabulary trained using all indic language data converted to same script
9	CFILT-IITB	INDIC21gu-en	2021/05/04 01:57:18	6125	Underway	NMT	No	Multilingual NMT (Many to One): Transformer based model with shared encoder-decoder and shared BPE vocabulary trained using all Indo-Aryan languages data converted to same script
10	coastal	INDIC21gu-en	2021/05/04 05:41:08	6163	Underway	NMT	No	mT5 trained only on PMI
11	SRPOL	INDIC21gu-en	2021/05/04 15:25:07	6243	Underway	NMT	No	Ensemble of many-to-one on all data. Pretrained on BT, finetuned on PMI
12	SRPOL	INDIC21gu-en	2021/05/04 16:29:47	6269	Underway	NMT	No	Many-to-one on all data. Pretrained on BT, finetuned on PMI
13	IITP-MT	INDIC21gu-en	2021/05/04 17:44:27	6282	Underway	NMT	No	Many-to-One model trained on all training data with base Transformer. All indic language data is romanized. Model fine-tuned on BT PMI monolingual corpus.
14	mcairt	INDIC21gu-en	2021/05/04 19:15:33	6334	Underway	NMT	No	multilingual model(many to one) trained on all WAT 2021 data by using base transformer.

Notice:

HUMAN (WAT2021) is the result of the Pairwise Crowdsourcing Evaluation on WAT2021.
HUMAN (WAT2021) was evaluated by 5 different workers and the final decision is made by the voting of the judgements.

HUMAN (WAT2020)

Notice:

HUMAN (WAT2020) is the result of the Pairwise Crowdsourcing Evaluation on WAT2020.
HUMAN (WAT2020) was evaluated by 5 different workers and the final decision is made by the voting of the judgements.

HUMAN (WAT2019)

Notice:

HUMAN (WAT2019) is the result of the Pairwise Crowdsourcing Evaluation on WAT2019.
HUMAN (WAT2019) was evaluated by 5 different workers and the final decision is made by the voting of the judgements.

HUMAN (WAT2018)

Notice:

HUMAN (WAT2018) is the result of the Pairwise Crowdsourcing Evaluation on WAT2018.
HUMAN (WAT2018) was evaluated by 5 different workers and the final decision is made by the voting of the judgements.

HUMAN (WAT2017)

Notice:

HUMAN (WAT2017) is the result of the Pairwise Crowdsourcing Evaluation on WAT2017.
HUMAN (WAT2017) was evaluated by 5 different workers and the final decision is made by the voting of the judgements.

HUMAN (WAT2016)

Notice:

HUMAN (WAT2016) is the result of the Pairwise Crowdsourcing Evaluation on WAT2016.
HUMAN (WAT2016) was evaluated by 5 different workers and the final decision is made by the voting of the judgements.

HUMAN (WAT2015)

Notice:

HUMAN (WAT2015) is the result of the Pairwise Crowdsourcing Evaluation on WAT2015.
HUMAN (WAT2015) was evaluated by 5 different workers and the final decision is made by the voting of the judgements.
The detail of the evaluation can be found in the pdf document (PDF file).

HUMAN (WAT2014)

Notice:

HUMAN (WAT2014) is the result of the Pairwise Crowdsourcing Evaluation on WAT2014.
HUMAN (WAT2014) was evaluated by 3 different workers and the final decision is made by the voting of the judgements.
The detail of the evaluation can be found in the pdf document (PDF file).

EVALUATION RESULTS USAGE POLICY

When you use the WAT evaluation results for any purpose such as:
- writing technical papers,
- making presentations about your system,
- advertising your MT system to the customers,
you can use the information about translation directions, scores (including both automatic and human evaluations) and ranks of your system among others. You can also use the scores of the other systems, but you MUST anonymize the other system's names. In addition, you can show the links (URLs) to the WAT evaluation result pages.

NICT (National Institute of Information and Communications Technology)
Kyoto University
Last Modified: 2018-08-02

WAT The Workshop on Asian Translation Evaluation Results

BLEU

RIBES

AMFM

HUMAN (WAT2022)

HUMAN (WAT2021)

HUMAN (WAT2020)

HUMAN (WAT2019)

HUMAN (WAT2018)

HUMAN (WAT2017)

HUMAN (WAT2016)

HUMAN (WAT2015)

HUMAN (WAT2014)

EVALUATION RESULTS USAGE POLICY

WAT

The Workshop on Asian Translation

Evaluation Results