WAT

The Workshop on Asian Translation

Evaluation Results

BLEU

#	Team	Task	Date/Time	DataID	BLEU										Method	Other Resources	System Description
#	Team	Task	Date/Time	DataID	juman	kytea	mecab	moses- tokenizer	stanford- segmenter- ctb	stanford- segmenter- pku	indic- tokenizer	unuse	myseg	kmseg	Method	Other Resources	System Description
1	SRPOL	INDIC21pa-en	2021/05/04 15:28:09	6249	-	-	-	46.39	-	-	-	-	-	-	NMT	No	Ensemble of many-to-one on all data. Pretrained on BT, finetuned on PMI
2	SRPOL	INDIC21pa-en	2021/05/04 16:33:03	6275	-	-	-	44.87	-	-	-	-	-	-	NMT	No	Many-to-one on all data. Pretrained on BT, finetuned on PMI
3	NICT-5	INDIC21pa-en	2021/06/21 12:05:01	6479	-	-	-	43.06	-	-	-	-	-	-	NMT	No	Using PMI and PIB data for fine-tuning on am mbart model trained for over 5 epochs.
4	NICT-5	INDIC21pa-en	2021/06/25 11:50:20	6500	-	-	-	42.44	-	-	-	-	-	-	NMT	No	Using PMI and PIB data for fine-tuning on a mbart model trained for over 5 epochs. MNMT model.
5	IIIT-H	INDIC21pa-en	2021/05/03 18:15:36	6023	-	-	-	41.24	-	-	-	-	-	-	NMT	No	MNMT system (XX-En) trained via exploiting lexical similarity on PMI+CVIT parallel corpus, then improved using back translation on PMI monolingual data followed by fine tuning.
6	sakura	INDIC21pa-en	2021/05/04 13:20:33	6209	-	-	-	41.18	-	-	-	-	-	-	NMT	No	Pre-training multilingual mBART many2many model with training corpus followed by finetuning on PMI Parallel.
7	sakura	INDIC21pa-en	2021/04/30 22:59:33	5877	-	-	-	40.38	-	-	-	-	-	-	NMT	No	Fine-tuning of multilingual mBART many2many model with training corpus.
8	mcairt	INDIC21pa-en	2021/05/04 19:28:11	6342	-	-	-	38.42	-	-	-	-	-	-	NMT	No	multilingual model(many to one) trained on all WAT 2021 data by using base transformer.
9	IITP-MT	INDIC21pa-en	2021/05/04 18:12:25	6301	-	-	-	38.41	-	-	-	-	-	-	NMT	No	Many-to-One model trained on all training data with base Transformer. All indic language data is romanized. Model fine-tuned on BT PMI monolingual corpus.
10	CFILT	INDIC21pa-en	2021/05/04 01:17:27	6059	-	-	-	38.01	-	-	-	-	-	-	NMT	No	Multilingual(Many-to-One(XX-En)) NMT model based on Transformer with shared encoder and decoder.
11	SRPOL	INDIC21pa-en	2021/04/21 19:33:28	5332	-	-	-	37.61	-	-	-	-	-	-	NMT	No	Base transformer on all WAT21 data
12	coastal	INDIC21pa-en	2021/05/04 05:43:28	6168	-	-	-	35.90	-	-	-	-	-	-	NMT	No	mT5 trained only on PMI
13	NICT-5	INDIC21pa-en	2021/04/22 11:53:47	5363	-	-	-	35.81	-	-	-	-	-	-	NMT	No	MBART+MNMT. Beam 4.
14	NICT-5	INDIC21pa-en	2021/04/21 15:45:23	5288	-	-	-	34.34	-	-	-	-	-	-	NMT	No	Pretrain MBART on IndicCorp and FT on bilingual PMI data. Beam search. Model is bilingual.
15	CFILT-IITB	INDIC21pa-en	2021/05/04 01:59:52	6129	-	-	-	32.34	-	-	-	-	-	-	NMT	No	Multilingual NMT (Many to One): Transformer based model with shared encoder-decoder and shared BPE vocabulary trained using all Indo-Aryan languages data converted to same script
16	CFILT-IITB	INDIC21pa-en	2021/05/04 01:56:11	6123	-	-	-	29.87	-	-	-	-	-	-	NMT	No	Multilingual NMT (Many to One): Transformer based model with shared encoder-decoder and shared BPE vocabulary trained using all indic language data converted to same script
17	coastal	INDIC21pa-en	2021/05/04 01:46:35	6108	-	-	-	25.44	-	-	-	-	-	-	NMT	No	seq2seq model trained on all WAT2021 data
18	NLPHut	INDIC21pa-en	2021/03/20 00:15:19	4615	-	-	-	24.35	-	-	-	-	-	-	NMT	No	Transformer trained using all languages PMI data. Then fine tuned using all pa-en data.
19	ORGANIZER	INDIC21pa-en	2021/04/08 17:25:18	4803	-	-	-	23.66	-	-	-	-	-	-	NMT	No	Bilingual baseline trained on PMI data. Transformer base. LR=10-3
20	gaurvar	INDIC21pa-en	2021/04/25 18:35:36	5551	-	-	-	18.61	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
21	gaurvar	INDIC21pa-en	2021/04/25 19:02:55	5572	-	-	-	18.59	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
22	gaurvar	INDIC21pa-en	2021/04/25 18:49:16	5562	-	-	-	17.86	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
23	gaurvar	INDIC21pa-en	2021/04/25 18:11:23	5534	-	-	-	16.46	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages

Notice:

This table is sorted by the leftmost segmenters. You can change the segmenter used to sort by clicking each segmenter link.

RIBES

#	Team	Task	Date/Time	DataID	RIBES										Method	Other Resources	System Description
#	Team	Task	Date/Time	DataID	juman	kytea	mecab	moses- tokenizer	stanford- segmenter- ctb	stanford- segmenter- pku	indic- tokenizer	unuse	myseg	kmseg	Method	Other Resources	System Description
1	SRPOL	INDIC21pa-en	2021/05/04 15:28:09	6249	-	-	-	0.865765	-	-	-	-	-	-	NMT	No	Ensemble of many-to-one on all data. Pretrained on BT, finetuned on PMI
2	SRPOL	INDIC21pa-en	2021/05/04 16:33:03	6275	-	-	-	0.861389	-	-	-	-	-	-	NMT	No	Many-to-one on all data. Pretrained on BT, finetuned on PMI
3	sakura	INDIC21pa-en	2021/05/04 13:20:33	6209	-	-	-	0.849499	-	-	-	-	-	-	NMT	No	Pre-training multilingual mBART many2many model with training corpus followed by finetuning on PMI Parallel.
4	NICT-5	INDIC21pa-en	2021/06/25 11:50:20	6500	-	-	-	0.845966	-	-	-	-	-	-	NMT	No	Using PMI and PIB data for fine-tuning on a mbart model trained for over 5 epochs. MNMT model.
5	sakura	INDIC21pa-en	2021/04/30 22:59:33	5877	-	-	-	0.844351	-	-	-	-	-	-	NMT	No	Fine-tuning of multilingual mBART many2many model with training corpus.
6	NICT-5	INDIC21pa-en	2021/06/21 12:05:01	6479	-	-	-	0.842755	-	-	-	-	-	-	NMT	No	Using PMI and PIB data for fine-tuning on am mbart model trained for over 5 epochs.
7	mcairt	INDIC21pa-en	2021/05/04 19:28:11	6342	-	-	-	0.840360	-	-	-	-	-	-	NMT	No	multilingual model(many to one) trained on all WAT 2021 data by using base transformer.
8	IITP-MT	INDIC21pa-en	2021/05/04 18:12:25	6301	-	-	-	0.839598	-	-	-	-	-	-	NMT	No	Many-to-One model trained on all training data with base Transformer. All indic language data is romanized. Model fine-tuned on BT PMI monolingual corpus.
9	IIIT-H	INDIC21pa-en	2021/05/03 18:15:36	6023	-	-	-	0.837608	-	-	-	-	-	-	NMT	No	MNMT system (XX-En) trained via exploiting lexical similarity on PMI+CVIT parallel corpus, then improved using back translation on PMI monolingual data followed by fine tuning.
10	coastal	INDIC21pa-en	2021/05/04 05:43:28	6168	-	-	-	0.835327	-	-	-	-	-	-	NMT	No	mT5 trained only on PMI
11	SRPOL	INDIC21pa-en	2021/04/21 19:33:28	5332	-	-	-	0.833454	-	-	-	-	-	-	NMT	No	Base transformer on all WAT21 data
12	NICT-5	INDIC21pa-en	2021/04/22 11:53:47	5363	-	-	-	0.827528	-	-	-	-	-	-	NMT	No	MBART+MNMT. Beam 4.
13	CFILT	INDIC21pa-en	2021/05/04 01:17:27	6059	-	-	-	0.818396	-	-	-	-	-	-	NMT	No	Multilingual(Many-to-One(XX-En)) NMT model based on Transformer with shared encoder and decoder.
14	NICT-5	INDIC21pa-en	2021/04/21 15:45:23	5288	-	-	-	0.816975	-	-	-	-	-	-	NMT	No	Pretrain MBART on IndicCorp and FT on bilingual PMI data. Beam search. Model is bilingual.
15	CFILT-IITB	INDIC21pa-en	2021/05/04 01:59:52	6129	-	-	-	0.805722	-	-	-	-	-	-	NMT	No	Multilingual NMT (Many to One): Transformer based model with shared encoder-decoder and shared BPE vocabulary trained using all Indo-Aryan languages data converted to same script
16	CFILT-IITB	INDIC21pa-en	2021/05/04 01:56:11	6123	-	-	-	0.795413	-	-	-	-	-	-	NMT	No	Multilingual NMT (Many to One): Transformer based model with shared encoder-decoder and shared BPE vocabulary trained using all indic language data converted to same script
17	coastal	INDIC21pa-en	2021/05/04 01:46:35	6108	-	-	-	0.791210	-	-	-	-	-	-	NMT	No	seq2seq model trained on all WAT2021 data
18	NLPHut	INDIC21pa-en	2021/03/20 00:15:19	4615	-	-	-	0.766047	-	-	-	-	-	-	NMT	No	Transformer trained using all languages PMI data. Then fine tuned using all pa-en data.
19	ORGANIZER	INDIC21pa-en	2021/04/08 17:25:18	4803	-	-	-	0.749459	-	-	-	-	-	-	NMT	No	Bilingual baseline trained on PMI data. Transformer base. LR=10-3
20	gaurvar	INDIC21pa-en	2021/04/25 18:49:16	5562	-	-	-	0.734748	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
21	gaurvar	INDIC21pa-en	2021/04/25 19:02:55	5572	-	-	-	0.730487	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
22	gaurvar	INDIC21pa-en	2021/04/25 18:35:36	5551	-	-	-	0.703876	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
23	gaurvar	INDIC21pa-en	2021/04/25 18:11:23	5534	-	-	-	0.679540	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages

Notice:

This table is sorted by the leftmost segmenters. You can change the segmenter used to sort by clicking each segmenter link.

AMFM

#	Team	Task	Date/Time	DataID	AMFM										Method	Other Resources	System Description
#	Team	Task	Date/Time	DataID	unuse	unuse	unuse	unuse	unuse	unuse	unuse	unuse	unuse	unuse	Method	Other Resources	System Description
1	SRPOL	INDIC21pa-en	2021/05/04 15:28:09	6249	-	-	-	0.841641	-	-	-	-	-	-	NMT	No	Ensemble of many-to-one on all data. Pretrained on BT, finetuned on PMI
2	SRPOL	INDIC21pa-en	2021/05/04 16:33:03	6275	-	-	-	0.836440	-	-	-	-	-	-	NMT	No	Many-to-one on all data. Pretrained on BT, finetuned on PMI
3	sakura	INDIC21pa-en	2021/04/30 22:59:33	5877	-	-	-	0.823464	-	-	-	-	-	-	NMT	No	Fine-tuning of multilingual mBART many2many model with training corpus.
4	sakura	INDIC21pa-en	2021/05/04 13:20:33	6209	-	-	-	0.823371	-	-	-	-	-	-	NMT	No	Pre-training multilingual mBART many2many model with training corpus followed by finetuning on PMI Parallel.
5	mcairt	INDIC21pa-en	2021/05/04 19:28:11	6342	-	-	-	0.818332	-	-	-	-	-	-	NMT	No	multilingual model(many to one) trained on all WAT 2021 data by using base transformer.
6	IITP-MT	INDIC21pa-en	2021/05/04 18:12:25	6301	-	-	-	0.815989	-	-	-	-	-	-	NMT	No	Many-to-One model trained on all training data with base Transformer. All indic language data is romanized. Model fine-tuned on BT PMI monolingual corpus.
7	SRPOL	INDIC21pa-en	2021/04/21 19:33:28	5332	-	-	-	0.815069	-	-	-	-	-	-	NMT	No	Base transformer on all WAT21 data
8	coastal	INDIC21pa-en	2021/05/04 05:43:28	6168	-	-	-	0.814440	-	-	-	-	-	-	NMT	No	mT5 trained only on PMI
9	IIIT-H	INDIC21pa-en	2021/05/03 18:15:36	6023	-	-	-	0.811169	-	-	-	-	-	-	NMT	No	MNMT system (XX-En) trained via exploiting lexical similarity on PMI+CVIT parallel corpus, then improved using back translation on PMI monolingual data followed by fine tuning.
10	CFILT	INDIC21pa-en	2021/05/04 01:17:27	6059	-	-	-	0.804561	-	-	-	-	-	-	NMT	No	Multilingual(Many-to-One(XX-En)) NMT model based on Transformer with shared encoder and decoder.
11	NICT-5	INDIC21pa-en	2021/04/22 11:53:47	5363	-	-	-	0.800753	-	-	-	-	-	-	NMT	No	MBART+MNMT. Beam 4.
12	NICT-5	INDIC21pa-en	2021/04/21 15:45:23	5288	-	-	-	0.792541	-	-	-	-	-	-	NMT	No	Pretrain MBART on IndicCorp and FT on bilingual PMI data. Beam search. Model is bilingual.
13	CFILT-IITB	INDIC21pa-en	2021/05/04 01:59:52	6129	-	-	-	0.782112	-	-	-	-	-	-	NMT	No	Multilingual NMT (Many to One): Transformer based model with shared encoder-decoder and shared BPE vocabulary trained using all Indo-Aryan languages data converted to same script
14	coastal	INDIC21pa-en	2021/05/04 01:46:35	6108	-	-	-	0.779252	-	-	-	-	-	-	NMT	No	seq2seq model trained on all WAT2021 data
15	CFILT-IITB	INDIC21pa-en	2021/05/04 01:56:11	6123	-	-	-	0.772655	-	-	-	-	-	-	NMT	No	Multilingual NMT (Many to One): Transformer based model with shared encoder-decoder and shared BPE vocabulary trained using all indic language data converted to same script
16	NLPHut	INDIC21pa-en	2021/03/20 00:15:19	4615	-	-	-	0.717322	-	-	-	-	-	-	NMT	No	Transformer trained using all languages PMI data. Then fine tuned using all pa-en data.
17	ORGANIZER	INDIC21pa-en	2021/04/08 17:25:18	4803	-	-	-	0.701483	-	-	-	-	-	-	NMT	No	Bilingual baseline trained on PMI data. Transformer base. LR=10-3
18	gaurvar	INDIC21pa-en	2021/04/25 19:02:55	5572	-	-	-	0.694658	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
19	gaurvar	INDIC21pa-en	2021/04/25 18:35:36	5551	-	-	-	0.693631	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
20	gaurvar	INDIC21pa-en	2021/04/25 18:49:16	5562	-	-	-	0.692458	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
21	gaurvar	INDIC21pa-en	2021/04/25 18:11:23	5534	-	-	-	0.686625	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
22	NICT-5	INDIC21pa-en	2021/06/21 12:05:01	6479	-	-	-	0.000000	-	-	-	-	-	-	NMT	No	Using PMI and PIB data for fine-tuning on am mbart model trained for over 5 epochs.
23	NICT-5	INDIC21pa-en	2021/06/25 11:50:20	6500	-	-	-	0.000000	-	-	-	-	-	-	NMT	No	Using PMI and PIB data for fine-tuning on a mbart model trained for over 5 epochs. MNMT model.

Notice:

This table is sorted by the leftmost segmenters. You can change the segmenter used to sort by clicking each segmenter link.
Adequacy-Fluency Metrics (AMFM) is a two-dimensional automatic evaluation metric for machine translation, designed to operate at the sentence level. It is based on adequacy and fluency, to decouple semantic and syntactic components of the translation process to provide a balanced view on translation quality.
AMFM is calculated without tokenizers.
The detail of AMFM is shown on the following paper: "Adequacy–Fluency Metrics: Evaluating MT in the Continuous Space Model Framework" [pdf]. Invited Talk in WAT2015 also helps understanding [slide].

HUMAN (WAT2022)

Notice:

HUMAN (WAT2022) is the result of the Pairwise Crowdsourcing Evaluation on WAT2022.
HUMAN (WAT2022) was evaluated by 5 different workers and the final decision is made by the voting of the judgements.

HUMAN (WAT2021)

#	Team	Task	Date/Time	DataID	HUMAN	Method	Other Resources	System Description
1	NLPHut	INDIC21pa-en	2021/03/20 00:15:19	4615	Underway	NMT	No	Transformer trained using all languages PMI data. Then fine tuned using all pa-en data.
2	NICT-5	INDIC21pa-en	2021/04/21 15:45:23	5288	Underway	NMT	No	Pretrain MBART on IndicCorp and FT on bilingual PMI data. Beam search. Model is bilingual.
3	NICT-5	INDIC21pa-en	2021/04/22 11:53:47	5363	Underway	NMT	No	MBART+MNMT. Beam 4.
4	gaurvar	INDIC21pa-en	2021/04/25 18:35:36	5551	Underway	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
5	gaurvar	INDIC21pa-en	2021/04/25 19:02:55	5572	Underway	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
6	sakura	INDIC21pa-en	2021/04/30 22:59:33	5877	Underway	NMT	No	Fine-tuning of multilingual mBART many2many model with training corpus.
7	IIIT-H	INDIC21pa-en	2021/05/03 18:15:36	6023	Underway	NMT	No	MNMT system (XX-En) trained via exploiting lexical similarity on PMI+CVIT parallel corpus, then improved using back translation on PMI monolingual data followed by fine tuning.
8	CFILT	INDIC21pa-en	2021/05/04 01:17:27	6059	Underway	NMT	No	Multilingual(Many-to-One(XX-En)) NMT model based on Transformer with shared encoder and decoder.
9	CFILT-IITB	INDIC21pa-en	2021/05/04 01:56:11	6123	Underway	NMT	No	Multilingual NMT (Many to One): Transformer based model with shared encoder-decoder and shared BPE vocabulary trained using all indic language data converted to same script
10	CFILT-IITB	INDIC21pa-en	2021/05/04 01:59:52	6129	Underway	NMT	No	Multilingual NMT (Many to One): Transformer based model with shared encoder-decoder and shared BPE vocabulary trained using all Indo-Aryan languages data converted to same script
11	coastal	INDIC21pa-en	2021/05/04 05:43:28	6168	Underway	NMT	No	mT5 trained only on PMI
12	SRPOL	INDIC21pa-en	2021/05/04 15:28:09	6249	Underway	NMT	No	Ensemble of many-to-one on all data. Pretrained on BT, finetuned on PMI
13	SRPOL	INDIC21pa-en	2021/05/04 16:33:03	6275	Underway	NMT	No	Many-to-one on all data. Pretrained on BT, finetuned on PMI
14	IITP-MT	INDIC21pa-en	2021/05/04 18:12:25	6301	Underway	NMT	No	Many-to-One model trained on all training data with base Transformer. All indic language data is romanized. Model fine-tuned on BT PMI monolingual corpus.
15	mcairt	INDIC21pa-en	2021/05/04 19:28:11	6342	Underway	NMT	No	multilingual model(many to one) trained on all WAT 2021 data by using base transformer.

Notice:

HUMAN (WAT2021) is the result of the Pairwise Crowdsourcing Evaluation on WAT2021.
HUMAN (WAT2021) was evaluated by 5 different workers and the final decision is made by the voting of the judgements.

HUMAN (WAT2020)

Notice:

HUMAN (WAT2020) is the result of the Pairwise Crowdsourcing Evaluation on WAT2020.
HUMAN (WAT2020) was evaluated by 5 different workers and the final decision is made by the voting of the judgements.

HUMAN (WAT2019)

Notice:

HUMAN (WAT2019) is the result of the Pairwise Crowdsourcing Evaluation on WAT2019.
HUMAN (WAT2019) was evaluated by 5 different workers and the final decision is made by the voting of the judgements.

HUMAN (WAT2018)

Notice:

HUMAN (WAT2018) is the result of the Pairwise Crowdsourcing Evaluation on WAT2018.
HUMAN (WAT2018) was evaluated by 5 different workers and the final decision is made by the voting of the judgements.

HUMAN (WAT2017)

Notice:

HUMAN (WAT2017) is the result of the Pairwise Crowdsourcing Evaluation on WAT2017.
HUMAN (WAT2017) was evaluated by 5 different workers and the final decision is made by the voting of the judgements.

HUMAN (WAT2016)

Notice:

HUMAN (WAT2016) is the result of the Pairwise Crowdsourcing Evaluation on WAT2016.
HUMAN (WAT2016) was evaluated by 5 different workers and the final decision is made by the voting of the judgements.

HUMAN (WAT2015)

Notice:

HUMAN (WAT2015) is the result of the Pairwise Crowdsourcing Evaluation on WAT2015.
HUMAN (WAT2015) was evaluated by 5 different workers and the final decision is made by the voting of the judgements.
The detail of the evaluation can be found in the pdf document (PDF file).

HUMAN (WAT2014)

Notice:

HUMAN (WAT2014) is the result of the Pairwise Crowdsourcing Evaluation on WAT2014.
HUMAN (WAT2014) was evaluated by 3 different workers and the final decision is made by the voting of the judgements.
The detail of the evaluation can be found in the pdf document (PDF file).

EVALUATION RESULTS USAGE POLICY

When you use the WAT evaluation results for any purpose such as:
- writing technical papers,
- making presentations about your system,
- advertising your MT system to the customers,
you can use the information about translation directions, scores (including both automatic and human evaluations) and ranks of your system among others. You can also use the scores of the other systems, but you MUST anonymize the other system's names. In addition, you can show the links (URLs) to the WAT evaluation result pages.

NICT (National Institute of Information and Communications Technology)
Kyoto University
Last Modified: 2018-08-02

WAT The Workshop on Asian Translation Evaluation Results

BLEU

RIBES

AMFM

HUMAN (WAT2022)

HUMAN (WAT2021)

HUMAN (WAT2020)

HUMAN (WAT2019)

HUMAN (WAT2018)

HUMAN (WAT2017)

HUMAN (WAT2016)

HUMAN (WAT2015)

HUMAN (WAT2014)

EVALUATION RESULTS USAGE POLICY

WAT

The Workshop on Asian Translation

Evaluation Results