WAT

The Workshop on Asian Translation

Evaluation Results

BLEU

#	Team	Task	Date/Time	DataID	BLEU										Method	Other Resources	System Description
#	Team	Task	Date/Time	DataID	juman	kytea	mecab	moses- tokenizer	stanford- segmenter- ctb	stanford- segmenter- pku	indic- tokenizer	unuse	myseg	kmseg	Method	Other Resources	System Description
1	SRPOL	INDIC21or-en	2021/05/04 15:27:42	6248	-	-	-	37.06	-	-	-	-	-	-	NMT	No	Ensemble of many-to-one on all data. Pretrained on BT, finetuned on PMI
2	SRPOL	INDIC21or-en	2021/05/04 16:32:35	6274	-	-	-	36.04	-	-	-	-	-	-	NMT	No	Many-to-one on all data. Pretrained on BT, finetuned on PMI
3	IIIT-H	INDIC21or-en	2021/05/03 18:15:02	6022	-	-	-	34.11	-	-	-	-	-	-	NMT	No	MNMT system (XX-En) trained via exploiting lexical similarity on PMI+CVIT parallel corpus, then improved using back translation on PMI monolingual data followed by fine tuning.
4	NICT-5	INDIC21or-en	2021/06/25 11:49:54	6499	-	-	-	33.09	-	-	-	-	-	-	NMT	No	Using PMI and PIB data for fine-tuning on a mbart model trained for over 5 epochs. MNMT model.
5	sakura	INDIC21or-en	2021/05/04 13:19:36	6208	-	-	-	32.82	-	-	-	-	-	-	NMT	No	Pre-training multilingual mBART many2many model with training corpus followed by finetuning on PMI Parallel.
6	sakura	INDIC21or-en	2021/04/30 22:55:20	5876	-	-	-	32.67	-	-	-	-	-	-	NMT	No	Fine-tuning of multilingual mBART many2many model with training corpus.
7	NICT-5	INDIC21or-en	2021/06/21 12:04:38	6478	-	-	-	32.08	-	-	-	-	-	-	NMT	No	Using PMI and PIB data for fine-tuning on am mbart model trained for over 5 epochs.
8	IITP-MT	INDIC21or-en	2021/05/04 18:07:20	6294	-	-	-	31.19	-	-	-	-	-	-	NMT	No	Many-to-One model trained on all training data with base Transformer. All indic language data is romanized. Model fine-tuned on BT PMI monolingual corpus.
9	CFILT	INDIC21or-en	2021/05/04 01:16:29	6058	-	-	-	30.46	-	-	-	-	-	-	NMT	No	Multilingual(Many-to-One(XX-En)) NMT model based on Transformer with shared encoder and decoder.
10	SRPOL	INDIC21or-en	2021/04/21 19:33:01	5331	-	-	-	30.07	-	-	-	-	-	-	NMT	No	Base transformer on all WAT21 data
11	mcairt	INDIC21or-en	2021/05/04 19:21:48	6338	-	-	-	29.96	-	-	-	-	-	-	NMT	No	multilingual model(many to one) trained on all WAT 2021 data by using base transformer.
12	NICT-5	INDIC21or-en	2021/04/22 11:53:28	5361	-	-	-	27.93	-	-	-	-	-	-	NMT	No	MBART+MNMT. Beam 4.
13	CFILT-IITB	INDIC21or-en	2021/05/04 01:59:27	6128	-	-	-	26.34	-	-	-	-	-	-	NMT	No	Multilingual NMT (Many to One): Transformer based model with shared encoder-decoder and shared BPE vocabulary trained using all Indo-Aryan languages data converted to same script
14	NICT-5	INDIC21or-en	2021/04/21 15:44:47	5286	-	-	-	25.81	-	-	-	-	-	-	NMT	No	Pretrain MBART on IndicCorp and FT on bilingual PMI data. Beam search. Model is bilingual.
15	CFILT-IITB	INDIC21or-en	2021/05/04 01:54:16	6119	-	-	-	25.05	-	-	-	-	-	-	NMT	No	Multilingual NMT (Many to One): Transformer based model with shared encoder-decoder and shared BPE vocabulary trained using all indic language data converted to same script
16	coastal	INDIC21or-en	2021/05/04 01:46:05	6107	-	-	-	19.61	-	-	-	-	-	-	NMT	No	seq2seq model trained on all WAT2021 data
17	NLPHut	INDIC21or-en	2021/03/19 16:31:18	4597	-	-	-	18.92	-	-	-	-	-	-	NMT	No	Transformer trained using all or-en data and all hi-en data. Then fine-tuned using all or-en data.
18	ORGANIZER	INDIC21or-en	2021/04/08 17:24:42	4801	-	-	-	16.35	-	-	-	-	-	-	NMT	No	Bilingual baseline trained on PMI data. Transformer base. LR=10-3
19	gaurvar	INDIC21or-en	2021/04/25 18:34:48	5550	-	-	-	13.71	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
20	gaurvar	INDIC21or-en	2021/04/25 19:01:23	5571	-	-	-	13.69	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
21	gaurvar	INDIC21or-en	2021/04/25 18:48:40	5561	-	-	-	13.05	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
22	gaurvar	INDIC21or-en	2021/04/25 18:18:26	5541	-	-	-	12.32	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages

Notice:

This table is sorted by the leftmost segmenters. You can change the segmenter used to sort by clicking each segmenter link.

RIBES

#	Team	Task	Date/Time	DataID	RIBES										Method	Other Resources	System Description
#	Team	Task	Date/Time	DataID	juman	kytea	mecab	moses- tokenizer	stanford- segmenter- ctb	stanford- segmenter- pku	indic- tokenizer	unuse	myseg	kmseg	Method	Other Resources	System Description
1	SRPOL	INDIC21or-en	2021/05/04 15:27:42	6248	-	-	-	0.816956	-	-	-	-	-	-	NMT	No	Ensemble of many-to-one on all data. Pretrained on BT, finetuned on PMI
2	SRPOL	INDIC21or-en	2021/05/04 16:32:35	6274	-	-	-	0.812816	-	-	-	-	-	-	NMT	No	Many-to-one on all data. Pretrained on BT, finetuned on PMI
3	sakura	INDIC21or-en	2021/04/30 22:55:20	5876	-	-	-	0.801734	-	-	-	-	-	-	NMT	No	Fine-tuning of multilingual mBART many2many model with training corpus.
4	sakura	INDIC21or-en	2021/05/04 13:19:36	6208	-	-	-	0.800209	-	-	-	-	-	-	NMT	No	Pre-training multilingual mBART many2many model with training corpus followed by finetuning on PMI Parallel.
5	mcairt	INDIC21or-en	2021/05/04 19:21:48	6338	-	-	-	0.798326	-	-	-	-	-	-	NMT	No	multilingual model(many to one) trained on all WAT 2021 data by using base transformer.
6	IIIT-H	INDIC21or-en	2021/05/03 18:15:02	6022	-	-	-	0.795132	-	-	-	-	-	-	NMT	No	MNMT system (XX-En) trained via exploiting lexical similarity on PMI+CVIT parallel corpus, then improved using back translation on PMI monolingual data followed by fine tuning.
7	IITP-MT	INDIC21or-en	2021/05/04 18:07:20	6294	-	-	-	0.794791	-	-	-	-	-	-	NMT	No	Many-to-One model trained on all training data with base Transformer. All indic language data is romanized. Model fine-tuned on BT PMI monolingual corpus.
8	SRPOL	INDIC21or-en	2021/04/21 19:33:01	5331	-	-	-	0.789470	-	-	-	-	-	-	NMT	No	Base transformer on all WAT21 data
9	NICT-5	INDIC21or-en	2021/06/25 11:49:54	6499	-	-	-	0.788801	-	-	-	-	-	-	NMT	No	Using PMI and PIB data for fine-tuning on a mbart model trained for over 5 epochs. MNMT model.
10	NICT-5	INDIC21or-en	2021/06/21 12:04:38	6478	-	-	-	0.787803	-	-	-	-	-	-	NMT	No	Using PMI and PIB data for fine-tuning on am mbart model trained for over 5 epochs.
11	CFILT	INDIC21or-en	2021/05/04 01:16:29	6058	-	-	-	0.772850	-	-	-	-	-	-	NMT	No	Multilingual(Many-to-One(XX-En)) NMT model based on Transformer with shared encoder and decoder.
12	NICT-5	INDIC21or-en	2021/04/22 11:53:28	5361	-	-	-	0.769634	-	-	-	-	-	-	NMT	No	MBART+MNMT. Beam 4.
13	NICT-5	INDIC21or-en	2021/04/21 15:44:47	5286	-	-	-	0.762604	-	-	-	-	-	-	NMT	No	Pretrain MBART on IndicCorp and FT on bilingual PMI data. Beam search. Model is bilingual.
14	CFILT-IITB	INDIC21or-en	2021/05/04 01:59:27	6128	-	-	-	0.761082	-	-	-	-	-	-	NMT	No	Multilingual NMT (Many to One): Transformer based model with shared encoder-decoder and shared BPE vocabulary trained using all Indo-Aryan languages data converted to same script
15	CFILT-IITB	INDIC21or-en	2021/05/04 01:54:16	6119	-	-	-	0.754313	-	-	-	-	-	-	NMT	No	Multilingual NMT (Many to One): Transformer based model with shared encoder-decoder and shared BPE vocabulary trained using all indic language data converted to same script
16	coastal	INDIC21or-en	2021/05/04 01:46:05	6107	-	-	-	0.737380	-	-	-	-	-	-	NMT	No	seq2seq model trained on all WAT2021 data
17	NLPHut	INDIC21or-en	2021/03/19 16:31:18	4597	-	-	-	0.720916	-	-	-	-	-	-	NMT	No	Transformer trained using all or-en data and all hi-en data. Then fine-tuned using all or-en data.
18	ORGANIZER	INDIC21or-en	2021/04/08 17:24:42	4801	-	-	-	0.679781	-	-	-	-	-	-	NMT	No	Bilingual baseline trained on PMI data. Transformer base. LR=10-3
19	gaurvar	INDIC21or-en	2021/04/25 18:48:40	5561	-	-	-	0.668638	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
20	gaurvar	INDIC21or-en	2021/04/25 19:01:23	5571	-	-	-	0.662493	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
21	gaurvar	INDIC21or-en	2021/04/25 18:34:48	5550	-	-	-	0.634313	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
22	gaurvar	INDIC21or-en	2021/04/25 18:18:26	5541	-	-	-	0.604738	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages

Notice:

This table is sorted by the leftmost segmenters. You can change the segmenter used to sort by clicking each segmenter link.

AMFM

#	Team	Task	Date/Time	DataID	AMFM										Method	Other Resources	System Description
#	Team	Task	Date/Time	DataID	unuse	unuse	unuse	unuse	unuse	unuse	unuse	unuse	unuse	unuse	Method	Other Resources	System Description
1	SRPOL	INDIC21or-en	2021/05/04 15:27:42	6248	-	-	-	0.817318	-	-	-	-	-	-	NMT	No	Ensemble of many-to-one on all data. Pretrained on BT, finetuned on PMI
2	SRPOL	INDIC21or-en	2021/05/04 16:32:35	6274	-	-	-	0.814871	-	-	-	-	-	-	NMT	No	Many-to-one on all data. Pretrained on BT, finetuned on PMI
3	sakura	INDIC21or-en	2021/04/30 22:55:20	5876	-	-	-	0.808239	-	-	-	-	-	-	NMT	No	Fine-tuning of multilingual mBART many2many model with training corpus.
4	sakura	INDIC21or-en	2021/05/04 13:19:36	6208	-	-	-	0.805953	-	-	-	-	-	-	NMT	No	Pre-training multilingual mBART many2many model with training corpus followed by finetuning on PMI Parallel.
5	IIIT-H	INDIC21or-en	2021/05/03 18:15:02	6022	-	-	-	0.804930	-	-	-	-	-	-	NMT	No	MNMT system (XX-En) trained via exploiting lexical similarity on PMI+CVIT parallel corpus, then improved using back translation on PMI monolingual data followed by fine tuning.
6	IITP-MT	INDIC21or-en	2021/05/04 18:07:20	6294	-	-	-	0.803226	-	-	-	-	-	-	NMT	No	Many-to-One model trained on all training data with base Transformer. All indic language data is romanized. Model fine-tuned on BT PMI monolingual corpus.
7	mcairt	INDIC21or-en	2021/05/04 19:21:48	6338	-	-	-	0.795586	-	-	-	-	-	-	NMT	No	multilingual model(many to one) trained on all WAT 2021 data by using base transformer.
8	SRPOL	INDIC21or-en	2021/04/21 19:33:01	5331	-	-	-	0.794017	-	-	-	-	-	-	NMT	No	Base transformer on all WAT21 data
9	CFILT	INDIC21or-en	2021/05/04 01:16:29	6058	-	-	-	0.793769	-	-	-	-	-	-	NMT	No	Multilingual(Many-to-One(XX-En)) NMT model based on Transformer with shared encoder and decoder.
10	NICT-5	INDIC21or-en	2021/04/22 11:53:28	5361	-	-	-	0.782917	-	-	-	-	-	-	NMT	No	MBART+MNMT. Beam 4.
11	NICT-5	INDIC21or-en	2021/04/21 15:44:47	5286	-	-	-	0.780431	-	-	-	-	-	-	NMT	No	Pretrain MBART on IndicCorp and FT on bilingual PMI data. Beam search. Model is bilingual.
12	CFILT-IITB	INDIC21or-en	2021/05/04 01:59:27	6128	-	-	-	0.780009	-	-	-	-	-	-	NMT	No	Multilingual NMT (Many to One): Transformer based model with shared encoder-decoder and shared BPE vocabulary trained using all Indo-Aryan languages data converted to same script
13	CFILT-IITB	INDIC21or-en	2021/05/04 01:54:16	6119	-	-	-	0.770941	-	-	-	-	-	-	NMT	No	Multilingual NMT (Many to One): Transformer based model with shared encoder-decoder and shared BPE vocabulary trained using all indic language data converted to same script
14	NLPHut	INDIC21or-en	2021/03/19 16:31:18	4597	-	-	-	0.740606	-	-	-	-	-	-	NMT	No	Transformer trained using all or-en data and all hi-en data. Then fine-tuned using all or-en data.
15	ORGANIZER	INDIC21or-en	2021/04/08 17:24:42	4801	-	-	-	0.730819	-	-	-	-	-	-	NMT	No	Bilingual baseline trained on PMI data. Transformer base. LR=10-3
16	coastal	INDIC21or-en	2021/05/04 01:46:05	6107	-	-	-	0.727657	-	-	-	-	-	-	NMT	No	seq2seq model trained on all WAT2021 data
17	gaurvar	INDIC21or-en	2021/04/25 18:18:26	5541	-	-	-	0.726829	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
18	gaurvar	INDIC21or-en	2021/04/25 18:34:48	5550	-	-	-	0.725121	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
19	gaurvar	INDIC21or-en	2021/04/25 19:01:23	5571	-	-	-	0.721531	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
20	gaurvar	INDIC21or-en	2021/04/25 18:48:40	5561	-	-	-	0.718668	-	-	-	-	-	-	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
21	NICT-5	INDIC21or-en	2021/06/21 12:04:38	6478	-	-	-	0.000000	-	-	-	-	-	-	NMT	No	Using PMI and PIB data for fine-tuning on am mbart model trained for over 5 epochs.
22	NICT-5	INDIC21or-en	2021/06/25 11:49:54	6499	-	-	-	0.000000	-	-	-	-	-	-	NMT	No	Using PMI and PIB data for fine-tuning on a mbart model trained for over 5 epochs. MNMT model.

Notice:

This table is sorted by the leftmost segmenters. You can change the segmenter used to sort by clicking each segmenter link.
Adequacy-Fluency Metrics (AMFM) is a two-dimensional automatic evaluation metric for machine translation, designed to operate at the sentence level. It is based on adequacy and fluency, to decouple semantic and syntactic components of the translation process to provide a balanced view on translation quality.
AMFM is calculated without tokenizers.
The detail of AMFM is shown on the following paper: "Adequacy–Fluency Metrics: Evaluating MT in the Continuous Space Model Framework" [pdf]. Invited Talk in WAT2015 also helps understanding [slide].

HUMAN (WAT2022)

Notice:

HUMAN (WAT2022) is the result of the Pairwise Crowdsourcing Evaluation on WAT2022.
HUMAN (WAT2022) was evaluated by 5 different workers and the final decision is made by the voting of the judgements.

HUMAN (WAT2021)

#	Team	Task	Date/Time	DataID	HUMAN	Method	Other Resources	System Description
1	NLPHut	INDIC21or-en	2021/03/19 16:31:18	4597	Underway	NMT	No	Transformer trained using all or-en data and all hi-en data. Then fine-tuned using all or-en data.
2	NICT-5	INDIC21or-en	2021/04/21 15:44:47	5286	Underway	NMT	No	Pretrain MBART on IndicCorp and FT on bilingual PMI data. Beam search. Model is bilingual.
3	NICT-5	INDIC21or-en	2021/04/22 11:53:28	5361	Underway	NMT	No	MBART+MNMT. Beam 4.
4	gaurvar	INDIC21or-en	2021/04/25 18:34:48	5550	Underway	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
5	gaurvar	INDIC21or-en	2021/04/25 19:01:23	5571	Underway	NMT	No	Multi Task Multi Lingual T5 trained for Multiple Indic Languages
6	sakura	INDIC21or-en	2021/04/30 22:55:20	5876	Underway	NMT	No	Fine-tuning of multilingual mBART many2many model with training corpus.
7	IIIT-H	INDIC21or-en	2021/05/03 18:15:02	6022	Underway	NMT	No	MNMT system (XX-En) trained via exploiting lexical similarity on PMI+CVIT parallel corpus, then improved using back translation on PMI monolingual data followed by fine tuning.
8	CFILT	INDIC21or-en	2021/05/04 01:16:29	6058	Underway	NMT	No	Multilingual(Many-to-One(XX-En)) NMT model based on Transformer with shared encoder and decoder.
9	coastal	INDIC21or-en	2021/05/04 01:46:05	6107	Underway	NMT	No	seq2seq model trained on all WAT2021 data
10	CFILT-IITB	INDIC21or-en	2021/05/04 01:54:16	6119	Underway	NMT	No	Multilingual NMT (Many to One): Transformer based model with shared encoder-decoder and shared BPE vocabulary trained using all indic language data converted to same script
11	CFILT-IITB	INDIC21or-en	2021/05/04 01:59:27	6128	Underway	NMT	No	Multilingual NMT (Many to One): Transformer based model with shared encoder-decoder and shared BPE vocabulary trained using all Indo-Aryan languages data converted to same script
12	SRPOL	INDIC21or-en	2021/05/04 15:27:42	6248	Underway	NMT	No	Ensemble of many-to-one on all data. Pretrained on BT, finetuned on PMI
13	SRPOL	INDIC21or-en	2021/05/04 16:32:35	6274	Underway	NMT	No	Many-to-one on all data. Pretrained on BT, finetuned on PMI
14	IITP-MT	INDIC21or-en	2021/05/04 18:07:20	6294	Underway	NMT	No	Many-to-One model trained on all training data with base Transformer. All indic language data is romanized. Model fine-tuned on BT PMI monolingual corpus.
15	mcairt	INDIC21or-en	2021/05/04 19:21:48	6338	Underway	NMT	No	multilingual model(many to one) trained on all WAT 2021 data by using base transformer.

Notice:

HUMAN (WAT2021) is the result of the Pairwise Crowdsourcing Evaluation on WAT2021.
HUMAN (WAT2021) was evaluated by 5 different workers and the final decision is made by the voting of the judgements.

HUMAN (WAT2020)

Notice:

HUMAN (WAT2020) is the result of the Pairwise Crowdsourcing Evaluation on WAT2020.
HUMAN (WAT2020) was evaluated by 5 different workers and the final decision is made by the voting of the judgements.

HUMAN (WAT2019)

Notice:

HUMAN (WAT2019) is the result of the Pairwise Crowdsourcing Evaluation on WAT2019.
HUMAN (WAT2019) was evaluated by 5 different workers and the final decision is made by the voting of the judgements.

HUMAN (WAT2018)

Notice:

HUMAN (WAT2018) is the result of the Pairwise Crowdsourcing Evaluation on WAT2018.
HUMAN (WAT2018) was evaluated by 5 different workers and the final decision is made by the voting of the judgements.

HUMAN (WAT2017)

Notice:

HUMAN (WAT2017) is the result of the Pairwise Crowdsourcing Evaluation on WAT2017.
HUMAN (WAT2017) was evaluated by 5 different workers and the final decision is made by the voting of the judgements.

HUMAN (WAT2016)

Notice:

HUMAN (WAT2016) is the result of the Pairwise Crowdsourcing Evaluation on WAT2016.
HUMAN (WAT2016) was evaluated by 5 different workers and the final decision is made by the voting of the judgements.

HUMAN (WAT2015)

Notice:

HUMAN (WAT2015) is the result of the Pairwise Crowdsourcing Evaluation on WAT2015.
HUMAN (WAT2015) was evaluated by 5 different workers and the final decision is made by the voting of the judgements.
The detail of the evaluation can be found in the pdf document (PDF file).

HUMAN (WAT2014)

Notice:

HUMAN (WAT2014) is the result of the Pairwise Crowdsourcing Evaluation on WAT2014.
HUMAN (WAT2014) was evaluated by 3 different workers and the final decision is made by the voting of the judgements.
The detail of the evaluation can be found in the pdf document (PDF file).

EVALUATION RESULTS USAGE POLICY

When you use the WAT evaluation results for any purpose such as:
- writing technical papers,
- making presentations about your system,
- advertising your MT system to the customers,
you can use the information about translation directions, scores (including both automatic and human evaluations) and ranks of your system among others. You can also use the scores of the other systems, but you MUST anonymize the other system's names. In addition, you can show the links (URLs) to the WAT evaluation result pages.

NICT (National Institute of Information and Communications Technology)
Kyoto University
Last Modified: 2018-08-02

WAT The Workshop on Asian Translation Evaluation Results

BLEU

RIBES

AMFM

HUMAN (WAT2022)

HUMAN (WAT2021)

HUMAN (WAT2020)

HUMAN (WAT2019)

HUMAN (WAT2018)

HUMAN (WAT2017)

HUMAN (WAT2016)

HUMAN (WAT2015)

HUMAN (WAT2014)

EVALUATION RESULTS USAGE POLICY

WAT

The Workshop on Asian Translation

Evaluation Results