Many Asian countries are rapidly growing these days and the importance of communicating and exchanging the information with these countries has intensified. To satisfy the demand for communication among these countries, machine translation technology is essential.
Machine translation technology has rapidly evolved recently and it is seeing practical use especially between European languages. However, the translation quality of Asian languages is not that high compared to that of European languages, and machine translation technology for these languages has not reached a stage of proliferation yet. This is not only due to the lack of the language resources for Asian languages but also due to the lack of techniques to correctly transfer the meaning of sentences from/to Asian languages. Consequently, a place for gathering and sharing the resources and knowledge about Asian language translation is necessary to enhance machine translation research for Asian languages.
The Workshop on Machine Translation (WMT), the world's largest machine translation workshop, mainly targets on European languages and does not include Asian languages. The International Workshop on Spoken Language Translation (IWSLT) has spoken language translation tasks for some Asian languages using TED talk data, but these is no task for written language.
The Workshop on Asian Translation (WAT) is an open machine translation evaluation campaign focusing on Asian languages. WAT gathers and shares the resources and knowledge of Asian language translation to understand the problems to be solved for the practical use of machine translation technologies among all Asian countries. WAT is unique in that it is an "open innovation platform": the test data is fixed and open, so participants can repeat evaluations on the same data and confirm changes in translation accuracy over time. WAT has no deadline for the automatic translation quality evaluation (continuous evaluation), so participants can submit translation results at any time.
In addition to the shared tasks, the workshop will also feature scientific papers on topics related to the machine translation, especially for Asian languages.
Topics of interest include, but are not limited to:
|SunFlare Co., Ltd.|
|Shared Task Submission Deadline|
|Research Paper Submission Deadline|
|System Description Paper for Shared Tasks Submission Deadline|
|Notification of Acceptance for Research Papers|
|Review Feedback of System Description Papers|
|Camera-ready Deadline (both Research and System Description Papers)|
|Workshop Dates||August 6, 2021|
* All deadlines are calculated at 11:59PM UTC-12
|7:05-7:45||Invited Talk I (Chair: Dabre) 40 mins.|
|7:05-7:45||Massively Multilingual Translation and Evaluation|
Francisco Guzmán and Angela Fan
|7:50-9:30||Shared Task I - Restricted / ALT+UCSY / NICT-SAP (Chair: Eriguchi, Ding, Dabre) 20 mins. + 10 mins. * 8|
|7:50-8:10||Task Descriptions and Results: Restricted (Eriguchi) / ALT+UCSY (Ding) / NICT-SAP (Dabre)|
|8:10-8:20||NHK’s Soft Lexically Constrained Neural Machine Translation at WAT 2021|
Hideya Mino, Kazutaka Kinugawa, Hitoshi Ito, Isao Goto, Ichiro Yamada and Takenobu Tokunaga
|8:20-8:30||Input Augmentation Improves Constrained Beam Search for Neural Machine Translation: NTT at WAT 2021|
Katsuki Chousa and Makoto Morishita
|8:30-8:40||NICT's Neural Machine Translation Systems for the WAT21 Restricted Translation Task|
Zuchao Li, Masao Utiyama, Eiichiro Sumita and Hai Zhao
|8:40-8:50||Machine Translation with the Pre-specified Target-side Words Using a Semi-autoregressive Model|
Seiichiro Kondo, Aomi Koyama, Tomoshige Kiyuna, Tosho Hirasawa and Mamoru Komachi
|8:50-9:00||NECTEC’s Participation in WAT-2021|
Zar Zar Hlaing, Ye Kyaw Thu, Thazin Myint Oo, Mya Ei San, Sasiporn Usanavasin, Ponrudee Netisopakul and Thepchai Supnithi
|9:00-9:10||Hybrid Statistical Machine Translation for English-Myanmar: UTYCC-Team1 Submission to WAT-2021|
Ye Kyaw Thu, Thazin Myint Oo, Hlaing Myat Nwe, Khaing Zar Mon, Nang Aeindray Kyaw, Naing Linn Phyo, Nann Hwan Khun and Hnin Aye Thant
|9:10-9:20||NICT-2 Translation System at WAT-2021: Applying a Pretrained Multilingual Encoder-Decoder Model to Low-resource Language Pairs|
Kenji Imamura and Eiichiro Sumita
|9:20-9:30||Rakuten's Participation in WAT 2021: Examining the Effectiveness of Pre-trained Models for Multilingual and Multimodal Machine Translation|
Raymond Hendy Susanto, Dongzhe Wang, Sunil Yadav, Mausam Jain and Ohnmar Htun
|9:35-10:15||Invited Talk II (Chair: Goto) 40 mins.|
|9:35-10:15||Understanding and Improving Context Usage in Context-aware Translation|
|10:20-11:40||Research Paper / Findings I (Chair: Mino) 20 mins. * 4|
|10:20-10:40||BTS: Back TranScription for Speech-to-Text Post-Processor using Text-to-Speech-to-Text|
Chanjun Park, Jaehyung Seo, Seolhwa Lee, Chanhee Lee, Hyeonseok Moon, Sugyeong Eo and Heuiseok Lim
|10:40-11:00||Zero-pronoun Data Augmentation for Japanese-to-English Translation|
Ryokan Ri, Toshiaki Nakazawa and Yoshimasa Tsuruoka
|11:00-11:20||Evaluation Scheme of Focal Translation for Japanese Partially Amended Statutes|
Takahiro Yamakoshi, Takahiro Komamizu, Yasuhiro Ogawa and Katsuhiko Toyama
|11:20-11:40||Joint Optimization of Tokenization and Downstream Model|
Tatsuya Hiraoka, Sho Takase, Kei Uchiumi, Atsushi Keyaki and Naoaki Okazaki
|13:00-14:40||Shared Task II - JPC / Hindi&Malayalam Multimodal / Japanese Multimodal (Chair: Higashiyama, Parida, Nakayama, Chu) 20 mins. + 10 mins. * 8|
|13:00-13:20||Task Descriptions and Results: JPC (Higashiyama) / Hindi&Malayalam Multimodal (Parida) / Japanese Multimodal (Nakayama, Chu)|
|13:20-13:30||TMU NMT System with Japanese BART for the Patent task of WAT 2021|
Hwichan Kim and Mamoru Komachi
|13:30-13:40||System Description for Transperfect|
Wiktor Stribiżew, Fred Bane, José Conceição and Anna Zaretskaya
|13:40-13:50||Bering Lab’s Submissions on WAT 2021 Shared Task|
Heesoo Park and Dongjun Lee
|13:50-14:00||NLPHut’s Participation at WAT2021|
Shantipriya Parida, Subhadarshi Panda, Ketan Kotwal, Amulya Dash, Satya Ranjan Dash, Yashvardhan Sharma, Petr Motlicek and Ondřej Bojar
|14:00-14:10||Improved English to Hindi Multimodal Neural Machine Translation|
Sahinur Rahman Laskar, Abdullah Faiz Ur Rahman Khilji, Darsh Kaushik, Partha Pakray and Sivaji Bandyopadhyay
|14:10-14:20||IITP at WAT 2021: System description for English-Hindi Multimodal Translation Task|
Baban Gain, Dibyanayan Bandyopadhyay and Asif Ekbal
|14:20-14:30||ViTA: Visual-Linguistic Translation by Aligning Object Tags|
Kshitij Gupta, Devansh Gautam and Radhika Mamidi
|14:30-14:40||TMEKU System for the WAT2021 Multimodal Translation Task|
Yuting Zhao, Mamoru Komachi, Tomoyuki Kajiwara and Chenhui Chu
|14:45-15:45||Research Paper / Findings II (Chair: Parida) 20 mins. * 3|
|14:45-15:05||Optimal Word Segmentation for Neural Machine Translation into Dravidian Languages|
Prajit Dhar, Arianna Bisazza and Gertjan van Noord
|15:05-15:25||Itihasa: A large-scale corpus for Sanskrit to English translation|
Rahul Aralikatte, Miryam de Lhoneux, Anoop Kunchukuttan and Anders Søgaard
|15:25-15:45||IndoCollex: A Testbed for Morphological Transformation of Indonesian Word Colloquialism|
Haryo Akbarianto Wibowo, Made Nindyatama Nityasya, Afra Feyza Akyüek, Suci Fitriany, Alham Fikri Aji, Radityo Eko Prasojo and Derry Tanti Wijaya
|15:50-17:20||Shared Task III - MultiIndicMT (Chair: Dabre) 20 mins. + 10 mins. * 8|
|15:50-16:00||Task Descriptions and Results: MultiIndicMT (Parida)|
|16:00-16:10||NICT-5's Submission To WAT 2021: MBART Pre-training And In-Domain Fine Tuning For Indic Languages|
Raj Dabre and Abhisek Chakrabarty
|16:10-16:20||How far can we get with one GPU in 100 hours? CoAStaL at MultiIndicMT Shared Task|
Rahul Aralikatte, Héctor Ricardo Murrieta Bello, Daniel Hershcovich, Marcel Bollmann and Anders Søgaard
|16:20-16:30||IIIT Hyderabad Submission To WAT 2021: Efficient Multilingual NMT systems for Indian languages|
Sourav Kumar, Salil Aggarwal and Dipti Sharma
|16:30-16:40||Language Relatedness and Lexical Closeness can help Improve Multilingual NMT: IITBombay@MultiIndicNMT WAT2021|
Jyotsana Khatri, Nikhil Saini and Pushpak Bhattacharyya
|16:40-16:50||Samsung R&D Institute Poland submission to WAT 2021 Indic Language Multilingual Task|
Adam Dobrowolski, Marcin Szymański, Marcin Chochowski and Paweł Przybysz
|16:50-17:00||Multilingual Machine Translation Systems at WAT 2021: One-to-Many and Many-to-One Transformer based NMT|
Shivam Mhaskar, Aditya Jain, Aakash Banerjee and Pushpak Bhattacharyya
|17:00-17:10||IITP-MT at WAT2021: Indic-English Multilingual Neural Machine Translation using Romanized Vocabulary|
Ramakrishna Appicharla, Kamal Kumar Gupta, Asif Ekbal and Pushpak Bhattacharyya
|17:10-17:20||ANVITA Machine Translation System for WAT 2021 MultiIndicMT Shared Task|
Pavanpankaj Vegi, Sivabhavani J, Biswajit Paul, Chitra Viswanathan and Prasanna Kumar K R
Shared task participation policy:
Research paper and system description paper submission policy:
Baseline systems site is now open.
We will evaluate the translation performance of the results submitted through
automatic evaluation and human evaluation.
We will prepare an automatic evaluation server. You will be able to evaluate the translation results at any time using this server.
Human evaluation will be carried out with two kinds of method, which are Pairwise Crowdsourcing Evaluation and JPO Adequacy Evaluation.
Submission site is now open.
(User Name and Password is necessary to access.)
Evaluation results site is now open.
Participants who submit results for human evaluation are required to submit description papers of their translation systems and evaluation results. All submissions and feedback are handled electronically as below.
The application site
for task participants of WAT2021 is now open.
For questions, comments, etc. please email to "wat-organizer -at- googlegroups -dot- com".
Japan Patent Office
The Association for Natural Language Processing
National Institute of Information and Communications Technology (NICT)
Asia-Pacific Association for Machine Translation (AAMT)
2020-12-21: site opened
NICT (National Institute of Information and Communications Technology)
Last Modified: 2020-12-21