WAT 2022

The 9th Workshop on Asian Translation

October 17, 2022
(Hosted by the COLING 2022)

Note: For remote presenters, please check the Underline's WAT2022 page

INTRODUCTION

Many Asian countries are rapidly growing these days and the importance of communicating and exchanging the information with these countries has intensified. To satisfy the demand for communication among these countries, machine translation technology is essential.

Machine translation technology has rapidly evolved recently and it is seeing practical use especially between European languages. However, the translation quality of Asian languages is not that high compared to that of European languages, and machine translation technology for these languages has not reached a stage of proliferation yet. This is not only due to the lack of the language resources for Asian languages but also due to the lack of techniques to correctly transfer the meaning of sentences from/to Asian languages. Consequently, a place for gathering and sharing the resources and knowledge about Asian language translation is necessary to enhance machine translation research for Asian languages.

The Workshop on Machine Translation (WMT), the world's largest machine translation workshop, mainly targets on European languages and does not include Asian languages. The International Workshop on Spoken Language Translation (IWSLT) has spoken language translation tasks for some Asian languages using TED talk data, but these is no task for written language.

The Workshop on Asian Translation (WAT) is an open machine translation evaluation campaign focusing on Asian languages. WAT gathers and shares the resources and knowledge of Asian language translation to understand the problems to be solved for the practical use of machine translation technologies among all Asian countries. WAT is unique in that it is an "open innovation platform": the test data is fixed and open, so participants can repeat evaluations on the same data and confirm changes in translation accuracy over time. WAT has no deadline for the automatic translation quality evaluation (continuous evaluation), so participants can submit translation results at any time.

In addition to the shared tasks, the workshop will also feature scientific papers on topics related to the machine translation, especially for Asian languages. Topics of interest include, but are not limited to:

analysis of the automatic/human evaluation results in the past WAT workshops
word-/phrase-/syntax-/semantics-/rule-based, neural and hybrid machine translation
Asian language processing
incorporating linguistic information into machine translation
decoding algorithms
system combination
error analysis
manual and automatic machine translation evaluation
machine translation applications
quality estimation
domain adaptation
machine translation for low resource languages
language resources

IMPORTANT DATES

Shared Task Submission Deadline	~~July 11, 2022~~July 18, 2022
Research Paper Submission Deadline	July 11, 2022
System Description Paper for Shared Tasks Submission Deadline	August 1, 2022
Notification of Acceptance for Research Papers	August 22, 2022
Review Feedback of System Description Papers	August 29, 2022
Camera-ready Deadline (both Research and System Description Papers)	September 5, 2022
Workshop Dates	October 17, 2022

* All deadlines are calculated at 11:59PM UTC-12

SunFlare Co., Ltd.

INVITED TALKS

Prof. Duygu Ataman
Assistant Professor of Computer Science, New York University
Title: Machine translation of Turkic languages: Current approaches and Open challenges
[Abstract]

TIMETABLE

Time zone: UTC+9

9:00-9:05	Welcome (Nakazawa)
9:05-9:50	Invited Talk (Chair: ???) 45 mins.
9:05-9:50	Machine translation of Turkic languages: Current approaches and Open challenges Duygu Ataman
9:50-10:30	Research Paper I (Chair: ???) 20 mins. * 2
9:50-10:10	Comparing BERT-based Reward Functions for Deep Reinforcement Learning in Machine Translation Yuki Nakatani, Tomoyuki Kajiwara and Takashi Ninomiya
10:10-10:30	Improving Jejueo-Korean Translation With Cross-Lingual Pretraining Using Japanese and Korean Francis Zheng, Edison Marrese-Taylor and Yutaka Matsuo
10:30-11:00	Break
11:00-12:30	Shared Task I - Restricted / Software /SWSTR (chair: Abe, Dabre) 20 mins. * 3 + 10 mins. * 3
11:00-11:10	Task Descriptions and Results: Restricted (Abe)
11:10-11:30	TMU NMT System with Automatic Post-Editing by Multi-Source Levenshtein Transformer for the Restricted Translation Task of WAT 20221 Seiichiro Kondo and Mamoru Komachi
11:30-11:40	Task Descriptions and Results: Software (Dabre)
11:40-12:00	HwTscSU's Submissions on WAT 2022 Shared Task Yilun Liu, Zhen Zhang, shimin tao, Junhui Li and Hao Yang
12:00-12:10	Task Descriptions and Results: SWSTR (Dabre)
12:10-12:30	NICT's Submission to the WAT 2022 Structured Document Translation Task Raj Dabre
12:30-14:00	Lunch Break
14:00-15:30	Shared Task II - Parallel Corpus FIltering / Indic (chair: Morishita, Parida) 20 mins. * 3 + 10 mins. * 2
14:00-14:10	Task Descriptions and Results: Parallel Corpus Filtering (Morishita)
14:10-14:30	Rakuten's Participation in WAT 2022: Parallel Dataset Filtering by Leveraging Vocabulary Heterogeneity Alberto Poncelas, Johanes Effendi, Ohnmar Htun, Sunil Kumar Yadav, Dongzhe Wang and Saurabh Jain
14:30-14:40	Task Descriptions and Results: Indic (Parida)
14:40-15:00	NIT Rourkela Machine Translation(MT) System Submission to WAT 2022 for MultiIndicMT: An Indic Language Multilingual Shared Task Sudhansu Bala Das, Atharv Biradar, Tapas Kumar Mishra and Bidyut Kumar Patra
15:00-15:20	Investigation of Multilingual Neural Machine Translation for Indian Languages Sahinur Rahman Laskar, Riyanka Manna, Partha Pakray and Sivaji Bandyopadhyay
15:20-16:00	Break
16:00-16:40	Research Paper II (Chair: ???) 20 mins. * 2
16:00-16:20	Does partial pretranslation can improve low ressourced-languages pairs? Raoul Blin
16:20-16:40	Multimodal Neural Machine Translation with Search Engine Based Image Retrieval ZhenHao TZH Tang, XiaoBing Zhang, Zi Long and XiangHua Fu
16:40-17:50	Shared Task III - Hindi&Malayalam Multimodal (chair: Parida) 20 mins. * 3 + 10 mins.
16:40-16:50	Task Descriptions and Results: multimodal (Parida)
16:50-17:10	Silo NLP's Participation at WAT2022 Shantipriya Parida, Subhadarshi Panda, Stig-Arne Grönroos, Mark Granroth-Wilding and Mika Koistinen
17:10-17:30	PICT@WAT 2022: Neural Machine Translation Systems for Indic Languages Anupam Atul Patil, Isha Joshi and Dipali Kadam
17:30-17:50	English to Bengali Multimodal Neural Machine Translation using Transliteration-based Phrase Pairs Augmentation Sahinur Rahman Laskar, Pankaj Dadure, Riyanka Manna, Partha Pakray and Sivaji Bandyopadhyay Investigation of English to Hindi Multimodal Neural Machine Translation using Transliteration-based Phrase Pairs Augmentation Sahinur Rahman Laskar, Rahul Singh, Md Faizal Karim, Riyanka Manna, Partha Pakray and Sivaji Bandyopadhyay
17:50-17:55	Closing (Nakazawa)

POLICY

Shared task participation policy:

There is no limitation on the number of teams from one laboratory to apply for WAT2022.
From one laboratory (not one team), at most two submissions are allowed for human evaluation in each sub-task.
If a team takes part in human evaluation, it is mandatory to submit its system description paper and to attend the workshop.
A team can submit its system description paper even only taking part in automatic evaluation.
In the case that a team taking part in human evaluation but does not submit their system description paper, the evaluation results, including automatic and by human, will not be presented in the overview paper provided by the organizers.

Research paper and system description paper submission policy:

All the submissions to WAT2022 must follow the policy of COLING 2022 long paper submissions, excluding the following exceptions.
- Single-blind review (not need to blind the author information) for the system description papers.
- Author information can't be changed after the initial submission.
- Multiple Submission Policy: submitted papers must not be submitted elsewhere during the COLING 2022 review period. This policy covers all refereed and archival conferences and workshops (e.g., IJCAI, SIGIR, NAACL). We make an exception to the above: papers can be dual-submitted to both COLING 2022 and an COLING 2022 workshop which has its submission deadline falling before our notification date of August 22, 2022. In addition, we will not consider any paper that overlaps significantly in content or results with papers that will be (or have been) published elsewhere. Authors submitting more than one paper to COLING 2022 must ensure that submissions do not overlap significantly (>25%) with each other in content or results.
System description paper submission is only for the participants of shared tasks.

RESEARCH PAPER

WAT2022 invites researchers to submit their original work on machine translation for Asian languages. The scope covers studies and reports on theories, techniques, and resources to improve the machines translation of Asian languages. All submitted research papers will be examined under a double-blind peer-reviewing to decide if they will appear at the workshop. Topics of interest include, but are not limited to:

Word-/phrase-/syntax-/semantics-/rule-based, neural, and hybrids machine translation
Asian language processing
Incorporating linguistic information into machine translation
Decoding algorithms
System combination
Error analysis
Manual and automatic machine translation evaluation
Machine translation applications
Quality estimation
Domain adaptation
Machine translation for low resource languages
Language resources

The format of research paper is identital to the format of system description, please refer to the section of PAPER SUBMISSION INFORMATION.

SHARED TASK

Tasks:

Document-level translation tasks:
- ASPEC+ParaNatCom: English --> Japanese Scientific Paper
- BSD Corpus: English <--> Japanese Business Scene Dialogue
- JIJI Corpus: English <--> Japanese Newswire
- NICT-SAP:
  - Unstructured document translation: Hindi/Thai/Indonesian/Malay/Vietnamese <--> English
  - Structured document translation: Japanese/Chinese/Korean <--> English
Multimodal translation tasks:
- Hindi Visual Genome: English --> Hindi
- Malayalam Visual Genome: English --> Malayalam
- Bengali Visual Genome: English --> Bengali
- Ambiguous MS COCO: English <--> Japanese
- Video Guided Ambiguous Subtitling: Japanese --> English
- Khmer Speech Translation: Khmer --> English/French
Indic tasks:
- MultiIndicMT: Assamese/Bengali/Gujarati/Hindi/Kannada/Malayalam/Marathi/Nepali/Odia/Punjabi/Sindhi/Sinhala/Tamil/Telugu/Urdu <--> English
Patent task:
- JPC3: English/Chinese/Korean <--> Japanese
Restricted Translation task:
- Check this page.
Parallel Corpus Filtering task:
- Check this page.

Baseline system:

Baseline systems site is now open.

TRANSLATION TASK EVALUATION

We will evaluate the translation performance of the results submitted through automatic evaluation and human evaluation.

Automatic evaluation:
We will prepare an automatic evaluation server. You will be able to evaluate the translation results at any time using this server.

Metric:
BLEU, RIBES, and AM-FM^*
* A two-dimensional automatic evaluation metric for machine translation, designed to operate at the sentence level.
* It is based on adequacy and fluency, to decouple semantic and syntactic components of the translation process to provide a balanced view on translation quality.
Format:
The submission format and the submission method are given at the submission site below.
Notice:
When submitting, participants have to agree that the submitted results are attributed to JST and NICT. The results will be used and distributed for research by JST and NICT.
Thanks to the technical collaborators: Luis Fernando D'Haro, Rafael E. Banchs and Haizhou Li.

Human evaluation:
Human evaluation will be carried out with two kinds of method, which are Pairwise Crowdsourcing Evaluation and JPO Adequacy Evaluation.

Submission:
Submission site is now open.
(User Name and Password is necessary to access.)

Evaluation results:
Evaluation results site is now open.

PAPER SUBMISSION INFORMATION

Participants who submit results for human evaluation are required to submit description papers of their translation systems and evaluation results. All submissions and feedback are handled electronically as below.

Format and Template:
Participants must use the same format as COLING 2022 long-paper and follow the same instructions in terms of the format. We do not distinguish between long and short papers.
We ask you to use the COLING 2022 LaTeX style files or Microsoft Word template that are available on the COLING 2022 conference web site.
Submission site:
https://www.softconf.com/coling2022/WAT2022/

APPLICATION for Shared Tasks

The application site for task participants of WAT2022 is now open.

ORGANIZERS

Toshiaki Nakazawa, The University of Tokyo, Japan [GENERAL, ASPEC+ParaNatCom, BSD]
Isao Goto, Japan Broadcasting Corporation (NHK), Japan [GENERAL, JIJI]
Hidaya Mino, Japan Broadcasting Corporation (NHK), Japan [GENERAL, JIJI]
Chenchen Ding, National Institute of Information and Communications Technology (NICT), Japan [GENERAL]
Raj Dabre, National Institute of Information and Communications Technology (NICT), Japan [MultiIndicMT, NICT-SAP]
Anoop Kunchookuttan, Microsoft AI and Research, India [MultiIndicMT]
Shohei Higashiyama, National Institute of Information and Communications Technology (NICT), Japan [JPC]
Hiroshi Manabe, National Institute of Information and Communications Technology (NICT), Japan [GENERAL]
Shantipriya Parida, Silo AI, Finland [Hindi Visual Genome, Malayalam Visual Genome, Bengali Visual Genome]
Ondřej Bojar, Charles University, Prague, Czech Republic [Hindi Visual Genome, Malayalam Visual Genome, Bengali Visual Genome]
Chenhui Chu, Kyoto University, Japan [Ambiguous MS COCO]
Akiko Eriguchi, Microsoft, USA [Restricted Translation]
Kaori Abe, Tohoku University, Japan [Restricted Translation]
Yusuke Oda, LegalForce, Japan [Restricted Translation, Parallel Corpus Filtering]
Makoto Morishita, NTT, Japan [Parallel Corpus Filtering]
Katsuhito Sudoh, Nara Institute of Science and Technology (NAIST), Japan [GENERAL]
Sadao Kurohashi, Kyoto University, Japan [GENERAL]
Pushpak Bhattacharyya, Indian Institute of Technology Patna (IITP), India [GENERAL]

PROGRAM COMMITTEE MEMBERS

Chenhui Chu, Kyoto University, Japan
Sangjee Dondrub, Qinghai Normal University, China
Chao-Hong Liu, ADAPT Centre, Dublin City University, Ireland
Valentin Malykh, Huawei Noah’s Ark Lab / Kazan Federal University, Russian Federation
Takashi Ninomiya, Ehime University, Japan
Katsuhito Sudoh, Nara Institute of Science and Technology (NAIST), Japan
Masao Utiyama, NICT, Japan
Xinyi Wang, Carnegie Mellon University, United States
Jiajun Zhang, Chinese Academy of Sciences, China

TECHNICAL COLLABORATOR

Luis Fernando D'Haro, Universidad Politécnica de Madrid, Spain
Rafael E. Banchs, Nanyang Technological University, Singapore
Haizhou Li, National University of Singapore, Singapore
Chen Zhang, National University of Singapore, Singapore

CONTACT

For questions, comments, etc. please email to "wat-organizer -at- googlegroups -dot- com".

Japan Patent Office
The Association for Natural Language Processing
National Institute of Information and Communications Technology (NICT)
Asia-Pacific Association for Machine Translation (AAMT)

PREVIOUS WORKSHOPS

WAT 2021
The 8th Workshop on Asian Translation (WAT 2021) was held in August 6 2021 in Bangkok, Thailand.
WAT 2020
The 7th Workshop on Asian Translation (WAT 2020) was held in December 2020 in Suzhou, China (Online).
WAT 2019
The 6th Workshop on Asian Translation (WAT 2019) was held in December 2019 in Hong Kong, China.
WAT 2018
The 5th Workshop on Asian Translation (WAT 2018) was held in December 2018 in Hong Kong, China.
WAT 2017
The 4th Workshop on Asian Translation (WAT 2017) was held in November 2017 in Taipei, China.
WAT 2016
The 3rd Workshop on Asian Translation (WAT 2016) was held in December 2016 in Osaka, Japan.
WAT 2015
The 2nd Workshop on Asian Translation (WAT 2015) was held in October 2015 in Kyoto, Japan.
WAT 2014
The 1st Workshop on Asian Translation (WAT 2014) was held in October 2014 in Tokyo, Japan.

CHANGE LOG

2022-03-29: site opened

NICT (National Institute of Information and Communications Technology)
Kyoto University
Last Modified: 2022-03-29

WAT 2022 The 9th Workshop on Asian Translation

October 17, 2022 (Hosted by the COLING 2022)