NICT_LOGO.JPG KYOTO-U_LOGO.JPG

WAT 2023

The 10th Workshop on Asian Translation


September 4, 2023
(Hosted by the MT Summit 2023)

[IMPORTANT DATES] | [INVITED TALK] | [TIMETABLE] | [POLICY] | [SHARED TASKS] | [RESEARCH PAPER] | [EVALUATION] | [BASELINE SYSTEM] | [PAPER SUBMISSION INFORMATION] | [APPLICATION] | [CONTACT] | [RELATED LINK] | [PREVIOUS WORKSHOPS]

INTRODUCTION

Many Asian countries are rapidly growing these days and the importance of communicating and exchanging the information with these countries has intensified. To satisfy the demand for communication among these countries, machine translation technology is essential.

Machine translation technology has rapidly evolved recently and it is seeing practical use especially between European languages. However, the translation quality of Asian languages is not that high compared to that of European languages, and machine translation technology for these languages has not reached a stage of proliferation yet. This is not only due to the lack of the language resources for Asian languages but also due to the lack of techniques to correctly transfer the meaning of sentences from/to Asian languages. Consequently, a place for gathering and sharing the resources and knowledge about Asian language translation is necessary to enhance machine translation research for Asian languages.

The Workshop on Machine Translation (WMT), the world's largest machine translation workshop, mainly targets on European languages and does not include Asian languages. The International Workshop on Spoken Language Translation (IWSLT) has spoken language translation tasks for some Asian languages using TED talk data, but these is no task for written language.

The Workshop on Asian Translation (WAT) is an open machine translation evaluation campaign focusing on Asian languages. WAT gathers and shares the resources and knowledge of Asian language translation to understand the problems to be solved for the practical use of machine translation technologies among all Asian countries. WAT is unique in that it is an "open innovation platform": the test data is fixed and open, so participants can repeat evaluations on the same data and confirm changes in translation accuracy over time. WAT has no deadline for the automatic translation quality evaluation (continuous evaluation), so participants can submit translation results at any time.

In addition to the shared tasks, the workshop will also feature scientific papers on topics related to the machine translation, especially for Asian languages. Topics of interest include, but are not limited to:

IMPORTANT DATES

Shared Task Submission DeadlineJuly 7, 2023July 14, 2023
Research Paper Submission DeadlineJuly 7, 2023July 14, 2023
System Description Paper for Shared Tasks Submission DeadlineJuly 14, 2023
Notification of Acceptance for Research PapersJuly 28, 2023
Review Feedback of System Description PapersJuly 28, 2023
Camera-ready Deadline (both Research and System Description Papers)August 4, 2023
Workshop DatesSeptember 4, 2023

* All deadlines are calculated at 11:59PM UTC-12

Back to top

INVITED TALK

Mr. Santhosh Thottingal
Principal Software Engineer, Wikimedia Foundation
[Short Bio.]

Title: Machine Translation at Wikipedia
Slides

Abstract:
Wikipedia, the multilingual encyclopedia available in over 320 languages, uses machine translation technology primarily for article translation. The translation process involves an integrated tool that utilizes various machine translation services to provide initial translations, which are then refined by editors before publication. To date, approximately 1.6 million articles have been translated. This presentation aims to introduce a human-in-the-loop product design, highlighting the provision of high-quality rich text translations through text-only machine translation, coupled with manual curation facilitated by human edits. Additionally, we will share insights and analytics pertaining to translation quality and translators. The discussion will encompass the machine translation engines employed, ranging from free and open-source systems to self-hosted services and external paid APIs. Wikipedia at present has machine translation capability to translate across 198 languages. Lastly, we will present the optimization techniques employed to scale machine translation models in order to meet the performance requirements of Wikipedia.
Back to top

TIMETABLE

Time zone: UTC+8
14:00-14:05Welcome (Toshiaki Nakazawa)
14:05-14:50Invited Talk (Chair: Raj Dabre) 45 mins.
14:05-14:50Machine Translation at Wikipedia
Santhosh Thottingal
14:50-15:10Research Paper (Chair: Isao Goto) 20 mins.
14:50-15:10Mitigating Domain Mismatch in Machine Translation via Paraphrasing
Hyuga Koretaka, Tomoyuki Kajiwara, Atsushi Fujita and Takashi Ninomiya
15:10-16:05Shared Task - Hindi/Malayalam/Bengali Multimodal (chair: Shantipriya Parida) 20 mins. * 2 + 15 mins.
15:10-15:25Task Descriptions and Results: multimodal (Shantipriya Parida)
15:25-15:45BITS-P at WAT 2023: Improving Indic Language Multimodal Translation by Image Augmentation using Diffusion Models
Amulya Dash, Hrithik Raj Gupta and Yashvardhan Sharma
15:45-16:05OdiaGenAI's Participation at WAT2023
SK Shahid, Guneet Singh Kohli, Sambit Sekhar, Debasish Dhal, Adit Sharma, Shubhendra Khusawash, Shantipriya Parida, Stig-Arne Grönroos and Satya Ranjan Dash
16:05-16:10Closing (Toshiaki Nakazawa)
Back to top

POLICY

Shared task participation policy:

Research paper and system description paper submission policy:

Back to top

RESEARCH PAPER

WAT2023 invites researchers to submit their original work on machine translation for Asian languages. The scope covers studies and reports on theories, techniques, and resources to improve the machines translation of Asian languages. All submitted research papers will be examined under a double-blind peer-reviewing to decide if they will appear at the workshop. Topics of interest include, but are not limited to:

The format of research paper is identital to the format of system description, please refer to the section of PAPER SUBMISSION INFORMATION.

Back to top

SHARED TASK

Tasks:

Baseline system:

Baseline systems site is now open.

Back to top

TRANSLATION TASK EVALUATION

We will evaluate the translation performance of the results submitted through automatic evaluation and human evaluation.

Automatic evaluation:
We will prepare an automatic evaluation server. You will be able to evaluate the translation results at any time using this server.

Human evaluation:
Human evaluation will be carried out with two kinds of method, which are Pairwise Crowdsourcing Evaluation and JPO Adequacy Evaluation.

Submission:
Submission site is now open.
(User Name and Password is necessary to access.)

Evaluation results:
Evaluation results site is now open.

Back to top

PAPER SUBMISSION INFORMATION

Participants who submit results for human evaluation are required to submit description papers of their translation systems and evaluation results. All submissions and feedback are handled electronically as below.

Back to top

APPLICATION for Shared Tasks

The application site for task participants of WAT2023 is now open.

Back to top

ORGANIZERS

Back to top

PROGRAM COMMITTEE MEMBERS

Back to top

TECHNICAL COLLABORATOR

Back to top

CONTACT

For questions, comments, etc. please email to "wat-organizer -at- googlegroups -dot- com".

Back to top

Japan Patent Office
The Association for Natural Language Processing
National Institute of Information and Communications Technology (NICT)
Asia-Pacific Association for Machine Translation (AAMT)

Back to top

PREVIOUS WORKSHOPS

Back to top

CHANGE LOG

2023-05-13: site opened
2023-06-07: task name using JIJI Corpus was changed

Back to top

NICT (National Institute of Information and Communications Technology)
Kyoto University
Last Modified: 2023-05-13