WAT 2015

The 2nd Workshop on Asian Translation

10:30am-17:00pm, October 16, 2015
The second conference room, 4th floor,
Campus Plaza Kyoto, Japan [MAP]

INTRODUCTION

The Workshop on Asian Translation (WAT) is a new open evaluation campaign focusing on Asian languages. We would like to invite a broad range of participants and conduct various forms of machine translation experiments and evaluation. Collecting and sharing our knowledge will allow us to understand the essence of machine translation and the problems to be solved. We are working toward the practical use of machine translation among all Asian countries.

For the 2nd WAT, we chose scientific papers and patents as the targeted domain, and selected the languages Japanese, Chinese, Korean and English.

What makes WAT unique:

Open innovation platform
The test data is fixed and open, so you can repeat evaluations on the same data and confirm changes in translation accuracy over time. WAT has no deadline for the automatic translation quality evaluation (continuous evaluation), so you can submit translation results at any time.
Domain and language pairs
WAT is the world's first workshop that uses scientific papers as a domain and Japanese-Chinese as a language pair. In the future, we will add more Asian languages, such as Korean, Vietnamese, Indonesian, Thai, Myanmar and so on.
Context-aware evaluation
The test data of WAT is prepared using the paragraph as a unit, while almost all other evaluation campaigns use the sentence as a unit. We would like to consider how to realize context-aware evaluation in WAT.
Evaluation method
Evaluation will be done by both automatic and human evaluation. For human evaluation, WAT will use crowdsourcing, which is low cost and allows multiple evaluations.

SunFlare

TIMETABLE

10:30 - 10:35	Welcome: Hideya Mino
10:35 - 11:05	Invited talk Ⅰ: Eiichiro Sumita
11:05 - 11:10	Break
11:10 - 11:30	Overview of WAT2015: Isao Goto
11:30 - 12:30	Oral Presentation Ⅰ (3 systems: TOSHIBA, naver and Sense)
	11:30 - 11:50 Toshiba MT System Description for the WAT2015 Workshop: Satoshi Sonoh and Satoshi Kinoshita
	11:50 - 12:10 NAVER Machine Translation System for WAT 2015: Hyoung-Gyu Lee, Jaesong Lee, Jun-Seok Kim and Chang-Ki Lee
	12:10 - 12:30 An Awkward Disparity between BLEU / RIBES Scores and Human Judgements in Machine Translation: Liling Tan, Jon Dehdari and Josef van Genabith
12:30 - 13:30	Lunch
13:30 - 14:15	Invited talk Ⅱ: Haizhou Li
14:15 - 14:20	Break
14:20 - 15:00	Oral Presentation Ⅱ (2 systems: NAIST and Kyoto-U)
	14:20 - 14:40 Neural Reranking Improves Subjective Quality of Machine Translation: NAIST at WAT2015: Graham Neubig, Makoto Morishita and Satoshi Nakamura
	14:40 - 15:00 KyotoEBMT System Description for the 2nd Workshop on Asian Translation: John Richardson, Raj Dabre, Chenhui Chu, Fabien Cromières, Toshiaki Nakazawa and Sadao Kurohashi
15:00 - 15:15	Poster Booster Session
15:15 - 16:45	Poster Presentation (all systems)
16:45 - 16:55	Closing: Toshiaki Nakazawa
16:55 - 17:00	Commemorative photo

notice: Lunch, Poster Presentation, Closing, and Commemorative photo will be held at the hall of 2nd floor.

INVITED TALK

Dr. Eiichiro Sumita, Associate Director General of Universal Communication Research Institute and Director of Multilingual Translation Laboratory (NICT)
[Short bio.]

Title: Government project for multi-lingual speech translation system to bridge the language barrier in Japan
[slide]

(Click here to more information. )

Time:10:35 - 11:05

Dr. Haizhou Li, Research Director of the Institute for Infocomm Research in Singapore, Principal Scientist and Department Head of Human Language Technology
[Short bio.]

Title: Adequacy-Fluency Metrics: Evaluating MT in the Continuous Space Model Framework
[slide]

(Click here to more information. )

Time:13:30 - 14:15

IMPORTANT DATES

Submission due for pairwise crowdsourcing evaluation	August 31, 2015	~~August 16, 2015~~
System description draft paper due	September 27, 2015	~~September 20, 2015~~
Review feedback	October 4, 2015	~~September 27, 2015~~
System description camera-ready paper due	October 11, 2015	~~October 4, 2015~~
WAT 2015	October 16, 2015 (whole day)

All deadlines are calculated at 11:59PM Pacific Time (UTC/GMT -8 hours)

notice: Task participants should submit the translation results for pairwise crowdsourcing evaluation.

TASK

The task is to improve the text translation quality for scientific papers and patents. Participants choose any of the subtasks in which they would like to participate and translate the test data using their machine translation systems. The WAT organizers will evaluate the results submitted using automatic evaluation and human evaluation. We also provide a baseline machine translation system using Moses.

Subtasks:

Scientific papers Subtask:
- English -> Japanese
- Japanese -> English
- Chinese -> Japanese
- Japanese -> Chinese
Patents Subtask:
- Chinese -> Japanese
- Korea -> Japanese

Dataset:

Scientific papers Subtask:
WAT uses ASPEC for the dataset including training, development, development test and test data. Participants of the scientific papers subtask must get a copy of ASPEC by themselves. ASPEC consists of approximately 3 million Japanese-English parallel sentences from paper abstracts (ASPEC-JE) and approximately 0.7 million Japanese-Chinese paper excerpts (ASPEC-JC)
Patents Subtask:
WAT uses JPO Patent Corpus, which is constructed by Japan Patent Office (JPO). This corpus consists of 1 million Chinese-Japanese parallel sentences and 1 million Korean-Japanese parallel sentences from patent description with four categories. Participants of patents subtask are required to get it on WAT2015 site of JPO Patent Corpus.

Baseline system:

Baseline systems site is now open.

EVALUATION

We will evaluate the translation performance of the results submitted through automatic evaluation and human evaluation.

Automatic evaluation:
We will prepare an automatic evaluation server. You will be able to evaluate the translation results at any time using this server.

Metric:
BLEU and RIBES
Format:
The submission format and the submission method are given at the submission site below.
Notice:
When submitting, participants have to agree that the submitted results are attributed to JST and NICT. The results will be used and distributed for research by JST and NICT.
The automatic evaluation server for the patents subtask will be available on 29th May.

Human evaluation:
Human evaluation will be carried out with two kinds of method, which are Pairwise Crowdsourcing Evaluation and JPO Adequacy Evaluation.

Pairwise Crowdsourcing Evaluation:
Pairwise Crowdsourcing Evaluation will be carried out using crowdsourcing. Organizers will sample 400 sentences from the test data for the pairwise crowdsourcing evaluation. Participants can submit translation results for the human evaluation a maximum of twice until the pairwise crowdsourcing evaluation is due.
- Metric:
  Sentence-by-sentence pair-wise evaluation compared to the baseline system. The crowdsourcing workers will be asked to judge which translation is better than the other in view of adequacy and fluency. To guarantee the quality of the evaluation, each sentence is evaluated by 5 different workers and the final decision is made by the voting of the judgements.
- Format:
  The submission format is the same as that of automatic evaluation. Participants can select their translation results from the ones submitted for automatic evaluation.
- Ranking:
  All systems will be ranked by the percentage of translations judged to improve upon the baseline system.

JPO Adequacy Evaluation:
We will also evaluate with the criteria of Content Transmission Level Evaluation which JPO defined (pages 5 to 8 in the PDF file (Japanese Page)). We will sample 200 sentences from the pairwise crowdsourcing evaluation data for the JPO adequacy evaluation. The JPO adequacy evaluation will be conducted only for translation results of 3 top-scored teams on the Pairwise Crowdsourcing Evaluation in each subtask's language pair.

Submission:

Submission site is now open.
(User Name and Password is necessary to access.)

Evaluation results:

Evaluation results site is now open.

PAPER SUBMISSION INFORMATION

Participants who submit results for human evaluation should submit description papers of their translation systems and evaluation results. All submissions and feedback are handled electronically as below.

Format:
PDF file format, two (2) columns, no smaller than nine (9) point type font throughout the paper.
Language:
Papers should be written in English.
Encouragement:
We strongly prefer that papers include a section entitled "Issues for Context-aware Machine Translation" which discusses the importance and usefulness of context.
Maximum length:
The maximum length of a submitted paper is eight (8) pages.
Template:
Participants must use the ACL 2015 LaTeX style files or Microsoft Word style files that are available on the ACL 2015 conference web site.
Submission site:
The paper submission site is open to only task participants.

REGISTRATION

The registration for task participants:
The registration site for task participants of WAT2015 is now open.
The registration fee is FREE for all participants.
The registration for WAT2015 audiences:
There is no need to register in advance.
The registration fee is FREE for all audiences include participants.

ORGANIZERS

Toshiaki Nakazawa (Japan Science and Technology Agency (JST), Japan)
Hideya Mino (National Institute of Information and Communications Technology (NICT), Japan)
Isao Goto (Japan Broadcasting Corporation (NHK), Japan)
Graham Neubig (Nara Institute of Science and Technology (NAIST), Japan)
Eiichiro Sumita (National Institute of Information and Communications Technology (NICT), Japan)
Sadao Kurohashi (Kyoto University, Japan)

CONTACT

For questions, comments, etc. please email to "wat -at- nlp.ist.i.kyoto-u.ac.jp".

Japan Patent Office
JPO Patent Corpus
The Association for Natural Language Processing (Japanese Page)
Asian Scientific Paper Excerpt Corpus (ASPEC)
Japan Science and Technology Agency (JST)
National Institute of Information and Communications Technology (NICT)

PREVIOUS WORKSHOPS

WAT 2014
The 1st Workshop on Asian Translation (WAT 2014) was held in October 2014 in Tokyo, Japan.

CHANGE LOG

2015-10-21: PHOTOS open
2015-10-15: PAPERS open
2015-10-08: TIMETABLE was updated
2015-09-09: TIMETABLE
2015-08-24: SPONSOR
2015-08-10: Evaluation was updated
2015-08-10: IMPORTANT DATES was updated
2015-04-28: IMPORTANT DATES
2015-04-16: Patents Subtask (Korean-Japanese)
2015-03-24: INVITED TALK
2015-02-26: Patents Subtask (Chinese-Japanese)
2015-02-26: REGISTRATION open
2015-02-26: site open

JST (Japan Science and Technology Agency)
NICT (National Institute of Information and Communications Technology)
Kyoto University
Last Modified: 2015-10-21

WAT 2015 The 2nd Workshop on Asian Translation

10:30am-17:00pm, October 16, 2015 The second conference room, 4th floor, Campus Plaza Kyoto, Japan [MAP]