WAT 2015
The 2nd Workshop on Asian Translation
10:30am-17:00pm, October 16, 2015
The second conference room, 4th floor,
Campus Plaza Kyoto, Japan
[MAP]
[SPONSOR]
| [TIMETABLE]
| [INVITED TALK]
| [IMPORTANT DATES]
| [TASK]
| [EVALUATION]
| [BASELINE SYSTEM]
| [PAPER SUBMISSION INFORMATION]
| [REGISTRATION]
| [CONTACT]
| [PREVIOUS WORKSHOPS]
WAT 2016 is now open.
PHOTOS is now open (2015/10/21)
PAPERS was updated (2015/10/21)
INTRODUCTION
The Workshop on Asian Translation (WAT) is a new open evaluation campaign focusing on Asian languages.
We would like to invite a broad range of participants and conduct various forms of machine translation experiments and evaluation.
Collecting and sharing our knowledge will allow us to understand the essence of machine translation and the problems to be solved.
We are working toward the practical use of machine translation among all Asian countries.
For the 2nd WAT, we chose scientific papers and patents as the targeted domain, and selected the languages Japanese, Chinese, Korean and English.
What makes WAT unique:
- Open innovation platform
The test data is fixed and open, so you can repeat evaluations on the same data and confirm changes in translation accuracy over time.
WAT has no deadline for the automatic translation quality evaluation (continuous evaluation), so you can submit translation results at any time.
- Domain and language pairs
WAT is the world's first workshop that uses scientific papers as a domain and Japanese-Chinese as a language pair.
In the future, we will add more Asian languages, such as Korean, Vietnamese, Indonesian, Thai, Myanmar and so on.
- Context-aware evaluation
The test data of WAT is prepared using the paragraph as a unit, while almost all other evaluation campaigns use the sentence as a unit.
We would like to consider how to realize context-aware evaluation in WAT.
- Evaluation method
Evaluation will be done by both automatic and human evaluation.
For human evaluation, WAT will use crowdsourcing, which is low cost and allows multiple evaluations.
|
SunFlare |
10:30 - 10:35 | Welcome: Hideya Mino |
10:35 - 11:05 | Invited talk Ⅰ: Eiichiro Sumita |
11:05 - 11:10 | Break |
11:10 - 11:30 | Overview of WAT2015: Isao Goto |
11:30 - 12:30 | Oral Presentation Ⅰ (3 systems: TOSHIBA, naver and Sense) |
| 11:30 - 11:50 Toshiba MT System Description for the WAT2015 Workshop: Satoshi Sonoh and Satoshi Kinoshita |
| 11:50 - 12:10 NAVER Machine Translation System for WAT 2015: Hyoung-Gyu Lee, Jaesong Lee, Jun-Seok Kim and Chang-Ki Lee |
| 12:10 - 12:30 An Awkward Disparity between BLEU / RIBES Scores and Human Judgements in Machine Translation: Liling Tan, Jon Dehdari and Josef van Genabith |
12:30 - 13:30 | Lunch |
13:30 - 14:15 | Invited talk Ⅱ: Haizhou Li |
14:15 - 14:20 | Break |
14:20 - 15:00 | Oral Presentation Ⅱ (2 systems: NAIST and Kyoto-U) |
| 14:20 - 14:40 Neural Reranking Improves Subjective Quality of Machine Translation: NAIST at WAT2015: Graham Neubig, Makoto Morishita and Satoshi Nakamura |
| 14:40 - 15:00 KyotoEBMT System Description for the 2nd Workshop on Asian Translation: John Richardson, Raj Dabre, Chenhui Chu, Fabien Cromières, Toshiaki Nakazawa and Sadao Kurohashi |
15:00 - 15:15 | Poster Booster Session |
15:15 - 16:45 | Poster Presentation (all systems) |
16:45 - 16:55 | Closing: Toshiaki Nakazawa |
16:55 - 17:00 | Commemorative photo |
notice: Lunch, Poster Presentation, Closing, and Commemorative photo will be held
at the hall of 2nd floor.
INVITED TALK
|
Dr. Eiichiro Sumita,
Associate Director General of Universal Communication Research Institute and
Director of Multilingual Translation Laboratory (NICT)
[Short bio.]
Title: Government project for multi-lingual speech translation system to
bridge the language barrier in Japan
[slide]
(Click here to more information. )
Time:10:35 - 11:05
|
|
Dr. Haizhou Li,
Research Director of the Institute for Infocomm Research in Singapore, Principal Scientist
and Department Head of Human Language Technology
[Short bio.]
Title: Adequacy-Fluency Metrics: Evaluating MT in the Continuous Space Model Framework
[slide]
(Click here to more information. )
Time:13:30 - 14:15
|
IMPORTANT DATES
Submission due for pairwise crowdsourcing evaluation | August 31, 2015 | August 16, 2015 |
System description draft paper due | September 27, 2015 | September 20, 2015 |
Review feedback | October 4, 2015 | September 27, 2015 |
System description camera-ready paper due | October 11, 2015 | October 4, 2015 |
WAT 2015 | October 16, 2015 (whole day) |
All deadlines are calculated at 11:59PM Pacific Time (UTC/GMT -8 hours)
notice: Task participants should submit the translation results for pairwise crowdsourcing evaluation.
TASK
The task is to improve the text translation quality for scientific papers
and patents.
Participants choose any of the subtasks in which they would like to participate
and translate the test data using their machine translation systems.
The WAT organizers will evaluate the results submitted using automatic evaluation
and human evaluation.
We also provide a baseline machine translation system using Moses.
Subtasks:
- Scientific papers Subtask:
- English -> Japanese
- Japanese -> English
- Chinese -> Japanese
- Japanese -> Chinese
- Patents Subtask:
- Chinese -> Japanese
- Korea -> Japanese
Dataset:
- Scientific papers Subtask:
WAT uses ASPEC
for the dataset including training, development, development test
and test data.
Participants of the scientific papers subtask must get a copy of ASPEC
by themselves.
ASPEC consists of approximately 3 million Japanese-English parallel sentences
from paper abstracts (ASPEC-JE) and approximately 0.7 million
Japanese-Chinese paper excerpts (ASPEC-JC)
- Patents Subtask:
WAT uses JPO Patent Corpus,
which is constructed by Japan Patent Office (JPO).
This corpus consists of 1 million Chinese-Japanese parallel sentences
and 1 million Korean-Japanese parallel sentences from patent description
with four categories.
Participants of patents subtask are required to get it on WAT2015 site
of JPO Patent Corpus.
Baseline system:
Baseline systems site is now open.
EVALUATION
We will evaluate the translation performance of the results submitted through
automatic evaluation and human evaluation.
Automatic evaluation:
We will prepare an automatic evaluation server.
You will be able to evaluate the translation results at any time using this server.
- Metric:
BLEU and RIBES
- Format:
The submission format and the submission method are given at the submission site below.
- Notice:
When submitting, participants have to agree that the submitted results
are attributed to JST and NICT.
The results will be used and distributed for research by JST and NICT.
The automatic evaluation server for the patents subtask will be available on 29th May.
Human evaluation:
Human evaluation will be carried out with two kinds of method,
which are Pairwise Crowdsourcing Evaluation and JPO Adequacy Evaluation.
- Pairwise Crowdsourcing Evaluation:
Pairwise Crowdsourcing Evaluation will be carried out using crowdsourcing.
Organizers will sample 400 sentences from the test data for the pairwise
crowdsourcing evaluation.
Participants can submit translation results for the human evaluation
a maximum of twice until the pairwise crowdsourcing evaluation is due.
- Metric:
Sentence-by-sentence pair-wise evaluation compared to the baseline system.
The crowdsourcing workers will be asked to judge which translation is better
than the other in view of adequacy and fluency.
To guarantee the quality of the evaluation, each sentence is evaluated by 5
different workers and the final decision is made by the voting of the judgements.
- Format:
The submission format is the same as that of automatic evaluation.
Participants can select their translation results from the ones submitted
for automatic evaluation.
- Ranking:
All systems will be ranked by the percentage of translations judged
to improve upon the baseline system.
- JPO Adequacy Evaluation:
We will also evaluate with the criteria of Content Transmission Level Evaluation which JPO defined
(pages 5 to 8 in the PDF file (Japanese Page)).
We will sample 200 sentences from the pairwise crowdsourcing evaluation data
for the JPO adequacy evaluation.
The JPO adequacy evaluation will be conducted only for translation results of 3 top-scored teams
on the Pairwise Crowdsourcing Evaluation in each subtask's language pair.
Submission:
Submission site is now open.
(User Name and Password is necessary to access.)
Evaluation results:
Evaluation results site is now open.
Participants who submit results for human evaluation should submit description
papers of their translation systems and evaluation results.
All submissions and feedback are handled electronically as below.
- Format:
PDF file format, two (2) columns, no smaller than nine (9) point type font
throughout the paper.
- Language:
Papers should be written in English.
- Encouragement:
We strongly prefer that papers include a section entitled
"Issues for Context-aware Machine Translation" which discusses
the importance and usefulness of context.
- Maximum length:
The maximum length of a submitted paper is eight (8) pages.
- Template:
Participants must use the ACL 2015 LaTeX style files or Microsoft Word style
files that are available on the
ACL 2015 conference web site.
- Submission site:
The paper submission site is open to only task participants.
REGISTRATION
- The registration for task participants:
The registration site
for task participants of WAT2015 is now open.
The registration fee is FREE for all participants.
- The registration for WAT2015 audiences:
There is no need to register in advance.
The registration fee is FREE for all audiences include participants.
ORGANIZERS
- Toshiaki Nakazawa (Japan Science and Technology Agency (JST), Japan)
- Hideya Mino (National Institute of Information and Communications Technology (NICT), Japan)
- Isao Goto (Japan Broadcasting Corporation (NHK), Japan)
- Graham Neubig (Nara Institute of Science and Technology (NAIST), Japan)
- Eiichiro Sumita (National Institute of Information and Communications Technology (NICT), Japan)
- Sadao Kurohashi (Kyoto University, Japan)
For questions, comments, etc. please email to "wat -at- nlp.ist.i.kyoto-u.ac.jp".
Japan Patent Office
JPO Patent Corpus
The Association for Natural Language Processing (Japanese Page)
Asian Scientific Paper Excerpt Corpus (ASPEC)
Japan Science and Technology Agency (JST)
National Institute of Information and Communications Technology (NICT)
PREVIOUS WORKSHOPS
- WAT 2014
The 1st Workshop on Asian Translation (WAT 2014) was held in October 2014 in Tokyo, Japan.
CHANGE LOG
2015-10-21: PHOTOS open
2015-10-15: PAPERS open
2015-10-08: TIMETABLE was updated
2015-09-09: TIMETABLE
2015-08-24: SPONSOR
2015-08-10: Evaluation was updated
2015-08-10: IMPORTANT DATES was updated
2015-04-28: IMPORTANT DATES
2015-04-16: Patents Subtask (Korean-Japanese)
2015-03-24: INVITED TALK
2015-02-26: Patents Subtask (Chinese-Japanese)
2015-02-26: REGISTRATION open
2015-02-26: site open
JST (Japan Science and Technology Agency)
NICT (National Institute of Information and Communications Technology)
Kyoto University
Last Modified: 2015-10-21