Traditional ASPEC translation tasks are sentence-level and the translation quality of them seem to be saturated. We think it's high time to move on to document-level evaluation. For the first year, we use ParaNatCom (Parallel English-Japanese abstract corpus made from Nature Communications articles) for the development and test sets of the Document-level Scientific Paper Translation sub-task. We cannot provide document-level training corpus, but you can use ASPEC and any other extra resources.
This year we have only English-to-Japanese translation direction. We have split the ParaNatCom into development and test sets. The file list of the development set is here and that of the test set is here. The participants of this task need to translate all the lines in the test set files under abstracts directory (which contains English sentences) of ParaNatCom and submit the translations.
There are reference (Japanese) translations under abstracts-ja-1 and abstracts-ja-2 directories. You can use those of the development set for tuning your system, but you should not look at those for the test set.
Note that each file is composed of 3 lines:
Each test file must be translated in the same format to the original file, which means that each translated file must contain exactly 3 lines:
Please note that you need to apply for WAT 2021 before submitting your translation results.
Sampled files of the test set will be human-evaluated. The evaluation criteria will be announced later.
For general questions, comments, etc. please email to "wat-organizer -at- googlegroups -dot- com". For questions related to this task contact "nakazawa -at- logos -dot- t -dot- u-tokyo -dot- ac -dot- jp".
Last Modified: 2020-12-21