JIJI Corpus was constructed by Jiji Press Ltd in collaboration with the National Institute of Information and Communications Technology (NICT). This corpus consists of a Japanese-English news corpus of 200K parallel sentences. These data come from Jiji Press news with various categories including politics, economy, nation, business, markets, sports and so on. The original news were distributed to many of newspaper companies, TV stations or portal sites. Jiji Press aims to introduce machine translation technologies into the daily editorial work in the future.
JIJI Corpus includes:
|Data Type||File Name||Number of sentences|
JIJI Press LTD.
5-15-8 Ginza, Chuo-ku,
Tokyo 104-8178, JAPAN
For questions, comments, etc. please email to "wat -at- nlp.ist.i.kyoto-u.ac.jp".
2018-8-16: agreement forms were updated for WAT2018 2017-6-12: site open
NICT (National Institute of Information and Communications Technology)
Last Modified: 2017-06-12