JIJI Corpus was constructed by Jiji Press Ltd in collaboration with the National Institute of Information and Communications Technology (NICT). This corpus consists of a Japanese-English news corpus of 200K parallel sentences. These data come from Jiji Press news with various categories including politics, economy, nation, business, markets, sports and so on. The original news were distributed to many of newspaper companies, TV stations or portal sites. Jiji Press aims to introduce machine translation technologies into the daily editorial work in the future.
JIJI Corpus includes:
|Data Type||File Name||Number of sentences|
JIJI Press LTD.
5-15-8 Ginza, Chuo-ku,
Tokyo 104-8178, JAPAN
For questions, comments, etc. please email to "wat -at- nlp.ist.i.kyoto-u.ac.jp".
2017-6-12: site open
JST (Japan Science and Technology Agency)
NICT (National Institute of Information and Communications Technology)
Last Modified: 2017-06-12