NICT_LOGO.JPG KYOTO-U_LOGO.JPG

WAT 2021

Myanmar-English Parallel Data

[HOME]

INTRODUCTION

The parallel data for Myanmar-English tanslation tasks at WAT2021 consist of two corpora, the ALT corpus and UCSY corpus.

DETAIL

The numbers of sentences are as follows:

Data Type File Name Number of Sentences
TRAIN train.ucsy.[my|en] 204,539
train.alt.[my|en] 18,088
DEV dev.alt.[my|en] 1,000
TEST test.alt.[my|en] 1,018

HOW TO OBTAIN

- The quality of the UCSY corpus used in WAT2021 is improved by correcting translation errors, spelling and typographic errors.

Myanmar-English Parallel Data for WAT2021

Back to top

CONTACT

For questions, comments, etc. please email to "wat-organizer -at- googlegroups -dot- com".

Back to top

CHANGE LOG

2020-07-17: site open


NICT (National Institute of Information and Communications Technology)
Kyoto University
Last Modified: 2021-02-12