Recipe Corpus was constructed by Cookpad Inc. Each recipe consists of a title, ingredients, steps, a description, and history. Note that a translation pair of a step, a description, or a history does not always consist of one parallel sentence. Note that although all texts in the training set can be used for training, only titles, ingredients, and steps in the test set will be used for evaluation.
Recipe Corpus includes:
Data Type | File Name | Text Type | Number of Texts |
---|---|---|---|
TRAIN | train.json | Title | 14,779 |
Ingredient | 127,244 | ||
Step | 108,993 | ||
DEV | dev.json | Title | 500 |
Ingredient | 4,274 | ||
Step | 3,303 | ||
DEVTEST | devtest.json | Title | 500 |
Ingredient | 4,188 | ||
Step | 3,086 | ||
TEST | test.json | Title | 500 |
Ingredient | 3,935 | ||
Step | 2,804 | ||
Description | 499 | ||
History | 456 |
Please send e-mail to "recipe-corpus -at- cookpad.com".
For questions, comments, etc. please email to "wat -at- nlp.ist.i.kyoto-u.ac.jp".
2017-6-16: site open
JST (Japan Science and Technology Agency)
NICT (National Institute of Information and Communications Technology)
Kyoto University
Last Modified: 2017-08-21