JST_LOGO.JPG NICT_LOGO.JPG KYOTO-U_LOGO.JPG

Recipe Corpus

[HOME]

INTRODUCTION

Recipe Corpus was constructed by Cookpad Inc. Each recipe consists of a title, ingredients, steps, a description, and history. Note that a translation pair of a step, a description, or a history does not always consist of one parallel sentence. Note that although all texts in the training set can be used for training, only titles, ingredients, and steps in the test set will be used for evaluation.

DETAIL

Recipe Corpus includes:

The numbers of texts are as follows:

Data Type File Name Text Type Number of Texts
TRAIN train.json Title 14,779
Ingredient 127,244
Step 108,993
DEV dev.json Title 500
Ingredient 4,274
Step 3,303
DEVTEST devtest.json Title 500
Ingredient 4,188
Step 3,086
TEST test.json Title 500
Ingredient 3,935
Step 2,804
Description 499
History 456

HOW TO OBTAIN

Please send e-mail to "recipe-corpus -at- cookpad.com".

Back to top

CONTACT

For questions, comments, etc. please email to "wat -at- nlp.ist.i.kyoto-u.ac.jp".

Back to top

CHANGE LOG

2017-6-16: site open


JST (Japan Science and Technology Agency)
NICT (National Institute of Information and Communications Technology)
Kyoto University
Last Modified: 2017-08-21