The NICT Japanese Learner English (JLE) Corpus


Japanese Page

Overview


In 2004, National Institute of Information and Communications Technology created a learner corpus, “The NICT JLE Corpus”. The source of the corpus data is the transcripts of the audio-recorded speech samples (1,281 samples, 1.2 million words, 300 hours in total) of English oral proficiency interview test, ACTFL-ALC SST (Standard Speaking Test).

Unique Features

  1. Proficiency Level

  2. Annotation

  3. Sub Corpus

Contents


What's New


Precautions


Sample


One transcribed text of an interview is stored as one TXT file in this corpus, and the corpus contains 1,281 files in total.

The following is a short quotation from a corpus file.
Each tag has different implications. For example: The meanings of all tags are shown in Tag List.

<head version="1.3">
<date>1999-12-16</date>
<sex>female</sex>
<age></age>
<country>Japan</country>
<overseas></overseas>
<category></category>
<step>1.5</step>
<TOEIC>765</TOEIC>
<TOEFL></TOEFL>
<other_tests></other_tests>
<SST_level>6</SST_level>
<SST_task2>restaurant</SST_task2>
<SST_task3>train_advanced</SST_task3>
<SST_task4>department store</SST_task4>
</head>

...

<stage2>
<task>
<A>I see. O K. Now, let me show you the first picture. Please describe this picture.</A>
<B>O K. <F>Er</F> <R>this is a</R> this is a <.></.> room in a hotel. And <.></.> <F>oh</F> sorry, it's not. Yeah, I think it's a restaurant. And there are three tables, <R>and</R> and there are three couples and <SC>two server</SC> two <R>waiter</R> waiter are serving. And <R>in the</R> in the middle of the restaurant, the couple is <F>er</F> drinking wine. And <F>err</F> the man is <.></.> testing the wine and saying something to the waiter. Maybe he is sommelier. And <R>he</R> he show the bottle to the man. I guess he is explaining something. And <F>er</F> the couple, <F>er</F> they dressed very nicely. <CO><R>And</R> <.></.> <F>mhmm</F> <R>and</R> <.></.> <R>and</R> <F>well</F> and</CO>. <.></.></B>
</task>
<followup>
<A>O K.</A>
<B>O K?</B>
<A>O K. Thank you very much. <F>Er</F> how do you spend time with your husband?</A>
<B><.></.> You mean, in our free time?</B>
<A><F>Mhmm</F>.</A>
<B><F>Er</F> like this? <.></.> <F>Well</F> <F>er</F> <R>I</R> I sometimes eating out with my husband. But we don't get dressed like this. <nvs>laughter</nvs> <..></..></B>
<A>Can you compare the restaurant you often go to to this picture?</A>
<B><nvs>laughter</nvs> It's very different from restaurant to we often go. We often go to a kind of family style restaurant <.></.> such as Denny's or Skylark. So I wish I could <SC>go like</SC> go to a nice restaurant like this.</B>
<A><F>Er</F> what is good about family-type restaurant?</A>
<B><F>Well</F> <SC>fir</SC> at first, it's very cheap and they served very quickly. And, <F>er</F> most of the cases, <F>er</F> that kind of restaurant is in suburb, so <SC>people are very</SC> <F>er</F> people can go there very easily. I think they are good point of family-type restaurant.</B>
</followup>
</stage2>

...

Download


License


Creative Commons License
Use and/or redistribution of the The NICT JLE Corpus is permitted under the conditions of Creative Commons Attribution-Share-Alike License 3.0. Details can be found at http://creativecommons.org/licenses/by-sa/3.0/.

Information Analysis Laboratory (previously Language Infrastructure Group)
National Institute of Information and Communications Technology
Copyright 2004-2012 NICT All Rights Reserved.