Abstract
This paper presents a method that uses a web questionnaire to create a corpus containing spontaneous utterances of natural ideas, which may contain grammatical mistakes. In an experimental implementation of the method, the subjects were informed that they were receiving nursing care from a person, and they were required to answer a web-based questionnaire in which their responses were recorded as speech utterances. Compared to the Wizard of Oz approach and interview-based corpus-creation methods, the presented method simplifies the collection of utterances. Furthermore, we conducted a two-fold assessment to verify the effectiveness of the presented method. First, the approach exhibited a significant reduction in workload compared to interview-style utterance collection. Second, we compared the variety of expressions collected when subjects were informed when they were informed that they were communicating with a nursing robot. The results indicate that, robot than for a person, in terms of other metrics such as time efficiency index, the total number of morphemes, the average number of morphemes per utterman speech target than for a robot.