JMIS - Journal of Management Information Systems

Journal of Management Information Systems

Volume 22 Number 4 2006 pp. 237-270

Discovering Cues to Error Detection in Speech Recognition Output: A User-Centered Approach

Lina, Zhou, Yongmei, Shi, Dongsong, Zhang, and Sears, Andrew

ABSTRACT: The great potential of speech recognition systems in freeing users' hands while interacting with computers has inspired a variety of promising applications. However, given the performance of the state-of-the-art speech recognition technology today, widespread acceptance of speech recognition technology would not be realistic without designing and developing new approaches to detecting and correcting recognition errors effectively. In seeking solutions to the above problem, identifying cues to error detection (CERD) is central. Our survey of the extant literature on the detection and correction of speech recognition errors reveals that the system-initiated, data-driven approach is dominant, but that heuristics from human users have been largely overlooked. This may have hindered the advance of speech technology. In this research, we propose a user-centered approach to discovering CERD. User studies are carried out to implement the approach. Content analysis of the collected verbal protocols lends itself to a taxonomy of CERD. The CERD discovered in this study can improve our knowledge on CERD by not only validating CERD from a user's perspective but also suggesting promising new CERD for detecting speech recognition errors. Moreover, the analysis of CERD in relation to error types and other CERD provides new insights into the context where specific CERD are effective. The findings of this study can be used to not only improve speech recognition output but also to provide context-aware support for error detection. This will help break the barrier for mainstream adoption of speech technology in a variety of information systems and applications.

Key words and phrases: cues to error detection, speech recognition, taxonomy, verbal protocol analysis