Technical Program

Paper Detail

Paper:	SLP-P7.9
Session:	Audio-visual and Multimodal Processing
Time:	Wednesday, May 17, 10:00 - 12:00
Presentation:	Poster
Topic:	Speech and Spoken Language Processing: Speech/voice-based human-computer interfaces (HCI)
Title:	THE VOCAL JOYSTICK
Authors:	Jeff Bilmes, Jonathan Malkin, Xiao Li, Susumu Harada, Kelley Kilanski, Katrin Kirchhoff, Richard Wright, Amarnag Subramanya, James Landay, Patricia Dowden, Howard Chizeck, University of Washington, Seattle, United States
Abstract:	The Vocal Joystick is a novel human-computer interface mechanism designed to enable individuals withmotor impairments to make use of vocal parameters to control objects on a computer screen (buttons, sliders, etc.) and ultimately electro-mechanical instruments (e.g., robotic arms, wireless home automation devices). We have developed a working prototype of the “VJ-engine” with which individuals can now control computer mouse movement with their voice. The core engine is currently optimized according to a number of criterion. In this paper, we describe the engine system design, engine optimization, and user-interface improvements, outline some of the signal processing and pattern recognition modules that were successful. We describe also a recently initiated large vocal data collection effort (to improve the engine’s accuracy). Lastly, we present new results comparing the vocal joystick with a state-of-the-art eye tracking pointing device, and show that not only is the vocal joystick already competitive, for some tasks it appears to improve.