Paper: | SLP-P11.9 |
Session: | Front-end For Robust Speech Recognition |
Time: | Wednesday, May 17, 16:30 - 18:30 |
Presentation: |
Poster
|
Topic: |
Speech and Spoken Language Processing: End-point detection and barge-in methods |
Title: |
AUTOMATIC SPEECH SEGMENTATION COMBINING AN HMM-BASED APPROACH AND RECURRENCE TREND ANALYSIS |
Authors: |
Runqiang Yan, Shanghai Jiao Tong University, China; Yiqing Zu, Motorola Research Center, China; Yisheng Zhu, Shanghai Jiao Tong University, China |
Abstract: |
Aiming at improving the speech segmentation accuracy acquired from standard HMM-based approach, this paper presents a nonlinear dynamical method for phoneme boundary adjustment by discerning and measuring the nonstationarity of speech dynamics. Dynamical systems of different phones present diversified invariant attractor structures in phase space. Therefore, when analyzing adjacent phones, there may exist a point, at which the underlying dynamics changes. In this study, time-dependent recurrence trend (TDRT) is proposed to describe the local changing degree of the nonstationarity of speech dynamics as time progress and identify the largest paling slop in the windowed recurrence plots (RPs) as the phoneme boundary. The experimental result shows that 9.41% increase in agreement within 20 ms with TDRT correction is obtained on TIMIT database. |