Minshu Zhan
MIT EECS Undergraduate Research and Innovation Scholar
Modeling Variation of Acoustic Cues in Speech Production
2012–2013
Automatic Speech Recognition has achieved its state-of-art performance using mainly statistical methods. Although current algorithms succeed under relatively undisturbed conditions, Performance can still be improved in noisy settings. Motivated by evidence that human speakers identify phonemes by detecting specific sound patterns that indicate linguistic categories or acoustic cues to distinctive features, the Speech Communication Lab studies these cues for the purpose of constructing a better acoustic model for ASR. My work is to model the correlation between the variations of the cues and the corresponding speaking context, i.e. to discover which factors lead to which kinds of variation. We believe that such knowledge can effectively reduce unpredictability, leading to better ASR performance.
My UROP project at the Speech Communication Lab involves developing a set of landmark label processing and analysis tools needed for modeling speech acoustics. I worked at the Evolutionary Design & Optimization Group on improving the algorithm of a wind farm layout optimization software. At Singapore SMART center I wrote GIS data processing programs for a transportation simulation platform developed by MIT and Singapore researchers.