Reconstructing Incomplete and Unreliable Speech Spectrogram for Robust Automatic Speech Recognition

Reconstructing Incomplete and Unreliable Speech Spectrogram for Robust Automatic Speech Recognition
Title Reconstructing Incomplete and Unreliable Speech Spectrogram for Robust Automatic Speech Recognition PDF eBook
Author Shirin Badiezadegan
Publisher
Pages
Release 2015
Genre
ISBN

Download Reconstructing Incomplete and Unreliable Speech Spectrogram for Robust Automatic Speech Recognition Book in PDF, Epub and Kindle

"The performance of an automatic speech recognition (ASR) system degrades dramatically when speech is corrupted by background noise. In many ASR applications, however, the presence of the background noise is unavoidable. Feature representations in ASR are usually derived from the short-time spectral magnitude of the speech signal, known as the speech spectrogram. The goal of the work in this thesis is to develop noise robust ASR systems by reconstructing noise corrupted speech spectrograms. This is addressed as a data imputation problem within the framework of missing feature theory in computational auditory scene analysis. This thesis presents a number of data imputation techniques which can add noise robustness to an ASR system while making minimum assumptions about the characteristics of the background noise. There are three major contributions in this thesis work. The first relates to the spectrographic mask estimation which is performed to identify noise corrupted features. Having identified the noise corrupted speech features, a spectrogram reconstruction technique is employed to estimate the underlying clean features and reconstruct the noise corrupted features. A mask estimation method, based on speech enhancement techniques presented previously in the literature, is incorporated in a spectrogram reconstruction approach for noise robust ASR. The presented mask estimation technique is shown to perform well both in stationary and non-stationary noisy environments. More importantly, this technique does not require any prior knowledge of the background noise type or the SNR level.The second contribution of this thesis is a filterbank based approach to spectrogram reconstruction based on discrete wavelet transform (DWT) de-noising. In these techniques, speech spectrogram coefficients are input to a DWT filterbank. Most of the spectrogram reconstruction approaches presented in the literature are model-based techniques that can only provide accurate estimates of the underlying clean speech when the characteristics of the noise corrupted features do not deviate from those of the model. Discrete wavelet transform (DWT) based de-noising methods have been used for signal reconstruction, but often require that the background noise is stationary and modeled by a Gaussian distribution. A novel approach is presented in this thesis for incorporating the information derived from spectrographic masks in a DWT-based de-noising method. It will be shown that the proposed approach reduces the impact of model mismatch associated with parametric approaches and exploits the robustness of non-parametric wavelet de-noising approach. This technique, however, can perform at its best only if some parameters are tuned to the noise conditions. The third contribution of this thesis is a procedure which combines multiple DWT-based reconstructed spectral features using a closed loop optimization algorithm which is related to the overall performance of the ASR system. The feature channels are formed from an ensemble of reconstructed spectrograms generated by applyingDWT-based spectrogram reconstruction with multiple parameter settings. The spectrograms associated with these feature channels differ in the degreeto which spectral information is suppressed across multiple scales and frequencybands.A consistent increase in word accuracy is reported for this multi-channelperformance monitoring approach with respect to animplementation of a more well known minimum mean squared error approach formissing feature based spectrogram reconstruction. " --

Reconstruction of Incomplete Spectrograms for Robust Speech Recognition

Reconstruction of Incomplete Spectrograms for Robust Speech Recognition
Title Reconstruction of Incomplete Spectrograms for Robust Speech Recognition PDF eBook
Author Bhiksha Raj Ramakrishnan
Publisher
Pages 0
Release 2000
Genre
ISBN

Download Reconstruction of Incomplete Spectrograms for Robust Speech Recognition Book in PDF, Epub and Kindle

Robust Speech Recognition of Uncertain or Missing Data

Robust Speech Recognition of Uncertain or Missing Data
Title Robust Speech Recognition of Uncertain or Missing Data PDF eBook
Author Dorothea Kolossa
Publisher Springer Science & Business Media
Pages 387
Release 2011-07-14
Genre Technology & Engineering
ISBN 3642213170

Download Robust Speech Recognition of Uncertain or Missing Data Book in PDF, Epub and Kindle

Automatic speech recognition suffers from a lack of robustness with respect to noise, reverberation and interfering speech. The growing field of speech recognition in the presence of missing or uncertain input data seeks to ameliorate those problems by using not only a preprocessed speech signal but also an estimate of its reliability to selectively focus on those segments and features that are most reliable for recognition. This book presents the state of the art in recognition in the presence of uncertainty, offering examples that utilize uncertainty information for noise robustness, reverberation robustness, simultaneous recognition of multiple speech signals, and audiovisual speech recognition. The book is appropriate for scientists and researchers in the field of speech recognition who will find an overview of the state of the art in robust speech recognition, professionals working in speech recognition who will find strategies for improving recognition results in various conditions of mismatch, and lecturers of advanced courses on speech processing or speech recognition who will find a reference and a comprehensive introduction to the field. The book assumes an understanding of the fundamentals of speech recognition using Hidden Markov Models.

Robust Automatic Speech Recognition with Missing and Unreliable Data

Robust Automatic Speech Recognition with Missing and Unreliable Data
Title Robust Automatic Speech Recognition with Missing and Unreliable Data PDF eBook
Author Ljubomir Josifovski
Publisher
Pages
Release 2002
Genre
ISBN

Download Robust Automatic Speech Recognition with Missing and Unreliable Data Book in PDF, Epub and Kindle

Acoustical and Environmental Robustness in Automatic Speech Recognition

Acoustical and Environmental Robustness in Automatic Speech Recognition
Title Acoustical and Environmental Robustness in Automatic Speech Recognition PDF eBook
Author Alex Acero
Publisher Springer Science & Business Media
Pages 216
Release 1992-11-30
Genre Technology & Engineering
ISBN 9780792392842

Download Acoustical and Environmental Robustness in Automatic Speech Recognition Book in PDF, Epub and Kindle

The need for automatic speech recognition systems to be robust with respect to changes in their acoustical environment has become more widely appreciated in recent years, as more systems are finding their way into practical applications. Although the issue of environmental robustness has received only a small fraction of the attention devoted to speaker independence, even speech recognition systems that are designed to be speaker independent frequently perform very poorly when they are tested using a different type of microphone or acoustical environment from the one with which they were trained. The use of microphones other than a "close talking" headset also tends to severely degrade speech recognition -performance. Even in relatively quiet office environments, speech is degraded by additive noise from fans, slamming doors, and other conversations, as well as by the effects of unknown linear filtering arising reverberation from surface reflections in a room, or spectral shaping by microphones or the vocal tracts of individual speakers. Speech-recognition systems designed for long-distance telephone lines, or applications deployed in more adverse acoustical environments such as motor vehicles, factory floors, oroutdoors demand far greaterdegrees ofenvironmental robustness. There are several different ways of building acoustical robustness into speech recognition systems. Arrays of microphones can be used to develop a directionally-sensitive system that resists intelference from competing talkers and other noise sources that are spatially separated from the source of the desired speech signal.

Robust Speech Recognition of Uncertain or Missing Data

Robust Speech Recognition of Uncertain or Missing Data
Title Robust Speech Recognition of Uncertain or Missing Data PDF eBook
Author Dorothea Kolossa
Publisher Springer
Pages 380
Release 2013-01-02
Genre Technology & Engineering
ISBN 9783642213182

Download Robust Speech Recognition of Uncertain or Missing Data Book in PDF, Epub and Kindle

Automatic speech recognition suffers from a lack of robustness with respect to noise, reverberation and interfering speech. The growing field of speech recognition in the presence of missing or uncertain input data seeks to ameliorate those problems by using not only a preprocessed speech signal but also an estimate of its reliability to selectively focus on those segments and features that are most reliable for recognition. This book presents the state of the art in recognition in the presence of uncertainty, offering examples that utilize uncertainty information for noise robustness, reverberation robustness, simultaneous recognition of multiple speech signals, and audiovisual speech recognition. The book is appropriate for scientists and researchers in the field of speech recognition who will find an overview of the state of the art in robust speech recognition, professionals working in speech recognition who will find strategies for improving recognition results in various conditions of mismatch, and lecturers of advanced courses on speech processing or speech recognition who will find a reference and a comprehensive introduction to the field. The book assumes an understanding of the fundamentals of speech recognition using Hidden Markov Models.

Robust Automatic Speech Recognition and Moduling of Auditory Discrimination with Auditory Experiments Spectro-temporal Features

Robust Automatic Speech Recognition and Moduling of Auditory Discrimination with Auditory Experiments Spectro-temporal Features
Title Robust Automatic Speech Recognition and Moduling of Auditory Discrimination with Auditory Experiments Spectro-temporal Features PDF eBook
Author Marc René Schädler
Publisher
Pages
Release 2016
Genre
ISBN 9783814223339

Download Robust Automatic Speech Recognition and Moduling of Auditory Discrimination with Auditory Experiments Spectro-temporal Features Book in PDF, Epub and Kindle

Automatic speech recognition (ASR) systems still do not perform as well as human listeners under realistic conditions. The unmatched ability of humans to understand speech in most difficult acoustic conditions originates from the superior properties of their auditory system. The aim of this thesis is to improve the recognition performance of ASR systems in difficult acoustic conditions by carefully integrating auditory signal processing strategies. To this end, the physiologically inspired extraction of spectro-temporal modulation patterns was successfully integrated into the front-end of a standard ASR system. Furhter the joint spectro-temporal processing could be separated into independent temporal and spectral processes. To investigate the reason for the remaining "man-maschine-gap" in recognition performance, a range of critical auditory discrimination tasks were performed using ASR systems. The comparison with empirical data showed the the seperate spectro-temporal modulation front-end provides a suitable auditory model and revealed the importance of across-frequency processing in speech recognition.