Deep Learning Based Speech Quality Prediction

Deep Learning Based Speech Quality Prediction
Title Deep Learning Based Speech Quality Prediction PDF eBook
Author Gabriel Mittag
Publisher Springer Nature
Pages 171
Release 2022-02-24
Genre Technology & Engineering
ISBN 3030914798

Download Deep Learning Based Speech Quality Prediction Book in PDF, Epub and Kindle

This book presents how to apply recent machine learning (deep learning) methods for the task of speech quality prediction. The author shows how recent advancements in machine learning can be leveraged for the task of speech quality prediction and provides an in-depth analysis of the suitability of different deep learning architectures for this task. The author then shows how the resulting model outperforms traditional speech quality models and provides additional information about the cause of a quality impairment through the prediction of the speech quality dimensions of noisiness, coloration, discontinuity, and loudness.

Machine Learning Based Speech Quality Prediction

Machine Learning Based Speech Quality Prediction
Title Machine Learning Based Speech Quality Prediction PDF eBook
Author Gabriel Mittag
Publisher
Pages
Release 2022
Genre
ISBN

Download Machine Learning Based Speech Quality Prediction Book in PDF, Epub and Kindle

Simulating Conversations for the Prediction of Speech Quality

Simulating Conversations for the Prediction of Speech Quality
Title Simulating Conversations for the Prediction of Speech Quality PDF eBook
Author Thilo Michael
Publisher Springer Nature
Pages 157
Release 2023-06-30
Genre Technology & Engineering
ISBN 3031318447

Download Simulating Conversations for the Prediction of Speech Quality Book in PDF, Epub and Kindle

This book discusses the simulation of conversations through a novel approach of predicting speech quality based on the interactions of two simulated interlocutors. The author describes the setup of a simulation environment that is capable of simulating human dialogue on the speech level. The impact of delay and bursty packet loss on VoIP conversations is investigated and modeled for the use in the simulation. Based on parameters extracted from simulated conversations, the author proposes extensions to the E-model, a parametric model standardized by the International Telecommunications Union, in order to predict the quality of the simulated conversations. The author shows that predictions based on the simulated conversations outperform models that rely on the transmission parameters alone.

Speech and Audio Processing for Coding, Enhancement and Recognition

Speech and Audio Processing for Coding, Enhancement and Recognition
Title Speech and Audio Processing for Coding, Enhancement and Recognition PDF eBook
Author Tokunbo Ogunfunmi
Publisher Springer
Pages 347
Release 2014-10-14
Genre Technology & Engineering
ISBN 1493914561

Download Speech and Audio Processing for Coding, Enhancement and Recognition Book in PDF, Epub and Kindle

This book describes the basic principles underlying the generation, coding, transmission and enhancement of speech and audio signals, including advanced statistical and machine learning techniques for speech and speaker recognition with an overview of the key innovations in these areas. Key research undertaken in speech coding, speech enhancement, speech recognition, emotion recognition and speaker diarization are also presented, along with recent advances and new paradigms in these areas.

New Era for Robust Speech Recognition

New Era for Robust Speech Recognition
Title New Era for Robust Speech Recognition PDF eBook
Author Shinji Watanabe
Publisher Springer
Pages 433
Release 2017-10-30
Genre Computers
ISBN 331964680X

Download New Era for Robust Speech Recognition Book in PDF, Epub and Kindle

This book covers the state-of-the-art in deep neural-network-based methods for noise robustness in distant speech recognition applications. It provides insights and detailed descriptions of some of the new concepts and key technologies in the field, including novel architectures for speech enhancement, microphone arrays, robust features, acoustic model adaptation, training data augmentation, and training criteria. The contributed chapters also include descriptions of real-world applications, benchmark tools and datasets widely used in the field. This book is intended for researchers and practitioners working in the field of speech processing and recognition who are interested in the latest deep learning techniques for noise robustness. It will also be of interest to graduate students in electrical engineering or computer science, who will find it a useful guide to this field of research.

Neural Text-to-Speech Synthesis

Neural Text-to-Speech Synthesis
Title Neural Text-to-Speech Synthesis PDF eBook
Author Xu Tan
Publisher Springer Nature
Pages 214
Release 2023-05-29
Genre Computers
ISBN 9819908272

Download Neural Text-to-Speech Synthesis Book in PDF, Epub and Kindle

Text-to-speech (TTS) aims to synthesize intelligible and natural speech based on the given text. It is a hot topic in language, speech, and machine learning research and has broad applications in industry. This book introduces neural network-based TTS in the era of deep learning, aiming to provide a good understanding of neural TTS, current research and applications, and the future research trend. This book first introduces the history of TTS technologies and overviews neural TTS, and provides preliminary knowledge on language and speech processing, neural networks and deep learning, and deep generative models. It then introduces neural TTS from the perspective of key components (text analyses, acoustic models, vocoders, and end-to-end models) and advanced topics (expressive and controllable, robust, model-efficient, and data-efficient TTS). It also points some future research directions and collects some resources related to TTS. This book is the first to introduce neural TTS in a comprehensive and easy-to-understand way and can serve both academic researchers and industry practitioners working on TTS.

Advances in Multimedia Modeling

Advances in Multimedia Modeling
Title Advances in Multimedia Modeling PDF eBook
Author Susanne Boll
Publisher Springer
Pages 822
Release 2009-12-24
Genre Computers
ISBN 364211301X

Download Advances in Multimedia Modeling Book in PDF, Epub and Kindle

The 16th international conference on Multimedia Modeling (MMM2010) was held in the famous mountain city Chongqing, China, January 6–8, 2010, and hosted by Southwest University. MMM is a leading international conference for researchersand industry practitioners to share their new ideas, original research results and practicaldevelopment experiences from all multimedia related areas. MMM2010attractedmorethan160regular,specialsession,anddemosession submissions from 21 countries/regions around the world. All submitted papers were reviewed by at least two PC members or external reviewers, and most of them were reviewed by three reviewers. The review process was very selective. From the total of 133 submissions to the main track, 43 (32. 3%) were accepted as regular papers, 22 (16. 5%) as short papers. In all, 15 papers were received for three special sessions, which is by invitation only, and 14 submissions were received for a demo session, with 9 being selected. Authors of accepted papers come from 16 countries/regions. This volume of the proceedings contains the abstracts of three invited talks and all the regular, short, special session and demo papers. The regular papers were categorized into nine sections: 3D mod- ing;advancedvideocodingandadaptation;face,gestureandapplications;image processing;imageretrieval;learningsemanticconcepts;mediaanalysisandm- eling; semantic video concepts; and tracking and motion analysis. Three special sessions were video analysis and event recognition, cross-X multimedia mining in large scale, and mobile computing and applications. The technical programfeatured three invited talks, paralleloral presentation of all the accepted regular and special session papers, and poster sessions for short and demo papers.