Combination of Standard and Complementary Models for Audio-Visual Speech Recognition

In this work, new multi-classifier schemes for isolated word speech recognition based on the combination of standard Hidden Markov Models (HMMs) and Complementary Gaussian Mixture Models (CGMMs) are proposed. Typically, in speech recognition systems, each word or phoneme in the vocabulary is represe...

Descripción completa

Detalles Bibliográficos
Autores principales: Sad, Gonzalo D., Terissi, Lucas D., Gómez, Juan Carlos
Formato: Objeto de conferencia
Lenguaje:Inglés
Publicado: 2015
Materias:
Acceso en línea:http://sedici.unlp.edu.ar/handle/10915/52105
http://44jaiio.sadio.org.ar/sites/default/files/asai113-120.pdf
Aporte de:
id I19-R120-10915-52105
record_format dspace
institution Universidad Nacional de La Plata
institution_str I-19
repository_str R-120
collection SEDICI (UNLP)
language Inglés
topic Ciencias Informáticas
Speech recognition and synthesis
audio-visual information fusion
decision level fusion
spellingShingle Ciencias Informáticas
Speech recognition and synthesis
audio-visual information fusion
decision level fusion
Sad, Gonzalo D.
Terissi, Lucas D.
Gómez, Juan Carlos
Combination of Standard and Complementary Models for Audio-Visual Speech Recognition
topic_facet Ciencias Informáticas
Speech recognition and synthesis
audio-visual information fusion
decision level fusion
description In this work, new multi-classifier schemes for isolated word speech recognition based on the combination of standard Hidden Markov Models (HMMs) and Complementary Gaussian Mixture Models (CGMMs) are proposed. Typically, in speech recognition systems, each word or phoneme in the vocabulary is represented by a model trained with samples of each particular class. The recognition is then performed by computing which model best represents the input word/phoneme to be classified. In this paper, a novel classification strategy based on complementary class models is presented. A complementary model to a particular class j refers to a model that is trained with instances of all the considered classes, excepting the ones associated to that class j. The classification schemes proposed in this paper are evaluated over two audio-visual speech databases, considering acoustic noisy conditions. Experimental results show that improvements in the recognition rates through a wide range of signal to noise ratios (SNRs) are achieved with the proposed classification methodologies.
format Objeto de conferencia
Objeto de conferencia
author Sad, Gonzalo D.
Terissi, Lucas D.
Gómez, Juan Carlos
author_facet Sad, Gonzalo D.
Terissi, Lucas D.
Gómez, Juan Carlos
author_sort Sad, Gonzalo D.
title Combination of Standard and Complementary Models for Audio-Visual Speech Recognition
title_short Combination of Standard and Complementary Models for Audio-Visual Speech Recognition
title_full Combination of Standard and Complementary Models for Audio-Visual Speech Recognition
title_fullStr Combination of Standard and Complementary Models for Audio-Visual Speech Recognition
title_full_unstemmed Combination of Standard and Complementary Models for Audio-Visual Speech Recognition
title_sort combination of standard and complementary models for audio-visual speech recognition
publishDate 2015
url http://sedici.unlp.edu.ar/handle/10915/52105
http://44jaiio.sadio.org.ar/sites/default/files/asai113-120.pdf
work_keys_str_mv AT sadgonzalod combinationofstandardandcomplementarymodelsforaudiovisualspeechrecognition
AT terissilucasd combinationofstandardandcomplementarymodelsforaudiovisualspeechrecognition
AT gomezjuancarlos combinationofstandardandcomplementarymodelsforaudiovisualspeechrecognition
bdutipo_str Repositorios
_version_ 1764820476479143936