Exploring the role of phonetic bottleneck features for speaker and language recognition

Mostrar todas las versiones(2)

Using bottleneck features extracted from a deep neural network (DNN) trained to predict senone posteriors has resulted in new, state-of-the-art technology for language and speaker identification. For language identification, the features' dense phonetic information is believed to enable improve...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	McLaren, M., Ferrer, L., Lawson, A., The Institute of Electrical and Electronics Engineers Signal Processing Society
Formato:	CONF
Materias:	Bottleneck Features Deep Neural Networks Language Recognition Speaker Recognition
Acceso en línea:	http://hdl.handle.net/20.500.12110/paper_15206149_v2016-May_n_p5575_McLaren
Aporte de:	Biblioteca Digital - Facultad de Ciencias Exactas y Naturales (UBA) de Universidad de Buenos Aires

id	todo:paper_15206149_v2016-May_n_p5575_McLaren
record_format	dspace
spelling	todo:paper_15206149_v2016-May_n_p5575_McLaren2023-10-03T16:20:33Z Exploring the role of phonetic bottleneck features for speaker and language recognition McLaren, M. Ferrer, L. Lawson, A. The Institute of Electrical and Electronics Engineers Signal Processing Society Bottleneck Features Deep Neural Networks Language Recognition Speaker Recognition Using bottleneck features extracted from a deep neural network (DNN) trained to predict senone posteriors has resulted in new, state-of-the-art technology for language and speaker identification. For language identification, the features' dense phonetic information is believed to enable improved performance by better representing language-dependent phone distributions. For speaker recognition, the role of these features is less clear, given that a bottleneck layer near the DNN output layer is thought to contain limited speaker information. In this article, we analyze the role of bottleneck features in these identification tasks by varying the DNN layer from which they are extracted, under the hypothesis that speaker information is traded for dense phonetic information as the layer moves toward the DNN output layer. Experiments support this hypothesis under certain conditions, and highlight the benefit of using a bottleneck layer close to the DNN output layer when DNN training data is matched to the evaluation conditions, and a layer more central to the DNN otherwise. © 2016 IEEE. CONF info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by/2.5/ar http://hdl.handle.net/20.500.12110/paper_15206149_v2016-May_n_p5575_McLaren
institution	Universidad de Buenos Aires
institution_str	I-28
repository_str	R-134
collection	Biblioteca Digital - Facultad de Ciencias Exactas y Naturales (UBA)
topic	Bottleneck Features Deep Neural Networks Language Recognition Speaker Recognition
spellingShingle	Bottleneck Features Deep Neural Networks Language Recognition Speaker Recognition McLaren, M. Ferrer, L. Lawson, A. The Institute of Electrical and Electronics Engineers Signal Processing Society Exploring the role of phonetic bottleneck features for speaker and language recognition
topic_facet	Bottleneck Features Deep Neural Networks Language Recognition Speaker Recognition
description	Using bottleneck features extracted from a deep neural network (DNN) trained to predict senone posteriors has resulted in new, state-of-the-art technology for language and speaker identification. For language identification, the features' dense phonetic information is believed to enable improved performance by better representing language-dependent phone distributions. For speaker recognition, the role of these features is less clear, given that a bottleneck layer near the DNN output layer is thought to contain limited speaker information. In this article, we analyze the role of bottleneck features in these identification tasks by varying the DNN layer from which they are extracted, under the hypothesis that speaker information is traded for dense phonetic information as the layer moves toward the DNN output layer. Experiments support this hypothesis under certain conditions, and highlight the benefit of using a bottleneck layer close to the DNN output layer when DNN training data is matched to the evaluation conditions, and a layer more central to the DNN otherwise. © 2016 IEEE.
format	CONF
author	McLaren, M. Ferrer, L. Lawson, A. The Institute of Electrical and Electronics Engineers Signal Processing Society
author_facet	McLaren, M. Ferrer, L. Lawson, A. The Institute of Electrical and Electronics Engineers Signal Processing Society
author_sort	McLaren, M.
title	Exploring the role of phonetic bottleneck features for speaker and language recognition
title_short	Exploring the role of phonetic bottleneck features for speaker and language recognition
title_full	Exploring the role of phonetic bottleneck features for speaker and language recognition
title_fullStr	Exploring the role of phonetic bottleneck features for speaker and language recognition
title_full_unstemmed	Exploring the role of phonetic bottleneck features for speaker and language recognition
title_sort	exploring the role of phonetic bottleneck features for speaker and language recognition
url	http://hdl.handle.net/20.500.12110/paper_15206149_v2016-May_n_p5575_McLaren
work_keys_str_mv	AT mclarenm exploringtheroleofphoneticbottleneckfeaturesforspeakerandlanguagerecognition AT ferrerl exploringtheroleofphoneticbottleneckfeaturesforspeakerandlanguagerecognition AT lawsona exploringtheroleofphoneticbottleneckfeaturesforspeakerandlanguagerecognition AT theinstituteofelectricalandelectronicsengineerssignalprocessingsociety exploringtheroleofphoneticbottleneckfeaturesforspeakerandlanguagerecognition
_version_	1807316158027661312

Exploring the role of phonetic bottleneck features for speaker and language recognition

Ejemplares similares