Improving robustness of speaker recognition to new conditions using unlabeled data

Mostrar todas las versiones(2)

Unsupervised techniques for the adaptation of speaker recognition are important due to the problem of condition mismatch that is prevalent when applying speaker recognition technology to new conditions and the general scarcity of labeled 'in-domain' data. In the recent NIST 2016 Speaker Re...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Castan, D., McLaren, M., Ferrer, L., Lawson, A., Lozano-Diez, A., Lacerda F., Strombergsson S., Wlodarczak M., Heldner M., Gustafson J., House D.
Formato:	CONF
Materias:	NIST SRE16 Score Calibration Score Normalization Trial-based Calibration Calibration Speech communication Acoustic conditions Calibration parameters Score normalization Speaker clustering Speaker recognition Speaker recognition evaluations Unsupervised techniques Speech recognition
Acceso en línea:	http://hdl.handle.net/20.500.12110/paper_2308457X_v2017-August_n_p3737_Castan
Aporte de:	Biblioteca Digital - Facultad de Ciencias Exactas y Naturales (UBA) de Universidad de Buenos Aires

id	todo:paper_2308457X_v2017-August_n_p3737_Castan
record_format	dspace
spelling	todo:paper_2308457X_v2017-August_n_p3737_Castan2023-10-03T16:40:54Z Improving robustness of speaker recognition to new conditions using unlabeled data Castan, D. McLaren, M. Ferrer, L. Lawson, A. Lozano-Diez, A. Lacerda F. Strombergsson S. Wlodarczak M. Heldner M. Gustafson J. House D. NIST SRE16 Score Calibration Score Normalization Trial-based Calibration Calibration Speech communication Acoustic conditions Calibration parameters NIST SRE16 Score normalization Speaker clustering Speaker recognition Speaker recognition evaluations Unsupervised techniques Speech recognition Unsupervised techniques for the adaptation of speaker recognition are important due to the problem of condition mismatch that is prevalent when applying speaker recognition technology to new conditions and the general scarcity of labeled 'in-domain' data. In the recent NIST 2016 Speaker Recognition Evaluation (SRE), symmetric score normalization (Snorm) and calibration using unlabeled in-domain data were shown to be beneficial. Because calibration requires speaker labels for training, speaker-clustering techniques were used to generate pseudo-speakers for learning calibration parameters in those cases where only unlabeled in-domain data was available. These methods performed well in the SRE16. It is unclear, however, whether those techniques generalize well to other data sources. In this work, we benchmark these approaches on several distinctly different databases, after we describe our SRI-CON-UAM team system submission for the NIST 2016 SRE. Our analysis shows that while the benefit of S-norm is also observed across other datasets, applying speaker-clustered calibration provides considerably greater benefit to the system in the context of new acoustic conditions. Copyright © 2017 ISCA. CONF info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by/2.5/ar http://hdl.handle.net/20.500.12110/paper_2308457X_v2017-August_n_p3737_Castan
institution	Universidad de Buenos Aires
institution_str	I-28
repository_str	R-134
collection	Biblioteca Digital - Facultad de Ciencias Exactas y Naturales (UBA)
topic	NIST SRE16 Score Calibration Score Normalization Trial-based Calibration Calibration Speech communication Acoustic conditions Calibration parameters NIST SRE16 Score normalization Speaker clustering Speaker recognition Speaker recognition evaluations Unsupervised techniques Speech recognition
spellingShingle	NIST SRE16 Score Calibration Score Normalization Trial-based Calibration Calibration Speech communication Acoustic conditions Calibration parameters NIST SRE16 Score normalization Speaker clustering Speaker recognition Speaker recognition evaluations Unsupervised techniques Speech recognition Castan, D. McLaren, M. Ferrer, L. Lawson, A. Lozano-Diez, A. Lacerda F. Strombergsson S. Wlodarczak M. Heldner M. Gustafson J. House D. Improving robustness of speaker recognition to new conditions using unlabeled data
topic_facet	NIST SRE16 Score Calibration Score Normalization Trial-based Calibration Calibration Speech communication Acoustic conditions Calibration parameters NIST SRE16 Score normalization Speaker clustering Speaker recognition Speaker recognition evaluations Unsupervised techniques Speech recognition
description	Unsupervised techniques for the adaptation of speaker recognition are important due to the problem of condition mismatch that is prevalent when applying speaker recognition technology to new conditions and the general scarcity of labeled 'in-domain' data. In the recent NIST 2016 Speaker Recognition Evaluation (SRE), symmetric score normalization (Snorm) and calibration using unlabeled in-domain data were shown to be beneficial. Because calibration requires speaker labels for training, speaker-clustering techniques were used to generate pseudo-speakers for learning calibration parameters in those cases where only unlabeled in-domain data was available. These methods performed well in the SRE16. It is unclear, however, whether those techniques generalize well to other data sources. In this work, we benchmark these approaches on several distinctly different databases, after we describe our SRI-CON-UAM team system submission for the NIST 2016 SRE. Our analysis shows that while the benefit of S-norm is also observed across other datasets, applying speaker-clustered calibration provides considerably greater benefit to the system in the context of new acoustic conditions. Copyright © 2017 ISCA.
format	CONF
author	Castan, D. McLaren, M. Ferrer, L. Lawson, A. Lozano-Diez, A. Lacerda F. Strombergsson S. Wlodarczak M. Heldner M. Gustafson J. House D.
author_facet	Castan, D. McLaren, M. Ferrer, L. Lawson, A. Lozano-Diez, A. Lacerda F. Strombergsson S. Wlodarczak M. Heldner M. Gustafson J. House D.
author_sort	Castan, D.
title	Improving robustness of speaker recognition to new conditions using unlabeled data
title_short	Improving robustness of speaker recognition to new conditions using unlabeled data
title_full	Improving robustness of speaker recognition to new conditions using unlabeled data
title_fullStr	Improving robustness of speaker recognition to new conditions using unlabeled data
title_full_unstemmed	Improving robustness of speaker recognition to new conditions using unlabeled data
title_sort	improving robustness of speaker recognition to new conditions using unlabeled data
url	http://hdl.handle.net/20.500.12110/paper_2308457X_v2017-August_n_p3737_Castan
work_keys_str_mv	AT castand improvingrobustnessofspeakerrecognitiontonewconditionsusingunlabeleddata AT mclarenm improvingrobustnessofspeakerrecognitiontonewconditionsusingunlabeleddata AT ferrerl improvingrobustnessofspeakerrecognitiontonewconditionsusingunlabeleddata AT lawsona improvingrobustnessofspeakerrecognitiontonewconditionsusingunlabeleddata AT lozanodieza improvingrobustnessofspeakerrecognitiontonewconditionsusingunlabeleddata AT lacerdaf improvingrobustnessofspeakerrecognitiontonewconditionsusingunlabeleddata AT strombergssons improvingrobustnessofspeakerrecognitiontonewconditionsusingunlabeleddata AT wlodarczakm improvingrobustnessofspeakerrecognitiontonewconditionsusingunlabeleddata AT heldnerm improvingrobustnessofspeakerrecognitiontonewconditionsusingunlabeleddata AT gustafsonj improvingrobustnessofspeakerrecognitiontonewconditionsusingunlabeleddata AT housed improvingrobustnessofspeakerrecognitiontonewconditionsusingunlabeleddata
_version_	1807314974585913344

Improving robustness of speaker recognition to new conditions using unlabeled data

Ejemplares similares