A generalization of PLDA for joint modeling of speaker identity and multiple nuisance conditions

Probabilistic linear discriminant analysis (PLDA) is the leading method for computing scores in speaker recognition systems. The method models the vectors representing each audio sample as a sum of three terms: one that depends on the speaker identity, one that models the within-speaker variability,...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Ferrer, L., McLaren, M., Sekhar C.C., Rao P., Ghosh P.K., Murthy H.A., Yegnanarayana B., Umesh S., Alku P., Prasanna S.R.M., Narayanan S.
Formato: CONF
Materias:
Acceso en línea:http://hdl.handle.net/20.500.12110/paper_2308457X_v2018-September_n_p82_Ferrer
Aporte de:
id todo:paper_2308457X_v2018-September_n_p82_Ferrer
record_format dspace
spelling todo:paper_2308457X_v2018-September_n_p82_Ferrer2023-10-03T16:40:55Z A generalization of PLDA for joint modeling of speaker identity and multiple nuisance conditions Ferrer, L. McLaren, M. Sekhar C.C. Rao P. Ghosh P.K. Murthy H.A. Yegnanarayana B. Umesh S. Alku P. Prasanna S.R.M. Narayanan S. Probabilistic linear discriminant analysis Speaker recognition Cost functions Discriminant analysis Speech communication Speech processing Acoustic characteristic Acoustic conditions Joint modeling Probabilistic linear discriminant analysis Speaker recognition Speaker recognition system Speaker variability Test condition Speech recognition Probabilistic linear discriminant analysis (PLDA) is the leading method for computing scores in speaker recognition systems. The method models the vectors representing each audio sample as a sum of three terms: one that depends on the speaker identity, one that models the within-speaker variability, and one that models any remaining variability. The last two terms are assumed to be independent across samples. We recently proposed an extension of the PLDA method, which we termed Joint PLDA (JPLDA), where the second term is considered dependent on the type of nuisance condition present in the data (e.g., the language or channel). The proposed method led to significant gains for multilanguage speaker recognition when taking language as the nuisance condition. In this paper, we present a generalization of this approach that allows for multiple nuisance terms. We show results using language and several nuisance conditions describing the acoustic characteristics of the sample and demonstrate that jointly including all these factors in the model leads to better results than including only language or acoustic condition factors. Overall, we obtain relative improvements in detection cost function between 5% and 47% for various systems and test conditions with respect to standard PLDA approaches. © 2018 International Speech Communication Association. All rights reserved. CONF info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by/2.5/ar http://hdl.handle.net/20.500.12110/paper_2308457X_v2018-September_n_p82_Ferrer
institution Universidad de Buenos Aires
institution_str I-28
repository_str R-134
collection Biblioteca Digital - Facultad de Ciencias Exactas y Naturales (UBA)
topic Probabilistic linear discriminant analysis
Speaker recognition
Cost functions
Discriminant analysis
Speech communication
Speech processing
Acoustic characteristic
Acoustic conditions
Joint modeling
Probabilistic linear discriminant analysis
Speaker recognition
Speaker recognition system
Speaker variability
Test condition
Speech recognition
spellingShingle Probabilistic linear discriminant analysis
Speaker recognition
Cost functions
Discriminant analysis
Speech communication
Speech processing
Acoustic characteristic
Acoustic conditions
Joint modeling
Probabilistic linear discriminant analysis
Speaker recognition
Speaker recognition system
Speaker variability
Test condition
Speech recognition
Ferrer, L.
McLaren, M.
Sekhar C.C.
Rao P.
Ghosh P.K.
Murthy H.A.
Yegnanarayana B.
Umesh S.
Alku P.
Prasanna S.R.M.
Narayanan S.
A generalization of PLDA for joint modeling of speaker identity and multiple nuisance conditions
topic_facet Probabilistic linear discriminant analysis
Speaker recognition
Cost functions
Discriminant analysis
Speech communication
Speech processing
Acoustic characteristic
Acoustic conditions
Joint modeling
Probabilistic linear discriminant analysis
Speaker recognition
Speaker recognition system
Speaker variability
Test condition
Speech recognition
description Probabilistic linear discriminant analysis (PLDA) is the leading method for computing scores in speaker recognition systems. The method models the vectors representing each audio sample as a sum of three terms: one that depends on the speaker identity, one that models the within-speaker variability, and one that models any remaining variability. The last two terms are assumed to be independent across samples. We recently proposed an extension of the PLDA method, which we termed Joint PLDA (JPLDA), where the second term is considered dependent on the type of nuisance condition present in the data (e.g., the language or channel). The proposed method led to significant gains for multilanguage speaker recognition when taking language as the nuisance condition. In this paper, we present a generalization of this approach that allows for multiple nuisance terms. We show results using language and several nuisance conditions describing the acoustic characteristics of the sample and demonstrate that jointly including all these factors in the model leads to better results than including only language or acoustic condition factors. Overall, we obtain relative improvements in detection cost function between 5% and 47% for various systems and test conditions with respect to standard PLDA approaches. © 2018 International Speech Communication Association. All rights reserved.
format CONF
author Ferrer, L.
McLaren, M.
Sekhar C.C.
Rao P.
Ghosh P.K.
Murthy H.A.
Yegnanarayana B.
Umesh S.
Alku P.
Prasanna S.R.M.
Narayanan S.
author_facet Ferrer, L.
McLaren, M.
Sekhar C.C.
Rao P.
Ghosh P.K.
Murthy H.A.
Yegnanarayana B.
Umesh S.
Alku P.
Prasanna S.R.M.
Narayanan S.
author_sort Ferrer, L.
title A generalization of PLDA for joint modeling of speaker identity and multiple nuisance conditions
title_short A generalization of PLDA for joint modeling of speaker identity and multiple nuisance conditions
title_full A generalization of PLDA for joint modeling of speaker identity and multiple nuisance conditions
title_fullStr A generalization of PLDA for joint modeling of speaker identity and multiple nuisance conditions
title_full_unstemmed A generalization of PLDA for joint modeling of speaker identity and multiple nuisance conditions
title_sort generalization of plda for joint modeling of speaker identity and multiple nuisance conditions
url http://hdl.handle.net/20.500.12110/paper_2308457X_v2018-September_n_p82_Ferrer
work_keys_str_mv AT ferrerl ageneralizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions
AT mclarenm ageneralizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions
AT sekharcc ageneralizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions
AT raop ageneralizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions
AT ghoshpk ageneralizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions
AT murthyha ageneralizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions
AT yegnanarayanab ageneralizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions
AT umeshs ageneralizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions
AT alkup ageneralizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions
AT prasannasrm ageneralizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions
AT narayanans ageneralizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions
AT ferrerl generalizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions
AT mclarenm generalizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions
AT sekharcc generalizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions
AT raop generalizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions
AT ghoshpk generalizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions
AT murthyha generalizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions
AT yegnanarayanab generalizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions
AT umeshs generalizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions
AT alkup generalizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions
AT prasannasrm generalizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions
AT narayanans generalizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions
_version_ 1807315035840577536