A generalization of PLDA for joint modeling of speaker identity and multiple nuisance conditions
Probabilistic linear discriminant analysis (PLDA) is the leading method for computing scores in speaker recognition systems. The method models the vectors representing each audio sample as a sum of three terms: one that depends on the speaker identity, one that models the within-speaker variability,...
Guardado en:
Autores principales: | , , , , , , , , , , |
---|---|
Formato: | CONF |
Materias: | |
Acceso en línea: | http://hdl.handle.net/20.500.12110/paper_2308457X_v2018-September_n_p82_Ferrer |
Aporte de: |
id |
todo:paper_2308457X_v2018-September_n_p82_Ferrer |
---|---|
record_format |
dspace |
spelling |
todo:paper_2308457X_v2018-September_n_p82_Ferrer2023-10-03T16:40:55Z A generalization of PLDA for joint modeling of speaker identity and multiple nuisance conditions Ferrer, L. McLaren, M. Sekhar C.C. Rao P. Ghosh P.K. Murthy H.A. Yegnanarayana B. Umesh S. Alku P. Prasanna S.R.M. Narayanan S. Probabilistic linear discriminant analysis Speaker recognition Cost functions Discriminant analysis Speech communication Speech processing Acoustic characteristic Acoustic conditions Joint modeling Probabilistic linear discriminant analysis Speaker recognition Speaker recognition system Speaker variability Test condition Speech recognition Probabilistic linear discriminant analysis (PLDA) is the leading method for computing scores in speaker recognition systems. The method models the vectors representing each audio sample as a sum of three terms: one that depends on the speaker identity, one that models the within-speaker variability, and one that models any remaining variability. The last two terms are assumed to be independent across samples. We recently proposed an extension of the PLDA method, which we termed Joint PLDA (JPLDA), where the second term is considered dependent on the type of nuisance condition present in the data (e.g., the language or channel). The proposed method led to significant gains for multilanguage speaker recognition when taking language as the nuisance condition. In this paper, we present a generalization of this approach that allows for multiple nuisance terms. We show results using language and several nuisance conditions describing the acoustic characteristics of the sample and demonstrate that jointly including all these factors in the model leads to better results than including only language or acoustic condition factors. Overall, we obtain relative improvements in detection cost function between 5% and 47% for various systems and test conditions with respect to standard PLDA approaches. © 2018 International Speech Communication Association. All rights reserved. CONF info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by/2.5/ar http://hdl.handle.net/20.500.12110/paper_2308457X_v2018-September_n_p82_Ferrer |
institution |
Universidad de Buenos Aires |
institution_str |
I-28 |
repository_str |
R-134 |
collection |
Biblioteca Digital - Facultad de Ciencias Exactas y Naturales (UBA) |
topic |
Probabilistic linear discriminant analysis Speaker recognition Cost functions Discriminant analysis Speech communication Speech processing Acoustic characteristic Acoustic conditions Joint modeling Probabilistic linear discriminant analysis Speaker recognition Speaker recognition system Speaker variability Test condition Speech recognition |
spellingShingle |
Probabilistic linear discriminant analysis Speaker recognition Cost functions Discriminant analysis Speech communication Speech processing Acoustic characteristic Acoustic conditions Joint modeling Probabilistic linear discriminant analysis Speaker recognition Speaker recognition system Speaker variability Test condition Speech recognition Ferrer, L. McLaren, M. Sekhar C.C. Rao P. Ghosh P.K. Murthy H.A. Yegnanarayana B. Umesh S. Alku P. Prasanna S.R.M. Narayanan S. A generalization of PLDA for joint modeling of speaker identity and multiple nuisance conditions |
topic_facet |
Probabilistic linear discriminant analysis Speaker recognition Cost functions Discriminant analysis Speech communication Speech processing Acoustic characteristic Acoustic conditions Joint modeling Probabilistic linear discriminant analysis Speaker recognition Speaker recognition system Speaker variability Test condition Speech recognition |
description |
Probabilistic linear discriminant analysis (PLDA) is the leading method for computing scores in speaker recognition systems. The method models the vectors representing each audio sample as a sum of three terms: one that depends on the speaker identity, one that models the within-speaker variability, and one that models any remaining variability. The last two terms are assumed to be independent across samples. We recently proposed an extension of the PLDA method, which we termed Joint PLDA (JPLDA), where the second term is considered dependent on the type of nuisance condition present in the data (e.g., the language or channel). The proposed method led to significant gains for multilanguage speaker recognition when taking language as the nuisance condition. In this paper, we present a generalization of this approach that allows for multiple nuisance terms. We show results using language and several nuisance conditions describing the acoustic characteristics of the sample and demonstrate that jointly including all these factors in the model leads to better results than including only language or acoustic condition factors. Overall, we obtain relative improvements in detection cost function between 5% and 47% for various systems and test conditions with respect to standard PLDA approaches. © 2018 International Speech Communication Association. All rights reserved. |
format |
CONF |
author |
Ferrer, L. McLaren, M. Sekhar C.C. Rao P. Ghosh P.K. Murthy H.A. Yegnanarayana B. Umesh S. Alku P. Prasanna S.R.M. Narayanan S. |
author_facet |
Ferrer, L. McLaren, M. Sekhar C.C. Rao P. Ghosh P.K. Murthy H.A. Yegnanarayana B. Umesh S. Alku P. Prasanna S.R.M. Narayanan S. |
author_sort |
Ferrer, L. |
title |
A generalization of PLDA for joint modeling of speaker identity and multiple nuisance conditions |
title_short |
A generalization of PLDA for joint modeling of speaker identity and multiple nuisance conditions |
title_full |
A generalization of PLDA for joint modeling of speaker identity and multiple nuisance conditions |
title_fullStr |
A generalization of PLDA for joint modeling of speaker identity and multiple nuisance conditions |
title_full_unstemmed |
A generalization of PLDA for joint modeling of speaker identity and multiple nuisance conditions |
title_sort |
generalization of plda for joint modeling of speaker identity and multiple nuisance conditions |
url |
http://hdl.handle.net/20.500.12110/paper_2308457X_v2018-September_n_p82_Ferrer |
work_keys_str_mv |
AT ferrerl ageneralizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions AT mclarenm ageneralizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions AT sekharcc ageneralizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions AT raop ageneralizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions AT ghoshpk ageneralizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions AT murthyha ageneralizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions AT yegnanarayanab ageneralizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions AT umeshs ageneralizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions AT alkup ageneralizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions AT prasannasrm ageneralizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions AT narayanans ageneralizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions AT ferrerl generalizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions AT mclarenm generalizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions AT sekharcc generalizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions AT raop generalizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions AT ghoshpk generalizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions AT murthyha generalizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions AT yegnanarayanab generalizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions AT umeshs generalizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions AT alkup generalizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions AT prasannasrm generalizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions AT narayanans generalizationofpldaforjointmodelingofspeakeridentityandmultiplenuisanceconditions |
_version_ |
1807315035840577536 |