Improving speech synthesis quality by reducing pitch peaks in the source recordings

We present a method for improving the perceived naturalness of corpus-based speech synthesizers. It consists in removing pronounced pitch peaks in the original recordings, which typically lead to noticeable discontinuities in the synthesized speech. We perceptually evaluated this method using two co...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Violante, L., Rodríguez Zivic, P., Gravano, A., Appen ButlerHill; et al.; ETS; Google; Microsoft Research; Rakuten
Formato: CONF
Materias:
Acceso en línea:http://hdl.handle.net/20.500.12110/paper_97819372_v_n_p502_Violante
Aporte de:
id todo:paper_97819372_v_n_p502_Violante
record_format dspace
spelling todo:paper_97819372_v_n_p502_Violante2023-10-03T16:44:45Z Improving speech synthesis quality by reducing pitch peaks in the source recordings Violante, L. Rodríguez Zivic, P. Gravano, A. Appen ButlerHill; et al.; ETS; Google; Microsoft Research; Rakuten Computational linguistics Continuous speech recognition Speech synthesis Corpus-based HMM-based Speech synthesizer Synthesized speech Audio recordings We present a method for improving the perceived naturalness of corpus-based speech synthesizers. It consists in removing pronounced pitch peaks in the original recordings, which typically lead to noticeable discontinuities in the synthesized speech. We perceptually evaluated this method using two concatenative and two HMM-based synthesis systems, and found that using it on the source recordings managed to improve the naturalness of the synthesizers and had no effect on their intelligibility. © 2013 Association for Computational Linguistics. Fil:Gravano, A. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales; Argentina. CONF info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by/2.5/ar http://hdl.handle.net/20.500.12110/paper_97819372_v_n_p502_Violante
institution Universidad de Buenos Aires
institution_str I-28
repository_str R-134
collection Biblioteca Digital - Facultad de Ciencias Exactas y Naturales (UBA)
topic Computational linguistics
Continuous speech recognition
Speech synthesis
Corpus-based
HMM-based
Speech synthesizer
Synthesized speech
Audio recordings
spellingShingle Computational linguistics
Continuous speech recognition
Speech synthesis
Corpus-based
HMM-based
Speech synthesizer
Synthesized speech
Audio recordings
Violante, L.
Rodríguez Zivic, P.
Gravano, A.
Appen ButlerHill; et al.; ETS; Google; Microsoft Research; Rakuten
Improving speech synthesis quality by reducing pitch peaks in the source recordings
topic_facet Computational linguistics
Continuous speech recognition
Speech synthesis
Corpus-based
HMM-based
Speech synthesizer
Synthesized speech
Audio recordings
description We present a method for improving the perceived naturalness of corpus-based speech synthesizers. It consists in removing pronounced pitch peaks in the original recordings, which typically lead to noticeable discontinuities in the synthesized speech. We perceptually evaluated this method using two concatenative and two HMM-based synthesis systems, and found that using it on the source recordings managed to improve the naturalness of the synthesizers and had no effect on their intelligibility. © 2013 Association for Computational Linguistics.
format CONF
author Violante, L.
Rodríguez Zivic, P.
Gravano, A.
Appen ButlerHill; et al.; ETS; Google; Microsoft Research; Rakuten
author_facet Violante, L.
Rodríguez Zivic, P.
Gravano, A.
Appen ButlerHill; et al.; ETS; Google; Microsoft Research; Rakuten
author_sort Violante, L.
title Improving speech synthesis quality by reducing pitch peaks in the source recordings
title_short Improving speech synthesis quality by reducing pitch peaks in the source recordings
title_full Improving speech synthesis quality by reducing pitch peaks in the source recordings
title_fullStr Improving speech synthesis quality by reducing pitch peaks in the source recordings
title_full_unstemmed Improving speech synthesis quality by reducing pitch peaks in the source recordings
title_sort improving speech synthesis quality by reducing pitch peaks in the source recordings
url http://hdl.handle.net/20.500.12110/paper_97819372_v_n_p502_Violante
work_keys_str_mv AT violantel improvingspeechsynthesisqualitybyreducingpitchpeaksinthesourcerecordings
AT rodriguezzivicp improvingspeechsynthesisqualitybyreducingpitchpeaksinthesourcerecordings
AT gravanoa improvingspeechsynthesisqualitybyreducingpitchpeaksinthesourcerecordings
AT appenbutlerhilletaletsgooglemicrosoftresearchrakuten improvingspeechsynthesisqualitybyreducingpitchpeaksinthesourcerecordings
_version_ 1782030858068164608