Improving speech synthesis quality by reducing pitch peaks in the source recordings
We present a method for improving the perceived naturalness of corpus-based speech synthesizers. It consists in removing pronounced pitch peaks in the original recordings, which typically lead to noticeable discontinuities in the synthesized speech. We perceptually evaluated this method using two co...
Guardado en:
Autores principales: | , , , |
---|---|
Formato: | CONF |
Materias: | |
Acceso en línea: | http://hdl.handle.net/20.500.12110/paper_97819372_v_n_p502_Violante |
Aporte de: |
id |
todo:paper_97819372_v_n_p502_Violante |
---|---|
record_format |
dspace |
spelling |
todo:paper_97819372_v_n_p502_Violante2023-10-03T16:44:45Z Improving speech synthesis quality by reducing pitch peaks in the source recordings Violante, L. Rodríguez Zivic, P. Gravano, A. Appen ButlerHill; et al.; ETS; Google; Microsoft Research; Rakuten Computational linguistics Continuous speech recognition Speech synthesis Corpus-based HMM-based Speech synthesizer Synthesized speech Audio recordings We present a method for improving the perceived naturalness of corpus-based speech synthesizers. It consists in removing pronounced pitch peaks in the original recordings, which typically lead to noticeable discontinuities in the synthesized speech. We perceptually evaluated this method using two concatenative and two HMM-based synthesis systems, and found that using it on the source recordings managed to improve the naturalness of the synthesizers and had no effect on their intelligibility. © 2013 Association for Computational Linguistics. Fil:Gravano, A. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales; Argentina. CONF info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by/2.5/ar http://hdl.handle.net/20.500.12110/paper_97819372_v_n_p502_Violante |
institution |
Universidad de Buenos Aires |
institution_str |
I-28 |
repository_str |
R-134 |
collection |
Biblioteca Digital - Facultad de Ciencias Exactas y Naturales (UBA) |
topic |
Computational linguistics Continuous speech recognition Speech synthesis Corpus-based HMM-based Speech synthesizer Synthesized speech Audio recordings |
spellingShingle |
Computational linguistics Continuous speech recognition Speech synthesis Corpus-based HMM-based Speech synthesizer Synthesized speech Audio recordings Violante, L. Rodríguez Zivic, P. Gravano, A. Appen ButlerHill; et al.; ETS; Google; Microsoft Research; Rakuten Improving speech synthesis quality by reducing pitch peaks in the source recordings |
topic_facet |
Computational linguistics Continuous speech recognition Speech synthesis Corpus-based HMM-based Speech synthesizer Synthesized speech Audio recordings |
description |
We present a method for improving the perceived naturalness of corpus-based speech synthesizers. It consists in removing pronounced pitch peaks in the original recordings, which typically lead to noticeable discontinuities in the synthesized speech. We perceptually evaluated this method using two concatenative and two HMM-based synthesis systems, and found that using it on the source recordings managed to improve the naturalness of the synthesizers and had no effect on their intelligibility. © 2013 Association for Computational Linguistics. |
format |
CONF |
author |
Violante, L. Rodríguez Zivic, P. Gravano, A. Appen ButlerHill; et al.; ETS; Google; Microsoft Research; Rakuten |
author_facet |
Violante, L. Rodríguez Zivic, P. Gravano, A. Appen ButlerHill; et al.; ETS; Google; Microsoft Research; Rakuten |
author_sort |
Violante, L. |
title |
Improving speech synthesis quality by reducing pitch peaks in the source recordings |
title_short |
Improving speech synthesis quality by reducing pitch peaks in the source recordings |
title_full |
Improving speech synthesis quality by reducing pitch peaks in the source recordings |
title_fullStr |
Improving speech synthesis quality by reducing pitch peaks in the source recordings |
title_full_unstemmed |
Improving speech synthesis quality by reducing pitch peaks in the source recordings |
title_sort |
improving speech synthesis quality by reducing pitch peaks in the source recordings |
url |
http://hdl.handle.net/20.500.12110/paper_97819372_v_n_p502_Violante |
work_keys_str_mv |
AT violantel improvingspeechsynthesisqualitybyreducingpitchpeaksinthesourcerecordings AT rodriguezzivicp improvingspeechsynthesisqualitybyreducingpitchpeaksinthesourcerecordings AT gravanoa improvingspeechsynthesisqualitybyreducingpitchpeaksinthesourcerecordings AT appenbutlerhilletaletsgooglemicrosoftresearchrakuten improvingspeechsynthesisqualitybyreducingpitchpeaksinthesourcerecordings |
_version_ |
1782030858068164608 |