Improving speech synthesis quality by reducing pitch peaks in the source recordings

We present a method for improving the perceived naturalness of corpus-based speech synthesizers. It consists in removing pronounced pitch peaks in the original recordings, which typically lead to noticeable discontinuities in the synthesized speech. We perceptually evaluated this method using two co...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autor principal:	Gravano, Agustín
Publicado:	2013
Materias:	Computational linguistics Continuous speech recognition Speech synthesis Corpus-based HMM-based Speech synthesizer Synthesized speech Audio recordings
Acceso en línea:	https://bibliotecadigital.exactas.uba.ar/collection/paper/document/paper_97819372_v_n_p502_Violante http://hdl.handle.net/20.500.12110/paper_97819372_v_n_p502_Violante
Aporte de:	Biblioteca Digital - Facultad de Ciencias Exactas y Naturales (UBA) de Universidad de Buenos Aires

id	paper:paper_97819372_v_n_p502_Violante
record_format	dspace
spelling	paper:paper_97819372_v_n_p502_Violante2025-07-30T19:14:26Z Improving speech synthesis quality by reducing pitch peaks in the source recordings Gravano, Agustín Computational linguistics Continuous speech recognition Speech synthesis Corpus-based HMM-based Speech synthesizer Synthesized speech Audio recordings We present a method for improving the perceived naturalness of corpus-based speech synthesizers. It consists in removing pronounced pitch peaks in the original recordings, which typically lead to noticeable discontinuities in the synthesized speech. We perceptually evaluated this method using two concatenative and two HMM-based synthesis systems, and found that using it on the source recordings managed to improve the naturalness of the synthesizers and had no effect on their intelligibility. © 2013 Association for Computational Linguistics. Fil:Gravano, A. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales; Argentina. 2013 https://bibliotecadigital.exactas.uba.ar/collection/paper/document/paper_97819372_v_n_p502_Violante http://hdl.handle.net/20.500.12110/paper_97819372_v_n_p502_Violante
institution	Universidad de Buenos Aires
institution_str	I-28
repository_str	R-134
collection	Biblioteca Digital - Facultad de Ciencias Exactas y Naturales (UBA)
topic	Computational linguistics Continuous speech recognition Speech synthesis Corpus-based HMM-based Speech synthesizer Synthesized speech Audio recordings
spellingShingle	Computational linguistics Continuous speech recognition Speech synthesis Corpus-based HMM-based Speech synthesizer Synthesized speech Audio recordings Gravano, Agustín Improving speech synthesis quality by reducing pitch peaks in the source recordings
topic_facet	Computational linguistics Continuous speech recognition Speech synthesis Corpus-based HMM-based Speech synthesizer Synthesized speech Audio recordings
description	We present a method for improving the perceived naturalness of corpus-based speech synthesizers. It consists in removing pronounced pitch peaks in the original recordings, which typically lead to noticeable discontinuities in the synthesized speech. We perceptually evaluated this method using two concatenative and two HMM-based synthesis systems, and found that using it on the source recordings managed to improve the naturalness of the synthesizers and had no effect on their intelligibility. © 2013 Association for Computational Linguistics.
author	Gravano, Agustín
author_facet	Gravano, Agustín
author_sort	Gravano, Agustín
title	Improving speech synthesis quality by reducing pitch peaks in the source recordings
title_short	Improving speech synthesis quality by reducing pitch peaks in the source recordings
title_full	Improving speech synthesis quality by reducing pitch peaks in the source recordings
title_fullStr	Improving speech synthesis quality by reducing pitch peaks in the source recordings
title_full_unstemmed	Improving speech synthesis quality by reducing pitch peaks in the source recordings
title_sort	improving speech synthesis quality by reducing pitch peaks in the source recordings
publishDate	2013
url	https://bibliotecadigital.exactas.uba.ar/collection/paper/document/paper_97819372_v_n_p502_Violante http://hdl.handle.net/20.500.12110/paper_97819372_v_n_p502_Violante
work_keys_str_mv	AT gravanoagustin improvingspeechsynthesisqualitybyreducingpitchpeaksinthesourcerecordings
_version_	1840328100874813440

Improving speech synthesis quality by reducing pitch peaks in the source recordings

Ejemplares similares