Improving speech synthesis quality by reducing pitch peaks in the source recordings

Mostrar todas las versiones(2)

We present a method for improving the perceived naturalness of corpus-based speech synthesizers. It consists in removing pronounced pitch peaks in the original recordings, which typically lead to noticeable discontinuities in the synthesized speech. We perceptually evaluated this method using two co...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autores principales:	Violante, L., Rodríguez Zivic, P., Gravano, A., Appen ButlerHill; et al.; ETS; Google; Microsoft Research; Rakuten
Formato:	CONF
Materias:	Computational linguistics Continuous speech recognition Speech synthesis Corpus-based HMM-based Speech synthesizer Synthesized speech Audio recordings
Acceso en línea:	http://hdl.handle.net/20.500.12110/paper_97819372_v_n_p502_Violante
Aporte de:	Biblioteca Digital - Facultad de Ciencias Exactas y Naturales (UBA) de Universidad de Buenos Aires

Descripción
Sumario:	We present a method for improving the perceived naturalness of corpus-based speech synthesizers. It consists in removing pronounced pitch peaks in the original recordings, which typically lead to noticeable discontinuities in the synthesized speech. We perceptually evaluated this method using two concatenative and two HMM-based synthesis systems, and found that using it on the source recordings managed to improve the naturalness of the synthesizers and had no effect on their intelligibility. © 2013 Association for Computational Linguistics.

Improving speech synthesis quality by reducing pitch peaks in the source recordings

Ejemplares similares