Improving speech synthesis quality by reducing pitch peaks in the source recordings

We present a method for improving the perceived naturalness of corpus-based speech synthesizers. It consists in removing pronounced pitch peaks in the original recordings, which typically lead to noticeable discontinuities in the synthesized speech. We perceptually evaluated this method using two co...

Descripción completa

Guardado en:

Detalles Bibliográficos
Autor principal:	Gravano, Agustín
Publicado:	2013
Materias:	Computational linguistics Continuous speech recognition Speech synthesis Corpus-based HMM-based Speech synthesizer Synthesized speech Audio recordings
Acceso en línea:	https://bibliotecadigital.exactas.uba.ar/collection/paper/document/paper_97819372_v_n_p502_Violante http://hdl.handle.net/20.500.12110/paper_97819372_v_n_p502_Violante
Aporte de:	Biblioteca Digital - Facultad de Ciencias Exactas y Naturales (UBA) de Universidad de Buenos Aires

Ejemplares similares

Improving speech synthesis quality by reducing pitch peaks in the source recordings
por: Violante, L., et al.

Prosodic facilitation and interference while judging on the veracity of synthesized statements
por: Gálvez, R.H., et al.

Prosodic facilitation and interference while judging on the veracity of synthesized statements
Publicado: (2017)

Techniques for noise robustness in automatic speech recognition
por: Virtanen, Tuomas
Publicado: (2012)

Emilia: a speech corpus for Argentine Spanish text to speech synthesis
por: Torres, H.M., et al.

Emilia: a speech corpus for Argentine Spanish text to speech synthesis
Publicado: (2019)

Minimizing annotation effort for adaptation of speech-activity detection systems
por: Ferrer, L., et al.

Minimizing annotation effort for adaptation of speech-activity detection systems
Publicado: (2016)

Isolated spanish digit recognition based on audio-visual features
por: Sad, Gonzalo D., et al.
Publicado: (2013)

Improving audio of emergency calls in Spanish performed to the ECU 911 through filters for ASR technology
por: Orellana, Marcos, et al.
Publicado: (2022)

Embedding EBP in speech and language therapy international examples /
Publicado: (2010)

Speech act theory and communication : a Univen study /
por: Kaburise, Phyllis
Publicado: (2011)

Using prosody to classify discourse relations
por: Kleinhans, J., et al.

Free speech in an open society /
por: Smolla, Rodney A.
Publicado: (1993)

Democracy and the problem of free speech /
por: Sunstein, Cass R.
Publicado: (1995)

Pragmatic perspectives on language and linguistics .
Publicado: (2010)

Pragmatic perspectives on language and linguistics.
Publicado: (2010)

The SRI system for the NIST OpenSAD 2015 speech activity detection evaluation
por: Graciarena, M., et al.

The SRI system for the NIST OpenSAD 2015 speech activity detection evaluation
Publicado: (2016)

Using prosody to classify discourse relations
Publicado: (2017)

Excitable speech : a politics of the performative /
por: Butler, Judith, 1956-
Publicado: (1997)

Assessing the Impact of Contextual Information in Hate Speech Detection
por: Gravano, Agustín, et al.
Publicado: (2023)

Teamwork Quality Prediction Using Speech-Based Features
por: Meza, Martín, et al.
Publicado: (2023)

Construcción de una base de datos para el desarrollo de sistemas de conversión de texto a habla
por: Rodriguez, Hernán Gonzalo
Publicado: (2000)

The 2016 speakers in thewild speaker recognition evaluation
por: McLaren, M., et al.

The 2016 speakers in thewild speaker recognition evaluation
Publicado: (2016)

Mitigating the effects of non-stationary unseen noises on language recognition performance
por: Ferrer, L., et al.

Mitigating the effects of non-stationary unseen noises on language recognition performance
Publicado: (2015)

Combination of Standard and Complementary Models for Audio-Visual Speech Recognition
por: Sad, Gonzalo D., et al.
Publicado: (2015)

Speech technology

Spoken language recognition based on senone posteriors
por: Ferrer, L., et al.

Spoken language recognition based on senone posteriors
Publicado: (2014)

Detección de palabras claves en lenguajes sin datos de entrenamiento
por: Brusco, Pablo, et al.
Publicado: (2014)

EURASIP Journal on Audio, Speech, and Music Processing

IVth International Conference on Pragmalinguistics and Speech Practices /
Publicado: (2011)

Frequency synthesizers : theory and design /
por: Manassewitsch, Vadim, 1927-
Publicado: (1976)

Framing in discourse /
Publicado: (1993)

Journal of speech, language, and hearing research

The speakers in the wild (SITW) speaker recognition database
por: McLaren, M., et al.

The speakers in the wild (SITW) speaker recognition database
Publicado: (2016)