Argentine Spanish segmental duration prediction

In this paper we model the segmental duration of Spanish spoken in Buenos Aires, considering its application in a text-to-speech system. The work was performed on two hand labeled databases. We use artificial neural networks as predictor, and all the input features can be extracted automatically fro...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Torres, H. M., Gurlekian, J. A.
Formato: Objeto de conferencia
Lenguaje:Inglés
Publicado: 2012
Materias:
Acceso en línea:http://sedici.unlp.edu.ar/handle/10915/123916
https://41jaiio.sadio.org.ar/sites/default/files/14_AST_2012.pdf
Aporte de:
Descripción
Sumario:In this paper we model the segmental duration of Spanish spoken in Buenos Aires, considering its application in a text-to-speech system. The work was performed on two hand labeled databases. We use artificial neural networks as predictor, and all the input features can be extracted automatically from the speech text. We experimented with a neural network for all phonemes and one neural network for phoneme. In both cases the results are very promising for the two databases used. The order of importance of input features revealed to be different for each of the methods tested and different according to the speaker style.