Deep Neural Networks for Shimmer Approximation in Synthesized Audio Signal

Shimmer is a classical acoustic measure of the amplitude perturbation of a signal. This kind of variation in the human voice allow to characterize some properties, not only of the voice itself, but of the person who speaks. During the last years deep learning techniques have become the state of the...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: García, Mario Alejandro, Destéfanis, Eduardo A.
Formato: Objeto de conferencia
Lenguaje:Inglés
Publicado: 2017
Materias:
Acceso en línea:http://sedici.unlp.edu.ar/handle/10915/63484
Aporte de:
Descripción
Sumario:Shimmer is a classical acoustic measure of the amplitude perturbation of a signal. This kind of variation in the human voice allow to characterize some properties, not only of the voice itself, but of the person who speaks. During the last years deep learning techniques have become the state of the art for recognition tasks on the voice. In this work the relationship between shimmer and deep neural networks is analyzed. A deep learning model is created. It is able to approximate shimmer value of a simple synthesized audio signal (stationary and without formants) taking the spectrogram as input feature. It is concluded firstly, that for this kind of synthesized signal, a neural network like the one we proposed can approximate shimmer, and secondly, that the convolution layers can be designed in order to preserve the information of shimmer and transmit it to the following layers.