Using prosody to classify discourse relations

This work aims to explore the correlation between the discourse structure of a spoken monologue and its prosody by predicting discourse relations from different prosodic attributes. For this purpose, a corpus of semi-spontaneous monologues in English has been automatically annotated according to the...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Kleinhans, J., Farrús, M., Gravano, A., Pérez, J.M., Lai, C., Wanner, L., Lacerda F., Strombergsson S., Wlodarczak M., Heldner M., Gustafson J., House D., Amazon Alexa; Apple; DiDi; et al.; Furhat Robotics; Microsoft
Formato: CONF
Materias:
RST
Acceso en línea:http://hdl.handle.net/20.500.12110/paper_2308457X_v2017-August_n_p3201_Kleinhans
Aporte de:
id todo:paper_2308457X_v2017-August_n_p3201_Kleinhans
record_format dspace
spelling todo:paper_2308457X_v2017-August_n_p3201_Kleinhans2023-10-03T16:40:54Z Using prosody to classify discourse relations Kleinhans, J. Farrús, M. Gravano, A. Pérez, J.M. Lai, C. Wanner, L. Lacerda F. Strombergsson S. Wlodarczak M. Heldner M. Gustafson J. House D. Amazon Alexa; Apple; DiDi; et al.; Furhat Robotics; Microsoft Discourse structure Prosody RST Speech synthesis Support vector machines Continuous speech recognition Speech Speech synthesis Support vector machines Text processing Discourse structure Prosodic features Prosody Rhetorical relations Rhetorical structure theory Speech rates Speech understanding Supervised classification Speech communication This work aims to explore the correlation between the discourse structure of a spoken monologue and its prosody by predicting discourse relations from different prosodic attributes. For this purpose, a corpus of semi-spontaneous monologues in English has been automatically annotated according to the Rhetorical Structure Theory, which models coherence in text via rhetorical relations. From corresponding audio files, prosodic features such as pitch, intensity, and speech rate have been extracted from different contexts of a relation. Supervised classification tasks using Support Vector Machines have been performed to find relationships between prosodic features and rhetorical relations.Preliminary results show that intensity combined with other features extracted from intra- and intersegmental environments is the feature with the highest predictability for a discourse relation. The prediction of rhetorical relations from prosodic features and their combinations is straightforwardly applicable to several tasks such as speech understanding or generation. Moreover, the knowledge of how rhetorical relations should be marked in terms of prosody will serve as a basis to improve speech synthesis applications and make voices sound more natural and expressive. Copyright © 2017 ISCA. CONF info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by/2.5/ar http://hdl.handle.net/20.500.12110/paper_2308457X_v2017-August_n_p3201_Kleinhans
institution Universidad de Buenos Aires
institution_str I-28
repository_str R-134
collection Biblioteca Digital - Facultad de Ciencias Exactas y Naturales (UBA)
topic Discourse structure
Prosody
RST
Speech synthesis
Support vector machines
Continuous speech recognition
Speech
Speech synthesis
Support vector machines
Text processing
Discourse structure
Prosodic features
Prosody
Rhetorical relations
Rhetorical structure theory
Speech rates
Speech understanding
Supervised classification
Speech communication
spellingShingle Discourse structure
Prosody
RST
Speech synthesis
Support vector machines
Continuous speech recognition
Speech
Speech synthesis
Support vector machines
Text processing
Discourse structure
Prosodic features
Prosody
Rhetorical relations
Rhetorical structure theory
Speech rates
Speech understanding
Supervised classification
Speech communication
Kleinhans, J.
Farrús, M.
Gravano, A.
Pérez, J.M.
Lai, C.
Wanner, L.
Lacerda F.
Strombergsson S.
Wlodarczak M.
Heldner M.
Gustafson J.
House D.
Amazon Alexa; Apple; DiDi; et al.; Furhat Robotics; Microsoft
Using prosody to classify discourse relations
topic_facet Discourse structure
Prosody
RST
Speech synthesis
Support vector machines
Continuous speech recognition
Speech
Speech synthesis
Support vector machines
Text processing
Discourse structure
Prosodic features
Prosody
Rhetorical relations
Rhetorical structure theory
Speech rates
Speech understanding
Supervised classification
Speech communication
description This work aims to explore the correlation between the discourse structure of a spoken monologue and its prosody by predicting discourse relations from different prosodic attributes. For this purpose, a corpus of semi-spontaneous monologues in English has been automatically annotated according to the Rhetorical Structure Theory, which models coherence in text via rhetorical relations. From corresponding audio files, prosodic features such as pitch, intensity, and speech rate have been extracted from different contexts of a relation. Supervised classification tasks using Support Vector Machines have been performed to find relationships between prosodic features and rhetorical relations.Preliminary results show that intensity combined with other features extracted from intra- and intersegmental environments is the feature with the highest predictability for a discourse relation. The prediction of rhetorical relations from prosodic features and their combinations is straightforwardly applicable to several tasks such as speech understanding or generation. Moreover, the knowledge of how rhetorical relations should be marked in terms of prosody will serve as a basis to improve speech synthesis applications and make voices sound more natural and expressive. Copyright © 2017 ISCA.
format CONF
author Kleinhans, J.
Farrús, M.
Gravano, A.
Pérez, J.M.
Lai, C.
Wanner, L.
Lacerda F.
Strombergsson S.
Wlodarczak M.
Heldner M.
Gustafson J.
House D.
Amazon Alexa; Apple; DiDi; et al.; Furhat Robotics; Microsoft
author_facet Kleinhans, J.
Farrús, M.
Gravano, A.
Pérez, J.M.
Lai, C.
Wanner, L.
Lacerda F.
Strombergsson S.
Wlodarczak M.
Heldner M.
Gustafson J.
House D.
Amazon Alexa; Apple; DiDi; et al.; Furhat Robotics; Microsoft
author_sort Kleinhans, J.
title Using prosody to classify discourse relations
title_short Using prosody to classify discourse relations
title_full Using prosody to classify discourse relations
title_fullStr Using prosody to classify discourse relations
title_full_unstemmed Using prosody to classify discourse relations
title_sort using prosody to classify discourse relations
url http://hdl.handle.net/20.500.12110/paper_2308457X_v2017-August_n_p3201_Kleinhans
work_keys_str_mv AT kleinhansj usingprosodytoclassifydiscourserelations
AT farrusm usingprosodytoclassifydiscourserelations
AT gravanoa usingprosodytoclassifydiscourserelations
AT perezjm usingprosodytoclassifydiscourserelations
AT laic usingprosodytoclassifydiscourserelations
AT wannerl usingprosodytoclassifydiscourserelations
AT lacerdaf usingprosodytoclassifydiscourserelations
AT strombergssons usingprosodytoclassifydiscourserelations
AT wlodarczakm usingprosodytoclassifydiscourserelations
AT heldnerm usingprosodytoclassifydiscourserelations
AT gustafsonj usingprosodytoclassifydiscourserelations
AT housed usingprosodytoclassifydiscourserelations
AT amazonalexaappledidietalfurhatroboticsmicrosoft usingprosodytoclassifydiscourserelations
_version_ 1807317726762369024