Turn-taking cues in task-oriented dialogue

As interactive voice response systems become more prevalent and provide increasingly more complex functionality, it becomes clear that the challenges facing such systems are not solely in their synthesis and recognition capabilities. Issues such as the coordination of turn exchanges between system a...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Gravano, A., Hirschberg, J.
Formato: JOUR
Materias:
Acceso en línea:http://hdl.handle.net/20.500.12110/paper_08852308_v25_n3_p601_Gravano
Aporte de:
id todo:paper_08852308_v25_n3_p601_Gravano
record_format dspace
spelling todo:paper_08852308_v25_n3_p601_Gravano2023-10-03T15:40:43Z Turn-taking cues in task-oriented dialogue Gravano, A. Hirschberg, J. Dialogue IVR systems Prosody Turn-taking Back channels Columbia Dialogue Interactive voice response Interactive voice response systems IVR systems Prosody System usability Turn-taking Speech recognition As interactive voice response systems become more prevalent and provide increasingly more complex functionality, it becomes clear that the challenges facing such systems are not solely in their synthesis and recognition capabilities. Issues such as the coordination of turn exchanges between system and user also play an important role in system usability. In particular, both systems and users have difficulty determining when the other is taking or relinquishing the turn. In this paper, we seek to identify turn-taking cues correlated with human-human turn exchanges which are automatically computable. We compare the presence of potential prosodic, acoustic, and lexico-syntactic turn-yielding cues in prosodic phrases preceding turn changes (smooth switches) vs. turn retentions (holds) vs. backchannels in the Columbia Games Corpus, a large corpus of task-oriented dialogues, to determine which features reliably distinguish between these three. We identify seven turn-yielding cues, all of which can be extracted automatically, for future use in turn generation and recognition in interactive voice response (IVR) systems. Testing Duncan's (1972) hypothesis that these turn-yielding cues are linearly correlated with the occurrence of turn-taking attempts, we further demonstrate that, the greater the number of turn-yielding cues that are present, the greater the likelihood that a turn change will occur. We also identify six cues that precede backchannels, which will also be useful for IVR backchannel generation and recognition; these cues correlate with backchannel occurrence in a quadratic manner. We find similar results for overlapping and for non-overlapping speech. © 2010 Elsevier Ltd. All rights reserved. Fil:Gravano, A. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales; Argentina. JOUR info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by/2.5/ar http://hdl.handle.net/20.500.12110/paper_08852308_v25_n3_p601_Gravano
institution Universidad de Buenos Aires
institution_str I-28
repository_str R-134
collection Biblioteca Digital - Facultad de Ciencias Exactas y Naturales (UBA)
topic Dialogue
IVR systems
Prosody
Turn-taking
Back channels
Columbia
Dialogue
Interactive voice response
Interactive voice response systems
IVR systems
Prosody
System usability
Turn-taking
Speech recognition
spellingShingle Dialogue
IVR systems
Prosody
Turn-taking
Back channels
Columbia
Dialogue
Interactive voice response
Interactive voice response systems
IVR systems
Prosody
System usability
Turn-taking
Speech recognition
Gravano, A.
Hirschberg, J.
Turn-taking cues in task-oriented dialogue
topic_facet Dialogue
IVR systems
Prosody
Turn-taking
Back channels
Columbia
Dialogue
Interactive voice response
Interactive voice response systems
IVR systems
Prosody
System usability
Turn-taking
Speech recognition
description As interactive voice response systems become more prevalent and provide increasingly more complex functionality, it becomes clear that the challenges facing such systems are not solely in their synthesis and recognition capabilities. Issues such as the coordination of turn exchanges between system and user also play an important role in system usability. In particular, both systems and users have difficulty determining when the other is taking or relinquishing the turn. In this paper, we seek to identify turn-taking cues correlated with human-human turn exchanges which are automatically computable. We compare the presence of potential prosodic, acoustic, and lexico-syntactic turn-yielding cues in prosodic phrases preceding turn changes (smooth switches) vs. turn retentions (holds) vs. backchannels in the Columbia Games Corpus, a large corpus of task-oriented dialogues, to determine which features reliably distinguish between these three. We identify seven turn-yielding cues, all of which can be extracted automatically, for future use in turn generation and recognition in interactive voice response (IVR) systems. Testing Duncan's (1972) hypothesis that these turn-yielding cues are linearly correlated with the occurrence of turn-taking attempts, we further demonstrate that, the greater the number of turn-yielding cues that are present, the greater the likelihood that a turn change will occur. We also identify six cues that precede backchannels, which will also be useful for IVR backchannel generation and recognition; these cues correlate with backchannel occurrence in a quadratic manner. We find similar results for overlapping and for non-overlapping speech. © 2010 Elsevier Ltd. All rights reserved.
format JOUR
author Gravano, A.
Hirschberg, J.
author_facet Gravano, A.
Hirschberg, J.
author_sort Gravano, A.
title Turn-taking cues in task-oriented dialogue
title_short Turn-taking cues in task-oriented dialogue
title_full Turn-taking cues in task-oriented dialogue
title_fullStr Turn-taking cues in task-oriented dialogue
title_full_unstemmed Turn-taking cues in task-oriented dialogue
title_sort turn-taking cues in task-oriented dialogue
url http://hdl.handle.net/20.500.12110/paper_08852308_v25_n3_p601_Gravano
work_keys_str_mv AT gravanoa turntakingcuesintaskorienteddialogue
AT hirschbergj turntakingcuesintaskorienteddialogue
_version_ 1782030044064907264