Using LSTM-based Language Models and human Eye Movements metrics to understand next-word predictions

Modern Natural Language Processing (NLP) models can achieve great results resolving di erent types of linguistic tasks. This is possible thanks to a high volume of internal parametersthat are optimized during the training phase. They allow to model high-level linguistic properties. For example, LSTM...

Descripción completa

Detalles Bibliográficos
Autores principales: Umfurer, Alfredo, Kamienkowski, Juan E., Bianchi, Bruno
Formato: Objeto de conferencia
Lenguaje:Inglés
Publicado: 2021
Materias:
Acceso en línea:http://sedici.unlp.edu.ar/handle/10915/140235
http://50jaiio.sadio.org.ar/pdfs/asai/ASAI-03.pdf
Aporte de:
id I19-R120-10915-140235
record_format dspace
institution Universidad Nacional de La Plata
institution_str I-19
repository_str R-120
collection SEDICI (UNLP)
language Inglés
topic Ciencias Informáticas
LSTM
Eye Movements
Linear Mixed Models
spellingShingle Ciencias Informáticas
LSTM
Eye Movements
Linear Mixed Models
Umfurer, Alfredo
Kamienkowski, Juan E.
Bianchi, Bruno
Using LSTM-based Language Models and human Eye Movements metrics to understand next-word predictions
topic_facet Ciencias Informáticas
LSTM
Eye Movements
Linear Mixed Models
description Modern Natural Language Processing (NLP) models can achieve great results resolving di erent types of linguistic tasks. This is possible thanks to a high volume of internal parametersthat are optimized during the training phase. They allow to model high-level linguistic properties. For example, LSTM-based language models have the ability to nd long-term dependencies between words on a text, and use them to make predictions about upcoming words. Nevertheless, their complexity makes it hard to understand which features they use to generate predictions. The neurolinguistic eld faces a similar issue when studying how our brain processes language. For example, every adult reader has the ability to understand long texts and to make predictions of upcoming words. Nevertheless, our understanding on how these predictions are driven is limited. During the last decades, the study of eye movements during reading have shed some light on this topic, nding a relation between the time spent on a word (gaze duration) and its processing cost. Here, we aim to understand how LSTM-based models predict future words and these predictions relate with human predictions, tting statistical models commonly used in the neurolinguistic eld with gaze duration as the dependent variable. We found that an AWD-LSTM Language Model can partially model eye movements, with high overlap with both human-Predictability and lexical frequency. Interestingly, this last overlap is seen to depend on the training corpus, being lower when the model is ne-tuned with a corpus similar to the one used for testing.
format Objeto de conferencia
Objeto de conferencia
author Umfurer, Alfredo
Kamienkowski, Juan E.
Bianchi, Bruno
author_facet Umfurer, Alfredo
Kamienkowski, Juan E.
Bianchi, Bruno
author_sort Umfurer, Alfredo
title Using LSTM-based Language Models and human Eye Movements metrics to understand next-word predictions
title_short Using LSTM-based Language Models and human Eye Movements metrics to understand next-word predictions
title_full Using LSTM-based Language Models and human Eye Movements metrics to understand next-word predictions
title_fullStr Using LSTM-based Language Models and human Eye Movements metrics to understand next-word predictions
title_full_unstemmed Using LSTM-based Language Models and human Eye Movements metrics to understand next-word predictions
title_sort using lstm-based language models and human eye movements metrics to understand next-word predictions
publishDate 2021
url http://sedici.unlp.edu.ar/handle/10915/140235
http://50jaiio.sadio.org.ar/pdfs/asai/ASAI-03.pdf
work_keys_str_mv AT umfureralfredo usinglstmbasedlanguagemodelsandhumaneyemovementsmetricstounderstandnextwordpredictions
AT kamienkowskijuane usinglstmbasedlanguagemodelsandhumaneyemovementsmetricstounderstandnextwordpredictions
AT bianchibruno usinglstmbasedlanguagemodelsandhumaneyemovementsmetricstounderstandnextwordpredictions
bdutipo_str Repositorios
_version_ 1764820458445733889