A Supervised Term-Weighting Method and its Application to Variable Extraction from Digital Media

Successful modeling and prediction depend on effective methods for the extraction of domain-relevant variables. This paper proposes a methodology for identifying domain-specific terms. The proposed methodology relies on a collection of documents labeled as relevant or irrelevant to the domain under...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Maisonnave, Mariano, Delbianco, Fernando, Tohmé, Fernando Abel, Maguitman, Ana Gabriela
Formato: Objeto de conferencia
Lenguaje:Inglés
Publicado: 2018
Materias:
Acceso en línea:http://sedici.unlp.edu.ar/handle/10915/70694
http://47jaiio.sadio.org.ar/sites/default/files/ASAI-07.pdf
Aporte de:
id I19-R120-10915-70694
record_format dspace
institution Universidad Nacional de La Plata
institution_str I-19
repository_str R-120
collection SEDICI (UNLP)
language Inglés
topic Ciencias Informáticas
termweighting
variable extraction
information retrieval
query- term selection
spellingShingle Ciencias Informáticas
termweighting
variable extraction
information retrieval
query- term selection
Maisonnave, Mariano
Delbianco, Fernando
Tohmé, Fernando Abel
Maguitman, Ana Gabriela
A Supervised Term-Weighting Method and its Application to Variable Extraction from Digital Media
topic_facet Ciencias Informáticas
termweighting
variable extraction
information retrieval
query- term selection
description Successful modeling and prediction depend on effective methods for the extraction of domain-relevant variables. This paper proposes a methodology for identifying domain-specific terms. The proposed methodology relies on a collection of documents labeled as relevant or irrelevant to the domain under analysis. Based on the labeled document collection, we propose a supervised technique that weights terms based on their descriptive and discriminating power. Finally, the descriptive and discriminating values are combined into a general measure that, through the use of an adjustable parameter, allows to independently favor different aspects of retrieval such as maximizing precision or recall, or achieving a balance between both of them. The proposed technique is applied to the economic domain and is empirically evaluated through a human-subject experiment involving experts and non-experts in Economy. It is also evaluated as a term-weighting technique for query-term selection showing promising results. We finally illustrate the potential of the proposal as a first step for identifying different types of associations between words.
format Objeto de conferencia
Objeto de conferencia
author Maisonnave, Mariano
Delbianco, Fernando
Tohmé, Fernando Abel
Maguitman, Ana Gabriela
author_facet Maisonnave, Mariano
Delbianco, Fernando
Tohmé, Fernando Abel
Maguitman, Ana Gabriela
author_sort Maisonnave, Mariano
title A Supervised Term-Weighting Method and its Application to Variable Extraction from Digital Media
title_short A Supervised Term-Weighting Method and its Application to Variable Extraction from Digital Media
title_full A Supervised Term-Weighting Method and its Application to Variable Extraction from Digital Media
title_fullStr A Supervised Term-Weighting Method and its Application to Variable Extraction from Digital Media
title_full_unstemmed A Supervised Term-Weighting Method and its Application to Variable Extraction from Digital Media
title_sort supervised term-weighting method and its application to variable extraction from digital media
publishDate 2018
url http://sedici.unlp.edu.ar/handle/10915/70694
http://47jaiio.sadio.org.ar/sites/default/files/ASAI-07.pdf
work_keys_str_mv AT maisonnavemariano asupervisedtermweightingmethodanditsapplicationtovariableextractionfromdigitalmedia
AT delbiancofernando asupervisedtermweightingmethodanditsapplicationtovariableextractionfromdigitalmedia
AT tohmefernandoabel asupervisedtermweightingmethodanditsapplicationtovariableextractionfromdigitalmedia
AT maguitmananagabriela asupervisedtermweightingmethodanditsapplicationtovariableextractionfromdigitalmedia
AT maisonnavemariano supervisedtermweightingmethodanditsapplicationtovariableextractionfromdigitalmedia
AT delbiancofernando supervisedtermweightingmethodanditsapplicationtovariableextractionfromdigitalmedia
AT tohmefernandoabel supervisedtermweightingmethodanditsapplicationtovariableextractionfromdigitalmedia
AT maguitmananagabriela supervisedtermweightingmethodanditsapplicationtovariableextractionfromdigitalmedia
bdutipo_str Repositorios
_version_ 1764820481693712387