Rebuilding the Story of a Hero: Information Extraction in Ancient Argentinian Texts

Large amounts of ancient documents have become available in the last years, regarding Argentinian history. This fact turns possible to find interesting and useful aggregated information. This work proposes the application of Natural Language Processing, Text Mining and Visualization tools over Argen...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Xamena, Eduardo, Marmanillo, Walter Gabriel, Mechaca, Ana Lidia
Formato: Objeto de conferencia
Lenguaje:Inglés
Publicado: 2019
Materias:
Acceso en línea:http://sedici.unlp.edu.ar/handle/10915/87809
Aporte de:
id I19-R120-10915-87809
record_format dspace
institution Universidad Nacional de La Plata
institution_str I-19
repository_str R-120
collection SEDICI (UNLP)
language Inglés
topic Ciencias Informáticas
Argentinian history
Natural language processing
TextMining
Visualization
Big document repositories
spellingShingle Ciencias Informáticas
Argentinian history
Natural language processing
TextMining
Visualization
Big document repositories
Xamena, Eduardo
Marmanillo, Walter Gabriel
Mechaca, Ana Lidia
Rebuilding the Story of a Hero: Information Extraction in Ancient Argentinian Texts
topic_facet Ciencias Informáticas
Argentinian history
Natural language processing
TextMining
Visualization
Big document repositories
description Large amounts of ancient documents have become available in the last years, regarding Argentinian history. This fact turns possible to find interesting and useful aggregated information. This work proposes the application of Natural Language Processing, Text Mining and Visualization tools over Argentinian ancient document repositories. Conceptual maps and entity networks make up the first target of this preliminary paper. The first step is the normalization of OCR acquired books of General G¨uemes. Exploratory analyses reveal the presence of manifold spelling errors, due to the OCR acquisition process of the volumes. We propose smart automatic ways for overcoming this issue in the process of normalization. Besides, a first topic landscape of a subset of volumes is obtained and analysed, via Topic Modelling tools.
format Objeto de conferencia
Objeto de conferencia
author Xamena, Eduardo
Marmanillo, Walter Gabriel
Mechaca, Ana Lidia
author_facet Xamena, Eduardo
Marmanillo, Walter Gabriel
Mechaca, Ana Lidia
author_sort Xamena, Eduardo
title Rebuilding the Story of a Hero: Information Extraction in Ancient Argentinian Texts
title_short Rebuilding the Story of a Hero: Information Extraction in Ancient Argentinian Texts
title_full Rebuilding the Story of a Hero: Information Extraction in Ancient Argentinian Texts
title_fullStr Rebuilding the Story of a Hero: Information Extraction in Ancient Argentinian Texts
title_full_unstemmed Rebuilding the Story of a Hero: Information Extraction in Ancient Argentinian Texts
title_sort rebuilding the story of a hero: information extraction in ancient argentinian texts
publishDate 2019
url http://sedici.unlp.edu.ar/handle/10915/87809
work_keys_str_mv AT xamenaeduardo rebuildingthestoryofaheroinformationextractioninancientargentiniantexts
AT marmanillowaltergabriel rebuildingthestoryofaheroinformationextractioninancientargentiniantexts
AT mechacaanalidia rebuildingthestoryofaheroinformationextractioninancientargentiniantexts
bdutipo_str Repositorios
_version_ 1764820489405988868