Rebuilding the Story of a Hero: Information Extraction in Ancient Argentinian Texts
Large amounts of ancient documents have become available in the last years, regarding Argentinian history. This fact turns possible to find interesting and useful aggregated information. This work proposes the application of Natural Language Processing, Text Mining and Visualization tools over Argen...
Guardado en:
Autores principales: | , , |
---|---|
Formato: | Objeto de conferencia |
Lenguaje: | Inglés |
Publicado: |
2019
|
Materias: | |
Acceso en línea: | http://sedici.unlp.edu.ar/handle/10915/87809 |
Aporte de: |
id |
I19-R120-10915-87809 |
---|---|
record_format |
dspace |
institution |
Universidad Nacional de La Plata |
institution_str |
I-19 |
repository_str |
R-120 |
collection |
SEDICI (UNLP) |
language |
Inglés |
topic |
Ciencias Informáticas Argentinian history Natural language processing TextMining Visualization Big document repositories |
spellingShingle |
Ciencias Informáticas Argentinian history Natural language processing TextMining Visualization Big document repositories Xamena, Eduardo Marmanillo, Walter Gabriel Mechaca, Ana Lidia Rebuilding the Story of a Hero: Information Extraction in Ancient Argentinian Texts |
topic_facet |
Ciencias Informáticas Argentinian history Natural language processing TextMining Visualization Big document repositories |
description |
Large amounts of ancient documents have become available in the last years, regarding Argentinian history. This fact turns possible to find interesting and useful aggregated information. This work proposes the application of Natural Language Processing, Text Mining and Visualization tools over Argentinian ancient document repositories.
Conceptual maps and entity networks make up the first target of this preliminary paper. The first step is the normalization of OCR acquired books of General G¨uemes. Exploratory analyses reveal the presence of manifold spelling errors, due to the OCR acquisition process of the volumes.
We propose smart automatic ways for overcoming this issue in the process of normalization. Besides, a first topic landscape of a subset of volumes is obtained and analysed, via Topic Modelling tools. |
format |
Objeto de conferencia Objeto de conferencia |
author |
Xamena, Eduardo Marmanillo, Walter Gabriel Mechaca, Ana Lidia |
author_facet |
Xamena, Eduardo Marmanillo, Walter Gabriel Mechaca, Ana Lidia |
author_sort |
Xamena, Eduardo |
title |
Rebuilding the Story of a Hero: Information Extraction in Ancient Argentinian Texts |
title_short |
Rebuilding the Story of a Hero: Information Extraction in Ancient Argentinian Texts |
title_full |
Rebuilding the Story of a Hero: Information Extraction in Ancient Argentinian Texts |
title_fullStr |
Rebuilding the Story of a Hero: Information Extraction in Ancient Argentinian Texts |
title_full_unstemmed |
Rebuilding the Story of a Hero: Information Extraction in Ancient Argentinian Texts |
title_sort |
rebuilding the story of a hero: information extraction in ancient argentinian texts |
publishDate |
2019 |
url |
http://sedici.unlp.edu.ar/handle/10915/87809 |
work_keys_str_mv |
AT xamenaeduardo rebuildingthestoryofaheroinformationextractioninancientargentiniantexts AT marmanillowaltergabriel rebuildingthestoryofaheroinformationextractioninancientargentiniantexts AT mechacaanalidia rebuildingthestoryofaheroinformationextractioninancientargentiniantexts |
bdutipo_str |
Repositorios |
_version_ |
1764820489405988868 |