Implementations of data mining and natural language processing on news portals publications: an exploratory approach to hypothesis generation in semiotic interpretation

This article presents preliminary results from an empirical, exploratory approach to a set of news articles published on six Argentine general-interest news portals, together with the comments posted there by reader–users through Facebook. The material presented here forms part of a broader research...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Alomar, Francisco, Raimondo Anselmino, Natalia, Sollberger, Dolores, Gindín, Irene Lis
Formato: Artículo revista
Lenguaje:Español
Publicado: Universidad Nacional de Rosario 2026
Materias:
Acceso en línea:https://aprendoconnooj.unr.edu.ar/index.php/revista/article/view/45
Aporte de:
id I15-R219-article-45
record_format ojs
institution Universidad Nacional de Rosario
institution_str I-15
repository_str R-219
container_title_str Aprendo con NooJ
language Español
format Artículo revista
topic semiodata
minería de datos
procesamiento del lenguaje natural
portales de noticias
géneros discursivos
semiodata
data mining
natural language processing
news portals
discursive genres
spellingShingle semiodata
minería de datos
procesamiento del lenguaje natural
portales de noticias
géneros discursivos
semiodata
data mining
natural language processing
news portals
discursive genres
Alomar, Francisco
Raimondo Anselmino, Natalia
Sollberger, Dolores
Gindín, Irene Lis
Implementations of data mining and natural language processing on news portals publications: an exploratory approach to hypothesis generation in semiotic interpretation
topic_facet semiodata
minería de datos
procesamiento del lenguaje natural
portales de noticias
géneros discursivos
semiodata
data mining
natural language processing
news portals
discursive genres
author Alomar, Francisco
Raimondo Anselmino, Natalia
Sollberger, Dolores
Gindín, Irene Lis
author_facet Alomar, Francisco
Raimondo Anselmino, Natalia
Sollberger, Dolores
Gindín, Irene Lis
author_sort Alomar, Francisco
title Implementations of data mining and natural language processing on news portals publications: an exploratory approach to hypothesis generation in semiotic interpretation
title_short Implementations of data mining and natural language processing on news portals publications: an exploratory approach to hypothesis generation in semiotic interpretation
title_full Implementations of data mining and natural language processing on news portals publications: an exploratory approach to hypothesis generation in semiotic interpretation
title_fullStr Implementations of data mining and natural language processing on news portals publications: an exploratory approach to hypothesis generation in semiotic interpretation
title_full_unstemmed Implementations of data mining and natural language processing on news portals publications: an exploratory approach to hypothesis generation in semiotic interpretation
title_sort implementations of data mining and natural language processing on news portals publications: an exploratory approach to hypothesis generation in semiotic interpretation
description This article presents preliminary results from an empirical, exploratory approach to a set of news articles published on six Argentine general-interest news portals, together with the comments posted there by reader–users through Facebook. The material presented here forms part of a broader research project on the platformization of discourses about the public/ common, focused on a case of “narcoterrorism” in the city of Rosario (Argentina) during 2024, conducted by a multidisciplinary research team. The findings reported here derive from the application of data mining and natural language processing (NLP) algorithms, including the vectorization of texts through embeddings, dimensionality reduction, and the identification of algorithmic groupings projected into graphical representations. These procedures—implemented within a methodological strategy that combines approaches under the framework referred to as semiodata—are understood to operate as triggers for abductive inferences that generate working hypotheses. The analytical path proposed here makes it possible to outline an alternative entry point for the sociosemiotic analysis of mediatized discourses and to apply a specific level of observation which, taken together, proves fruitful for distinguishing boundaries at the level of discursive genres through the visualization of geometric distances. In this sense, algorithmic implementations may be potentially useful for identifying certain invariant disparities linked to enunciative properties of a generic order—at least with regard to the distinction between primary and secondary genres—whereas other differences (such as those related to the variety of journalistic genres) do not appear to have been captured computationally.
publisher Universidad Nacional de Rosario
publishDate 2026
url https://aprendoconnooj.unr.edu.ar/index.php/revista/article/view/45
work_keys_str_mv AT alomarfrancisco implementationsofdataminingandnaturallanguageprocessingonnewsportalspublicationsanexploratoryapproachtohypothesisgenerationinsemioticinterpretation
AT raimondoanselminonatalia implementationsofdataminingandnaturallanguageprocessingonnewsportalspublicationsanexploratoryapproachtohypothesisgenerationinsemioticinterpretation
AT sollbergerdolores implementationsofdataminingandnaturallanguageprocessingonnewsportalspublicationsanexploratoryapproachtohypothesisgenerationinsemioticinterpretation
AT gindinirenelis implementationsofdataminingandnaturallanguageprocessingonnewsportalspublicationsanexploratoryapproachtohypothesisgenerationinsemioticinterpretation
AT alomarfrancisco implementacionesenmineriadedatosyprocesamientodellenguajenaturalsobrepublicacionesenportalesdenoticiasunenfoqueexploratorioparalageneraciondehipotesisenlainterpretacionsemiotica
AT raimondoanselminonatalia implementacionesenmineriadedatosyprocesamientodellenguajenaturalsobrepublicacionesenportalesdenoticiasunenfoqueexploratorioparalageneraciondehipotesisenlainterpretacionsemiotica
AT sollbergerdolores implementacionesenmineriadedatosyprocesamientodellenguajenaturalsobrepublicacionesenportalesdenoticiasunenfoqueexploratorioparalageneraciondehipotesisenlainterpretacionsemiotica
AT gindinirenelis implementacionesenmineriadedatosyprocesamientodellenguajenaturalsobrepublicacionesenportalesdenoticiasunenfoqueexploratorioparalageneraciondehipotesisenlainterpretacionsemiotica
first_indexed 2026-05-04T05:11:19Z
last_indexed 2026-05-04T05:11:19Z
_version_ 1864233364162084864
spelling I15-R219-article-452026-04-16T14:29:33Z Implementations of data mining and natural language processing on news portals publications: an exploratory approach to hypothesis generation in semiotic interpretation Implementaciones en minería de datos y procesamiento del lenguaje natural sobre publicaciones en portales de noticias: un enfoque exploratorio para la generación de hipótesis en la interpretación semiótica Alomar, Francisco Raimondo Anselmino, Natalia Sollberger, Dolores Gindín, Irene Lis semiodata minería de datos procesamiento del lenguaje natural portales de noticias géneros discursivos semiodata data mining natural language processing news portals discursive genres This article presents preliminary results from an empirical, exploratory approach to a set of news articles published on six Argentine general-interest news portals, together with the comments posted there by reader–users through Facebook. The material presented here forms part of a broader research project on the platformization of discourses about the public/ common, focused on a case of “narcoterrorism” in the city of Rosario (Argentina) during 2024, conducted by a multidisciplinary research team. The findings reported here derive from the application of data mining and natural language processing (NLP) algorithms, including the vectorization of texts through embeddings, dimensionality reduction, and the identification of algorithmic groupings projected into graphical representations. These procedures—implemented within a methodological strategy that combines approaches under the framework referred to as semiodata—are understood to operate as triggers for abductive inferences that generate working hypotheses. The analytical path proposed here makes it possible to outline an alternative entry point for the sociosemiotic analysis of mediatized discourses and to apply a specific level of observation which, taken together, proves fruitful for distinguishing boundaries at the level of discursive genres through the visualization of geometric distances. In this sense, algorithmic implementations may be potentially useful for identifying certain invariant disparities linked to enunciative properties of a generic order—at least with regard to the distinction between primary and secondary genres—whereas other differences (such as those related to the variety of journalistic genres) do not appear to have been captured computationally. Este artículo presenta avances de una aproximación empírica (de alcance exploratorio) a un conjunto de notas publicadas en seis portales argentinos de información general y a los comentarios de usuarios-lectores publicados allí a través de Facebook. Lo compartido es parte de una investigación sobre la plataformización de los discursos acerca de lo público-común, circunscripta a un caso de “narcoterrorismo” en la ciudad de Rosario (Argentina) durante 2024, que lleva a cabo un equipo multidisciplinario. Los hallazgos aquí expuestos derivan de la aplicación de algoritmos de minería de datos y procesamiento del lenguaje natural (PLN) que incluyen la vectorización de los textos mediante embeddings, su reducción de dimensionalidad y la obtención de agrupamientos algorítmicos, proyectados en gráficos. Se considera, además, que estos procedimientos (desempeñados en el marco de una estrategia de combinación metodológica denominada como semiodata) operan como detonantes de inferencias abductivas generadoras de hipótesis de trabajo. El recorrido propuesto habilita a proponer una vía alternativa de ingreso al análisis sociosemiótico de los discursos mediatizados y la aplicación de un nivel de observación, que conjuntamente muestran cierta fecundidad para la distinción de delimitaciones que atañen al nivel de los géneros discursivos, mediante la visualización de distancias geométricas. En este sentido, las implementaciones algorítmicas serían potencialmente útiles para la identificación de determinadas disparidades invariantes ligadas a propiedades enunciativas de orden genérico, al menos en lo que respecta a la discriminación entre géneros primarios y secundarios, mientras que otras diferencias (relativas a la variedad de géneros periodísticos, por ejemplo) no habrían sido captadas computarizadamente. Universidad Nacional de Rosario 2026-04-16 info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion Artículo revisado por pares application/pdf https://aprendoconnooj.unr.edu.ar/index.php/revista/article/view/45 10.35305/an.vi6.45 Aprendo con NooJ; Núm. 6 (2026) 2718-8574 spa https://aprendoconnooj.unr.edu.ar/index.php/revista/article/view/45/69 https://creativecommons.org/licenses/by-nc-sa/4.0