Implementations of data mining and natural language processing on news portals publications: an exploratory approach to hypothesis generation in semiotic interpretation
This article presents preliminary results from an empirical, exploratory approach to a set of news articles published on six Argentine general-interest news portals, together with the comments posted there by reader–users through Facebook. The material presented here forms part of a broader research...
Guardado en:
| Autores principales: | , , , |
|---|---|
| Formato: | Artículo revista |
| Lenguaje: | Español |
| Publicado: |
Universidad Nacional de Rosario
2026
|
| Materias: | |
| Acceso en línea: | https://aprendoconnooj.unr.edu.ar/index.php/revista/article/view/45 |
| Aporte de: |
| id |
I15-R219-article-45 |
|---|---|
| record_format |
ojs |
| institution |
Universidad Nacional de Rosario |
| institution_str |
I-15 |
| repository_str |
R-219 |
| container_title_str |
Aprendo con NooJ |
| language |
Español |
| format |
Artículo revista |
| topic |
semiodata minería de datos procesamiento del lenguaje natural portales de noticias géneros discursivos semiodata data mining natural language processing news portals discursive genres |
| spellingShingle |
semiodata minería de datos procesamiento del lenguaje natural portales de noticias géneros discursivos semiodata data mining natural language processing news portals discursive genres Alomar, Francisco Raimondo Anselmino, Natalia Sollberger, Dolores Gindín, Irene Lis Implementations of data mining and natural language processing on news portals publications: an exploratory approach to hypothesis generation in semiotic interpretation |
| topic_facet |
semiodata minería de datos procesamiento del lenguaje natural portales de noticias géneros discursivos semiodata data mining natural language processing news portals discursive genres |
| author |
Alomar, Francisco Raimondo Anselmino, Natalia Sollberger, Dolores Gindín, Irene Lis |
| author_facet |
Alomar, Francisco Raimondo Anselmino, Natalia Sollberger, Dolores Gindín, Irene Lis |
| author_sort |
Alomar, Francisco |
| title |
Implementations of data mining and natural language processing on news portals publications: an exploratory approach to hypothesis generation in semiotic interpretation |
| title_short |
Implementations of data mining and natural language processing on news portals publications: an exploratory approach to hypothesis generation in semiotic interpretation |
| title_full |
Implementations of data mining and natural language processing on news portals publications: an exploratory approach to hypothesis generation in semiotic interpretation |
| title_fullStr |
Implementations of data mining and natural language processing on news portals publications: an exploratory approach to hypothesis generation in semiotic interpretation |
| title_full_unstemmed |
Implementations of data mining and natural language processing on news portals publications: an exploratory approach to hypothesis generation in semiotic interpretation |
| title_sort |
implementations of data mining and natural language processing on news portals publications: an exploratory approach to hypothesis generation in semiotic interpretation |
| description |
This article presents preliminary results from an empirical, exploratory approach to a set of news articles published on six Argentine general-interest news portals, together with the comments posted there by reader–users through Facebook. The material presented here forms part of a broader research project on the platformization of discourses about the public/ common, focused on a case of “narcoterrorism” in the city of Rosario (Argentina) during 2024, conducted by a multidisciplinary research team. The findings reported here derive from the application of data mining and natural language processing (NLP) algorithms, including the vectorization of texts through embeddings, dimensionality reduction, and the identification of algorithmic groupings projected into graphical representations. These procedures—implemented within a methodological strategy that combines approaches under the framework referred to as semiodata—are understood to operate as triggers for abductive inferences that generate working hypotheses. The analytical path proposed here makes it possible to outline an alternative entry point for the sociosemiotic analysis of mediatized discourses and to apply a specific level of observation which, taken together, proves fruitful for distinguishing boundaries at the level of discursive genres through the visualization of geometric distances. In this sense, algorithmic implementations may be potentially useful for identifying certain invariant disparities linked to enunciative properties of a generic order—at least with regard to the distinction between primary and secondary genres—whereas other differences (such as those related to the variety of journalistic genres) do not appear to have been captured computationally. |
| publisher |
Universidad Nacional de Rosario |
| publishDate |
2026 |
| url |
https://aprendoconnooj.unr.edu.ar/index.php/revista/article/view/45 |
| work_keys_str_mv |
AT alomarfrancisco implementationsofdataminingandnaturallanguageprocessingonnewsportalspublicationsanexploratoryapproachtohypothesisgenerationinsemioticinterpretation AT raimondoanselminonatalia implementationsofdataminingandnaturallanguageprocessingonnewsportalspublicationsanexploratoryapproachtohypothesisgenerationinsemioticinterpretation AT sollbergerdolores implementationsofdataminingandnaturallanguageprocessingonnewsportalspublicationsanexploratoryapproachtohypothesisgenerationinsemioticinterpretation AT gindinirenelis implementationsofdataminingandnaturallanguageprocessingonnewsportalspublicationsanexploratoryapproachtohypothesisgenerationinsemioticinterpretation AT alomarfrancisco implementacionesenmineriadedatosyprocesamientodellenguajenaturalsobrepublicacionesenportalesdenoticiasunenfoqueexploratorioparalageneraciondehipotesisenlainterpretacionsemiotica AT raimondoanselminonatalia implementacionesenmineriadedatosyprocesamientodellenguajenaturalsobrepublicacionesenportalesdenoticiasunenfoqueexploratorioparalageneraciondehipotesisenlainterpretacionsemiotica AT sollbergerdolores implementacionesenmineriadedatosyprocesamientodellenguajenaturalsobrepublicacionesenportalesdenoticiasunenfoqueexploratorioparalageneraciondehipotesisenlainterpretacionsemiotica AT gindinirenelis implementacionesenmineriadedatosyprocesamientodellenguajenaturalsobrepublicacionesenportalesdenoticiasunenfoqueexploratorioparalageneraciondehipotesisenlainterpretacionsemiotica |
| first_indexed |
2026-05-04T05:11:19Z |
| last_indexed |
2026-05-04T05:11:19Z |
| _version_ |
1864233364162084864 |
| spelling |
I15-R219-article-452026-04-16T14:29:33Z Implementations of data mining and natural language processing on news portals publications: an exploratory approach to hypothesis generation in semiotic interpretation Implementaciones en minería de datos y procesamiento del lenguaje natural sobre publicaciones en portales de noticias: un enfoque exploratorio para la generación de hipótesis en la interpretación semiótica Alomar, Francisco Raimondo Anselmino, Natalia Sollberger, Dolores Gindín, Irene Lis semiodata minería de datos procesamiento del lenguaje natural portales de noticias géneros discursivos semiodata data mining natural language processing news portals discursive genres This article presents preliminary results from an empirical, exploratory approach to a set of news articles published on six Argentine general-interest news portals, together with the comments posted there by reader–users through Facebook. The material presented here forms part of a broader research project on the platformization of discourses about the public/ common, focused on a case of “narcoterrorism” in the city of Rosario (Argentina) during 2024, conducted by a multidisciplinary research team. The findings reported here derive from the application of data mining and natural language processing (NLP) algorithms, including the vectorization of texts through embeddings, dimensionality reduction, and the identification of algorithmic groupings projected into graphical representations. These procedures—implemented within a methodological strategy that combines approaches under the framework referred to as semiodata—are understood to operate as triggers for abductive inferences that generate working hypotheses. The analytical path proposed here makes it possible to outline an alternative entry point for the sociosemiotic analysis of mediatized discourses and to apply a specific level of observation which, taken together, proves fruitful for distinguishing boundaries at the level of discursive genres through the visualization of geometric distances. In this sense, algorithmic implementations may be potentially useful for identifying certain invariant disparities linked to enunciative properties of a generic order—at least with regard to the distinction between primary and secondary genres—whereas other differences (such as those related to the variety of journalistic genres) do not appear to have been captured computationally. Este artículo presenta avances de una aproximación empírica (de alcance exploratorio) a un conjunto de notas publicadas en seis portales argentinos de información general y a los comentarios de usuarios-lectores publicados allí a través de Facebook. Lo compartido es parte de una investigación sobre la plataformización de los discursos acerca de lo público-común, circunscripta a un caso de “narcoterrorismo” en la ciudad de Rosario (Argentina) durante 2024, que lleva a cabo un equipo multidisciplinario. Los hallazgos aquí expuestos derivan de la aplicación de algoritmos de minería de datos y procesamiento del lenguaje natural (PLN) que incluyen la vectorización de los textos mediante embeddings, su reducción de dimensionalidad y la obtención de agrupamientos algorítmicos, proyectados en gráficos. Se considera, además, que estos procedimientos (desempeñados en el marco de una estrategia de combinación metodológica denominada como semiodata) operan como detonantes de inferencias abductivas generadoras de hipótesis de trabajo. El recorrido propuesto habilita a proponer una vía alternativa de ingreso al análisis sociosemiótico de los discursos mediatizados y la aplicación de un nivel de observación, que conjuntamente muestran cierta fecundidad para la distinción de delimitaciones que atañen al nivel de los géneros discursivos, mediante la visualización de distancias geométricas. En este sentido, las implementaciones algorítmicas serían potencialmente útiles para la identificación de determinadas disparidades invariantes ligadas a propiedades enunciativas de orden genérico, al menos en lo que respecta a la discriminación entre géneros primarios y secundarios, mientras que otras diferencias (relativas a la variedad de géneros periodísticos, por ejemplo) no habrían sido captadas computarizadamente. Universidad Nacional de Rosario 2026-04-16 info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion Artículo revisado por pares application/pdf https://aprendoconnooj.unr.edu.ar/index.php/revista/article/view/45 10.35305/an.vi6.45 Aprendo con NooJ; Núm. 6 (2026) 2718-8574 spa https://aprendoconnooj.unr.edu.ar/index.php/revista/article/view/45/69 https://creativecommons.org/licenses/by-nc-sa/4.0 |