Automatic classification of new articles in Spanish

We apply machine learning techniques to the automatic classification of news articles from the local newspaper La Capital of Rosario, Argentina. The corpus (LCC) is an archive of approximately 75,000 manually categorized articles in Spanish published in 1991. We benchmark on LCC three widely used su...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Ceccatto, Hermenegildo Alejandro, Calvo, Rafael A., García Adeva, Juan José, Cerviño Beresi, U.
Formato: Objeto de conferencia
Lenguaje:Inglés
Publicado: 2004
Materias:
Acceso en línea:http://sedici.unlp.edu.ar/handle/10915/22551
Aporte de:
Descripción
Sumario:We apply machine learning techniques to the automatic classification of news articles from the local newspaper La Capital of Rosario, Argentina. The corpus (LCC) is an archive of approximately 75,000 manually categorized articles in Spanish published in 1991. We benchmark on LCC three widely used supervised learning methods: k-Nearest Neighbors, Na¨ ve Bayes and Arti ficial Neural Networks, illustrating the corpus properties.