A PSO-based clustering approach assisted by initial clustering information

Clustering of short texts is an important research area because of its applicability in information retrieval and text mining. To this end was proposed CLUDIPSO, a discrete Particle Swarm Optimization algorithm to cluster short texts. Initial results showed that CLUDIPSO has performed well in small...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Velázquez, Carlos, Cagnina, Leticia, Errecalde, Marcelo Luis
Formato: Objeto de conferencia
Lenguaje:Inglés
Publicado: 2012
Materias:
Acceso en línea:http://sedici.unlp.edu.ar/handle/10915/23753
Aporte de:
Descripción
Sumario:Clustering of short texts is an important research area because of its applicability in information retrieval and text mining. To this end was proposed CLUDIPSO, a discrete Particle Swarm Optimization algorithm to cluster short texts. Initial results showed that CLUDIPSO has performed well in small collections of short texts. However, later works showed some drawbacks when dealing with larger collections. In this paper we present a hybridization of CLUDIPSO to overcome these drawbacks, by providing information in the initial cycles of the algorithm to avoid a random search and thus speed up the convergence process. This is achieved by using a pre-clustering obtained with the Expectation-Maximization method which is included in the initial population of the algorithm. The results obtained with the hybrid version show a significant improvement over those obtained with the original version.