Application of a Bayesian Semi-supervised Learning Strategy to Network Intrusion Detection

Supervised learning classifiers have proved to be a viable solution in the network intrusion detection field. In practice, however, it is difficult to obtain the required labeled data for implementing these approaches. An alternative approach that avoids the need of labeled datasets consists of usin...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Catania, Carlos Adrián, García Garino, Carlos, Bromberg, Facundo
Formato: Objeto de conferencia
Lenguaje:Inglés
Publicado: 2010
Materias:
Acceso en línea:http://sedici.unlp.edu.ar/handle/10915/152809
http://39jaiio.sadio.org.ar/sites/default/files/39jaiio-asai-16.pdf
Aporte de:
Descripción
Sumario:Supervised learning classifiers have proved to be a viable solution in the network intrusion detection field. In practice, however, it is difficult to obtain the required labeled data for implementing these approaches. An alternative approach that avoids the need of labeled datasets consists of using classifiers following a semi-supervised strategy. These classifiers use in their learning process information from labeled and unlabeled datapoints. One of these semi-supervised approaches, originally applied to text classification, combines a naïve Bayes (NB) classifier with the expectation maximization (EM) algorithm. Despite some differences, network intrusion detection shares many of the characteristics of the document classification problem. It is extremely hard to obtain labeled data whereas there are plenty of unlabeled data easily accessible. This work aims to determine the viability of applying semi-supervised techniques to network intrusion detection, with special focus on the combination of NB classifier and EM. A set of experiments conducted on the 1998 DARPA dataset show using EM with unlabeled data can provide significant benefits in classification performance, reducing the size of required labeled data by 90%.