Application of a Bayesian Semi-supervised Learning Strategy to Network Intrusion Detection

Supervised learning classifiers have proved to be a viable solution in the network intrusion detection field. In practice, however, it is difficult to obtain the required labeled data for implementing these approaches. An alternative approach that avoids the need of labeled datasets consists of usin...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Catania, Carlos Adrián, García Garino, Carlos, Bromberg, Facundo
Formato: Objeto de conferencia
Lenguaje:Inglés
Publicado: 2010
Materias:
Acceso en línea:http://sedici.unlp.edu.ar/handle/10915/152809
http://39jaiio.sadio.org.ar/sites/default/files/39jaiio-asai-16.pdf
Aporte de:
id I19-R120-10915-152809
record_format dspace
spelling I19-R120-10915-1528092023-05-11T20:09:49Z http://sedici.unlp.edu.ar/handle/10915/152809 http://39jaiio.sadio.org.ar/sites/default/files/39jaiio-asai-16.pdf issn:1850-2784 Application of a Bayesian Semi-supervised Learning Strategy to Network Intrusion Detection Catania, Carlos Adrián García Garino, Carlos Bromberg, Facundo 2010 2010 2023-05-11T15:06:25Z en Ciencias Informáticas Intrusion Detection Systems Semi-supervised Learning Expectation Maximization Supervised learning classifiers have proved to be a viable solution in the network intrusion detection field. In practice, however, it is difficult to obtain the required labeled data for implementing these approaches. An alternative approach that avoids the need of labeled datasets consists of using classifiers following a semi-supervised strategy. These classifiers use in their learning process information from labeled and unlabeled datapoints. One of these semi-supervised approaches, originally applied to text classification, combines a naïve Bayes (NB) classifier with the expectation maximization (EM) algorithm. Despite some differences, network intrusion detection shares many of the characteristics of the document classification problem. It is extremely hard to obtain labeled data whereas there are plenty of unlabeled data easily accessible. This work aims to determine the viability of applying semi-supervised techniques to network intrusion detection, with special focus on the combination of NB classifier and EM. A set of experiments conducted on the 1998 DARPA dataset show using EM with unlabeled data can provide significant benefits in classification performance, reducing the size of required labeled data by 90%. Sociedad Argentina de Informática e Investigación Operativa Objeto de conferencia Objeto de conferencia http://creativecommons.org/licenses/by-nc-sa/4.0/ Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) application/pdf 175-186
institution Universidad Nacional de La Plata
institution_str I-19
repository_str R-120
collection SEDICI (UNLP)
language Inglés
topic Ciencias Informáticas
Intrusion Detection Systems
Semi-supervised Learning
Expectation Maximization
spellingShingle Ciencias Informáticas
Intrusion Detection Systems
Semi-supervised Learning
Expectation Maximization
Catania, Carlos Adrián
García Garino, Carlos
Bromberg, Facundo
Application of a Bayesian Semi-supervised Learning Strategy to Network Intrusion Detection
topic_facet Ciencias Informáticas
Intrusion Detection Systems
Semi-supervised Learning
Expectation Maximization
description Supervised learning classifiers have proved to be a viable solution in the network intrusion detection field. In practice, however, it is difficult to obtain the required labeled data for implementing these approaches. An alternative approach that avoids the need of labeled datasets consists of using classifiers following a semi-supervised strategy. These classifiers use in their learning process information from labeled and unlabeled datapoints. One of these semi-supervised approaches, originally applied to text classification, combines a naïve Bayes (NB) classifier with the expectation maximization (EM) algorithm. Despite some differences, network intrusion detection shares many of the characteristics of the document classification problem. It is extremely hard to obtain labeled data whereas there are plenty of unlabeled data easily accessible. This work aims to determine the viability of applying semi-supervised techniques to network intrusion detection, with special focus on the combination of NB classifier and EM. A set of experiments conducted on the 1998 DARPA dataset show using EM with unlabeled data can provide significant benefits in classification performance, reducing the size of required labeled data by 90%.
format Objeto de conferencia
Objeto de conferencia
author Catania, Carlos Adrián
García Garino, Carlos
Bromberg, Facundo
author_facet Catania, Carlos Adrián
García Garino, Carlos
Bromberg, Facundo
author_sort Catania, Carlos Adrián
title Application of a Bayesian Semi-supervised Learning Strategy to Network Intrusion Detection
title_short Application of a Bayesian Semi-supervised Learning Strategy to Network Intrusion Detection
title_full Application of a Bayesian Semi-supervised Learning Strategy to Network Intrusion Detection
title_fullStr Application of a Bayesian Semi-supervised Learning Strategy to Network Intrusion Detection
title_full_unstemmed Application of a Bayesian Semi-supervised Learning Strategy to Network Intrusion Detection
title_sort application of a bayesian semi-supervised learning strategy to network intrusion detection
publishDate 2010
url http://sedici.unlp.edu.ar/handle/10915/152809
http://39jaiio.sadio.org.ar/sites/default/files/39jaiio-asai-16.pdf
work_keys_str_mv AT cataniacarlosadrian applicationofabayesiansemisupervisedlearningstrategytonetworkintrusiondetection
AT garciagarinocarlos applicationofabayesiansemisupervisedlearningstrategytonetworkintrusiondetection
AT brombergfacundo applicationofabayesiansemisupervisedlearningstrategytonetworkintrusiondetection
_version_ 1765722523509456896