Combine vector quantization and support vector machine for imbalanced datasets

In cases of extremely imbalanced dataset with high dimensions, standard machine learning techniques tend to be overwhelmed by the large classes. This paper rebalances skewed datasets by compressing the majority class. This approach combines Vector Quantization and Support Vector Machine and construc...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Yu, Ting, Debenham, John, Jan, Tony, Simoff, Simeon
Formato: Objeto de conferencia
Lenguaje:Inglés
Publicado: 2006
Materias:
Acceso en línea:http://sedici.unlp.edu.ar/handle/10915/23866
Aporte de:
id I19-R120-10915-23866
record_format dspace
institution Universidad Nacional de La Plata
institution_str I-19
repository_str R-120
collection SEDICI (UNLP)
language Inglés
topic Ciencias Informáticas
Base de Datos
spellingShingle Ciencias Informáticas
Base de Datos
Yu, Ting
Debenham, John
Jan, Tony
Simoff, Simeon
Combine vector quantization and support vector machine for imbalanced datasets
topic_facet Ciencias Informáticas
Base de Datos
description In cases of extremely imbalanced dataset with high dimensions, standard machine learning techniques tend to be overwhelmed by the large classes. This paper rebalances skewed datasets by compressing the majority class. This approach combines Vector Quantization and Support Vector Machine and constructs a new approach, VQ-SVM, to rebalance datasets without significant information loss. Some issues, e.g. distortion and support vectors, have been discussed to address the trade-off between the information loss and undersampling. Experiments compare VQ-SVM and standard SVM on some imbalanced datasets with varied imbalance ratios, and results show that the performance of VQ-SVM is superior to SVM, especially in case of extremely imbalanced large datasets.
format Objeto de conferencia
Objeto de conferencia
author Yu, Ting
Debenham, John
Jan, Tony
Simoff, Simeon
author_facet Yu, Ting
Debenham, John
Jan, Tony
Simoff, Simeon
author_sort Yu, Ting
title Combine vector quantization and support vector machine for imbalanced datasets
title_short Combine vector quantization and support vector machine for imbalanced datasets
title_full Combine vector quantization and support vector machine for imbalanced datasets
title_fullStr Combine vector quantization and support vector machine for imbalanced datasets
title_full_unstemmed Combine vector quantization and support vector machine for imbalanced datasets
title_sort combine vector quantization and support vector machine for imbalanced datasets
publishDate 2006
url http://sedici.unlp.edu.ar/handle/10915/23866
work_keys_str_mv AT yuting combinevectorquantizationandsupportvectormachineforimbalanceddatasets
AT debenhamjohn combinevectorquantizationandsupportvectormachineforimbalanceddatasets
AT jantony combinevectorquantizationandsupportvectormachineforimbalanceddatasets
AT simoffsimeon combinevectorquantizationandsupportvectormachineforimbalanceddatasets
bdutipo_str Repositorios
_version_ 1764820466346754050