On the statistical comparison of feature selection methods and the role of experts: the case of Las Vegas strip

A statistical comparison of feature selection methods is performed. Feature selection is an important issue in Data Mining and Data Science, and a comparison of the results obtained from different methods is hard to be performed. Then, the evaluation of metrics and ways of comparisons is an importan...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Barraza, Néstor Rubén, Moreno, Antonio A.
Formato: Objeto de conferencia
Lenguaje:Inglés
Publicado: 2020
Materias:
Acceso en línea:http://sedici.unlp.edu.ar/handle/10915/116426
http://49jaiio.sadio.org.ar/pdfs/asai/ASAI-02.pdf
Aporte de:
id I19-R120-10915-116426
record_format dspace
institution Universidad Nacional de La Plata
institution_str I-19
repository_str R-120
collection SEDICI (UNLP)
language Inglés
topic Ciencias Informáticas
Big data
Feature selection
Wrapper
Filtered
Lasso
Expert role
spellingShingle Ciencias Informáticas
Big data
Feature selection
Wrapper
Filtered
Lasso
Expert role
Barraza, Néstor Rubén
Moreno, Antonio A.
On the statistical comparison of feature selection methods and the role of experts: the case of Las Vegas strip
topic_facet Ciencias Informáticas
Big data
Feature selection
Wrapper
Filtered
Lasso
Expert role
description A statistical comparison of feature selection methods is performed. Feature selection is an important issue in Data Mining and Data Science, and a comparison of the results obtained from different methods is hard to be performed. Then, the evaluation of metrics and ways of comparisons is an important matter of study. Our study is performed on a real dataset previously analyzed in the literature containing a small number of records, drawing the attention on the conclusions to be applied where poor statistical confidence levels of significance can be obtained because of a relative low number of samples are present. The use of inter rater agreement coefficients is introduced as a novel approach extending a previous study. Boruta and tree-based methodologies perform rather well even in small data as it is shown. Our metrics can be used to guide the expert opinion in order to take the final decision. This work extends the results obtained in a previous analysis performed on the mentioned dataset.
format Objeto de conferencia
Objeto de conferencia
author Barraza, Néstor Rubén
Moreno, Antonio A.
author_facet Barraza, Néstor Rubén
Moreno, Antonio A.
author_sort Barraza, Néstor Rubén
title On the statistical comparison of feature selection methods and the role of experts: the case of Las Vegas strip
title_short On the statistical comparison of feature selection methods and the role of experts: the case of Las Vegas strip
title_full On the statistical comparison of feature selection methods and the role of experts: the case of Las Vegas strip
title_fullStr On the statistical comparison of feature selection methods and the role of experts: the case of Las Vegas strip
title_full_unstemmed On the statistical comparison of feature selection methods and the role of experts: the case of Las Vegas strip
title_sort on the statistical comparison of feature selection methods and the role of experts: the case of las vegas strip
publishDate 2020
url http://sedici.unlp.edu.ar/handle/10915/116426
http://49jaiio.sadio.org.ar/pdfs/asai/ASAI-02.pdf
work_keys_str_mv AT barrazanestorruben onthestatisticalcomparisonoffeatureselectionmethodsandtheroleofexpertsthecaseoflasvegasstrip
AT morenoantonioa onthestatisticalcomparisonoffeatureselectionmethodsandtheroleofexpertsthecaseoflasvegasstrip
bdutipo_str Repositorios
_version_ 1764820446999478272