On the class distribution labelling step sensitivity of co-training

Co-training can learn from datasets having a small number of labelled examples and a large number of unlabelled ones. It is an iterative algorithm where examples labelled in previous iterations are used to improve the classification of examples from the unlabelled set. However, as the number of ini...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Matsubara, Edson T., Monard, Maria C., Prati, Ronaldo
Formato: Objeto de conferencia
Lenguaje:Inglés
Publicado: 2006
Materias:
Acceso en línea:http://sedici.unlp.edu.ar/handle/10915/23900
Aporte de:
id I19-R120-10915-23900
record_format dspace
institution Universidad Nacional de La Plata
institution_str I-19
repository_str R-120
collection SEDICI (UNLP)
language Inglés
topic Ciencias Informáticas
iterative algorithm
label
challenging domains
spellingShingle Ciencias Informáticas
iterative algorithm
label
challenging domains
Matsubara, Edson T.
Monard, Maria C.
Prati, Ronaldo
On the class distribution labelling step sensitivity of co-training
topic_facet Ciencias Informáticas
iterative algorithm
label
challenging domains
description Co-training can learn from datasets having a small number of labelled examples and a large number of unlabelled ones. It is an iterative algorithm where examples labelled in previous iterations are used to improve the classification of examples from the unlabelled set. However, as the number of initial labelled examples is often small we do not have reliable estimates regarding the underlying population which generated the data. In this work we make the claim that the proportion in which examples are labelled is a key parameter to co-training. Furthermore, we have done a series of experiments to investigate how the proportion in which we label examples in each step influences cotraining performance. Results show that co-training should be used with care in challenging domains.
format Objeto de conferencia
Objeto de conferencia
author Matsubara, Edson T.
Monard, Maria C.
Prati, Ronaldo
author_facet Matsubara, Edson T.
Monard, Maria C.
Prati, Ronaldo
author_sort Matsubara, Edson T.
title On the class distribution labelling step sensitivity of co-training
title_short On the class distribution labelling step sensitivity of co-training
title_full On the class distribution labelling step sensitivity of co-training
title_fullStr On the class distribution labelling step sensitivity of co-training
title_full_unstemmed On the class distribution labelling step sensitivity of co-training
title_sort on the class distribution labelling step sensitivity of co-training
publishDate 2006
url http://sedici.unlp.edu.ar/handle/10915/23900
work_keys_str_mv AT matsubaraedsont ontheclassdistributionlabellingstepsensitivityofcotraining
AT monardmariac ontheclassdistributionlabellingstepsensitivityofcotraining
AT pratironaldo ontheclassdistributionlabellingstepsensitivityofcotraining
bdutipo_str Repositorios
_version_ 1764820466383454209