On the class distribution labelling step sensitivity of co-training
Co-training can learn from datasets having a small number of labelled examples and a large number of unlabelled ones. It is an iterative algorithm where examples labelled in previous iterations are used to improve the classification of examples from the unlabelled set. However, as the number of ini...
Guardado en:
| Autores principales: | , , |
|---|---|
| Formato: | Objeto de conferencia |
| Lenguaje: | Inglés |
| Publicado: |
2006
|
| Materias: | |
| Acceso en línea: | http://sedici.unlp.edu.ar/handle/10915/23900 |
| Aporte de: |
| id |
I19-R120-10915-23900 |
|---|---|
| record_format |
dspace |
| institution |
Universidad Nacional de La Plata |
| institution_str |
I-19 |
| repository_str |
R-120 |
| collection |
SEDICI (UNLP) |
| language |
Inglés |
| topic |
Ciencias Informáticas iterative algorithm label challenging domains |
| spellingShingle |
Ciencias Informáticas iterative algorithm label challenging domains Matsubara, Edson T. Monard, Maria C. Prati, Ronaldo On the class distribution labelling step sensitivity of co-training |
| topic_facet |
Ciencias Informáticas iterative algorithm label challenging domains |
| description |
Co-training can learn from datasets having a small number of labelled examples and a large number of unlabelled ones. It is an iterative algorithm where examples labelled in previous iterations are used to improve the classification of examples from the unlabelled set.
However, as the number of initial labelled examples is often small we do not have reliable estimates regarding the underlying population which generated the data. In this work we make the claim that the proportion in which examples are labelled is a key parameter to co-training.
Furthermore, we have done a series of experiments to investigate how the proportion in which we label examples in each step influences cotraining performance. Results show that co-training should be used with care in challenging domains. |
| format |
Objeto de conferencia Objeto de conferencia |
| author |
Matsubara, Edson T. Monard, Maria C. Prati, Ronaldo |
| author_facet |
Matsubara, Edson T. Monard, Maria C. Prati, Ronaldo |
| author_sort |
Matsubara, Edson T. |
| title |
On the class distribution labelling step sensitivity of co-training |
| title_short |
On the class distribution labelling step sensitivity of co-training |
| title_full |
On the class distribution labelling step sensitivity of co-training |
| title_fullStr |
On the class distribution labelling step sensitivity of co-training |
| title_full_unstemmed |
On the class distribution labelling step sensitivity of co-training |
| title_sort |
on the class distribution labelling step sensitivity of co-training |
| publishDate |
2006 |
| url |
http://sedici.unlp.edu.ar/handle/10915/23900 |
| work_keys_str_mv |
AT matsubaraedsont ontheclassdistributionlabellingstepsensitivityofcotraining AT monardmariac ontheclassdistributionlabellingstepsensitivityofcotraining AT pratironaldo ontheclassdistributionlabellingstepsensitivityofcotraining |
| bdutipo_str |
Repositorios |
| _version_ |
1764820466383454209 |