Towards Information Quality Assurance in Spanish: Wikipedia
Featured Articles (FA) are considered to be the best articles that Wikipedia has to offer and in the last years, researchers have found interesting to analyze whether and how they can be distinguished from “ordinary” articles. Likewise, identifying what issues have to be enhanced or fixed in ordinar...
Autores principales: | , , , , , , |
---|---|
Formato: | Articulo |
Lenguaje: | Inglés |
Publicado: |
2017
|
Materias: | |
Acceso en línea: | http://sedici.unlp.edu.ar/handle/10915/59979 http://journal.info.unlp.edu.ar/wp-content/uploads/2017/05/JCST-44-Paper-4.pdf |
Aporte de: |
id |
I19-R120-10915-59979 |
---|---|
record_format |
dspace |
institution |
Universidad Nacional de La Plata |
institution_str |
I-19 |
repository_str |
R-120 |
collection |
SEDICI (UNLP) |
language |
Inglés |
topic |
Ciencias Informáticas featured article identification information quality quality flaws prediction Wikipedia |
spellingShingle |
Ciencias Informáticas featured article identification information quality quality flaws prediction Wikipedia Ferretti, Edgardo Soria, Matías Pérez Casseignau, Sebastián Pohn, Lian Urquiza, Guido Gómez, Sergio Alejandro Errecalde, Marcelo Luis Towards Information Quality Assurance in Spanish: Wikipedia |
topic_facet |
Ciencias Informáticas featured article identification information quality quality flaws prediction Wikipedia |
description |
Featured Articles (FA) are considered to be the best articles that Wikipedia has to offer and in the last years, researchers have found interesting to analyze whether and how they can be distinguished from “ordinary” articles. Likewise, identifying what issues have to be enhanced or fixed in ordinary articles in order to improve their quality is a recent key research trend. Most of the approaches developed to face these information quality problems have been proposed for the English Wikipedia. However, few efforts have been accomplished in Spanish Wikipedia, despite being Spanish, one of the most spoken languages in the world by native speakers. In this respect, we present a breakdown of Spanish Wikipedia’s quality flaw structure. Besides, we carry out studies with three different corpora to automatically assess information quality in Spanish Wikipedia, where FA identification is evaluated as a binary classification task. Our evaluation on a unified setting allows to compare with the English version, the performance achieved by our approach on the Spanish version. The best results obtained show that FA identification in Spanish, can be performed with an F1 score of 0.88 using a document model consisting of only twenty six features and Support Vector Machine as classification algorithm. |
format |
Articulo Articulo |
author |
Ferretti, Edgardo Soria, Matías Pérez Casseignau, Sebastián Pohn, Lian Urquiza, Guido Gómez, Sergio Alejandro Errecalde, Marcelo Luis |
author_facet |
Ferretti, Edgardo Soria, Matías Pérez Casseignau, Sebastián Pohn, Lian Urquiza, Guido Gómez, Sergio Alejandro Errecalde, Marcelo Luis |
author_sort |
Ferretti, Edgardo |
title |
Towards Information Quality Assurance in Spanish: Wikipedia |
title_short |
Towards Information Quality Assurance in Spanish: Wikipedia |
title_full |
Towards Information Quality Assurance in Spanish: Wikipedia |
title_fullStr |
Towards Information Quality Assurance in Spanish: Wikipedia |
title_full_unstemmed |
Towards Information Quality Assurance in Spanish: Wikipedia |
title_sort |
towards information quality assurance in spanish: wikipedia |
publishDate |
2017 |
url |
http://sedici.unlp.edu.ar/handle/10915/59979 http://journal.info.unlp.edu.ar/wp-content/uploads/2017/05/JCST-44-Paper-4.pdf |
work_keys_str_mv |
AT ferrettiedgardo towardsinformationqualityassuranceinspanishwikipedia AT soriamatias towardsinformationqualityassuranceinspanishwikipedia AT perezcasseignausebastian towardsinformationqualityassuranceinspanishwikipedia AT pohnlian towardsinformationqualityassuranceinspanishwikipedia AT urquizaguido towardsinformationqualityassuranceinspanishwikipedia AT gomezsergioalejandro towardsinformationqualityassuranceinspanishwikipedia AT errecaldemarceloluis towardsinformationqualityassuranceinspanishwikipedia |
bdutipo_str |
Repositorios |
_version_ |
1764820478186225665 |