Towards Smart Data Technologies for Big Data Analytics

Currently the publicly available datasets for Big Data Ana-lytics are of different qualities, and obtaining the expected behavior from the Machine Learning algorithms is crucial. Furthermore, since working with a huge amount of data is usually a time-demanding task, tohave high quality data is requi...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Basgall, María José, Naiouf, Marcelo, Herrera, Francisco, Fernández, Alberto
Formato: Objeto de conferencia
Lenguaje:Inglés
Publicado: 2020
Materias:
Acceso en línea:http://sedici.unlp.edu.ar/handle/10915/104775
Aporte de:
Descripción
Sumario:Currently the publicly available datasets for Big Data Ana-lytics are of different qualities, and obtaining the expected behavior from the Machine Learning algorithms is crucial. Furthermore, since working with a huge amount of data is usually a time-demanding task, tohave high quality data is required. Smart Data refers to the process of transforming Big Data into clean and reliable data, and this can be accomplished by converting them, reducing unnecessary volume of data or applying some preprocessing techniques with the aim of improve their quality, and still to obtain trustworthy results. We present those properties that affect the quality of data. Also, the available proposals to analyze the quality of huge amount of data and to cope with low quality datasets in an scalable way, are commented. Furthermore, the need for a methodology towards Smart Data is highlighted.