Job Schedulers for Machine Learning and Data Mining algorithms distributed in Hadoop

The standard scheduler of Hadoop does not consider the characteristics of jobs such as computational demand, inputs / outputs, dependencies, location of the data, etc., which could be a valuable source to allocate resources to jobs in order to optimize their use. The objective of this research is to...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Cornejo, Félix Martín, Zunino, Alejandro, Murazzo, María Antonia
Formato: Objeto de conferencia
Lenguaje:Inglés
Publicado: 2018
Materias:
Acceso en línea:http://sedici.unlp.edu.ar/handle/10915/69919
Aporte de:
Descripción
Sumario:The standard scheduler of Hadoop does not consider the characteristics of jobs such as computational demand, inputs / outputs, dependencies, location of the data, etc., which could be a valuable source to allocate resources to jobs in order to optimize their use. The objective of this research is to take advantage of this information for planning, limiting the scope to ML / DM algorithms, in order to improve the execution times with respect to existing schedulers. The aim is to improve Hadoop job schedulers, seeking to optimize the execution times of machine learning and data mining algorithms in Clusters.