Comparison of variable selection procedures to model weather-pathogen relation in crops
Nowadays it is possible to easily access large volumes of georeferenced climatic data. These data can be used to model the relationship between climatic conditions and disease from multiple meteorological variables, usually correlated and redundant. The selection of variables allows the identificati...
Guardado en:
Autores principales: | , , , |
---|---|
Formato: | Artículo revista |
Lenguaje: | Español |
Publicado: |
Facultad de Ciencias Agropecuarias
2024
|
Materias: | |
Acceso en línea: | https://revistas.unc.edu.ar/index.php/agris/article/view/40871 |
Aporte de: |
id |
I10-R352-article-40871 |
---|---|
record_format |
ojs |
spelling |
I10-R352-article-408712024-07-10T00:40:59Z Comparison of variable selection procedures to model weather-pathogen relation in crops Comparación de procedimientos de selección de variables para la modelación de la relación clima-patógenos en cultivos Suarez, Franco Marcelo Bruno, Cecilia Giménez Pecci, María de la Paz Balzarini, Mónica LASSO Stepwise Boruta logistic regression LASSO Stepwise Boruta regresión logística Nowadays it is possible to easily access large volumes of georeferenced climatic data. These data can be used to model the relationship between climatic conditions and disease from multiple meteorological variables, usually correlated and redundant. The selection of variables allows the identification of a subset of relevant regressors to build predictive models. Stepwise, Boruta, and LASSO are variable selection procedures of different nature, so their relative performance has been scarcely explored. The objective of this work was the comparison of these methods simultaneously applied in the construction of regression models to predict disease risk from climatic data. Three georeferenced databases were used with presence/absence values of different pathogens in maize crops in Argentina. For each scenario, climatic variables from the period prior to sowing until harvest were obtained. The three variable selection methods obtained models with accuracy close to 70 %. However, LASSO produced the best predictive model, selecting an intermediate number of variables with respect to Stepwise (lower number) and Boruta (higher number). The results could be extended to other pathosystems and inspire the construction of alarm systems based on climatic variables. Hoy es posible acceder fácilmente a cuantiosos volúmenes de datos climáticos georreferenciados. Éstos pueden ser usados para modelar la relación entre condiciones climáticas y enfermedad, para lo cual es necesario usar múltiples variables meteorológicas, usualmente correlacionadas y redundantes. La selección de variables permite identificar un subconjunto de regresoras relevantes para construir modelos predictivos. Stepwise, Boruta y LASSO son procedimientos de selección de variables de distinta naturaleza por lo que su desempeño relativo ha sido poco explorado. El objetivo de este trabajo fue la comparación de estos métodos aplicados simultáneamente en la construcción de modelos de regresión para predecir riesgo de enfermedad desde datos climáticos. Se utilizaron tres bases de datos georreferenciados con valores de presencia/ausencia de distintos patógenos en cultivos de maíz en Argentina. Para cada escenario se obtuvieron variables climáticas del periodo previo a la siembra hasta la cosecha. Con los tres métodos se generaron modelos predictivos con precisión de clasificación cercana al 70 %. LASSO produjo mejor predicción, seleccionando una cantidad intermedia de variables respecto a Stepwise (menor cantidad) y a Boruta (mayor). Los resultados podrían extenderse a otros patosistemas y contribuir a la construcción de sistemas de alarma basados en variables climáticas Facultad de Ciencias Agropecuarias 2024-01-05 info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion text/html application/pdf https://revistas.unc.edu.ar/index.php/agris/article/view/40871 10.31047/1668.298x.v40.n2.40871 AgriScientia; Vol. 40 No. 2 (2023); 37-48 AgriScientia; Vol. 40 Núm. 2 (2023); 37-48 1668-298X 10.31047/1668.298x.v40.n2 spa https://revistas.unc.edu.ar/index.php/agris/article/view/40871/45750 https://revistas.unc.edu.ar/index.php/agris/article/view/40871/44495 Derechos de autor 2024 Franco Marcelo Suarez, Cecilia Bruno, María de la Paz Giménez Pecci, Mónica Balzarini https://creativecommons.org/licenses/by-sa/4.0 |
institution |
Universidad Nacional de Córdoba |
institution_str |
I-10 |
repository_str |
R-352 |
container_title_str |
AgriScientia |
language |
Español |
format |
Artículo revista |
topic |
LASSO Stepwise Boruta logistic regression LASSO Stepwise Boruta regresión logística |
spellingShingle |
LASSO Stepwise Boruta logistic regression LASSO Stepwise Boruta regresión logística Suarez, Franco Marcelo Bruno, Cecilia Giménez Pecci, María de la Paz Balzarini, Mónica Comparison of variable selection procedures to model weather-pathogen relation in crops |
topic_facet |
LASSO Stepwise Boruta logistic regression LASSO Stepwise Boruta regresión logística |
author |
Suarez, Franco Marcelo Bruno, Cecilia Giménez Pecci, María de la Paz Balzarini, Mónica |
author_facet |
Suarez, Franco Marcelo Bruno, Cecilia Giménez Pecci, María de la Paz Balzarini, Mónica |
author_sort |
Suarez, Franco Marcelo |
title |
Comparison of variable selection procedures to model weather-pathogen relation in crops |
title_short |
Comparison of variable selection procedures to model weather-pathogen relation in crops |
title_full |
Comparison of variable selection procedures to model weather-pathogen relation in crops |
title_fullStr |
Comparison of variable selection procedures to model weather-pathogen relation in crops |
title_full_unstemmed |
Comparison of variable selection procedures to model weather-pathogen relation in crops |
title_sort |
comparison of variable selection procedures to model weather-pathogen relation in crops |
description |
Nowadays it is possible to easily access large volumes of georeferenced climatic data. These data can be used to model the relationship between climatic conditions and disease from multiple meteorological variables, usually correlated and redundant. The selection of variables allows the identification of a subset of relevant regressors to build predictive models. Stepwise, Boruta, and LASSO are variable selection procedures of different nature, so their relative performance has been scarcely explored. The objective of this work was the comparison of these methods simultaneously applied in the construction of regression models to predict disease risk from climatic data. Three georeferenced databases were used with presence/absence values of different pathogens in maize crops in Argentina. For each scenario, climatic variables from the period prior to sowing until harvest were obtained. The three variable selection methods obtained models with accuracy close to 70 %. However, LASSO produced the best predictive model, selecting an intermediate number of variables with respect to Stepwise (lower number) and Boruta (higher number). The results could be extended to other pathosystems and inspire the construction of alarm systems based on climatic variables. |
publisher |
Facultad de Ciencias Agropecuarias |
publishDate |
2024 |
url |
https://revistas.unc.edu.ar/index.php/agris/article/view/40871 |
work_keys_str_mv |
AT suarezfrancomarcelo comparisonofvariableselectionprocedurestomodelweatherpathogenrelationincrops AT brunocecilia comparisonofvariableselectionprocedurestomodelweatherpathogenrelationincrops AT gimenezpeccimariadelapaz comparisonofvariableselectionprocedurestomodelweatherpathogenrelationincrops AT balzarinimonica comparisonofvariableselectionprocedurestomodelweatherpathogenrelationincrops AT suarezfrancomarcelo comparaciondeprocedimientosdeselecciondevariablesparalamodelaciondelarelacionclimapatogenosencultivos AT brunocecilia comparaciondeprocedimientosdeselecciondevariablesparalamodelaciondelarelacionclimapatogenosencultivos AT gimenezpeccimariadelapaz comparaciondeprocedimientosdeselecciondevariablesparalamodelaciondelarelacionclimapatogenosencultivos AT balzarinimonica comparaciondeprocedimientosdeselecciondevariablesparalamodelaciondelarelacionclimapatogenosencultivos |
first_indexed |
2024-09-03T22:16:40Z |
last_indexed |
2024-09-03T22:16:40Z |
_version_ |
1809214918032883712 |