Robust estimation of multivariate location and scatter in the presence of missing data

Two main issues regarding data quality are data contamination (outliers) and data completion (missing data). These two problems have attracted much attention and research but surprisingly, they are seldom considered together. Popular robust methods such as S-estimators of multivariate location and s...

Descripción completa

Detalles Bibliográficos
Autores principales: Danilov, M., Yohai, V.J., Zamar, R.H.
Formato: JOUR
Materias:
Acceso en línea:http://hdl.handle.net/20.500.12110/paper_01621459_v107_n499_p1178_Danilov
Aporte de:
id todo:paper_01621459_v107_n499_p1178_Danilov
record_format dspace
spelling todo:paper_01621459_v107_n499_p1178_Danilov2023-10-03T15:01:31Z Robust estimation of multivariate location and scatter in the presence of missing data Danilov, M. Yohai, V.J. Zamar, R.H. Consistent Elliptical distribution EM algorithm Fixed point equation Two main issues regarding data quality are data contamination (outliers) and data completion (missing data). These two problems have attracted much attention and research but surprisingly, they are seldom considered together. Popular robust methods such as S-estimators of multivariate location and scatter offer protection against outliers but cannot deal with missing data, except for the obviously inefficient approach of deleting all incomplete cases. We generalize the definition of S-estimators of multivariate location and scatter to simultaneously deal with missing data and outliers. We show that the proposed estimators are strongly consistent under elliptical models when data are missing completely at random. We derive an algorithm similar to the Expectation-Maximization algorithm for computing the proposed estimators. This algorithm is initialized by an extension for missing data of the minimum volume ellipsoid. We assess the performance of our proposal by Monte Carlo simulation and give some real data examples. This article has supplementary material online. © 2012 American Statistical Association. JOUR info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by/2.5/ar http://hdl.handle.net/20.500.12110/paper_01621459_v107_n499_p1178_Danilov
institution Universidad de Buenos Aires
institution_str I-28
repository_str R-134
collection Biblioteca Digital - Facultad de Ciencias Exactas y Naturales (UBA)
topic Consistent
Elliptical distribution
EM algorithm
Fixed point equation
spellingShingle Consistent
Elliptical distribution
EM algorithm
Fixed point equation
Danilov, M.
Yohai, V.J.
Zamar, R.H.
Robust estimation of multivariate location and scatter in the presence of missing data
topic_facet Consistent
Elliptical distribution
EM algorithm
Fixed point equation
description Two main issues regarding data quality are data contamination (outliers) and data completion (missing data). These two problems have attracted much attention and research but surprisingly, they are seldom considered together. Popular robust methods such as S-estimators of multivariate location and scatter offer protection against outliers but cannot deal with missing data, except for the obviously inefficient approach of deleting all incomplete cases. We generalize the definition of S-estimators of multivariate location and scatter to simultaneously deal with missing data and outliers. We show that the proposed estimators are strongly consistent under elliptical models when data are missing completely at random. We derive an algorithm similar to the Expectation-Maximization algorithm for computing the proposed estimators. This algorithm is initialized by an extension for missing data of the minimum volume ellipsoid. We assess the performance of our proposal by Monte Carlo simulation and give some real data examples. This article has supplementary material online. © 2012 American Statistical Association.
format JOUR
author Danilov, M.
Yohai, V.J.
Zamar, R.H.
author_facet Danilov, M.
Yohai, V.J.
Zamar, R.H.
author_sort Danilov, M.
title Robust estimation of multivariate location and scatter in the presence of missing data
title_short Robust estimation of multivariate location and scatter in the presence of missing data
title_full Robust estimation of multivariate location and scatter in the presence of missing data
title_fullStr Robust estimation of multivariate location and scatter in the presence of missing data
title_full_unstemmed Robust estimation of multivariate location and scatter in the presence of missing data
title_sort robust estimation of multivariate location and scatter in the presence of missing data
url http://hdl.handle.net/20.500.12110/paper_01621459_v107_n499_p1178_Danilov
work_keys_str_mv AT danilovm robustestimationofmultivariatelocationandscatterinthepresenceofmissingdata
AT yohaivj robustestimationofmultivariatelocationandscatterinthepresenceofmissingdata
AT zamarrh robustestimationofmultivariatelocationandscatterinthepresenceofmissingdata
_version_ 1807321794916384768