Multivariate location and scatter matrix estimation under cellwise and casewise contamination

Real data may contain both cellwise outliers and casewise outliers. There is a vast literature on robust estimation for casewise outliers, but only a scant literature for cellwise outliers and almost none for both types of outliers. Estimation of multivariate location and scatter matrix is a corner...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autor principal: Leung, A.
Otros Autores: Yohai, V., Zamar, R.
Formato: Capítulo de libro
Lenguaje:Inglés
Publicado: Elsevier B.V. 2017
Acceso en línea:Registro en Scopus
DOI
Handle
Registro en la Biblioteca Digital
Aporte de:Registro referencial: Solicitar el recurso aquí
LEADER 07723caa a22007217a 4500
001 PAPER-14879
003 AR-BaUEN
005 20230518204529.0
008 190410s2017 xx ||||fo|||| 00| 0 eng|d
024 7 |2 scopus  |a 2-s2.0-85013669307 
040 |a Scopus  |b spa  |c AR-BaUEN  |d AR-BaUEN 
030 |a CSDAD 
100 1 |a Leung, A. 
245 1 0 |a Multivariate location and scatter matrix estimation under cellwise and casewise contamination 
260 |b Elsevier B.V.  |c 2017 
270 1 0 |m Leung, A.; Department of Statistics, University of British Columbia, 3182-2207 Main Mall, Vancouver, Canada; email: andy.leung@stat.ubc.ca 
506 |2 openaire  |e Política editorial 
504 |a Agostinelli, C., Leung, A., Yohai, V.J., Zamar, R.H., Rejoinder on: Robust estimation of multivariate location and scatter in the presence of cellwise and casewise contamination (2015) TEST, 24 (3), pp. 484-488 
504 |a Agostinelli, C., Leung, A., Yohai, V.J., Zamar, R.H., Robust estimation of multivariate location and scatter in the presence of cellwise and casewise contamination (2015) TEST, 24 (3), pp. 441-461 
504 |a Alqallaf, F.A., Konis, K.P., Martin, R.D., Zamar, R.H., Scalable robust covariance and correlation estimates for data mining (2002) Proceedings of the eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 14-23. , In: KDD ’02. pp 
504 |a Alqallaf, F., Van~Aelst, S., Yohai, V.J., Zamar, R.H., Propagation of outliers in multivariate data (2009) Ann. Statist., 37 (1), pp. 311-331 
504 |a Danilov, M., Yohai, V.J., Zamar, R.H., Robust estimation of multivariate location and scatter in the presence of missing data (2012) J. Amer. Statist. Assoc., 107, pp. 1178-1186 
504 |a Farcomeni, A., Robust constrained clustering in presence of entry-wise outliers (2014) Technometrics, 56, pp. 102-111 
504 |a Friedman, J., Hastie, T., Tibshirani, R., Sparse inverse covariance estimation with the graphical lasso (2008) Biostatistics, 9 (3), pp. 432-441 
504 |a Gnanadesikan, R., Kettenring, J.R., Robust estimates, residuals, and outlier detection with multiresponse data (1972) Biometrics, 28, pp. 81-124 
504 |a Hall, P., Marron, J., Neeman, A., Geometric representation of high dimension, low sample size data (2005) J. R. Stat. Soc. Ser. B Stat. Methodol., 67, pp. 427-444 
504 |a Leung, A., Danilov, M., Yohai, V., Zamar, R., GSE: Robust Estimation in the Presence of Cellwise and Casewise Contamination and Missing Data (2015), R package version 3.2.3; Maronna, R.A., Comments on: Robust estimation of multivariate location and scatter in the presence of cellwise and casewise contamination (2015) TEST, 24 (3), pp. 471-472 
504 |a Maronna, R.A., Martin, R.D., Yohai, V.J., Robust Statistics: Theory and Methods (2006), John Wiley & Sons Chichister; Maronna, R.A., Yohai, V.J., Robust and efficient estimation of high dimensional scatter and location (2015); Martin, R., Robust covariances: Common risk versus specific risk outliers (2013), www.rinfinance.com/agenda/2013/talk/DougMartin.pdf, In: Presented at the 2013 R-Finance Conference, Chicago, IL, (visited 2016-08-24); Peña, D., Prieto, F.J., Multivariate outlier detection and robust covariance matrix estimation (2001) Technometrics, 43, pp. 286-310 
504 |a Rocke, D.M., Robustness properties of S-estimators of multivariate location and shape in high dimension (1996) Ann. Statist., 24, pp. 1327-1345 
504 |a Rousseeuw, P.J., Croux, C., Alternatives to the median absolute deviation (1993) J. Amer. Statist. Assoc., 88, pp. 1273-1283 
504 |a Rousseeuw, P.J., Van~den Bossche, W., Comments on: Robust estimation of multivariate location and scatter in the presence of cellwise and casewise contamination (2015) TEST, 24 (3), pp. 473-477 
504 |a Rousseeuw, P.J., Van den Bossche, W., 2016. Detecting deviating data cells. [stat.ME]; Van Aelst, S., Vandervieren, E., Willems, G., A Stahel-Donoho estimator based on Huberized outlyingness (2012) Comput. Statist. Data Anal., 56, pp. 531-542 
520 3 |a Real data may contain both cellwise outliers and casewise outliers. There is a vast literature on robust estimation for casewise outliers, but only a scant literature for cellwise outliers and almost none for both types of outliers. Estimation of multivariate location and scatter matrix is a corner stone in multivariate data analysis. A two-step approach was recently proposed to perform robust estimation of multivariate location and scatter matrix in the presence of cellwise and casewise outliers. In the first step a univariate filter was applied to remove cellwise outliers. In the second step a generalized S-estimator was used to downweight casewise outliers. This proposal can be further improved in three main directions. First, through the introduction of a consistent bivariate filter to be used in combination with the univariate filter in the first step. Second, through the proposal of a new fast subsampling procedure to generate starting points for the generalized S-estimator in the second step. Third, through the use of a non-monotonic weight function for the generalized S-estimator to better handle casewise outliers in high dimension. A simulation study and a real data example show that, unlike the original two-step procedure, the modified two-step approach performs and scales well in high dimension. Moreover, they show that the modified procedure outperforms the original one and other state-of-the-art robust procedures under cellwise and casewise data contamination. © 2017 Elsevier B.V.  |l eng 
536 |a Detalles de la financiación: Natural Sciences and Engineering Research Council of Canada, RGPIN-2014-05227 
536 |a Detalles de la financiación: Ruben Zamar and Andy Leung research were partially funded by the Natural Sciences and Engineering Research Council of Canada (Grant No. RGPIN-2014-05227). 
593 |a Department of Statistics, University of British Columbia, 3182-2207 Main Mall, Vancouver, British Columbia, V6T 1Z4, Canada 
593 |a Departamento de Matemática, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Ciudad Universitaria, Pabellón 1, Buenos Aires, 1426, Argentina 
690 1 0 |a CELLWISE OUTLIERS 
690 1 0 |a COMPONENTWISE CONTAMINATION 
690 1 0 |a MULTIVARIATE LOCATION AND SCATTER 
690 1 0 |a ROBUST ESTIMATION 
690 1 0 |a LOCATION 
690 1 0 |a MATRIX ALGEBRA 
690 1 0 |a MULTIVARIANT ANALYSIS 
690 1 0 |a CELLWISE OUTLIERS 
690 1 0 |a COMPONENTWISE 
690 1 0 |a MULTIVARIATE DATA ANALYSIS 
690 1 0 |a ROBUST ESTIMATION 
690 1 0 |a ROBUST PROCEDURES 
690 1 0 |a SIMULATION STUDIES 
690 1 0 |a TWO-STEP APPROACH 
690 1 0 |a TWO-STEP PROCEDURE 
690 1 0 |a STATISTICS 
700 1 |a Yohai, V. 
700 1 |a Zamar, R. 
773 0 |d Elsevier B.V., 2017  |g v. 111  |h pp. 59-76  |p Comput. Stat. Data Anal.  |x 01679473  |w (AR-BaUEN)CENRE-4276  |t Computational Statistics and Data Analysis 
856 4 1 |u https://www.scopus.com/inward/record.uri?eid=2-s2.0-85013669307&doi=10.1016%2fj.csda.2017.02.007&partnerID=40&md5=e0a14817f35f4bb98df80af5bd50b460  |y Registro en Scopus 
856 4 0 |u https://doi.org/10.1016/j.csda.2017.02.007  |y DOI 
856 4 0 |u https://hdl.handle.net/20.500.12110/paper_01679473_v111_n_p59_Leung  |y Handle 
856 4 0 |u https://bibliotecadigital.exactas.uba.ar/collection/paper/document/paper_01679473_v111_n_p59_Leung  |y Registro en la Biblioteca Digital 
961 |a paper_01679473_v111_n_p59_Leung  |b paper  |c PE 
962 |a info:eu-repo/semantics/article  |a info:ar-repo/semantics/artículo  |b info:eu-repo/semantics/publishedVersion 
999 |c 75832