Quality control of genotypes using heritability estimates of gene content at the marker

Quality control filtering of single-nucleotide polymorphisms (SNPs) is a key step when analyzing genomic data. Here we present a practical method to identify low-quality SNPs, meaning markers whose genotypes are wrongly assigned for a large proportion of individuals, by estimating the heritability o...

Descripción completa

Guardado en:
Detalles Bibliográficos
Otros Autores: Forneris, Natalia Soledad, Legarra, Andres, Vitezica, Zulma Gladis, Tsuruta, Shogo, Aguilar, Ignacio, Misztal, Ignacy, Cantet, Rodolfo Juan Carlos
Formato: Artículo
Lenguaje:Inglés
Materias:
SNP
Acceso en línea:http://ri.agro.uba.ar/files/intranet/articulo/2015forneris1.pdf
LINK AL EDITOR
Aporte de:Registro referencial: Solicitar el recurso aquí
LEADER 04186nab a22004697a 4500
001 20190506153019.0
003 AR-BaUFA
005 20220502145431.0
008 190506t2015 xxud||||o|||| 00| | eng d
999 |c 46309  |d 46309 
999 |d 46309 
999 |d 46309 
999 |d 46309 
999 |d 46309 
999 |d 46309 
999 |d 46309 
999 |d 46309 
022 |a 1943-2631 
024 |a 10.1534/genetics.114.173559 
040 |a AR-BaUFA 
245 1 |a Quality control of genotypes using heritability estimates of gene content at the marker 
520 |a Quality control filtering of single-nucleotide polymorphisms (SNPs) is a key step when analyzing genomic data. Here we present a practical method to identify low-quality SNPs, meaning markers whose genotypes are wrongly assigned for a large proportion of individuals, by estimating the heritability of gene content at each marker, where gene content is the number of copies of a particular reference allele in a genotype of an animal (0, 1, or 2). If there is no mutation at the marker, gene content has an additive heritability of 1 by construction. The method uses restricted maximum likelihood (REML) to estimate heritability of gene content at each SNP and also builds a likelihood-ratio test statistic to test for zero error variance in genotyping. As a by-product, estimates of the allele frequencies of markers at the base population are obtained. Using simulated data with 10% permutation error (4% actual error) in genotyping, the method had a specificity of 0.96 (4% of correct markers are rejected) and a sensitivity of 0.99 (1% of wrong markers are accepted) if markers with heritability lower than 0.975 are discarded. Checking of Mendelian errors resulted in a lower sensitivity (0.84) for the same simulation. The proposed method is further illustrated with a real data set with genotypes from 3534 animals genotyped for 50,433 markers from the Illumina PorcineSNP60 chip and a pedigree of 6473 individuals; those markers underwent very little quality control. A total of 4099 markers with P-values lower than 0.01 were discarded based on our method, with associated estimates of heritability as low as 0.12. Contrary to other techniques, our method uses all information in the population simultaneously, can be used in any population with markers and pedigree recordings, and is simple to implement using standard software for REML estimation. Scripts for its use are provided. 
653 |a GENE CONTENT 
653 |a QUALITY CONTROL 
653 |a SNP 
653 |a GENOMIC SELECTION 
653 |a REML 
653 |a SHARED DATA RESOURCE 
653 |a GENPRED 
700 1 |9 29153  |a Forneris, Natalia Soledad  |u Universidad de Buenos Aires. Facultad de Agronomía. Departamento de Producción Animal. Buenos Aires, Argentina.  |u CONICET. Buenos Aires, Argentina. 
700 1 |9 67204  |a Legarra, Andres  |u INRA. Génétique. Physiologie et Systèmes d’Elevage. Castanet-Tolosan, France.  |u Université de Toulouse. INP. ENSAT. Génétique. Physiologie et Systèmes d’Elevage. Castanet-Tolosan, France. 
700 1 |a Vitezica, Zulma Gladis  |u INRA. Génétique. Physiologie et Systèmes d’Elevage. Castanet-Tolosan, France.  |u Université de Toulouse. INP. ENSAT. Génétique. Physiologie et Systèmes d’Elevage. Castanet-Tolosan, France.  |9 7786 
700 1 |a Tsuruta, Shogo  |u University of Georgia. Animal and Dairy Science. Athens, Georgia.  |9 68612 
700 1 |a Aguilar, Ignacio  |u Instituto Nacional de Investigación Agropecuaria. Canelones, Uruguay.  |9 68613 
700 1 |a Misztal, Ignacy  |u University of Georgia. Animal and Dairy Science. Athens, Georgia.  |9 67508 
700 1 |9 12817  |a Cantet, Rodolfo Juan Carlos  |u Universidad de Buenos Aires. Facultad de Agronomía. Departamento de Producción Animal. Buenos Aires, Argentina.  |u CONICET. Buenos Aires, Argentina. 
773 |t Genetics  |g vol.199, no.3 (2015), p.675–681, grafs. 
856 |f 2015forneris1  |i en reservorio  |q application/pdf  |u http://ri.agro.uba.ar/files/intranet/articulo/2015forneris1.pdf  |x ARTI201904 
856 |u https://www.genetics.org/  |z LINK AL EDITOR 
942 |c ARTICULO 
942 |c ENLINEA 
976 |a AAG