GBStools: A Statistical Method for Estimating Allelic Dropout in Reduced Representation Sequencing Data

Reduced representation sequencing methods such as genotyping-by-sequencing (GBS) enable low-cost measurement of genetic variation without the need for a reference genome assembly. These methods are widely used in genetic mapping and population genetics studies, especially with non-model organisms. V...

Descripción completa

Guardado en:
Detalles Bibliográficos
Autores principales: Cooke, T. F., Yee, M.-C., Muzzio, Marina, Sockell, A., Bell, R., Cornejo, O. E., Kelley, J. L., Bailliet, Graciela, Bravi, Claudio Marcelo, Bustamante, Carlos D., Kenny, E. E.
Formato: Articulo
Lenguaje:Inglés
Publicado: 2016
Materias:
GBS
Acceso en línea:http://sedici.unlp.edu.ar/handle/10915/87069
Aporte de:
id I19-R120-10915-87069
record_format dspace
institution Universidad Nacional de La Plata
institution_str I-19
repository_str R-120
collection SEDICI (UNLP)
language Inglés
topic Ciencias Exactas
GBS
genetic variation
spellingShingle Ciencias Exactas
GBS
genetic variation
Cooke, T. F.
Yee, M.-C.
Muzzio, Marina
Sockell, A.
Bell, R.
Cornejo, O. E.
Kelley, J. L.
Bailliet, Graciela
Bravi, Claudio Marcelo
Bustamante, Carlos D.
Kenny, E. E.
GBStools: A Statistical Method for Estimating Allelic Dropout in Reduced Representation Sequencing Data
topic_facet Ciencias Exactas
GBS
genetic variation
description Reduced representation sequencing methods such as genotyping-by-sequencing (GBS) enable low-cost measurement of genetic variation without the need for a reference genome assembly. These methods are widely used in genetic mapping and population genetics studies, especially with non-model organisms. Variant calling error rates, however, are higher in GBS than in standard sequencing, in particular due to restriction site polymorphisms, and few computational tools exist that specifically model and correct these errors. We developed a statistical method to remove errors caused by restriction site polymorphisms, implemented in the software package GBStools. We evaluated it in several simulated data sets, varying in number of samples, mean coverage and population mutation rate, and in two empirical human data sets (N = 8 and N = 63 samples). In our simulations, GBStools improved genotype accuracy more than commonly used filters such as Hardy-Weinberg equilibrium p-values. GBStools is most effective at removing genotype errors in data sets over 100 samples when coverage is 40X or higher, and the improvement is most pronounced in species with high genomic diversity. We also demonstrate the utility of GBS and GBStools for human population genetic inference in Argentine populations and reveal widely varying individual ancestry proportions and an excess of singletons, consistent with recent population growth.
format Articulo
Articulo
author Cooke, T. F.
Yee, M.-C.
Muzzio, Marina
Sockell, A.
Bell, R.
Cornejo, O. E.
Kelley, J. L.
Bailliet, Graciela
Bravi, Claudio Marcelo
Bustamante, Carlos D.
Kenny, E. E.
author_facet Cooke, T. F.
Yee, M.-C.
Muzzio, Marina
Sockell, A.
Bell, R.
Cornejo, O. E.
Kelley, J. L.
Bailliet, Graciela
Bravi, Claudio Marcelo
Bustamante, Carlos D.
Kenny, E. E.
author_sort Cooke, T. F.
title GBStools: A Statistical Method for Estimating Allelic Dropout in Reduced Representation Sequencing Data
title_short GBStools: A Statistical Method for Estimating Allelic Dropout in Reduced Representation Sequencing Data
title_full GBStools: A Statistical Method for Estimating Allelic Dropout in Reduced Representation Sequencing Data
title_fullStr GBStools: A Statistical Method for Estimating Allelic Dropout in Reduced Representation Sequencing Data
title_full_unstemmed GBStools: A Statistical Method for Estimating Allelic Dropout in Reduced Representation Sequencing Data
title_sort gbstools: a statistical method for estimating allelic dropout in reduced representation sequencing data
publishDate 2016
url http://sedici.unlp.edu.ar/handle/10915/87069
work_keys_str_mv AT cooketf gbstoolsastatisticalmethodforestimatingallelicdropoutinreducedrepresentationsequencingdata
AT yeemc gbstoolsastatisticalmethodforestimatingallelicdropoutinreducedrepresentationsequencingdata
AT muzziomarina gbstoolsastatisticalmethodforestimatingallelicdropoutinreducedrepresentationsequencingdata
AT sockella gbstoolsastatisticalmethodforestimatingallelicdropoutinreducedrepresentationsequencingdata
AT bellr gbstoolsastatisticalmethodforestimatingallelicdropoutinreducedrepresentationsequencingdata
AT cornejooe gbstoolsastatisticalmethodforestimatingallelicdropoutinreducedrepresentationsequencingdata
AT kelleyjl gbstoolsastatisticalmethodforestimatingallelicdropoutinreducedrepresentationsequencingdata
AT baillietgraciela gbstoolsastatisticalmethodforestimatingallelicdropoutinreducedrepresentationsequencingdata
AT braviclaudiomarcelo gbstoolsastatisticalmethodforestimatingallelicdropoutinreducedrepresentationsequencingdata
AT bustamantecarlosd gbstoolsastatisticalmethodforestimatingallelicdropoutinreducedrepresentationsequencingdata
AT kennyee gbstoolsastatisticalmethodforestimatingallelicdropoutinreducedrepresentationsequencingdata
bdutipo_str Repositorios
_version_ 1764820489610461184