False-positive rates in two-point parametric linkage analysis

Silke Szymczak, Claire Simpson, Cheryl D. Cropp, Joan E. Bailey-Wilson

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Two-point linkage analyses of whole genome sequence data are a promising approach to identify rare variants that segregate with complex diseases in large pedigrees because, in theory, the causal variants have been genotyped. We used whole genome sequence data and simulated traits provided by Genetic Analysis Workshop 18 to evaluate the proportion of false-positive findings in a binary trait using classic two-point parametric linkage analysis. False-positive genome-wide significant log of odds (LOD) scores were identified in more than 80% of 50 replicates for a binary phenotype generated by dichotomizing a quantitative trait that was simulated with a polygenic component (that was not based on any of the provided whole genome sequence genotypes). In contrast, when the trait was truly nongenetic (created by randomly assigning affected-unaffected status), the number of false-positive results was well controlled. These results suggest that when using two-point linkage analyses on whole genome sequence data, one should carefully examine regions yielding significant two-point LOD scores with multipoint analysis and that a more stringent significance threshold may be needed.

Original languageEnglish (US)
Article number110
JournalBMC Proceedings
Volume8
DOIs
StatePublished - Jun 17 2014
Externally publishedYes

Fingerprint

Genes
Genome
Pedigree
Genotype
Phenotype
Education

All Science Journal Classification (ASJC) codes

  • Biochemistry, Genetics and Molecular Biology(all)

Cite this

False-positive rates in two-point parametric linkage analysis. / Szymczak, Silke; Simpson, Claire; Cropp, Cheryl D.; Bailey-Wilson, Joan E.

In: BMC Proceedings, Vol. 8, 110, 17.06.2014.

Research output: Contribution to journalArticle

Szymczak, Silke ; Simpson, Claire ; Cropp, Cheryl D. ; Bailey-Wilson, Joan E. / False-positive rates in two-point parametric linkage analysis. In: BMC Proceedings. 2014 ; Vol. 8.
@article{a78dfd358e1c4bf39398e7d89d3604c7,
title = "False-positive rates in two-point parametric linkage analysis",
abstract = "Two-point linkage analyses of whole genome sequence data are a promising approach to identify rare variants that segregate with complex diseases in large pedigrees because, in theory, the causal variants have been genotyped. We used whole genome sequence data and simulated traits provided by Genetic Analysis Workshop 18 to evaluate the proportion of false-positive findings in a binary trait using classic two-point parametric linkage analysis. False-positive genome-wide significant log of odds (LOD) scores were identified in more than 80{\%} of 50 replicates for a binary phenotype generated by dichotomizing a quantitative trait that was simulated with a polygenic component (that was not based on any of the provided whole genome sequence genotypes). In contrast, when the trait was truly nongenetic (created by randomly assigning affected-unaffected status), the number of false-positive results was well controlled. These results suggest that when using two-point linkage analyses on whole genome sequence data, one should carefully examine regions yielding significant two-point LOD scores with multipoint analysis and that a more stringent significance threshold may be needed.",
author = "Silke Szymczak and Claire Simpson and Cropp, {Cheryl D.} and Bailey-Wilson, {Joan E.}",
year = "2014",
month = "6",
day = "17",
doi = "10.1186/1753-6561-8-S1-S110",
language = "English (US)",
volume = "8",
journal = "BMC Proceedings",
issn = "1753-6561",
publisher = "BioMed Central",

}

TY - JOUR

T1 - False-positive rates in two-point parametric linkage analysis

AU - Szymczak, Silke

AU - Simpson, Claire

AU - Cropp, Cheryl D.

AU - Bailey-Wilson, Joan E.

PY - 2014/6/17

Y1 - 2014/6/17

N2 - Two-point linkage analyses of whole genome sequence data are a promising approach to identify rare variants that segregate with complex diseases in large pedigrees because, in theory, the causal variants have been genotyped. We used whole genome sequence data and simulated traits provided by Genetic Analysis Workshop 18 to evaluate the proportion of false-positive findings in a binary trait using classic two-point parametric linkage analysis. False-positive genome-wide significant log of odds (LOD) scores were identified in more than 80% of 50 replicates for a binary phenotype generated by dichotomizing a quantitative trait that was simulated with a polygenic component (that was not based on any of the provided whole genome sequence genotypes). In contrast, when the trait was truly nongenetic (created by randomly assigning affected-unaffected status), the number of false-positive results was well controlled. These results suggest that when using two-point linkage analyses on whole genome sequence data, one should carefully examine regions yielding significant two-point LOD scores with multipoint analysis and that a more stringent significance threshold may be needed.

AB - Two-point linkage analyses of whole genome sequence data are a promising approach to identify rare variants that segregate with complex diseases in large pedigrees because, in theory, the causal variants have been genotyped. We used whole genome sequence data and simulated traits provided by Genetic Analysis Workshop 18 to evaluate the proportion of false-positive findings in a binary trait using classic two-point parametric linkage analysis. False-positive genome-wide significant log of odds (LOD) scores were identified in more than 80% of 50 replicates for a binary phenotype generated by dichotomizing a quantitative trait that was simulated with a polygenic component (that was not based on any of the provided whole genome sequence genotypes). In contrast, when the trait was truly nongenetic (created by randomly assigning affected-unaffected status), the number of false-positive results was well controlled. These results suggest that when using two-point linkage analyses on whole genome sequence data, one should carefully examine regions yielding significant two-point LOD scores with multipoint analysis and that a more stringent significance threshold may be needed.

UR - http://www.scopus.com/inward/record.url?scp=85018193067&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85018193067&partnerID=8YFLogxK

U2 - 10.1186/1753-6561-8-S1-S110

DO - 10.1186/1753-6561-8-S1-S110

M3 - Article

VL - 8

JO - BMC Proceedings

JF - BMC Proceedings

SN - 1753-6561

M1 - 110

ER -