Linear mixed models for association analysis of quantitative traits with next-generation sequencing data

Chi-Yang Chiu, Fang Yuan, Bing song Zhang, Ao Yuan, Xin Li, Hong Bin Fang, Kenneth Lange, Daniel E. Weeks, Alexander F. Wilson, Joan E. Bailey-Wilson, Anthony M. Musolf, Dwight Stambolian, M'Hamed Lajmi Lakhal-Chaieb, Richard J. Cook, Francis J. McMahon, Christopher I. Amos, Momiao Xiong, Ruzong Fan

Research output: Contribution to journalArticle

Abstract

We develop linear mixed models (LMMs) and functional linear mixed models (FLMMs) for gene-based tests of association between a quantitative trait and genetic variants on pedigrees. The effects of a major gene are modeled as a fixed effect, the contributions of polygenes are modeled as a random effect, and the correlations of pedigree members are modeled via inbreeding/kinship coefficients. F-statistics and χ 2 likelihood ratio test (LRT) statistics based on the LMMs and FLMMs are constructed to test for association. We show empirically that the F-distributed statistics provide a good control of the type I error rate. The F-test statistics of the LMMs have similar or higher power than the FLMMs, kernel-based famSKAT (family-based sequence kernel association test), and burden test famBT (family-based burden test). The F-statistics of the FLMMs perform well when analyzing a combination of rare and common variants. For small samples, the LRT statistics of the FLMMs control the type I error rate well at the nominal levels α = 0.01 and 0.05. For moderate/large samples, the LRT statistics of the FLMMs control the type I error rates well. The LRT statistics of the LMMs can lead to inflated type I error rates. The proposed models are useful in whole genome and whole exome association studies of complex traits.

Original languageEnglish (US)
Pages (from-to)189-206
Number of pages18
JournalGenetic Epidemiology
Volume43
Issue number2
DOIs
StatePublished - Mar 1 2019

Fingerprint

Linear Models
Pedigree
Exome
Inbreeding
Genome-Wide Association Study
Genes

All Science Journal Classification (ASJC) codes

  • Epidemiology
  • Genetics(clinical)

Cite this

Linear mixed models for association analysis of quantitative traits with next-generation sequencing data. / Chiu, Chi-Yang; Yuan, Fang; Zhang, Bing song; Yuan, Ao; Li, Xin; Fang, Hong Bin; Lange, Kenneth; Weeks, Daniel E.; Wilson, Alexander F.; Bailey-Wilson, Joan E.; Musolf, Anthony M.; Stambolian, Dwight; Lakhal-Chaieb, M'Hamed Lajmi; Cook, Richard J.; McMahon, Francis J.; Amos, Christopher I.; Xiong, Momiao; Fan, Ruzong.

In: Genetic Epidemiology, Vol. 43, No. 2, 01.03.2019, p. 189-206.

Research output: Contribution to journalArticle

Chiu, C-Y, Yuan, F, Zhang, BS, Yuan, A, Li, X, Fang, HB, Lange, K, Weeks, DE, Wilson, AF, Bailey-Wilson, JE, Musolf, AM, Stambolian, D, Lakhal-Chaieb, MHL, Cook, RJ, McMahon, FJ, Amos, CI, Xiong, M & Fan, R 2019, 'Linear mixed models for association analysis of quantitative traits with next-generation sequencing data', Genetic Epidemiology, vol. 43, no. 2, pp. 189-206. https://doi.org/10.1002/gepi.22177
Chiu, Chi-Yang ; Yuan, Fang ; Zhang, Bing song ; Yuan, Ao ; Li, Xin ; Fang, Hong Bin ; Lange, Kenneth ; Weeks, Daniel E. ; Wilson, Alexander F. ; Bailey-Wilson, Joan E. ; Musolf, Anthony M. ; Stambolian, Dwight ; Lakhal-Chaieb, M'Hamed Lajmi ; Cook, Richard J. ; McMahon, Francis J. ; Amos, Christopher I. ; Xiong, Momiao ; Fan, Ruzong. / Linear mixed models for association analysis of quantitative traits with next-generation sequencing data. In: Genetic Epidemiology. 2019 ; Vol. 43, No. 2. pp. 189-206.
@article{7814edff67fa4f488e78a4c6200df8e9,
title = "Linear mixed models for association analysis of quantitative traits with next-generation sequencing data",
abstract = "We develop linear mixed models (LMMs) and functional linear mixed models (FLMMs) for gene-based tests of association between a quantitative trait and genetic variants on pedigrees. The effects of a major gene are modeled as a fixed effect, the contributions of polygenes are modeled as a random effect, and the correlations of pedigree members are modeled via inbreeding/kinship coefficients. F-statistics and χ 2 likelihood ratio test (LRT) statistics based on the LMMs and FLMMs are constructed to test for association. We show empirically that the F-distributed statistics provide a good control of the type I error rate. The F-test statistics of the LMMs have similar or higher power than the FLMMs, kernel-based famSKAT (family-based sequence kernel association test), and burden test famBT (family-based burden test). The F-statistics of the FLMMs perform well when analyzing a combination of rare and common variants. For small samples, the LRT statistics of the FLMMs control the type I error rate well at the nominal levels α = 0.01 and 0.05. For moderate/large samples, the LRT statistics of the FLMMs control the type I error rates well. The LRT statistics of the LMMs can lead to inflated type I error rates. The proposed models are useful in whole genome and whole exome association studies of complex traits.",
author = "Chi-Yang Chiu and Fang Yuan and Zhang, {Bing song} and Ao Yuan and Xin Li and Fang, {Hong Bin} and Kenneth Lange and Weeks, {Daniel E.} and Wilson, {Alexander F.} and Bailey-Wilson, {Joan E.} and Musolf, {Anthony M.} and Dwight Stambolian and Lakhal-Chaieb, {M'Hamed Lajmi} and Cook, {Richard J.} and McMahon, {Francis J.} and Amos, {Christopher I.} and Momiao Xiong and Ruzong Fan",
year = "2019",
month = "3",
day = "1",
doi = "10.1002/gepi.22177",
language = "English (US)",
volume = "43",
pages = "189--206",
journal = "Genetic Epidemiology",
issn = "0741-0395",
publisher = "Wiley-Liss Inc.",
number = "2",

}

TY - JOUR

T1 - Linear mixed models for association analysis of quantitative traits with next-generation sequencing data

AU - Chiu, Chi-Yang

AU - Yuan, Fang

AU - Zhang, Bing song

AU - Yuan, Ao

AU - Li, Xin

AU - Fang, Hong Bin

AU - Lange, Kenneth

AU - Weeks, Daniel E.

AU - Wilson, Alexander F.

AU - Bailey-Wilson, Joan E.

AU - Musolf, Anthony M.

AU - Stambolian, Dwight

AU - Lakhal-Chaieb, M'Hamed Lajmi

AU - Cook, Richard J.

AU - McMahon, Francis J.

AU - Amos, Christopher I.

AU - Xiong, Momiao

AU - Fan, Ruzong

PY - 2019/3/1

Y1 - 2019/3/1

N2 - We develop linear mixed models (LMMs) and functional linear mixed models (FLMMs) for gene-based tests of association between a quantitative trait and genetic variants on pedigrees. The effects of a major gene are modeled as a fixed effect, the contributions of polygenes are modeled as a random effect, and the correlations of pedigree members are modeled via inbreeding/kinship coefficients. F-statistics and χ 2 likelihood ratio test (LRT) statistics based on the LMMs and FLMMs are constructed to test for association. We show empirically that the F-distributed statistics provide a good control of the type I error rate. The F-test statistics of the LMMs have similar or higher power than the FLMMs, kernel-based famSKAT (family-based sequence kernel association test), and burden test famBT (family-based burden test). The F-statistics of the FLMMs perform well when analyzing a combination of rare and common variants. For small samples, the LRT statistics of the FLMMs control the type I error rate well at the nominal levels α = 0.01 and 0.05. For moderate/large samples, the LRT statistics of the FLMMs control the type I error rates well. The LRT statistics of the LMMs can lead to inflated type I error rates. The proposed models are useful in whole genome and whole exome association studies of complex traits.

AB - We develop linear mixed models (LMMs) and functional linear mixed models (FLMMs) for gene-based tests of association between a quantitative trait and genetic variants on pedigrees. The effects of a major gene are modeled as a fixed effect, the contributions of polygenes are modeled as a random effect, and the correlations of pedigree members are modeled via inbreeding/kinship coefficients. F-statistics and χ 2 likelihood ratio test (LRT) statistics based on the LMMs and FLMMs are constructed to test for association. We show empirically that the F-distributed statistics provide a good control of the type I error rate. The F-test statistics of the LMMs have similar or higher power than the FLMMs, kernel-based famSKAT (family-based sequence kernel association test), and burden test famBT (family-based burden test). The F-statistics of the FLMMs perform well when analyzing a combination of rare and common variants. For small samples, the LRT statistics of the FLMMs control the type I error rate well at the nominal levels α = 0.01 and 0.05. For moderate/large samples, the LRT statistics of the FLMMs control the type I error rates well. The LRT statistics of the LMMs can lead to inflated type I error rates. The proposed models are useful in whole genome and whole exome association studies of complex traits.

UR - http://www.scopus.com/inward/record.url?scp=85058072566&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85058072566&partnerID=8YFLogxK

U2 - 10.1002/gepi.22177

DO - 10.1002/gepi.22177

M3 - Article

VL - 43

SP - 189

EP - 206

JO - Genetic Epidemiology

JF - Genetic Epidemiology

SN - 0741-0395

IS - 2

ER -