Meta-analysis of complex diseases at gene level with generalized functional linear models

Ruzong Fan, Yifan Wang, Chi-Yang Chiu, Wei Chen, Haobo Ren, Yun Li, Michael Boehnke, Christopher I. Amos, Jason H. Moore, Momiao Xiong

Research output: Contribution to journalArticle

9 Citations (Scopus)

Abstract

We developed generalized functional linear models (GFLMs) to perform a meta-analysis of multiple case-control studies to evaluate the relationship of genetic data to dichotomous traits adjusting for covariates. Unlike the previously developed meta-analysis for sequence kernel association tests (MetaSKATs), which are based on mixed-effect models to make the contributions of major gene loci random, GFLMs are fixed models; i.e., genetic effects of multiple genetic variants are fixed. Based on GFLMs, we developed chisquared-distributed Rao’s efficient score test and likelihood-ratio test (LRT) statistics to test for an association between a complex dichotomous trait and multiple genetic variants. We then performed extensive simulations to evaluate the empirical type I error rates and power performance of the proposed tests. The Rao’s efficient score test statistics of GFLMs are very conservative and have higher power than MetaSKATs when some causal variants are rare and some are common. When the causal variants are all rare [i.e., minor allele frequencies (MAF) < 0.03], the Rao’s efficient score test statistics have similar or slightly lower power than MetaSKATs. The LRT statistics generate accurate type I error rates for homogeneous genetic-effect models and may inflate type I error rates for heterogeneous genetic-effect models owing to the large numbers of degrees of freedom and have similar or slightly higher power than the Rao’sefficient score test statistics. GFLMs were applied to analyze genetic data of 22 gene regions of type 2 diabetes data from a meta-analysis of eight European studies and detected significant association for 18 genes (P < 3.10 × 10−6), tentative association for 2 genes (HHEX and HMGA2; P ≈ 10−5), and no association for 2 genes, while MetaSKATs detected none. In addition, the traditional additive-effect model detects association at gene HHEX. GFLMs and related tests can analyze rare or common variants or a combination of the two and can be useful in whole-genome and whole-exome association studies.

Original languageEnglish (US)
Pages (from-to)457-470
Number of pages14
JournalGenetics
Volume202
Issue number2
DOIs
StatePublished - Feb 1 2016

Fingerprint

Meta-Analysis
Linear Models
Genes
Genetic Models
Exome
Genome-Wide Association Study
Gene Frequency
Type 2 Diabetes Mellitus
Case-Control Studies

All Science Journal Classification (ASJC) codes

  • Genetics

Cite this

Meta-analysis of complex diseases at gene level with generalized functional linear models. / Fan, Ruzong; Wang, Yifan; Chiu, Chi-Yang; Chen, Wei; Ren, Haobo; Li, Yun; Boehnke, Michael; Amos, Christopher I.; Moore, Jason H.; Xiong, Momiao.

In: Genetics, Vol. 202, No. 2, 01.02.2016, p. 457-470.

Research output: Contribution to journalArticle

Fan, R, Wang, Y, Chiu, C-Y, Chen, W, Ren, H, Li, Y, Boehnke, M, Amos, CI, Moore, JH & Xiong, M 2016, 'Meta-analysis of complex diseases at gene level with generalized functional linear models', Genetics, vol. 202, no. 2, pp. 457-470. https://doi.org/10.1534/genetics.115.180869
Fan, Ruzong ; Wang, Yifan ; Chiu, Chi-Yang ; Chen, Wei ; Ren, Haobo ; Li, Yun ; Boehnke, Michael ; Amos, Christopher I. ; Moore, Jason H. ; Xiong, Momiao. / Meta-analysis of complex diseases at gene level with generalized functional linear models. In: Genetics. 2016 ; Vol. 202, No. 2. pp. 457-470.
@article{a22c287fcd894a469c6e671a0f7ae0f0,
title = "Meta-analysis of complex diseases at gene level with generalized functional linear models",
abstract = "We developed generalized functional linear models (GFLMs) to perform a meta-analysis of multiple case-control studies to evaluate the relationship of genetic data to dichotomous traits adjusting for covariates. Unlike the previously developed meta-analysis for sequence kernel association tests (MetaSKATs), which are based on mixed-effect models to make the contributions of major gene loci random, GFLMs are fixed models; i.e., genetic effects of multiple genetic variants are fixed. Based on GFLMs, we developed chisquared-distributed Rao’s efficient score test and likelihood-ratio test (LRT) statistics to test for an association between a complex dichotomous trait and multiple genetic variants. We then performed extensive simulations to evaluate the empirical type I error rates and power performance of the proposed tests. The Rao’s efficient score test statistics of GFLMs are very conservative and have higher power than MetaSKATs when some causal variants are rare and some are common. When the causal variants are all rare [i.e., minor allele frequencies (MAF) < 0.03], the Rao’s efficient score test statistics have similar or slightly lower power than MetaSKATs. The LRT statistics generate accurate type I error rates for homogeneous genetic-effect models and may inflate type I error rates for heterogeneous genetic-effect models owing to the large numbers of degrees of freedom and have similar or slightly higher power than the Rao’sefficient score test statistics. GFLMs were applied to analyze genetic data of 22 gene regions of type 2 diabetes data from a meta-analysis of eight European studies and detected significant association for 18 genes (P < 3.10 × 10−6), tentative association for 2 genes (HHEX and HMGA2; P ≈ 10−5), and no association for 2 genes, while MetaSKATs detected none. In addition, the traditional additive-effect model detects association at gene HHEX. GFLMs and related tests can analyze rare or common variants or a combination of the two and can be useful in whole-genome and whole-exome association studies.",
author = "Ruzong Fan and Yifan Wang and Chi-Yang Chiu and Wei Chen and Haobo Ren and Yun Li and Michael Boehnke and Amos, {Christopher I.} and Moore, {Jason H.} and Momiao Xiong",
year = "2016",
month = "2",
day = "1",
doi = "10.1534/genetics.115.180869",
language = "English (US)",
volume = "202",
pages = "457--470",
journal = "Genetics",
issn = "0016-6731",
publisher = "Genetics Society of America",
number = "2",

}

TY - JOUR

T1 - Meta-analysis of complex diseases at gene level with generalized functional linear models

AU - Fan, Ruzong

AU - Wang, Yifan

AU - Chiu, Chi-Yang

AU - Chen, Wei

AU - Ren, Haobo

AU - Li, Yun

AU - Boehnke, Michael

AU - Amos, Christopher I.

AU - Moore, Jason H.

AU - Xiong, Momiao

PY - 2016/2/1

Y1 - 2016/2/1

N2 - We developed generalized functional linear models (GFLMs) to perform a meta-analysis of multiple case-control studies to evaluate the relationship of genetic data to dichotomous traits adjusting for covariates. Unlike the previously developed meta-analysis for sequence kernel association tests (MetaSKATs), which are based on mixed-effect models to make the contributions of major gene loci random, GFLMs are fixed models; i.e., genetic effects of multiple genetic variants are fixed. Based on GFLMs, we developed chisquared-distributed Rao’s efficient score test and likelihood-ratio test (LRT) statistics to test for an association between a complex dichotomous trait and multiple genetic variants. We then performed extensive simulations to evaluate the empirical type I error rates and power performance of the proposed tests. The Rao’s efficient score test statistics of GFLMs are very conservative and have higher power than MetaSKATs when some causal variants are rare and some are common. When the causal variants are all rare [i.e., minor allele frequencies (MAF) < 0.03], the Rao’s efficient score test statistics have similar or slightly lower power than MetaSKATs. The LRT statistics generate accurate type I error rates for homogeneous genetic-effect models and may inflate type I error rates for heterogeneous genetic-effect models owing to the large numbers of degrees of freedom and have similar or slightly higher power than the Rao’sefficient score test statistics. GFLMs were applied to analyze genetic data of 22 gene regions of type 2 diabetes data from a meta-analysis of eight European studies and detected significant association for 18 genes (P < 3.10 × 10−6), tentative association for 2 genes (HHEX and HMGA2; P ≈ 10−5), and no association for 2 genes, while MetaSKATs detected none. In addition, the traditional additive-effect model detects association at gene HHEX. GFLMs and related tests can analyze rare or common variants or a combination of the two and can be useful in whole-genome and whole-exome association studies.

AB - We developed generalized functional linear models (GFLMs) to perform a meta-analysis of multiple case-control studies to evaluate the relationship of genetic data to dichotomous traits adjusting for covariates. Unlike the previously developed meta-analysis for sequence kernel association tests (MetaSKATs), which are based on mixed-effect models to make the contributions of major gene loci random, GFLMs are fixed models; i.e., genetic effects of multiple genetic variants are fixed. Based on GFLMs, we developed chisquared-distributed Rao’s efficient score test and likelihood-ratio test (LRT) statistics to test for an association between a complex dichotomous trait and multiple genetic variants. We then performed extensive simulations to evaluate the empirical type I error rates and power performance of the proposed tests. The Rao’s efficient score test statistics of GFLMs are very conservative and have higher power than MetaSKATs when some causal variants are rare and some are common. When the causal variants are all rare [i.e., minor allele frequencies (MAF) < 0.03], the Rao’s efficient score test statistics have similar or slightly lower power than MetaSKATs. The LRT statistics generate accurate type I error rates for homogeneous genetic-effect models and may inflate type I error rates for heterogeneous genetic-effect models owing to the large numbers of degrees of freedom and have similar or slightly higher power than the Rao’sefficient score test statistics. GFLMs were applied to analyze genetic data of 22 gene regions of type 2 diabetes data from a meta-analysis of eight European studies and detected significant association for 18 genes (P < 3.10 × 10−6), tentative association for 2 genes (HHEX and HMGA2; P ≈ 10−5), and no association for 2 genes, while MetaSKATs detected none. In addition, the traditional additive-effect model detects association at gene HHEX. GFLMs and related tests can analyze rare or common variants or a combination of the two and can be useful in whole-genome and whole-exome association studies.

UR - http://www.scopus.com/inward/record.url?scp=84979937494&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84979937494&partnerID=8YFLogxK

U2 - 10.1534/genetics.115.180869

DO - 10.1534/genetics.115.180869

M3 - Article

VL - 202

SP - 457

EP - 470

JO - Genetics

JF - Genetics

SN - 0016-6731

IS - 2

ER -