A flexible estimating equations approach for mapping function-valued traits

Hao Xiong, Evan H. Goulding, Elaine J. Carlson, Laurence H. Tecott, Charles E. McCulloch, Saunak Sen

Research output: Contribution to journalArticle

17 Citations (Scopus)

Abstract

In genetic studies, many interesting traits, including growth curves and skeletal shape, have temporal or spatial structure. They are better treated as curves or function-valued traits. Identification of genetic loci contributing to such traits is facilitated by specialized methods that explicitly address the function-valued nature of the data. Current methods for mapping function-valued traits are mostly likelihood-based, requiring specification of the distribution and error structure. However, such specification is difficult or impractical in many scenarios. We propose a general functional regression approach based on estimating equations that is robust to misspecification of the covariance structure. Estimation is based on a two-step least-squares algorithm, which is fast and applicable even when the number of time points exceeds the number of samples. It is also flexible due to a general linear functional model; changing the number of covariates does not necessitate a new set of formulas and programs. In addition, many meaningful extensions are straightforward. For example, we can accommodate incomplete genotype data, and the algorithm can be trivially parallelized. The framework is an attractive alternative to likelihood-based methods when the covariance structure of the data is not known. It provides a good compromise between model simplicity, statistical efficiency, and computational speed. We illustrate our method and its advantages using circadian mouse behavioral data.

Original languageEnglish (US)
Pages (from-to)305-316
Number of pages12
JournalGenetics
Volume189
Issue number1
DOIs
StatePublished - Sep 1 2011

Fingerprint

Genetic Loci
Statistical Models
Least-Squares Analysis
Linear Models
Genotype
Growth

All Science Journal Classification (ASJC) codes

  • Genetics

Cite this

Xiong, H., Goulding, E. H., Carlson, E. J., Tecott, L. H., McCulloch, C. E., & Sen, S. (2011). A flexible estimating equations approach for mapping function-valued traits. Genetics, 189(1), 305-316. https://doi.org/10.1534/genetics.111.129221

A flexible estimating equations approach for mapping function-valued traits. / Xiong, Hao; Goulding, Evan H.; Carlson, Elaine J.; Tecott, Laurence H.; McCulloch, Charles E.; Sen, Saunak.

In: Genetics, Vol. 189, No. 1, 01.09.2011, p. 305-316.

Research output: Contribution to journalArticle

Xiong, H, Goulding, EH, Carlson, EJ, Tecott, LH, McCulloch, CE & Sen, S 2011, 'A flexible estimating equations approach for mapping function-valued traits', Genetics, vol. 189, no. 1, pp. 305-316. https://doi.org/10.1534/genetics.111.129221
Xiong H, Goulding EH, Carlson EJ, Tecott LH, McCulloch CE, Sen S. A flexible estimating equations approach for mapping function-valued traits. Genetics. 2011 Sep 1;189(1):305-316. https://doi.org/10.1534/genetics.111.129221
Xiong, Hao ; Goulding, Evan H. ; Carlson, Elaine J. ; Tecott, Laurence H. ; McCulloch, Charles E. ; Sen, Saunak. / A flexible estimating equations approach for mapping function-valued traits. In: Genetics. 2011 ; Vol. 189, No. 1. pp. 305-316.
@article{13a6190d13d74282993cccd4c868f6b2,
title = "A flexible estimating equations approach for mapping function-valued traits",
abstract = "In genetic studies, many interesting traits, including growth curves and skeletal shape, have temporal or spatial structure. They are better treated as curves or function-valued traits. Identification of genetic loci contributing to such traits is facilitated by specialized methods that explicitly address the function-valued nature of the data. Current methods for mapping function-valued traits are mostly likelihood-based, requiring specification of the distribution and error structure. However, such specification is difficult or impractical in many scenarios. We propose a general functional regression approach based on estimating equations that is robust to misspecification of the covariance structure. Estimation is based on a two-step least-squares algorithm, which is fast and applicable even when the number of time points exceeds the number of samples. It is also flexible due to a general linear functional model; changing the number of covariates does not necessitate a new set of formulas and programs. In addition, many meaningful extensions are straightforward. For example, we can accommodate incomplete genotype data, and the algorithm can be trivially parallelized. The framework is an attractive alternative to likelihood-based methods when the covariance structure of the data is not known. It provides a good compromise between model simplicity, statistical efficiency, and computational speed. We illustrate our method and its advantages using circadian mouse behavioral data.",
author = "Hao Xiong and Goulding, {Evan H.} and Carlson, {Elaine J.} and Tecott, {Laurence H.} and McCulloch, {Charles E.} and Saunak Sen",
year = "2011",
month = "9",
day = "1",
doi = "10.1534/genetics.111.129221",
language = "English (US)",
volume = "189",
pages = "305--316",
journal = "Genetics",
issn = "0016-6731",
publisher = "Genetics Society of America",
number = "1",

}

TY - JOUR

T1 - A flexible estimating equations approach for mapping function-valued traits

AU - Xiong, Hao

AU - Goulding, Evan H.

AU - Carlson, Elaine J.

AU - Tecott, Laurence H.

AU - McCulloch, Charles E.

AU - Sen, Saunak

PY - 2011/9/1

Y1 - 2011/9/1

N2 - In genetic studies, many interesting traits, including growth curves and skeletal shape, have temporal or spatial structure. They are better treated as curves or function-valued traits. Identification of genetic loci contributing to such traits is facilitated by specialized methods that explicitly address the function-valued nature of the data. Current methods for mapping function-valued traits are mostly likelihood-based, requiring specification of the distribution and error structure. However, such specification is difficult or impractical in many scenarios. We propose a general functional regression approach based on estimating equations that is robust to misspecification of the covariance structure. Estimation is based on a two-step least-squares algorithm, which is fast and applicable even when the number of time points exceeds the number of samples. It is also flexible due to a general linear functional model; changing the number of covariates does not necessitate a new set of formulas and programs. In addition, many meaningful extensions are straightforward. For example, we can accommodate incomplete genotype data, and the algorithm can be trivially parallelized. The framework is an attractive alternative to likelihood-based methods when the covariance structure of the data is not known. It provides a good compromise between model simplicity, statistical efficiency, and computational speed. We illustrate our method and its advantages using circadian mouse behavioral data.

AB - In genetic studies, many interesting traits, including growth curves and skeletal shape, have temporal or spatial structure. They are better treated as curves or function-valued traits. Identification of genetic loci contributing to such traits is facilitated by specialized methods that explicitly address the function-valued nature of the data. Current methods for mapping function-valued traits are mostly likelihood-based, requiring specification of the distribution and error structure. However, such specification is difficult or impractical in many scenarios. We propose a general functional regression approach based on estimating equations that is robust to misspecification of the covariance structure. Estimation is based on a two-step least-squares algorithm, which is fast and applicable even when the number of time points exceeds the number of samples. It is also flexible due to a general linear functional model; changing the number of covariates does not necessitate a new set of formulas and programs. In addition, many meaningful extensions are straightforward. For example, we can accommodate incomplete genotype data, and the algorithm can be trivially parallelized. The framework is an attractive alternative to likelihood-based methods when the covariance structure of the data is not known. It provides a good compromise between model simplicity, statistical efficiency, and computational speed. We illustrate our method and its advantages using circadian mouse behavioral data.

UR - http://www.scopus.com/inward/record.url?scp=80052648285&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=80052648285&partnerID=8YFLogxK

U2 - 10.1534/genetics.111.129221

DO - 10.1534/genetics.111.129221

M3 - Article

VL - 189

SP - 305

EP - 316

JO - Genetics

JF - Genetics

SN - 0016-6731

IS - 1

ER -