Ordinal response prediction using bootstrap aggregation, with application to a high-throughput methylation data set

K. J. Archer, Valeria Mas

Research output: Contribution to journalArticle

4 Citations (Scopus)

Abstract

Many investigators conducting translational research are performing high-throughput genomic experiments and then developing multigenic classifiers using the resulting high-dimensional data set. In a large number of applications, the class to be predicted may be inherently ordinal. Examples of ordinal outcomes include tumor-node-metastasis (TNM) stage (I, II, III, IV); drug toxicity evaluated as none, mild, moderate, or severe; and response to treatment classified as complete response, partial response, stable disease, or progressive disease. While one can apply nominal response classification methods to ordinal response data, in doing so some information is lost that may improve the predictive performance of the classifier. This study examined the effectiveness of alternative ordinal splitting functions combined with bootstrap aggregation for classifying an ordinal response. We demonstrate that the ordinal impurity and ordered twoing methods have desirable properties for classifying ordinal response data and both perform well in comparison to other previously described methods. Developing a multigenic classifier is a common goal for microarray studies, and therefore application of the ordinal ensemble methods is demonstrated on a high-throughput methylation data set.

Original languageEnglish (US)
Pages (from-to)3597-3610
Number of pages14
JournalStatistics in Medicine
Volume28
Issue number29
DOIs
StatePublished - Dec 20 2009

Fingerprint

Bootstrap
Methylation
High Throughput
Aggregation
Prediction
Translational Medical Research
Classifier
Drug-Related Side Effects and Adverse Reactions
Research Personnel
Neoplasm Metastasis
Ensemble Methods
Metastasis
Datasets
High-dimensional Data
Toxicity
Impurities
Microarray
Categorical or nominal
Genomics
Tumor

All Science Journal Classification (ASJC) codes

  • Epidemiology
  • Statistics and Probability

Cite this

Ordinal response prediction using bootstrap aggregation, with application to a high-throughput methylation data set. / Archer, K. J.; Mas, Valeria.

In: Statistics in Medicine, Vol. 28, No. 29, 20.12.2009, p. 3597-3610.

Research output: Contribution to journalArticle

@article{072d603747424ef1ab8bf335ab1a1d5c,
title = "Ordinal response prediction using bootstrap aggregation, with application to a high-throughput methylation data set",
abstract = "Many investigators conducting translational research are performing high-throughput genomic experiments and then developing multigenic classifiers using the resulting high-dimensional data set. In a large number of applications, the class to be predicted may be inherently ordinal. Examples of ordinal outcomes include tumor-node-metastasis (TNM) stage (I, II, III, IV); drug toxicity evaluated as none, mild, moderate, or severe; and response to treatment classified as complete response, partial response, stable disease, or progressive disease. While one can apply nominal response classification methods to ordinal response data, in doing so some information is lost that may improve the predictive performance of the classifier. This study examined the effectiveness of alternative ordinal splitting functions combined with bootstrap aggregation for classifying an ordinal response. We demonstrate that the ordinal impurity and ordered twoing methods have desirable properties for classifying ordinal response data and both perform well in comparison to other previously described methods. Developing a multigenic classifier is a common goal for microarray studies, and therefore application of the ordinal ensemble methods is demonstrated on a high-throughput methylation data set.",
author = "Archer, {K. J.} and Valeria Mas",
year = "2009",
month = "12",
day = "20",
doi = "10.1002/sim.3707",
language = "English (US)",
volume = "28",
pages = "3597--3610",
journal = "Statistics in Medicine",
issn = "0277-6715",
publisher = "John Wiley and Sons Ltd",
number = "29",

}

TY - JOUR

T1 - Ordinal response prediction using bootstrap aggregation, with application to a high-throughput methylation data set

AU - Archer, K. J.

AU - Mas, Valeria

PY - 2009/12/20

Y1 - 2009/12/20

N2 - Many investigators conducting translational research are performing high-throughput genomic experiments and then developing multigenic classifiers using the resulting high-dimensional data set. In a large number of applications, the class to be predicted may be inherently ordinal. Examples of ordinal outcomes include tumor-node-metastasis (TNM) stage (I, II, III, IV); drug toxicity evaluated as none, mild, moderate, or severe; and response to treatment classified as complete response, partial response, stable disease, or progressive disease. While one can apply nominal response classification methods to ordinal response data, in doing so some information is lost that may improve the predictive performance of the classifier. This study examined the effectiveness of alternative ordinal splitting functions combined with bootstrap aggregation for classifying an ordinal response. We demonstrate that the ordinal impurity and ordered twoing methods have desirable properties for classifying ordinal response data and both perform well in comparison to other previously described methods. Developing a multigenic classifier is a common goal for microarray studies, and therefore application of the ordinal ensemble methods is demonstrated on a high-throughput methylation data set.

AB - Many investigators conducting translational research are performing high-throughput genomic experiments and then developing multigenic classifiers using the resulting high-dimensional data set. In a large number of applications, the class to be predicted may be inherently ordinal. Examples of ordinal outcomes include tumor-node-metastasis (TNM) stage (I, II, III, IV); drug toxicity evaluated as none, mild, moderate, or severe; and response to treatment classified as complete response, partial response, stable disease, or progressive disease. While one can apply nominal response classification methods to ordinal response data, in doing so some information is lost that may improve the predictive performance of the classifier. This study examined the effectiveness of alternative ordinal splitting functions combined with bootstrap aggregation for classifying an ordinal response. We demonstrate that the ordinal impurity and ordered twoing methods have desirable properties for classifying ordinal response data and both perform well in comparison to other previously described methods. Developing a multigenic classifier is a common goal for microarray studies, and therefore application of the ordinal ensemble methods is demonstrated on a high-throughput methylation data set.

UR - http://www.scopus.com/inward/record.url?scp=72849123246&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=72849123246&partnerID=8YFLogxK

U2 - 10.1002/sim.3707

DO - 10.1002/sim.3707

M3 - Article

C2 - 19697302

AN - SCOPUS:72849123246

VL - 28

SP - 3597

EP - 3610

JO - Statistics in Medicine

JF - Statistics in Medicine

SN - 0277-6715

IS - 29

ER -