Drug safety data mining with a tree-based scan statistic

Martin Kulldorff, Inna Dashevsky, Taliser R. Avery, Arnold K. Chan, Robert Davis, David Graham, Richard Platt, Susan E. Andrade, Denise Boudreau, Margaret J. Gunter, Lisa J. Herrinton, Pamala A. Pawloski, Marsha A. Raebel, Douglas Roblin, Jeffrey S. Brown

Research output: Contribution to journalArticle

18 Citations (Scopus)

Abstract

Purpose: In post-marketing drug safety surveillance, data mining can potentially detect rare but serious adverse events. Assessing an entire collection of drug-event pairs is traditionally performed on a predefined level of granularity. It is unknown a priori whether a drug causes a very specific or a set of related adverse events, such as mitral valve disorders, all valve disorders, or different types of heart disease. This methodological paper evaluates the tree-based scan statistic data mining method to enhance drug safety surveillance. Methods: We use a three-million-member electronic health records database from the HMO Research Network. Using the tree-based scan statistic, we assess the safety of selected antifungal and diabetes drugs, simultaneously evaluating overlapping diagnosis groups at different granularity levels, adjusting for multiple testing. Expected and observed adverse event counts were adjusted for age, sex, and health plan, producing a log likelihood ratio test statistic. Results: Out of 732 evaluated disease groupings, 24 were statistically significant, divided among 10 non-overlapping disease categories. Five of the 10 signals are known adverse effects, four are likely due to confounding by indication, while one may warrant further investigation. Conclusion: The tree-based scan statistic can be successfully applied as a data mining tool in drug safety surveillance using observational data. The total number of statistical signals was modest and does not imply a causal relationship. Rather, data mining results should be used to generate candidate drug-event pairs for rigorous epidemiological studies to evaluate the individual and comparative safety profiles of drugs.

Original languageEnglish (US)
Pages (from-to)517-523
Number of pages7
JournalPharmacoepidemiology and Drug Safety
Volume22
Issue number5
DOIs
StatePublished - May 2013
Externally publishedYes

Fingerprint

Data Mining
Safety
Pharmaceutical Preparations
Health Maintenance Organizations
Electronic Health Records
Marketing
Mitral Valve
Epidemiologic Studies
Heart Diseases
Databases
Health
Research

All Science Journal Classification (ASJC) codes

  • Pharmacology (medical)
  • Epidemiology

Cite this

Kulldorff, M., Dashevsky, I., Avery, T. R., Chan, A. K., Davis, R., Graham, D., ... Brown, J. S. (2013). Drug safety data mining with a tree-based scan statistic. Pharmacoepidemiology and Drug Safety, 22(5), 517-523. https://doi.org/10.1002/pds.3423

Drug safety data mining with a tree-based scan statistic. / Kulldorff, Martin; Dashevsky, Inna; Avery, Taliser R.; Chan, Arnold K.; Davis, Robert; Graham, David; Platt, Richard; Andrade, Susan E.; Boudreau, Denise; Gunter, Margaret J.; Herrinton, Lisa J.; Pawloski, Pamala A.; Raebel, Marsha A.; Roblin, Douglas; Brown, Jeffrey S.

In: Pharmacoepidemiology and Drug Safety, Vol. 22, No. 5, 05.2013, p. 517-523.

Research output: Contribution to journalArticle

Kulldorff, M, Dashevsky, I, Avery, TR, Chan, AK, Davis, R, Graham, D, Platt, R, Andrade, SE, Boudreau, D, Gunter, MJ, Herrinton, LJ, Pawloski, PA, Raebel, MA, Roblin, D & Brown, JS 2013, 'Drug safety data mining with a tree-based scan statistic', Pharmacoepidemiology and Drug Safety, vol. 22, no. 5, pp. 517-523. https://doi.org/10.1002/pds.3423
Kulldorff M, Dashevsky I, Avery TR, Chan AK, Davis R, Graham D et al. Drug safety data mining with a tree-based scan statistic. Pharmacoepidemiology and Drug Safety. 2013 May;22(5):517-523. https://doi.org/10.1002/pds.3423
Kulldorff, Martin ; Dashevsky, Inna ; Avery, Taliser R. ; Chan, Arnold K. ; Davis, Robert ; Graham, David ; Platt, Richard ; Andrade, Susan E. ; Boudreau, Denise ; Gunter, Margaret J. ; Herrinton, Lisa J. ; Pawloski, Pamala A. ; Raebel, Marsha A. ; Roblin, Douglas ; Brown, Jeffrey S. / Drug safety data mining with a tree-based scan statistic. In: Pharmacoepidemiology and Drug Safety. 2013 ; Vol. 22, No. 5. pp. 517-523.
@article{d2992277a3214af6ad62dc7bab6bcf60,
title = "Drug safety data mining with a tree-based scan statistic",
abstract = "Purpose: In post-marketing drug safety surveillance, data mining can potentially detect rare but serious adverse events. Assessing an entire collection of drug-event pairs is traditionally performed on a predefined level of granularity. It is unknown a priori whether a drug causes a very specific or a set of related adverse events, such as mitral valve disorders, all valve disorders, or different types of heart disease. This methodological paper evaluates the tree-based scan statistic data mining method to enhance drug safety surveillance. Methods: We use a three-million-member electronic health records database from the HMO Research Network. Using the tree-based scan statistic, we assess the safety of selected antifungal and diabetes drugs, simultaneously evaluating overlapping diagnosis groups at different granularity levels, adjusting for multiple testing. Expected and observed adverse event counts were adjusted for age, sex, and health plan, producing a log likelihood ratio test statistic. Results: Out of 732 evaluated disease groupings, 24 were statistically significant, divided among 10 non-overlapping disease categories. Five of the 10 signals are known adverse effects, four are likely due to confounding by indication, while one may warrant further investigation. Conclusion: The tree-based scan statistic can be successfully applied as a data mining tool in drug safety surveillance using observational data. The total number of statistical signals was modest and does not imply a causal relationship. Rather, data mining results should be used to generate candidate drug-event pairs for rigorous epidemiological studies to evaluate the individual and comparative safety profiles of drugs.",
author = "Martin Kulldorff and Inna Dashevsky and Avery, {Taliser R.} and Chan, {Arnold K.} and Robert Davis and David Graham and Richard Platt and Andrade, {Susan E.} and Denise Boudreau and Gunter, {Margaret J.} and Herrinton, {Lisa J.} and Pawloski, {Pamala A.} and Raebel, {Marsha A.} and Douglas Roblin and Brown, {Jeffrey S.}",
year = "2013",
month = "5",
doi = "10.1002/pds.3423",
language = "English (US)",
volume = "22",
pages = "517--523",
journal = "Pharmacoepidemiology and Drug Safety",
issn = "1053-8569",
publisher = "John Wiley and Sons Ltd",
number = "5",

}

TY - JOUR

T1 - Drug safety data mining with a tree-based scan statistic

AU - Kulldorff, Martin

AU - Dashevsky, Inna

AU - Avery, Taliser R.

AU - Chan, Arnold K.

AU - Davis, Robert

AU - Graham, David

AU - Platt, Richard

AU - Andrade, Susan E.

AU - Boudreau, Denise

AU - Gunter, Margaret J.

AU - Herrinton, Lisa J.

AU - Pawloski, Pamala A.

AU - Raebel, Marsha A.

AU - Roblin, Douglas

AU - Brown, Jeffrey S.

PY - 2013/5

Y1 - 2013/5

N2 - Purpose: In post-marketing drug safety surveillance, data mining can potentially detect rare but serious adverse events. Assessing an entire collection of drug-event pairs is traditionally performed on a predefined level of granularity. It is unknown a priori whether a drug causes a very specific or a set of related adverse events, such as mitral valve disorders, all valve disorders, or different types of heart disease. This methodological paper evaluates the tree-based scan statistic data mining method to enhance drug safety surveillance. Methods: We use a three-million-member electronic health records database from the HMO Research Network. Using the tree-based scan statistic, we assess the safety of selected antifungal and diabetes drugs, simultaneously evaluating overlapping diagnosis groups at different granularity levels, adjusting for multiple testing. Expected and observed adverse event counts were adjusted for age, sex, and health plan, producing a log likelihood ratio test statistic. Results: Out of 732 evaluated disease groupings, 24 were statistically significant, divided among 10 non-overlapping disease categories. Five of the 10 signals are known adverse effects, four are likely due to confounding by indication, while one may warrant further investigation. Conclusion: The tree-based scan statistic can be successfully applied as a data mining tool in drug safety surveillance using observational data. The total number of statistical signals was modest and does not imply a causal relationship. Rather, data mining results should be used to generate candidate drug-event pairs for rigorous epidemiological studies to evaluate the individual and comparative safety profiles of drugs.

AB - Purpose: In post-marketing drug safety surveillance, data mining can potentially detect rare but serious adverse events. Assessing an entire collection of drug-event pairs is traditionally performed on a predefined level of granularity. It is unknown a priori whether a drug causes a very specific or a set of related adverse events, such as mitral valve disorders, all valve disorders, or different types of heart disease. This methodological paper evaluates the tree-based scan statistic data mining method to enhance drug safety surveillance. Methods: We use a three-million-member electronic health records database from the HMO Research Network. Using the tree-based scan statistic, we assess the safety of selected antifungal and diabetes drugs, simultaneously evaluating overlapping diagnosis groups at different granularity levels, adjusting for multiple testing. Expected and observed adverse event counts were adjusted for age, sex, and health plan, producing a log likelihood ratio test statistic. Results: Out of 732 evaluated disease groupings, 24 were statistically significant, divided among 10 non-overlapping disease categories. Five of the 10 signals are known adverse effects, four are likely due to confounding by indication, while one may warrant further investigation. Conclusion: The tree-based scan statistic can be successfully applied as a data mining tool in drug safety surveillance using observational data. The total number of statistical signals was modest and does not imply a causal relationship. Rather, data mining results should be used to generate candidate drug-event pairs for rigorous epidemiological studies to evaluate the individual and comparative safety profiles of drugs.

UR - http://www.scopus.com/inward/record.url?scp=84877619276&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84877619276&partnerID=8YFLogxK

U2 - 10.1002/pds.3423

DO - 10.1002/pds.3423

M3 - Article

VL - 22

SP - 517

EP - 523

JO - Pharmacoepidemiology and Drug Safety

JF - Pharmacoepidemiology and Drug Safety

SN - 1053-8569

IS - 5

ER -