Physonline

An open source machine learning pipeline for real-time analysis of streaming physiological waveform

Jacob R. Sutton, Ruhi Mahajan, Oguz Akbilgic, Rishikesan Kamaleswaran

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

Real-time analysis of streaming physiological data to identify earlier abnormal conditions is an important aspect of precision medicine. However, open-source systems supporting this workflow are lacking. In this paper, we present PhysOnline, a pipeline built on the open-source Apache Spark platform to ingest streaming physiological data for online feature extraction and machine learning. We consider scalability factors for horizontal deployment to support growing analysis requirements. We further integrate real-time feature extraction, including pattern recognition methods as well as descriptive statistical components to identify temporal characteristics of waveform signals. These generated features are then used for machine learning and for real-time classification of abnormal conditions. As a case study, we present the online classification of electrocardiography recordings for screening Paroxysmal Atrial Fibrillation (PAF) and demonstrate that our pipeline can predict persons developing PAF at least 45 min. before an episode of that condition. This pipeline can be applied in domains where pattern matching, temporal abstractions, and morphological characteristics can be used for real-time classification of streaming time-series data,

Original languageEnglish (US)
Article number8353460
Pages (from-to)59-65
Number of pages7
JournalIEEE Journal of Biomedical and Health Informatics
Volume23
Issue number1
DOIs
StatePublished - Jan 1 2019

Fingerprint

Learning systems
Pipelines
Feature extraction
Pattern matching
Electrocardiography
Electric sparks
Medicine
Pattern recognition
Scalability
Time series
Statistical methods
Screening
Atrial Fibrillation
Precision Medicine
Workflow
Machine Learning

All Science Journal Classification (ASJC) codes

  • Biotechnology
  • Computer Science Applications
  • Electrical and Electronic Engineering
  • Health Information Management

Cite this

Physonline : An open source machine learning pipeline for real-time analysis of streaming physiological waveform. / Sutton, Jacob R.; Mahajan, Ruhi; Akbilgic, Oguz; Kamaleswaran, Rishikesan.

In: IEEE Journal of Biomedical and Health Informatics, Vol. 23, No. 1, 8353460, 01.01.2019, p. 59-65.

Research output: Contribution to journalArticle

@article{cdd8d943c6dc418ca11c758c1ba8ef25,
title = "Physonline: An open source machine learning pipeline for real-time analysis of streaming physiological waveform",
abstract = "Real-time analysis of streaming physiological data to identify earlier abnormal conditions is an important aspect of precision medicine. However, open-source systems supporting this workflow are lacking. In this paper, we present PhysOnline, a pipeline built on the open-source Apache Spark platform to ingest streaming physiological data for online feature extraction and machine learning. We consider scalability factors for horizontal deployment to support growing analysis requirements. We further integrate real-time feature extraction, including pattern recognition methods as well as descriptive statistical components to identify temporal characteristics of waveform signals. These generated features are then used for machine learning and for real-time classification of abnormal conditions. As a case study, we present the online classification of electrocardiography recordings for screening Paroxysmal Atrial Fibrillation (PAF) and demonstrate that our pipeline can predict persons developing PAF at least 45 min. before an episode of that condition. This pipeline can be applied in domains where pattern matching, temporal abstractions, and morphological characteristics can be used for real-time classification of streaming time-series data,",
author = "Sutton, {Jacob R.} and Ruhi Mahajan and Oguz Akbilgic and Rishikesan Kamaleswaran",
year = "2019",
month = "1",
day = "1",
doi = "10.1109/JBHI.2018.2832610",
language = "English (US)",
volume = "23",
pages = "59--65",
journal = "IEEE Journal of Biomedical and Health Informatics",
issn = "2168-2194",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "1",

}

TY - JOUR

T1 - Physonline

T2 - An open source machine learning pipeline for real-time analysis of streaming physiological waveform

AU - Sutton, Jacob R.

AU - Mahajan, Ruhi

AU - Akbilgic, Oguz

AU - Kamaleswaran, Rishikesan

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Real-time analysis of streaming physiological data to identify earlier abnormal conditions is an important aspect of precision medicine. However, open-source systems supporting this workflow are lacking. In this paper, we present PhysOnline, a pipeline built on the open-source Apache Spark platform to ingest streaming physiological data for online feature extraction and machine learning. We consider scalability factors for horizontal deployment to support growing analysis requirements. We further integrate real-time feature extraction, including pattern recognition methods as well as descriptive statistical components to identify temporal characteristics of waveform signals. These generated features are then used for machine learning and for real-time classification of abnormal conditions. As a case study, we present the online classification of electrocardiography recordings for screening Paroxysmal Atrial Fibrillation (PAF) and demonstrate that our pipeline can predict persons developing PAF at least 45 min. before an episode of that condition. This pipeline can be applied in domains where pattern matching, temporal abstractions, and morphological characteristics can be used for real-time classification of streaming time-series data,

AB - Real-time analysis of streaming physiological data to identify earlier abnormal conditions is an important aspect of precision medicine. However, open-source systems supporting this workflow are lacking. In this paper, we present PhysOnline, a pipeline built on the open-source Apache Spark platform to ingest streaming physiological data for online feature extraction and machine learning. We consider scalability factors for horizontal deployment to support growing analysis requirements. We further integrate real-time feature extraction, including pattern recognition methods as well as descriptive statistical components to identify temporal characteristics of waveform signals. These generated features are then used for machine learning and for real-time classification of abnormal conditions. As a case study, we present the online classification of electrocardiography recordings for screening Paroxysmal Atrial Fibrillation (PAF) and demonstrate that our pipeline can predict persons developing PAF at least 45 min. before an episode of that condition. This pipeline can be applied in domains where pattern matching, temporal abstractions, and morphological characteristics can be used for real-time classification of streaming time-series data,

UR - http://www.scopus.com/inward/record.url?scp=85059798747&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85059798747&partnerID=8YFLogxK

U2 - 10.1109/JBHI.2018.2832610

DO - 10.1109/JBHI.2018.2832610

M3 - Article

VL - 23

SP - 59

EP - 65

JO - IEEE Journal of Biomedical and Health Informatics

JF - IEEE Journal of Biomedical and Health Informatics

SN - 2168-2194

IS - 1

M1 - 8353460

ER -