Optimal feature selection using fuzzy combination of feature subset for transcriptome data

Vikas Singh, Harsh Vardhan, Nishchal K. Verma, Yan Cui

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

Applying machine learning algorithms directly on high dimensional datasets, like those encountered in transcriptome analysis, may lead to high time complexity and low performance of learning models, especially when the number of samples is small compared to the dimensionality. Selecting the optimal set of features then becomes an essential task for such datasets. Filter methods are one of the main class of techniques used for feature selection wherein a score is assigned to features based on criteria such as information gain, statistical measures or similarity based measures and then selects the best scored features. Using filter methods on the complete dataset results in features that have good performance over the dataset but might perform poorly in certain regions of the data, which affects accuracy for data points of those regions. To overcome this degradation in performance, we propose two novel methods to assign a robust score by using the fuzzy combination of the region-specific optimal feature subsets obtained using a standard feature selection algorithm (we use mRMR for this paper).We compare the result with state-of-the-art feature selection algorithm, mRMR (Minimum Redundancy Maximum Relevance) in the terms of accuracy on certain standard datasets.

Original languageEnglish (US)
Title of host publication2018 IEEE International Conference on Fuzzy Systems, FUZZ 2018 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781509060207
DOIs
StatePublished - Oct 12 2018
Event2018 IEEE International Conference on Fuzzy Systems, FUZZ 2018 - Rio de Janeiro, Brazil
Duration: Jul 8 2018Jul 13 2018

Publication series

NameIEEE International Conference on Fuzzy Systems
Volume2018-July
ISSN (Print)1098-7584

Other

Other2018 IEEE International Conference on Fuzzy Systems, FUZZ 2018
CountryBrazil
CityRio de Janeiro
Period7/8/187/13/18

Fingerprint

Set theory
Feature Selection
Feature extraction
Filter Method
Subset
Redundancy
Information Gain
Learning algorithms
Time Complexity
Dimensionality
Assign
Learning systems
Learning Algorithm
Machine Learning
Degradation
High-dimensional
Term
Relevance
Standards
Model

All Science Journal Classification (ASJC) codes

  • Software
  • Theoretical Computer Science
  • Artificial Intelligence
  • Applied Mathematics

Cite this

Singh, V., Vardhan, H., Verma, N. K., & Cui, Y. (2018). Optimal feature selection using fuzzy combination of feature subset for transcriptome data. In 2018 IEEE International Conference on Fuzzy Systems, FUZZ 2018 - Proceedings [8491683] (IEEE International Conference on Fuzzy Systems; Vol. 2018-July). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/FUZZ-IEEE.2018.8491683

Optimal feature selection using fuzzy combination of feature subset for transcriptome data. / Singh, Vikas; Vardhan, Harsh; Verma, Nishchal K.; Cui, Yan.

2018 IEEE International Conference on Fuzzy Systems, FUZZ 2018 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2018. 8491683 (IEEE International Conference on Fuzzy Systems; Vol. 2018-July).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Singh, V, Vardhan, H, Verma, NK & Cui, Y 2018, Optimal feature selection using fuzzy combination of feature subset for transcriptome data. in 2018 IEEE International Conference on Fuzzy Systems, FUZZ 2018 - Proceedings., 8491683, IEEE International Conference on Fuzzy Systems, vol. 2018-July, Institute of Electrical and Electronics Engineers Inc., 2018 IEEE International Conference on Fuzzy Systems, FUZZ 2018, Rio de Janeiro, Brazil, 7/8/18. https://doi.org/10.1109/FUZZ-IEEE.2018.8491683
Singh V, Vardhan H, Verma NK, Cui Y. Optimal feature selection using fuzzy combination of feature subset for transcriptome data. In 2018 IEEE International Conference on Fuzzy Systems, FUZZ 2018 - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2018. 8491683. (IEEE International Conference on Fuzzy Systems). https://doi.org/10.1109/FUZZ-IEEE.2018.8491683
Singh, Vikas ; Vardhan, Harsh ; Verma, Nishchal K. ; Cui, Yan. / Optimal feature selection using fuzzy combination of feature subset for transcriptome data. 2018 IEEE International Conference on Fuzzy Systems, FUZZ 2018 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2018. (IEEE International Conference on Fuzzy Systems).
@inproceedings{4723dc295e5949faa5ef5a44d3d29285,
title = "Optimal feature selection using fuzzy combination of feature subset for transcriptome data",
abstract = "Applying machine learning algorithms directly on high dimensional datasets, like those encountered in transcriptome analysis, may lead to high time complexity and low performance of learning models, especially when the number of samples is small compared to the dimensionality. Selecting the optimal set of features then becomes an essential task for such datasets. Filter methods are one of the main class of techniques used for feature selection wherein a score is assigned to features based on criteria such as information gain, statistical measures or similarity based measures and then selects the best scored features. Using filter methods on the complete dataset results in features that have good performance over the dataset but might perform poorly in certain regions of the data, which affects accuracy for data points of those regions. To overcome this degradation in performance, we propose two novel methods to assign a robust score by using the fuzzy combination of the region-specific optimal feature subsets obtained using a standard feature selection algorithm (we use mRMR for this paper).We compare the result with state-of-the-art feature selection algorithm, mRMR (Minimum Redundancy Maximum Relevance) in the terms of accuracy on certain standard datasets.",
author = "Vikas Singh and Harsh Vardhan and Verma, {Nishchal K.} and Yan Cui",
year = "2018",
month = "10",
day = "12",
doi = "10.1109/FUZZ-IEEE.2018.8491683",
language = "English (US)",
series = "IEEE International Conference on Fuzzy Systems",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
booktitle = "2018 IEEE International Conference on Fuzzy Systems, FUZZ 2018 - Proceedings",
address = "United States",

}

TY - GEN

T1 - Optimal feature selection using fuzzy combination of feature subset for transcriptome data

AU - Singh, Vikas

AU - Vardhan, Harsh

AU - Verma, Nishchal K.

AU - Cui, Yan

PY - 2018/10/12

Y1 - 2018/10/12

N2 - Applying machine learning algorithms directly on high dimensional datasets, like those encountered in transcriptome analysis, may lead to high time complexity and low performance of learning models, especially when the number of samples is small compared to the dimensionality. Selecting the optimal set of features then becomes an essential task for such datasets. Filter methods are one of the main class of techniques used for feature selection wherein a score is assigned to features based on criteria such as information gain, statistical measures or similarity based measures and then selects the best scored features. Using filter methods on the complete dataset results in features that have good performance over the dataset but might perform poorly in certain regions of the data, which affects accuracy for data points of those regions. To overcome this degradation in performance, we propose two novel methods to assign a robust score by using the fuzzy combination of the region-specific optimal feature subsets obtained using a standard feature selection algorithm (we use mRMR for this paper).We compare the result with state-of-the-art feature selection algorithm, mRMR (Minimum Redundancy Maximum Relevance) in the terms of accuracy on certain standard datasets.

AB - Applying machine learning algorithms directly on high dimensional datasets, like those encountered in transcriptome analysis, may lead to high time complexity and low performance of learning models, especially when the number of samples is small compared to the dimensionality. Selecting the optimal set of features then becomes an essential task for such datasets. Filter methods are one of the main class of techniques used for feature selection wherein a score is assigned to features based on criteria such as information gain, statistical measures or similarity based measures and then selects the best scored features. Using filter methods on the complete dataset results in features that have good performance over the dataset but might perform poorly in certain regions of the data, which affects accuracy for data points of those regions. To overcome this degradation in performance, we propose two novel methods to assign a robust score by using the fuzzy combination of the region-specific optimal feature subsets obtained using a standard feature selection algorithm (we use mRMR for this paper).We compare the result with state-of-the-art feature selection algorithm, mRMR (Minimum Redundancy Maximum Relevance) in the terms of accuracy on certain standard datasets.

UR - http://www.scopus.com/inward/record.url?scp=85060484189&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85060484189&partnerID=8YFLogxK

U2 - 10.1109/FUZZ-IEEE.2018.8491683

DO - 10.1109/FUZZ-IEEE.2018.8491683

M3 - Conference contribution

AN - SCOPUS:85060484189

T3 - IEEE International Conference on Fuzzy Systems

BT - 2018 IEEE International Conference on Fuzzy Systems, FUZZ 2018 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

ER -