Assessing the quality of biclusters using fuzzy biclustering index

Nishchal Kumar Verma, Esha Dutta, Yan Cui

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Several algorithms are proposed in the literature for extracting local patterns from a large data matrix. This technique of data mining is known as biclustering. Each of the biclustering algorithms is specialised in extracting different kinds of biclusters. Some algorithms detect equal biclusters, whereas some identify scaled biclusters (Madeira et al., 2004). For any practical database, since we are not aware of the biclusters present in it, we are not sure of the biclustering algorithm to be used. In such a scenario, it is important to define metrics to compare the quality of the extracted biclusters and hence the quality of the biclustering algorithm. In this paper, we have defined novel measures of Hausdorff distance between biclusters and global silhouette index for estimating the quality of biclusters extracted by the existing algorithms. We have also combined these metrics with the proportion of enriched biclusters extracted and defined an overall index defined as the Fuzzy Biclustering Index (FBI) to compare the various algorithms. For a given data set, higher is the FBI, better is the biclustering algorithm.

Original languageEnglish (US)
Pages (from-to)291-311
Number of pages21
JournalInternational Journal of Data Mining and Bioinformatics
Volume15
Issue number4
DOIs
StatePublished - Jan 1 2016

Fingerprint

Data Mining
Data mining
Databases
scenario
present
Datasets
literature

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Biochemistry, Genetics and Molecular Biology(all)
  • Library and Information Sciences

Cite this

Assessing the quality of biclusters using fuzzy biclustering index. / Verma, Nishchal Kumar; Dutta, Esha; Cui, Yan.

In: International Journal of Data Mining and Bioinformatics, Vol. 15, No. 4, 01.01.2016, p. 291-311.

Research output: Contribution to journalArticle

@article{619001ff54954694a2cd1bab3685dd00,
title = "Assessing the quality of biclusters using fuzzy biclustering index",
abstract = "Several algorithms are proposed in the literature for extracting local patterns from a large data matrix. This technique of data mining is known as biclustering. Each of the biclustering algorithms is specialised in extracting different kinds of biclusters. Some algorithms detect equal biclusters, whereas some identify scaled biclusters (Madeira et al., 2004). For any practical database, since we are not aware of the biclusters present in it, we are not sure of the biclustering algorithm to be used. In such a scenario, it is important to define metrics to compare the quality of the extracted biclusters and hence the quality of the biclustering algorithm. In this paper, we have defined novel measures of Hausdorff distance between biclusters and global silhouette index for estimating the quality of biclusters extracted by the existing algorithms. We have also combined these metrics with the proportion of enriched biclusters extracted and defined an overall index defined as the Fuzzy Biclustering Index (FBI) to compare the various algorithms. For a given data set, higher is the FBI, better is the biclustering algorithm.",
author = "Verma, {Nishchal Kumar} and Esha Dutta and Yan Cui",
year = "2016",
month = "1",
day = "1",
doi = "10.1504/IJDMB.2016.078145",
language = "English (US)",
volume = "15",
pages = "291--311",
journal = "International Journal of Data Mining and Bioinformatics",
issn = "1748-5673",
publisher = "Inderscience Enterprises Ltd",
number = "4",

}

TY - JOUR

T1 - Assessing the quality of biclusters using fuzzy biclustering index

AU - Verma, Nishchal Kumar

AU - Dutta, Esha

AU - Cui, Yan

PY - 2016/1/1

Y1 - 2016/1/1

N2 - Several algorithms are proposed in the literature for extracting local patterns from a large data matrix. This technique of data mining is known as biclustering. Each of the biclustering algorithms is specialised in extracting different kinds of biclusters. Some algorithms detect equal biclusters, whereas some identify scaled biclusters (Madeira et al., 2004). For any practical database, since we are not aware of the biclusters present in it, we are not sure of the biclustering algorithm to be used. In such a scenario, it is important to define metrics to compare the quality of the extracted biclusters and hence the quality of the biclustering algorithm. In this paper, we have defined novel measures of Hausdorff distance between biclusters and global silhouette index for estimating the quality of biclusters extracted by the existing algorithms. We have also combined these metrics with the proportion of enriched biclusters extracted and defined an overall index defined as the Fuzzy Biclustering Index (FBI) to compare the various algorithms. For a given data set, higher is the FBI, better is the biclustering algorithm.

AB - Several algorithms are proposed in the literature for extracting local patterns from a large data matrix. This technique of data mining is known as biclustering. Each of the biclustering algorithms is specialised in extracting different kinds of biclusters. Some algorithms detect equal biclusters, whereas some identify scaled biclusters (Madeira et al., 2004). For any practical database, since we are not aware of the biclusters present in it, we are not sure of the biclustering algorithm to be used. In such a scenario, it is important to define metrics to compare the quality of the extracted biclusters and hence the quality of the biclustering algorithm. In this paper, we have defined novel measures of Hausdorff distance between biclusters and global silhouette index for estimating the quality of biclusters extracted by the existing algorithms. We have also combined these metrics with the proportion of enriched biclusters extracted and defined an overall index defined as the Fuzzy Biclustering Index (FBI) to compare the various algorithms. For a given data set, higher is the FBI, better is the biclustering algorithm.

UR - http://www.scopus.com/inward/record.url?scp=84981505054&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84981505054&partnerID=8YFLogxK

U2 - 10.1504/IJDMB.2016.078145

DO - 10.1504/IJDMB.2016.078145

M3 - Article

VL - 15

SP - 291

EP - 311

JO - International Journal of Data Mining and Bioinformatics

JF - International Journal of Data Mining and Bioinformatics

SN - 1748-5673

IS - 4

ER -