Automatic cluster identification for environmental applications using the self-organizing maps and a new genetic algorithm

Tonny J. Oyana, Dajun Dai

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

A rapid increase of environmental data dimensionality emphasizes the importance of developing data-driven inductive approaches to geographic analysis. This article uses a loosely coupled strategy to combine the technique of self-organizing maps (SOM) with a new genetic algorithm (GA) for automatic identification of clusters in multidimensional environmental datasets. In the first stage, we employ the well-known classic SOM because it is able to handle the dimensional interactions and capture the number of clusters via visualization; and thus provide extraordinary insights into original data. In the second stage, this new GA rigorously delineates the cluster boundaries using a flexibly oriented elliptical search window. To test this approach, one synthetic and two real-world datasets are employed. The results confirm a more robust and reliable approach that provides a better understanding and interpretation of massive multivariate environmental datasets, thus maximizing our insights. Other key benefits include the fact that it provides a computationally fast and efficient environment to accurately detect clusters, and is highly flexible. In a nutshell, the article presents a computational approach to facilitate knowledge discovery of massive multivariate environmental datasets; as we are too familiar with their accelerating growth rate.

Original languageEnglish (US)
Pages (from-to)53-69
Number of pages17
JournalGeocarto International
Volume25
Issue number1
DOIs
StatePublished - Feb 1 2010

Fingerprint

genetic algorithm
visualization
interpretation
interaction
knowledge

All Science Journal Classification (ASJC) codes

  • Geography, Planning and Development
  • Water Science and Technology

Cite this

Automatic cluster identification for environmental applications using the self-organizing maps and a new genetic algorithm. / Oyana, Tonny J.; Dai, Dajun.

In: Geocarto International, Vol. 25, No. 1, 01.02.2010, p. 53-69.

Research output: Contribution to journalArticle

@article{1c3f99de53df4cff9f10da178e57bffb,
title = "Automatic cluster identification for environmental applications using the self-organizing maps and a new genetic algorithm",
abstract = "A rapid increase of environmental data dimensionality emphasizes the importance of developing data-driven inductive approaches to geographic analysis. This article uses a loosely coupled strategy to combine the technique of self-organizing maps (SOM) with a new genetic algorithm (GA) for automatic identification of clusters in multidimensional environmental datasets. In the first stage, we employ the well-known classic SOM because it is able to handle the dimensional interactions and capture the number of clusters via visualization; and thus provide extraordinary insights into original data. In the second stage, this new GA rigorously delineates the cluster boundaries using a flexibly oriented elliptical search window. To test this approach, one synthetic and two real-world datasets are employed. The results confirm a more robust and reliable approach that provides a better understanding and interpretation of massive multivariate environmental datasets, thus maximizing our insights. Other key benefits include the fact that it provides a computationally fast and efficient environment to accurately detect clusters, and is highly flexible. In a nutshell, the article presents a computational approach to facilitate knowledge discovery of massive multivariate environmental datasets; as we are too familiar with their accelerating growth rate.",
author = "Oyana, {Tonny J.} and Dajun Dai",
year = "2010",
month = "2",
day = "1",
doi = "10.1080/10106040802711687",
language = "English (US)",
volume = "25",
pages = "53--69",
journal = "Geocarto International",
issn = "1010-6049",
publisher = "Taylor and Francis Ltd.",
number = "1",

}

TY - JOUR

T1 - Automatic cluster identification for environmental applications using the self-organizing maps and a new genetic algorithm

AU - Oyana, Tonny J.

AU - Dai, Dajun

PY - 2010/2/1

Y1 - 2010/2/1

N2 - A rapid increase of environmental data dimensionality emphasizes the importance of developing data-driven inductive approaches to geographic analysis. This article uses a loosely coupled strategy to combine the technique of self-organizing maps (SOM) with a new genetic algorithm (GA) for automatic identification of clusters in multidimensional environmental datasets. In the first stage, we employ the well-known classic SOM because it is able to handle the dimensional interactions and capture the number of clusters via visualization; and thus provide extraordinary insights into original data. In the second stage, this new GA rigorously delineates the cluster boundaries using a flexibly oriented elliptical search window. To test this approach, one synthetic and two real-world datasets are employed. The results confirm a more robust and reliable approach that provides a better understanding and interpretation of massive multivariate environmental datasets, thus maximizing our insights. Other key benefits include the fact that it provides a computationally fast and efficient environment to accurately detect clusters, and is highly flexible. In a nutshell, the article presents a computational approach to facilitate knowledge discovery of massive multivariate environmental datasets; as we are too familiar with their accelerating growth rate.

AB - A rapid increase of environmental data dimensionality emphasizes the importance of developing data-driven inductive approaches to geographic analysis. This article uses a loosely coupled strategy to combine the technique of self-organizing maps (SOM) with a new genetic algorithm (GA) for automatic identification of clusters in multidimensional environmental datasets. In the first stage, we employ the well-known classic SOM because it is able to handle the dimensional interactions and capture the number of clusters via visualization; and thus provide extraordinary insights into original data. In the second stage, this new GA rigorously delineates the cluster boundaries using a flexibly oriented elliptical search window. To test this approach, one synthetic and two real-world datasets are employed. The results confirm a more robust and reliable approach that provides a better understanding and interpretation of massive multivariate environmental datasets, thus maximizing our insights. Other key benefits include the fact that it provides a computationally fast and efficient environment to accurately detect clusters, and is highly flexible. In a nutshell, the article presents a computational approach to facilitate knowledge discovery of massive multivariate environmental datasets; as we are too familiar with their accelerating growth rate.

UR - http://www.scopus.com/inward/record.url?scp=77951081879&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77951081879&partnerID=8YFLogxK

U2 - 10.1080/10106040802711687

DO - 10.1080/10106040802711687

M3 - Article

VL - 25

SP - 53

EP - 69

JO - Geocarto International

JF - Geocarto International

SN - 1010-6049

IS - 1

ER -