Data mining for GENE expression profiles from DNA microarray

Sung Bae Cho, Hong Hee Won

Research output: Contribution to journalArticlepeer-review

17 Citations (Scopus)


Microarray technology has supplied a large volume of data, which changes many problems in biology into the problems of computing. As a result techniques for extracting useful information from the data are developed. In particular, microarray technology has been applied to prediction and diagnosis of cancer, so that it expectedly helps us to exactly predict and diagnose cancer. To precisely classify cancer we have to select genes related to cancer because the genes extracted from microarray have many noises. In this paper, we attempt to explore seven feature selection methods and four classifiers and propose ensemble classifiers in three benchmark datasets to systematically evaluate the performances of the feature selection methods and machine learning classifiers. Three benchmark datasets are leukemia cancer dataset, colon cancer dataset and lymphoma cancer data set. The methods to combine the classifiers are majority voting, weighted voting, and Bayesian approach to improve the performance of classification. Experimental results show that the ensemble with several basis classifiers produces the best recognition rate on the benchmark datasets.

Original languageEnglish
Pages (from-to)593-608
Number of pages16
JournalInternational Journal of Software Engineering and Knowledge Engineering
Issue number6
Publication statusPublished - 2003 Dec

Bibliographical note

Funding Information:
This work was supported by Biometrics Engineering Research Center and a grant of Korea Health 21 R&D Project, Ministry of Health & Welfare, Republic of Korea.

All Science Journal Classification (ASJC) codes

  • Software
  • Computer Networks and Communications
  • Computer Graphics and Computer-Aided Design
  • Artificial Intelligence


Dive into the research topics of 'Data mining for GENE expression profiles from DNA microarray'. Together they form a unique fingerprint.

Cite this