Abstract
It is important to use a better criterion in selection and discretization of attributes for the generation of decision trees to construct a better classifier in the area of pattern recognition in order to intelligently access huge amount of data efficiently. Two well-known criteria are gain and gain ratio, both based on the entropy of partitions. We propose in this paper a new criterion based also on entropy, and use both theoretical analysis and computer simulation to demonstrate that it works better than gain or gain ratio in a wide variety of situations. We use the usual entropy calculation where the base of the logarithm is not two but the number of successors to the node. Our theoretical analysis leads some specific situations in which the new criterion works always better than gain or gain ratio, and the simulation result may implicitly cover all the other situations not covered by the analysis.
Original language | English |
---|---|
Pages (from-to) | 1371-1375 |
Number of pages | 5 |
Journal | IEEE transactions on pattern analysis and machine intelligence |
Volume | 19 |
Issue number | 12 |
DOIs | |
Publication status | Published - 1997 |
Bibliographical note
Funding Information:The authors wish to thank the anonymous reviewers for their helpful suggestions in improving the earlier draft of this paper. This research was supported by Korea Telecom Research and Development Group under Contract 96-22.
All Science Journal Classification (ASJC) codes
- Software
- Computer Vision and Pattern Recognition
- Computational Theory and Mathematics
- Artificial Intelligence
- Applied Mathematics