An extension of regression trees to generate better predictive models

Hyunjoong Kim, Frank M. Guess, Timothy M. Young

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)


For situations where the data are drawn from reasonably homogeneous populations, traditional methods such as multiple regression typically yield insightful analyses. For situations where the data are drawn from more heterogeneous populations, decision tree approaches, such as Classification and Regression Trees (CART) and Generalized, Unbiased, Interaction, Detection, and Estimation (GUIDE), are more likely to recognize idiosyncratic subpopulations and interactions automatically. In contrast to CART, however, GUIDE yields models with better predictive performance for each subpopulation. This article extends the idea of GUIDE to handle analysis of covariance-type problems. This article compares GUIDE modeling to various decision tree methods and to multiple regression. The article identifies and discusses the relative advantages and disadvantages of multiple regression, CART, and GUIDE. GUIDE produces quality or reliability models that exhibit greater predictive accuracy than multiple regression or CART for complex, highly diverse populations. Also, GUIDE is readily applicable to many other areas, such as repairability and maintainability settings involving both qualitative and quantitative variables. A small case study of an engineered wood product, medium-density fiberboard, is presented to illustrate the application of GUIDE. Accepted in 2005 for a special issue on Reliability co-edited by Hoang Pham, Rutgers University; Dong Ho Park, Hallym University, Korea; and Richard Cassady, University of Arkansas.

Original languageEnglish
Pages (from-to)43-54
Number of pages12
JournalIIE Transactions (Institute of Industrial Engineers)
Issue number1
Publication statusPublished - 2012 Jan 1

Bibliographical note

Funding Information:
Hyunjoong Kim is a Full Professor in the Department of Applied Statistics at Yonsei University, South Korea. He earned a Ph.D. in Statistics from the University of Wisconsin Madison. He was the recipient of the Best Teaching Award in 2006 and 2007 at Yonsei University and the Professional Development Award in 2002 at the University of Tennessee. He has held grants from the Korea Science and Engineering Foundation, Korea Research Foundation, National Center for Health Statistics, and the College of Business Administration at the University of Tennessee. He is on the Editorial Board of The Korean Communication in Statistics. He has been a Program Committee Member for the Conference on Korean Data Mining Society, November 2007; Asian Institute in Statistical Genetics, July 2005; and The New Frontiers of Statistical Data Mining, Knowledge Discovery, and E-Business, June 2002. He has written numerous articles on statistics, industrial engineering, and data mining journals, including such titles as Journal of the American Statistical Association, Journal of Computational and Graphical Statistics, IIE Transactions, Journal of Statistical Computation and Simulation, International Journal of Industrial Engineering, Statistical Data Mining and Knowledge Discovery, etc. His research interests lie in data mining, tree-based statistical modeling, and statistical computing.

Funding Information:
Hyunjoong Kim’s work was supported by Basic Science Research program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science, and Technology (2010-0008769). Frank M. Guess is grateful for funding from the competitive Scholarly Research Grant Program of the College of Business Administration at the University of Tennessee. Timothy M. Young appreciates funding from the United States Department of Agriculture, Special Wood Utilization Research Grants, which is administered by the University of Tennessee Agricultural Experiment Station and the Tennessee Forest Products Center.

All Science Journal Classification (ASJC) codes

  • Industrial and Manufacturing Engineering


Dive into the research topics of 'An extension of regression trees to generate better predictive models'. Together they form a unique fingerprint.

Cite this