Explanatory model of dry eye disease using health and nutrition examinations: Machine learning and network-based factor analysis from a national survey

Sang Min Nam, Thomas A. Peterson, Atul J. Butte, Kyoung Yul Seo, Hyun Wook Han

Research output: Contribution to journalArticlepeer-review

10 Citations (Scopus)


Background: Dry eye disease (DED) is a complex disease of the ocular surface, and its associated factors are important for understanding and effectively treating DED. Objective: This study aimed to provide an integrative and personalized model of DED by making an explanatory model of DED using as many factors as possible from the Korea National Health and Nutrition Examination Survey (KNHANES) data. Methods: Using KNHANES data for 2012 (4391 sample cases), a point-based scoring system was created for ranking factors associated with DED and assessing patient-specific DED risk. First, decision trees and lasso were used to classify continuous factors and to select important factors, respectively. Next, a survey-weighted multiple logistic regression was trained using these factors, and points were assigned using the regression coefficients. Finally, network graphs of partial correlations between factors were utilized to study the interrelatedness of DED-associated factors. Results: The point-based model achieved an area under the curve of 0.70 (95% CI 0.61-0.78), and 13 of 78 factors considered were chosen. Important factors included sex (+9 points for women), corneal refractive surgery (+9 points), current depression (+7 points), cataract surgery (+7 points), stress (+6 points), age (54-66 years; +4 points), rhinitis (+4 points), lipid-lowering medication (+4 points), and intake of omega-3 (0.43%-0.65% kcal/day; −4 points). Among these, the age group 54 to 66 years had high centrality in the network, whereas omega-3 had low centrality. Conclusions: Integrative understanding of DED was possible using the machine learning-based model and network-based factor analysis. This method for finding important risk factors and identifying patient-specific risk could be applied to other multifactorial diseases.

Original languageEnglish
Article numbere16153
JournalJMIR Medical Informatics
Issue number2
Publication statusPublished - 2020 Feb

Bibliographical note

Publisher Copyright:
© Sang Min Nam, Thomas A Peterson, Atul J Butte, Kyoung Yul Seo, Hyun Wook Han. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 20.02.2020. This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on http://medinform.jmir.org/, as well as this copyright and license information must be included.

All Science Journal Classification (ASJC) codes

  • Health Informatics
  • Health Information Management


Dive into the research topics of 'Explanatory model of dry eye disease using health and nutrition examinations: Machine learning and network-based factor analysis from a national survey'. Together they form a unique fingerprint.

Cite this