A method for obtaining rich data from PubMed using SVM

Junbum Cha, Jeongwoo Kim, Yunku Yeu, Sanghyun Park

Research output: Chapter in Book/Report/Conference proceedingConference contribution


As text mining advances rapidly in the biomedical field, the importance of text data is increasing. Most text data is obtained through a Medical Subjects Headings (MeSH) term search; in this process, a large amount of valuable data is missed because the data is not indexed yet with MeSH terms. In this paper, we propose a method for obtaining additional text data in addition to that obtained using a conventional MeSH term search. In order to obtain additional data, we used the Support Vector Machine (SVM) as the data mining method for classifying documents to related or unrelated. We evaluated the results using a frequency-based text mining approach measuring the quality of data in study of lung cancer. This was confirmed that the data extracted using our method provided as much valuable information as searching using MeSH terms. Further, we found that the amount of information found was increased by 40% using additional extracted data.

Original languageEnglish
Title of host publication2016 Symposium on Applied Computing, SAC 2016
PublisherAssociation for Computing Machinery
Number of pages3
ISBN (Electronic)9781450337397
Publication statusPublished - 2016 Apr 4
Event31st Annual ACM Symposium on Applied Computing, SAC 2016 - Pisa, Italy
Duration: 2016 Apr 42016 Apr 8

Publication series

NameProceedings of the ACM Symposium on Applied Computing


Other31st Annual ACM Symposium on Applied Computing, SAC 2016

Bibliographical note

Publisher Copyright:
© 2016 ACM.

All Science Journal Classification (ASJC) codes

  • Software


Dive into the research topics of 'A method for obtaining rich data from PubMed using SVM'. Together they form a unique fingerprint.

Cite this