Text categorization of biomedical data sets using graph kernels and a controlled Vocabulary

Said Bleik, Meenakshi Mishra, Jun Huan, Min Song

Research output: Contribution to journalArticlepeer-review

28 Citations (Scopus)


Recently, graph representations of text have been showing improved performance over conventional bag-of-words representations in text categorization applications. In this paper, we present a graph-based representation for biomedical articles and use graph kernels to classify those articles into high-level categories. In our representation, common biomedical concepts and semantic relationships are identified with the help of an existing ontology and are used to build a rich graph structure that provides a consistent feature set and preserves additional semantic information that could improve a classifier's performance. We attempt to classify the graphs using both a set-based graph kernel that is capable of dealing with the disconnected nature of the graphs and a simple linear kernel. Finally, we report the results comparing the classification performance of the kernel classifiers to common text-based classifiers.

Original languageEnglish
Article number6475935
Pages (from-to)1211-1217
Number of pages7
JournalIEEE/ACM Transactions on Computational Biology and Bioinformatics
Issue number5
Publication statusPublished - 2013 Sept

All Science Journal Classification (ASJC) codes

  • Biotechnology
  • Genetics
  • Applied Mathematics


Dive into the research topics of 'Text categorization of biomedical data sets using graph kernels and a controlled Vocabulary'. Together they form a unique fingerprint.

Cite this