TY - GEN
T1 - CGM
T2 - 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshops, BIBMW 2009
AU - Bleik, Said
AU - Song, Min
AU - Smalter, Aaron
AU - Huan, Jun
AU - Lushington, Gerald
PY - 2009
Y1 - 2009
N2 - Text Categorization is used to organize and manage biomedical text databases that are growing at an exponential rate. Feature representations for documents are a crucial factor for the performance of text categorization. Most of the successful existing techniques use a vector representation based on key entities extracted from the text. In this paper we investigate a new direction where we represent a document as a graph. In this representation we identify high level concepts and build a rich graph structure that contains additional concepts and relationships. We then use graph kernel techniques to perform text categorization. The results show a significant improvement in accuracy when compared to categorization based on only the extracted concepts.
AB - Text Categorization is used to organize and manage biomedical text databases that are growing at an exponential rate. Feature representations for documents are a crucial factor for the performance of text categorization. Most of the successful existing techniques use a vector representation based on key entities extracted from the text. In this paper we investigate a new direction where we represent a document as a graph. In this representation we identify high level concepts and build a rich graph structure that contains additional concepts and relationships. We then use graph kernel techniques to perform text categorization. The results show a significant improvement in accuracy when compared to categorization based on only the extracted concepts.
UR - http://www.scopus.com/inward/record.url?scp=72849119302&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=72849119302&partnerID=8YFLogxK
U2 - 10.1109/BIBMW.2009.5332134
DO - 10.1109/BIBMW.2009.5332134
M3 - Conference contribution
AN - SCOPUS:72849119302
SN - 9781424451210
T3 - Proceedings - 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshops, BIBMW 2009
SP - 38
EP - 43
BT - Proceedings - 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshops, BIBMW 2009
Y2 - 1 November 2009 through 4 November 2009
ER -