TY - GEN
T1 - Extracting and mining protein-protein interaction network from biomedical literature
AU - Hu, Xiaohua
AU - Yoo, Illhoi
AU - Song, Il Yeol
AU - Song, Min
AU - Han, Jianchao
AU - Lechner, Mark
PY - 2004
Y1 - 2004
N2 - In this paper we present a biomedical literature data mining system SPIE-DM (Scalable and Portable Information Extraction and Data Mining) to extract and mine the protein-protein interaction network from biomedical literature such as MedLine. SPIE-DM consists of two phases: in Phase 1, we develop a Scalable and Portable IE method (SPIE) to extract the protein-protein interaction from the biomedical literature. These extracted protein-protein interactions form a scale-free network graph. In Phase 2, we apply a novel clustering method SFCluster to mine the protein-protein interaction network. The clusters in the network graph represent some potential protein complexes, which are very important for biologist to study the protein functionality. The clustering algorithm considers the characteristics of the scale-free network graphs and is based on the local density of the vertex and its neighborhood functions that can be used to find more meaningful clusters at different density levels. The experiments of SPIE-DM on around 1600 chromatin proteins indicate that our system is very promising for extracting and mining from biomedical literature databases.
AB - In this paper we present a biomedical literature data mining system SPIE-DM (Scalable and Portable Information Extraction and Data Mining) to extract and mine the protein-protein interaction network from biomedical literature such as MedLine. SPIE-DM consists of two phases: in Phase 1, we develop a Scalable and Portable IE method (SPIE) to extract the protein-protein interaction from the biomedical literature. These extracted protein-protein interactions form a scale-free network graph. In Phase 2, we apply a novel clustering method SFCluster to mine the protein-protein interaction network. The clusters in the network graph represent some potential protein complexes, which are very important for biologist to study the protein functionality. The clustering algorithm considers the characteristics of the scale-free network graphs and is based on the local density of the vertex and its neighborhood functions that can be used to find more meaningful clusters at different density levels. The experiments of SPIE-DM on around 1600 chromatin proteins indicate that our system is very promising for extracting and mining from biomedical literature databases.
UR - http://www.scopus.com/inward/record.url?scp=17044371130&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=17044371130&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:17044371130
SN - 0780387287
SN - 9780780387287
T3 - Proceedings of the 2004 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB'04
SP - 244
EP - 251
BT - Proceedings of the 2004 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB'04
T2 - Proceedings of the 2004 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB'04
Y2 - 7 October 2004 through 8 October 2004
ER -