TY - GEN
T1 - Mining entity translations from comparable corpora
T2 - 20th ACM Conference on Information and Knowledge Management, CIKM'11
AU - Kim, Jinhan
AU - Jiang, Long
AU - Hwang, Seung Won
AU - Song, Young In
AU - Zhou, Ming
PY - 2011
Y1 - 2011
N2 - This paper addresses the problem of mining named entity translations from comparable corpora, specifically, mining English and Chinese named entity translation. We first observe that existing approaches use one or more of the following named entity similarity metrics: entity, entity context, and relationship. Inspired by this observation, in this paper, we propose a new holistic approach, by (1) combining all similarity types used and (2) additionally considering relationship context similarity between pairs of named entities, a missing quadrant in the taxonomy of similarity metrics. We abstract the named entity translation problem as the matching of two named entity graphs extracted from the comparable corpora. Specifically, named entity graphs are first constructed from comparable corpora to extract relationship between named entities. Entity similarity and entity context similarity are then calculated from every pair of bilingual named entities. A reinforcing method is utilized to reflect relationship similarity and relationship context similarity between named entities. According to our experimental results, our holistic graph-based approach significantly outperforms previous approaches.
AB - This paper addresses the problem of mining named entity translations from comparable corpora, specifically, mining English and Chinese named entity translation. We first observe that existing approaches use one or more of the following named entity similarity metrics: entity, entity context, and relationship. Inspired by this observation, in this paper, we propose a new holistic approach, by (1) combining all similarity types used and (2) additionally considering relationship context similarity between pairs of named entities, a missing quadrant in the taxonomy of similarity metrics. We abstract the named entity translation problem as the matching of two named entity graphs extracted from the comparable corpora. Specifically, named entity graphs are first constructed from comparable corpora to extract relationship between named entities. Entity similarity and entity context similarity are then calculated from every pair of bilingual named entities. A reinforcing method is utilized to reflect relationship similarity and relationship context similarity between named entities. According to our experimental results, our holistic graph-based approach significantly outperforms previous approaches.
UR - http://www.scopus.com/inward/record.url?scp=83055179414&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=83055179414&partnerID=8YFLogxK
U2 - 10.1145/2063576.2063764
DO - 10.1145/2063576.2063764
M3 - Conference contribution
AN - SCOPUS:83055179414
SN - 9781450307178
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 1295
EP - 1304
BT - CIKM'11 - Proceedings of the 2011 ACM International Conference on Information and Knowledge Management
Y2 - 24 October 2011 through 28 October 2011
ER -