Mining entity translations from comparable corpora: A holistic graph mapping approach

Jinhan Kim, Long Jiang, Seung Won Hwang, Young In Song, Ming Zhou

Research output: Chapter in Book/Report/Conference proceedingConference contribution

12 Citations (Scopus)

Abstract

This paper addresses the problem of mining named entity translations from comparable corpora, specifically, mining English and Chinese named entity translation. We first observe that existing approaches use one or more of the following named entity similarity metrics: entity, entity context, and relationship. Inspired by this observation, in this paper, we propose a new holistic approach, by (1) combining all similarity types used and (2) additionally considering relationship context similarity between pairs of named entities, a missing quadrant in the taxonomy of similarity metrics. We abstract the named entity translation problem as the matching of two named entity graphs extracted from the comparable corpora. Specifically, named entity graphs are first constructed from comparable corpora to extract relationship between named entities. Entity similarity and entity context similarity are then calculated from every pair of bilingual named entities. A reinforcing method is utilized to reflect relationship similarity and relationship context similarity between named entities. According to our experimental results, our holistic graph-based approach significantly outperforms previous approaches.

Original languageEnglish
Title of host publicationCIKM'11 - Proceedings of the 2011 ACM International Conference on Information and Knowledge Management
Pages1295-1304
Number of pages10
DOIs
Publication statusPublished - 2011
Event20th ACM Conference on Information and Knowledge Management, CIKM'11 - Glasgow, United Kingdom
Duration: 2011 Oct 242011 Oct 28

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings

Other

Other20th ACM Conference on Information and Knowledge Management, CIKM'11
Country/TerritoryUnited Kingdom
CityGlasgow
Period11/10/2411/10/28

All Science Journal Classification (ASJC) codes

  • Decision Sciences(all)
  • Business, Management and Accounting(all)

Fingerprint

Dive into the research topics of 'Mining entity translations from comparable corpora: A holistic graph mapping approach'. Together they form a unique fingerprint.

Cite this