Tracking word semantic change in biomedical literature

Erjia Yan, Yongjun Zhu

Research output: Contribution to journalArticlepeer-review

16 Citations (Scopus)


Up to this point, research on written scholarly communication has focused primarily on syntactic, rather than semantic, analyses. Consequently, we have yet to understand semantic change as it applies to disciplinary discourse. The objective of this study is to illustrate word semantic change in biomedical literature. To that end, we identify a set of representative words in biomedical literature based on word frequency and word-topic probability distributions. A word2vec language model is then applied to the identified words in order to measure word- and topic-level semantic changes. We find that for the selected words in PubMed, overall, meanings are becoming more stable in the 2000s than they were in the 1980s and 1990s. At the topic level, the global distance of most topics (19 out of 20 tested) is declining, suggesting that the words used to discuss these topics are stabilizing semantically. Similarly, the local distance of most topics (19 out of 20) is also declining, showing that the meanings of words from these topics are becoming more consistent with those of their semantic neighbors. At the word level, this paper identifies two different trends in word semantics, as measured by the aforementioned distance metrics: on the one hand, words can form clusters with their semantic neighbors, and these words, as a cluster, coevolve semantically; on the other hand, words can drift apart from their semantic neighbors while nonetheless stabilizing in the global context. In relating our work to language laws on semantic change, we find no overwhelming evidence to support either the law of parallel change or the law of conformity.

Original languageEnglish
Pages (from-to)76-86
Number of pages11
JournalInternational Journal of Medical Informatics
Publication statusPublished - 2018 Jan

Bibliographical note

Publisher Copyright:
© 2017 Elsevier B.V.

All Science Journal Classification (ASJC) codes

  • Health Informatics


Dive into the research topics of 'Tracking word semantic change in biomedical literature'. Together they form a unique fingerprint.

Cite this