Predictive effects of structural variation on citation counts

Chaomei Chen

Research output: Contribution to journalArticlepeer-review

207 Citations (Scopus)


A critical part of a scientific activity is to discern how a new idea is related to what we know and what may become possible. As the number of new scientific publications arrives at a rate that rapidly outpaces our capacity of reading, analyzing, and synthesizing scientific knowledge, we need to augment ourselves with information that can effectively guide us through the rapidly growing intellectual space. In this article, we address a fundamental issue concerning what kinds of information may serve as early signs of potentially valuable ideas. In particular, we are interested in information that is routinely available and derivable upon the publication of a scientific paper without assuming the availability of additional information such as its usage and citations. We propose a theoretical and computational model that predicts the potential of a scientific publication in terms of the degree to which it alters the intellectual structure of the state of the art. The structural variation approach focuses on the novel boundary-spanning connections introduced by a new article to the intellectual space. We validate the role of boundary-spanning in predicting future citations using three metrics of structural variation-namely, modularity change rate, cluster linkage, and Centrality Divergence-along with more commonly studied predictors of citations such as the number of coauthors, the number of cited references, and the number of pages. Main effects of these factors are estimated for five cases using zero-inflated negative binomial regression models of citation counts. Key findings indicate that (a) structural variations measured by cluster linkage are a better predictor of citation counts than are the more commonly studied variables such as the number of references cited, (b) the number of coauthors and the number of references are both good predictors of global citation counts to a lesser extent, and (c) the Centrality Divergence metric is potentially valuable for detecting boundary-spanning activities at interdisciplinary levels. The structural variation approach offers a new way to monitor and discern the potential of newly published papers in context. The boundary-spanning mechanism offers a conceptually simplified and unifying explanation of the roles played by commonly studied extrinsic properties of a publication in the study of citation behavior.

Original languageEnglish
Pages (from-to)431-449
Number of pages19
JournalJournal of the American Society for Information Science and Technology
Issue number3
Publication statusPublished - 2012 Mar

All Science Journal Classification (ASJC) codes

  • Software
  • Information Systems
  • Human-Computer Interaction
  • Computer Networks and Communications
  • Artificial Intelligence


Dive into the research topics of 'Predictive effects of structural variation on citation counts'. Together they form a unique fingerprint.

Cite this