Abstract
We present a method to detect the novelty of a research paper. Because novelty in scholarly literature also examines the larger research community, a network-based approach for extracting features is proposed. Two graphs are introduced, a macro-level graph, where authors and documents are used as nodes, and a micro-level graph, where keywords, topics, and words are used as nodes. After constructing the seed graph, papers are incrementally added while changes in the graph are recorded as the feature set of a paper. An autoencoder neural network is then used as the novelty detection model. The experimental results show that the commonly used text feature representations, TF-IDF and one-class SVM, are not suitable for detecting the novelty of a research paper. Among the constructed graphs, keyword-level graph features exhibit the best performance using regression analysis as the metric. We also combine the macro-level graph, micro-level graph, and all features and find that the combination of keywords, topics, and word features perform the best using regression and citation count analysis. Other factors that could affect the citation counts, impact, and audience, are also discussed.
Original language | English |
---|---|
Pages (from-to) | 542-557 |
Number of pages | 16 |
Journal | Information sciences |
Volume | 422 |
DOIs | |
Publication status | Published - 2018 Jan |
Bibliographical note
Funding Information:This project is supported by Microsoft Research. This project acknowledges the role of the Microsoft Cognitive Service API and its contribution to the success of this research. This project is also partly supported by the Ministry of Education of the Republic of Korea and the National Research Foundation of Korea ( NRF-2015S1A3A2046711 ).
Publisher Copyright:
© 2017 Elsevier Inc.
All Science Journal Classification (ASJC) codes
- Software
- Information Systems and Management
- Artificial Intelligence
- Theoretical Computer Science
- Control and Systems Engineering
- Computer Science Applications