Using twitter to predict when vulnerabilities will be exploited

Haipeng Chen, Rui Liu, Noseong Park, V. S. Subrahmanian

Research output: Chapter in Book/Report/Conference proceedingConference contribution

25 Citations (Scopus)

Abstract

When a new cyber-vulnerability is detected, a Common Vulnerability and Exposure (CVE) number is attached to it. Malicious “exploits” may use these vulnerabilities to carry out attacks. Unlike works which study if a CVE will be used in an exploit, we study the problem of predicting when an exploit is first seen. This is an important question for system administrators as they need to devote scarce resources to take corrective action when a new vulnerability emerges. Moreover, past works assume that CVSS scores (released by NIST) are available for predictions, but we show on average that 49% of real world exploits occur before CVSS scores are published. This means that past works, which use CVSS scores, miss almost half of the exploits. In this paper, we propose a novel framework to predict when a vulnerability will be exploited via Twitter discussion, without using CVSS score information. We introduce the unique concept of a family of CVE-Author-Tweet (CAT) graphs and build a novel set of features based on such graphs. We define recurrence relations capturing “hotness” of tweets, “expertise” of Twitter users on CVEs, and “availability” of information about CVEs, and prove that we can solve these recurrences via a fix point algorithm. Our second innovation adopts Hawkes processes to estimate the number of tweets/retweets related to the CVEs. Using the above two sets of novel features, we propose two ensemble forecast models FEEU (for classification) and FRET (for regression) to predict when a CVE will be exploited. Compared with natural adaptations of past works (which predict if an exploit will be used), FEEU increases F1 score by 25.1%, while FRET decreases MAE by 37.2%.

Original languageEnglish
Title of host publicationKDD 2019 - Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery
Pages3143-3152
Number of pages10
ISBN (Electronic)9781450362016
DOIs
Publication statusPublished - 2019 Jul 25
Event25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2019 - Anchorage, United States
Duration: 2019 Aug 42019 Aug 8

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Conference

Conference25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2019
Country/TerritoryUnited States
CityAnchorage
Period19/8/419/8/8

Bibliographical note

Funding Information:
This work is supported by ONR grants N00014-18-1-2670 and N00014-16-1-2896 and ARO grant W911NF-13-1-0421.

Publisher Copyright:
© 2019 Association for Computing Machinery.

All Science Journal Classification (ASJC) codes

  • Software
  • Information Systems

Fingerprint

Dive into the research topics of 'Using twitter to predict when vulnerabilities will be exploited'. Together they form a unique fingerprint.

Cite this