Suicidality Detection on Social Media Using Metadata and Text Feature Extraction and Machine Learning

Woojin Jung, Donghun Kim, Seojin Nam, Yongjun Zhu

Research output: Contribution to journalArticlepeer-review


In this study, we implemented machine learning models that can detect suicidality posts on Twitter. We randomly selected and annotated 20,000 tweets and explored metadata and text features to build effective models. Metadata features were studied in great details to understand their possibility and importance in suicidality detection models. Results showed that posting type (i.e., reply or not) and time-related features such as the month, day of the week, and the time (AM vs. PM) were the most important metadata features in suicidality detection models. Specifically, the probability of a social media post being suicidal is higher if the post is a reply to other users rather than an original tweet. Moreover, tweets created in the afternoon, on Fridays and weekends, and in fall have higher probabilities of being detected as suicidality tweets compared with those created in other times. By integrating metadata and text features, we obtained a model of good performance (i.e., F1 score of 0.846) that can assist humans in the real-world setting to detect suicidality social media posts.

Original languageEnglish
Pages (from-to)13-28
Number of pages16
JournalArchives of Suicide Research
Issue number1
Publication statusPublished - 2023

Bibliographical note

Publisher Copyright:
© 2021 International Academy for Suicide Research.

All Science Journal Classification (ASJC) codes

  • Clinical Psychology
  • Psychiatry and Mental health


Dive into the research topics of 'Suicidality Detection on Social Media Using Metadata and Text Feature Extraction and Machine Learning'. Together they form a unique fingerprint.

Cite this