A Vision Transformer Enhanced with Patch Encoding for Malware Classification

Kyoung Won Park, Sung Bae Cho

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)


With various benefits through software technology development, malicious attacks to steal confidential and company information have constantly been increasing. Recent deep learning models with images converted from malicious code achieve meaningful results, but they have challenges in classifying the same malware family, like Ramnit, Tracur, and Obfuscator. ACY that have similar structures in the image. Instead of observing the overall global features, there is a need for a method of considering the position of local features and learning the relationships between them. In this paper, we propose a vision transformer enhanced with the additional encoding of multiple patches for location information of local features and relationship information between them. For learning considering position information and all relationships between patches, [CLS] tokens that can summarize all information are utilized. 10-fold cross-validation with the Microsoft challenge dataset shows that the proposed model produces better accuracy than comparable studies. The misclassification analysis confirms that the proposed method can detect the same malware family penetrated by the conventional deep learning model. Additional analysis with the activation map emphasizes which structural and sequential features are extracted to detect different codes belonging to the same malware family.

Original languageEnglish
Title of host publicationIntelligent Data Engineering and Automated Learning – IDEAL 2022 - 23rd International Conference, IDEAL 2022, Proceedings
EditorsHujun Yin, David Camacho, Peter Tino
PublisherSpringer Science and Business Media Deutschland GmbH
Number of pages11
ISBN (Print)9783031217524
Publication statusPublished - 2022
Event23rd International Conference on Intelligent Data Engineering and Automated Learning, IDEAL 2022 - Manchester, United Kingdom
Duration: 2022 Nov 242022 Nov 26

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13756 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference23rd International Conference on Intelligent Data Engineering and Automated Learning, IDEAL 2022
Country/TerritoryUnited Kingdom

Bibliographical note

Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • General Computer Science


Dive into the research topics of 'A Vision Transformer Enhanced with Patch Encoding for Malware Classification'. Together they form a unique fingerprint.

Cite this