JISE

[1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]

Journal of Information Science and Engineering, Vol. 38 No. 6, pp. 1317-1334

Learning Dynamic Malware Representation from Common Behavior

YI-TING HUANG^1,+, TING-YI CHEN², SHUN-WEN HSIAO³,
AND YEALI S. SUN²
¹Research Center for Information Technology Innovation
Academia Sinica
Taipei, 115 Taiwan
²Information Management
National Taiwan University
Taipei, 106 Taiwan
³Management Information Systems
National Chengchi University
Taipei, 116 Taiwan
E-mail: ythuang@iis.sinica.edu.tw⁺; r06725035@g.ntu.edu.tw;
hsiaom@nccu.edu.tw; sunny@ntu.edu.tw

Malware analysis has been extensively investigated as the number and types of malware has increased dramatically. However, most previous studies use end-to-end systems to detect whether a sample is malicious, or to identify its malware family. In this paper, we introduce a framework composed of two components, RasMMA and RasNN, accounting for common characteristics within a family. While RasMMA extracts the common behaviors of malware, RasNN is designed to pretrain a composition of the common behaviors as malware representation. Different from the end-to-end models, the pretrained malware representation can be fine-tuned with one additional output layer to apply other malware applications, such as family classification. We conduct broad experiments to determine the influence of individual framework components and the feasibility of a task-specific extension model. The results show that the proposed framework outperforms the other baselines, and also demonstrates that learned malware representation can be applied to other cybersecurity application and outperform the existing system.

Keywords: deep learning, dynamic analysis, malware behavior analysis, malware family classification, malware representation

Retrieve PDF document (JISE_202206_12.pdf)