Evaluating Behavior Graph Reduction Strategies for Machine Learning-Based Malware Detection
Graph-based representations of program behavior are a powerful foundation for machine learning-based malware detection. However, the large size and complexity of these behavior graphs pose scalability challenges. This paper presents a systematic evaluation of five graph reduction strategies—covering both coarsening and sparsification—designed to simplify graphs while preserving meaningful behavioral features. Using dynamic analysis data from Windows PE32 binaries, we analyze the impact of these reductions on computational efficiency, detection performance, and model robustness against adversarial mimicry attacks. Our results show that several strategies substantially reduce graph size and extraction time without significant accuracy loss. We also find that coarsening based on API calls’ action maintains stronger robustness to adversarial manipulation.