Analysis of Machine Learning Approaches to Packing Detection
Packing is a widely used obfuscation technique by which malware hides content and behavior. Much research explores how to detect a packed program via such varied approaches as entropy analysis, syntactic signatures, and, more recently, machine learning classifiers using various features. Yet no robust results indicate which algorithms perform best or which features are most significant. Reviews of these results highlight how accuracy, cost, generalization of capabilities, and other measures complicate evaluations. Our work addresses deficiencies by assessing nine different machine-learning approaches using 119 features to identify which features are most significant for packing detection, which algorithms offer the best performance, and which algorithms are most economical.