# macro_fc

**Repository Path**: chensheng101/macro_fc

## Basic Information

- **Project Name**: macro_fc
- **Description**: Combine obfuscation features and suspicious keywords to detect malicious VBA macro
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No

## Statistics

- **Stars**: 0
- **Forks**: 0
- **Created**: 2023-09-20
- **Last Updated**: 2023-09-20

## Categories & Tags

**Categories**: Uncategorized

**Tags**: None

## README

# macro_fc

## 1.Introduction
Current research concentrate on detecting malicious macros by examining macro obfuscation and suspicious keywords such as strings and functions. Both methods have limitations. Obfuscation detection is unable to distinguish between benign and malicious macros, leading to numerous false positives in detection. The VBA runtime offers a range of capabilities, granting malware operators new chances to fulfil their objectives. Consequently, this poses a challenge to current detection models, which rely on identifiable suspicious keywords.

In this project, We combine obfuscation features and suspicious keywords in commonly used machine learning modles, including RF, MLP, SVM and KNN to detect malicious VBA macros. We analyze 77 obfuscation features from the attacker's point of view and identify 46 suspicious keywords in macro. Our method was evaluated using two datasets, dataset1 was a public dataset containing 2939 benign samples and 13734 malicious samples. Dataset2 comprises 2885 new samples from Virustotal, all of which were observed after the publication date of dataset1. 

 Results demonstrate that our proposed method outperforms the existing research, and delivers more consistent results in detecting unknown samples. Furthermore, ensemble multi-classifiers with distinct feature selection further improve the detection rate.

## 2.Dataset 
* dataset1
    - ds1.xls
* dataset2
    - ds2.xls

## 3.Source code of models
* RF
    - RF.ipynb
* MLP
    - MLP.ipynb
* SVM
    - SVM_sca.ipynb
* KNN
    - KNN.ipynb