# KNighter
**Repository Path**: wsljy2021/KNighter
## Basic Information
- **Project Name**: KNighter
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: Apache-2.0
- **Default Branch**: main
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2025-11-24
- **Last Updated**: 2025-11-24
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
#
KNighter: Transforming Static Analysis with LLM-Synthesized Checkers

## Table of Contents
- [About](#about)
- [Getting Started](#getting-started)
- [Docker Setup (Recommended)](#docker-setup-recommended)
- [Manual Environment Setup (Alternative)](#manual-environment-setup-alternative)
- [Running KNighter](#running-knighter)
- [Architecture Documentation](#architecture-documentation)
## About
**KNighter** is an innovative checker synthesis tool that leverages Large Language Models (LLMs) to automatically generate static analysis checkers from historical patch commits.
### Key Features
- **🤖 LLM-Powered Generation**: Automatically synthesizes static analysis checkers using state-of-the-art language models
- **📊 Multi-step Pipeline**: Employs a sophisticated generation → refinement → triage workflow for high-quality results
- **🔍 Historical Learning**: Learns from real-world patch commits to understand common bug patterns
- **⚡ LLVM Integration**: Built on top of LLVM for robust static analysis capabilities
- **🐧 Linux Kernel Focus**: Specialized for finding bugs in large-scale C/C++ codebases like the Linux kernel
The detected bugs 🐛 can be found [here](https://docs.google.com/spreadsheets/d/1WzUhbUK0JE9QahywsfEBGEy94-o5F1A0F921ljyPuJk/edit?usp=sharing).
> [!IMPORTANT]
> We are continuously improving the documentation and adding new features. Please stay tuned for updates.
## Getting Started
### Docker Setup (Recommended)
🐳 Docker Installation Options
#### Option 1: Docker Hub (Recommended)
```bash
docker pull knighterhub/knighter
```
#### Option 2: Build from Source
```bash
git clone https://github.com/ise-uiuc/KNighter.git KNighter
cd KNighter
docker build -t knighter .
```
🚀 Running the Container
```bash
# Pull from Docker Hub
docker run -it knighterhub/knighter
# Build from source
docker run -it knighter
```
⚙️ Environment Initialization
When running the container for the first time, initialize the environment:
```bash
cd /app
# This would take a while to download the dependencies and compile the LLVM
python3 scripts/init_docker.py
```
This downloads LLVM and Linux kernel source code into `/data/llvm` and `/data/linux`.
**API Key Configuration:**
```bash
echo 'openai_key: "YOUR_OPENAI_API_KEY"' > /app/llm_keys.yaml
```
### Manual Environment Setup (Alternative)
> **Note**: For detailed setup steps, refer to `scripts/init_docker.py` which contains the complete initialization process.
🔧 Manual Installation Steps
**Step 1: Install Dependencies**
Download and build [LLVM-18.1.8](https://github.com/llvm/llvm-project/releases/tag/llvmorg-18.1.8):
```sh
wget https://github.com/llvm/llvm-project/archive/refs/tags/llvmorg-18.1.8.zip
unzip llvmorg-18.1.8.zip
```
Git clone the Linux kernel source code:
```sh
git clone https://github.com/torvalds/linux.git
```
Install Python dependencies:
```sh
# Option 1: Using uv (recommended for faster installs)
curl -LsSf https://astral.sh/uv/install.sh | sh
source $HOME/.cargo/env
uv pip install -r requirements.txt
# Option 2: Using regular pip
pip3 install -r requirements.txt
git submodule update --init --recursive
```
**Step 2: Configuration Files**
Set up your `config.yaml` (see `scripts/init_docker.py` for reference):
```yaml
result_dir: "result-checkers"
LLVM_dir: "/PATH/TO/LLVM_DIR"
checker_nums: 10
linux_dir: "/PATH/TO/LINUX_DIR"
key_file: "llm_keys.yaml"
model: "o3-mini"
```
Set up the `llm_keys.yaml` file (see `llm_keys_example.yaml` for reference):
```yaml
openai_key: "sk-..."
claude_key: "sk-ant-..."
google_key: "AIza..."
deepseek_key: "sk-..."
# For local models (optional)
# In config, use "local:model_name" format to use local models
# Like "local:openai/gpt-oss-120b"
base_url: "http://localhost:8000/v1"
api_key: "dummy"
```
**Step 3: LLVM Setup**
```sh
python3 scripts/setup_llvm.py LLVM_PATH
```
## Running KNighter
### Quick Start (Docker)
For rapid evaluation, use the debug dataset:
```bash
cd /app/src
# Step 1: Generate checkers for debug commits
python3 main.py gen --config_file /app/config-generate.yaml --commit_file=/app/commits/commits-debug.txt
# Step 2: Refine generated checkers
python3 main.py refine --config_file /app/config-refine-debug.yaml /app/result-generate
# Step 3: Triage and analyze results
python3 main.py triage --config_file /app/config-triage-debug.yaml /app/result-refine-debug
```
📋 Pipeline Modes & Usage
**Available Operation Modes:**
| Mode | Purpose | Description |
|------|---------|-------------|
| `gen` | Generation | Generate new checkers from commit patches |
| `refine` | Refinement | Improve and validate generated checkers |
| `scan` | Scanning | Scan the kernel with validated checkers |
| `triage` | Analysis | Analyze and categorize scan results |
**Basic Usage (Manual Setup):**
```bash
cd src
python3 main.py --commit_file= --config_file=
```
**Example:**
```bash
python3 main.py gen --commit_file=../commits/commits-selected.txt --config_file=config.yaml
```
⚙️ Configuration Files
| File | Purpose | Key Parameters |
|------|---------|----------------|
| `config-generate.yaml` | Checker generation | `model`, `checker_nums`, `result_dir` |
| `config-refine.yaml` | Refinement process | `jobs`, `scan_timeout`, `scan_commit` |
| `config-triage.yaml` | Result analysis | Analysis parameters |
Modify these files to experiment with different parameters from the paper evaluation.
## Architecture Documentation
🏗️ System Architecture Overview
KNighter implements a multi-stage pipeline for automated checker synthesis:
1. **Commit Analysis**: Extract bug patterns from historical patches
2. **Checker Generation**: Use LLMs to synthesize static analysis checkers
3. **Refinement**: Validate and improve generated checkers through compilation and testing
4. **Deployment**: Apply refined checkers to target codebases
5. **Triage**: Analyze and categorize detected issues
For comprehensive architecture documentation, see [`ARCHITECTURE.md`](ARCHITECTURE.md).
---
**Citation**: If you use KNighter in your research, please cite our paper:
```bibtex
@inproceedings{knighter,
title = {KNighter: Transforming Static Analysis with LLM-Synthesized Checkers},
author = {Yang, Chenyuan and Zhao, Zijie and Xie, Zichen and Li, Haoyu and Zhang, Lingming},
year = {2025},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3731569.3764827},
doi = {10.1145/3731569.3764827},
booktitle = {Proceedings of the ACM SIGOPS 31st Symposium on Operating Systems Principles},
location = {Seoul, Republic of Korea},
series = {SOSP '25}
}
```