diff --git a/dllm-feature-introduce.md b/dllm-feature-introduce.md
new file mode 100644
index 0000000000000000000000000000000000000000..ee70a1be9a93f82331e2f6d48dcf390ecc9de85f
--- /dev/null
+++ b/dllm-feature-introduce.md
@@ -0,0 +1,92 @@
+# dllm
+
+stand for "distributed llm", aims at providing better tools for distributed vllm serving framework.
+
+## Build guide
+
+> **TL;DR**
+> 
+> ```
+> yum install python3-pip gcc g++ cmake spdlog-devel -y
+> pip install --upgrade pip
+> pip install --upgrade wheel setuptools ninja pybind11 chariot-ds
+> 
+> python3 setup.py bdist_wheel
+> ```
+ 
+### Build requires
+
+**build tools**
+
+* `gcc/g++/make/cmake`: can be installed by `yum install gcc g++ cmake -y`
+* `ninja`: can be installed by `pip install ninja`
+* `python/pip`: can be installed by `yum install python3-pip; pip install --upgrade pip;`
+* `wheel/setuptools`: can be installed by `pip install --upgrade pip wheel setuptools`
+
+> NOTE: upgrade setuptools is necessary in most of OS
+
+**dependencies**
+
+* `spdlog`: can be installed by `yum install spdlog-devel -y`
+* `pybind11`: can be installed by `pip install pybind11`
+* `chariot-ds`: can be installed by `pip install chariot-ds`
+* `ascend cann`: access https://www.hiascend.com/software/cann for installation
+
+### Build command
+
+```bash
+bash build.sh
+# or python3 setup.py bdist_wheel
+```
+
+## Install guide
+
+```bash
+pip install dist/dllm-*.whl
+```
+
+## Use guide
+
+### deploy dependencies
+
+> NOTE: After deploy chariot-ds, you need to set the envrionment `DS_WORKER_ADDR="{IP}:{PORT}"` on each node before start ray.
+
+1. chariot-ds: follow https://pypi.org/project/chariot-ds/
+2. Ray: follow https://docs.ray.io/en/latest/cluster/vms/user-guides/launching-clusters/on-premises.html#on-prem
+
+### deploy dllm
+
+use vllm-mindspore as an example, when use,
+
+* 1 Prefill instance, with parallel config: [TP: 4, DP: 4, EP: 16]
+* 1 Decode instance, with parallel config: [TP: 4, DP: 4, EP: 16]
+
+the command should be like:
+
+```bash
+dllm deploy \
+  --prefill-instances-num=1 \
+  --decode-instances-num=1 \
+  -ptp=4 -dtp=4 -pdp=4 -ddp=4 -pep=16 -dep=16 \
+  --prefill-startup-params="vllm-mindspore serve --model=/workspace/models/qwen2.5_7B --trust_remote_code --max-num-seqs=256 --max_model_len=1024 --max-num-batched-tokens=1024 --block-size=128 --gpu-memory-utilization=0.93" \
+  --decode-startup-params="vllm-mindspore serve --model=/workspace/models/qwen2.5_7B --trust_remote_code --max-num-seqs=256 --max_model_len=1024 --max-num-batched-tokens=1024 --block-size=128 --gpu-memory-utilization=0.93"
+```
+
+After deploy success, can access the localhost:8000 as a general openai api endpoint (which is fully compatible)
+
+```bash
+curl -X POST "http://127.0.0.1:8000/v1/completions" -H "Content-Type: application/json" -H "Authorization: Bearer YOUR_API_KEY" -d '{
+  "model": "/workspace/models/qwen2.5_7B",
+  "prompt": "Alice is ",
+  "max_tokens": 50,
+  "temperature": 0
+}'
+```
+
+### enable kv cache protect
+
+To prevent private data leakage, dllm support kv cache protect by encrypt kv cache data when transmitting between prefill and decode instance n PD disaggregated deployment
+
+Kv cache data is encrypt by sec-mask in parallel with inference to enhance encryption performance
+
+To enable kv cache protect, you need to set environment **before start Ray**: `ENABLE_KVC_PROTECT=True`