Supported Models#
Generative Models#
Model |
Support |
INT8(W8A8) |
AWQ(W4A16) |
GPTQ(WNA16) |
LoRA |
Tensor Parallel |
Expert Parallel |
Data Parallel |
Kunlun Graph |
|---|---|---|---|---|---|---|---|---|---|
Qwen2 |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
|
Qwen2.5 |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
|
Qwen3 |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
|
Qwen3-Moe |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
|
Qwen3-Next |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
|
MiMo-V2-Flash |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
||
Llama2 |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
|
Llama3 |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
|
Llama3.1 |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
||
gpt-oss |
✅ |
✅ |
✅ |
✅ |
✅ |
||||
GLM4.5 |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
||
GLM4.5Air |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
||
GLM4.7 |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
||
GLM5 |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
||
Kimi-K2 |
✅ |
- |
✅ |
- |
✅ |
✅ |
✅ |
||
DeepSeek-R1 |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
||
DeepSeek-V3 |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
||
DeepSeek-V3.2 |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
Multimodal Language Models#
Model |
Support |
INT8(W8A8) |
AWQ(W4A16) |
GPTQ(WNA16) |
LoRA |
Tensor Parallel |
Expert Parallel |
Data Parallel |
Kunlun Graph |
|---|---|---|---|---|---|---|---|---|---|
Qwen2-VL |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
|
Qwen2.5-VL |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
|
Qwen3-VL |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
|
Qwen3-VL-MoE |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
|
Qwen3-Omni-MoE |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
|
InternVL-2.5 |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
||
InternVL-3.5 |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
||
InternS1 |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |
✅ |