Qwen 3.5 — Local AI Model by Alibaba Cloud

Alibaba's next-generation flagship family released February 2026. Qwen 3.5 introduces a hybrid Gated DeltaNet + MoE architecture delivering frontier performance at a fraction of the compute. The 9B variant outperforms 120B models on reasoning benchmarks while fitting on an 8 GB GPU. Native multimodal vision, 256K context, 201 languages, and Apache 2.0 license make it one of the most versatile open models available.

Hardware Requirements

Qwen 3.5 0.8BMin 1 GB VRAM · Q4_K_M · 256,000 ctx · ollama run qwen3.5:0.8b
Qwen 3.5 2BMin 3 GB VRAM · Q4_K_M · 256,000 ctx · ollama run qwen3.5:2b
Qwen 3.5 4BMin 4 GB VRAM · Q4_K_M · 256,000 ctx · ollama run qwen3.5:4b
Qwen 3.5 9BMin 7 GB VRAM · Q4_K_M · 256,000 ctx · ollama run qwen3.5:9b
Qwen 3.5 27BMin 17 GB VRAM · Q4_K_M · 256,000 ctx · ollama run qwen3.5:27b
Qwen 3.5 35B-A3BMin 20 GB VRAM · Q4_K_M · 256,000 ctx · ollama run qwen3.5:35b-a3b

How to Run Locally

Install Ollama then run: ollama run qwen3.5:0.8b

Minimum VRAM: 1 GB. For best results use Q4_K_M quantization.