作者: Jakub Rusinowski · 最后更新: 2026年6月15日
Alibaba's April 2026 follow-up to Qwen 3.5. As of June 15, 2026 only two tiers have been released — a 27B dense model (Apr 21–22) and a 35B-A3B MoE model (Apr 16) — both Apache 2.0, with native text/image/video input and ~256K context (extensible to ~1M via YaRN). Qwen 3.6-Plus/Plus-Preview/Max-Preview exist but are proprietary, API-only, and not listed here.
| Qwen 3.6 27B | Min 16.8 GB VRAM · Q4_K_M · 262,144 ctx · ollama run qwen3.6:27b |
| Qwen 3.6 35B-A3B | Min 21 GB VRAM · Q4_K_M · 262,144 ctx · ollama run qwen3.6:35b-a3b |
Install Ollama then run: ollama run qwen3.6:27b
Minimum VRAM: 17 GB. For best results use Q4_K_M quantization.
Qwen 3.6 needs about 17 GB VRAM at Q4_K_M quantization for its smallest variant. Variants: Qwen 3.6 27B (16.8 GB, Q4_K_M); Qwen 3.6 35B-A3B (21 GB, Q4_K_M). On Apple Silicon, unified memory counts toward this requirement.
Yes — Qwen 3.6 runs on an RTX 4090 (24 GB) and other 24 GB cards such as the RTX 3090. Smaller variants also fit comfortably on 8–16 GB GPUs at Q4_K_M.
Q4_K_M is the best balance of quality and VRAM for Qwen 3.6 in most cases. Choose Q8_0 for near-lossless quality if you have spare VRAM, or smaller quants (Q3/Q2) only when memory is tight.
Install Ollama, then run: ollama run qwen3.6:27b. This downloads Qwen 3.6 and starts a local, OpenAI-compatible endpoint — no internet connection is needed after the initial download.