Best LLMs for 20 GB VRAM

作者: Jakub Rusinowski · 最后更新: 2026年6月15日

These are the strongest local models that fit entirely in 20 GB of VRAM, ranked by capability, with the quantization level and estimated tokens/sec needed to fit.

GPUs at This Tier

Ranked Models

Qwen 2.5 Family — Qwen 2.5 14B InstructQ4_K_M · 9.5 GB · ~84 tok/s on AMD Radeon RX 7900 XT
Qwen 3.5 (Legacy Listing — Unverified) — Qwen 3.5 122B-A10B (MoE)Q4_K_M · 13.5 GB · ~400 tok/s on AMD Radeon RX 7900 XT
Gemma 4 (Legacy Listing — Unverified) — Gemma 4 27B ⭐Q4_K_M · 14 GB · ~57 tok/s on AMD Radeon RX 7900 XT
Qwen 3 — Qwen 3 14BQ4_K_M · 9.5 GB · ~84 tok/s on AMD Radeon RX 7900 XT
Codestral — Codestral 22BQ4_K_M · 13 GB · ~62 tok/s on AMD Radeon RX 7900 XT
DeepSeek R1 — DeepSeek R1 Distill Qwen 14BQ4_K_M · 9.2 GB · ~87 tok/s on AMD Radeon RX 7900 XT
Phi-4 Family — Phi-4 (14B)Q4_K_M · 9.2 GB · ~87 tok/s on AMD Radeon RX 7900 XT
Qwen3-Coder — Qwen3-Coder 80B-A3B (MoE)Q4_K_M · 7.5 GB · ~400 tok/s on AMD Radeon RX 7900 XT
Gemma 3 — Gemma 3 27B InstructQ4_K_M · 16.5 GB · ~48 tok/s on AMD Radeon RX 7900 XT
Mistral Small 3.1 — Mistral Small 3.1 24BQ4_K_M · 14.5 GB · ~55 tok/s on AMD Radeon RX 7900 XT
Qwen 3.6 — Qwen 3.6 27BQ4_K_M · 16.8 GB · ~48 tok/s on AMD Radeon RX 7900 XT
InternLM 3 — InternLM 3 20B InstructQ4_K_M · 12.5 GB · ~64 tok/s on AMD Radeon RX 7900 XT
Mistral Family — Mistral Small 3 (24B)Q4_K_M · 14.5 GB · ~55 tok/s on AMD Radeon RX 7900 XT
Qwen 3 — Qwen 3 30B-A3B (MoE)Q4_K_M · 8 GB · ~400 tok/s on AMD Radeon RX 7900 XT
Granite 3.0 — Granite 3.0 8B InstructQ4_K_M · 5.5 GB · ~145 tok/s on AMD Radeon RX 7900 XT

FAQ

What LLMs run well with 20 GB VRAM?

Qwen 2.5 Family, Qwen 3.5 (Legacy Listing — Unverified), Gemma 4 (Legacy Listing — Unverified), Qwen 3, Codestral all fit in 20 GB VRAM.

Which GPUs have 20 GB VRAM?

AMD Radeon RX 7900 XT, AMD Radeon RX 7900 XTX, Apple M2, Apple M3.

← All VRAM Tiers | Check Your Hardware