Written by Jakub Rusinowski · Last updated June 15, 2026
These are the strongest local models that fit entirely in 16 GB of VRAM, ranked by capability, with the quantization level and estimated tokens/sec needed to fit.
| Qwen 2.5 Family — Qwen 2.5 14B Instruct | Q4_K_M · 9.5 GB · ~34 tok/s on AMD Radeon RX 9060 XT 16GB |
| Qwen 3.5 (Legacy Listing — Unverified) — Qwen 3.5 122B-A10B (MoE) | Q4_K_M · 13.5 GB · ~289 tok/s on AMD Radeon RX 9060 XT 16GB |
| Qwen 3 — Qwen 3 14B | Q4_K_M · 9.5 GB · ~34 tok/s on AMD Radeon RX 9060 XT 16GB |
| Codestral — Codestral 22B | Q4_K_M · 13 GB · ~25 tok/s on AMD Radeon RX 9060 XT 16GB |
| DeepSeek R1 — DeepSeek R1 Distill Qwen 14B | Q4_K_M · 9.2 GB · ~35 tok/s on AMD Radeon RX 9060 XT 16GB |
| Phi-4 Family — Phi-4 (14B) | Q4_K_M · 9.2 GB · ~35 tok/s on AMD Radeon RX 9060 XT 16GB |
| Qwen3-Coder — Qwen3-Coder 80B-A3B (MoE) | Q4_K_M · 7.5 GB · ~400 tok/s on AMD Radeon RX 9060 XT 16GB |
| InternLM 3 — InternLM 3 20B Instruct | Q4_K_M · 12.5 GB · ~26 tok/s on AMD Radeon RX 9060 XT 16GB |
| Qwen 3 — Qwen 3 30B-A3B (MoE) | Q4_K_M · 8 GB · ~400 tok/s on AMD Radeon RX 9060 XT 16GB |
| Granite 3.0 — Granite 3.0 8B Instruct | Q4_K_M · 5.5 GB · ~58 tok/s on AMD Radeon RX 9060 XT 16GB |
| Qwen 3.5 (Legacy Listing — Unverified) — Qwen 3.5 14B | Q4_K_M · 9.5 GB · ~34 tok/s on AMD Radeon RX 9060 XT 16GB |
| Qwen 2.5 Family — Qwen 2.5 7B Instruct | Q4_K_M · 4.8 GB · ~67 tok/s on AMD Radeon RX 9060 XT 16GB |
| Llama 4 — Llama 4 Scout 17B | Q4_K_M · 10.5 GB · ~195 tok/s on AMD Radeon RX 9060 XT 16GB |
| Qwen 3 — Qwen 3 8B | Q4_K_M · 5.5 GB · ~58 tok/s on AMD Radeon RX 9060 XT 16GB |
| DeepSeek R1 — DeepSeek R1 Distill Llama 8B | Q4_K_M · 5.8 GB · ~55 tok/s on AMD Radeon RX 9060 XT 16GB |
Qwen 2.5 Family, Qwen 3.5 (Legacy Listing — Unverified), Qwen 3, Codestral, DeepSeek R1 all fit in 16 GB VRAM.
AMD Radeon RX 9060 XT 16GB, NVIDIA GeForce RTX 5060 Ti 16GB, NVIDIA GeForce RTX 4060 Ti 16GB, AMD Radeon RX 7800 XT.