Autor: Jakub Rusinowski · Ostatnia aktualizacja: 15 czerwca 2026
Mistral AI's second-generation coding specialist released April 2026. The 123B Sparse variant scores 71.6% on SWE-bench Verified — the highest open-source score for code agent tasks at the time of release. Built for agentic software engineering: multi-file editing, repo navigation, and test-driven development. Apache 2.0 licensed.
| Devstral-2 123B | Min 68 GB VRAM · Q4_K_M · 128,000 ctx · ollama run devstral:123b |
| Devstral-2 22B | Min 13 GB VRAM · Q4_K_M · 128,000 ctx · ollama run devstral:22b |
Install Ollama then run: ollama run devstral:123b
Minimum VRAM: 13 GB. For best results use Q4_K_M quantization.
Devstral-2 needs about 13 GB VRAM at Q4_K_M quantization for its smallest variant. Variants: Devstral-2 123B (68 GB, Q4_K_M); Devstral-2 22B (13 GB, Q4_K_M). On Apple Silicon, unified memory counts toward this requirement.
Yes — Devstral-2 runs on an RTX 4090 (24 GB) and other 24 GB cards such as the RTX 3090. Smaller variants also fit comfortably on 8–16 GB GPUs at Q4_K_M.
Q4_K_M is the best balance of quality and VRAM for Devstral-2 in most cases. Choose Q8_0 for near-lossless quality if you have spare VRAM, or smaller quants (Q3/Q2) only when memory is tight.
Install Ollama, then run: ollama run devstral:123b. This downloads Devstral-2 and starts a local, OpenAI-compatible endpoint — no internet connection is needed after the initial download.