GPT-oss 120B — Local AI Model by OpenAI

Written by Jakub Rusinowski · Last updated July 30, 2026

OpenAI's open-weight 120B dense model released April 2026 under Apache 2.0 — their first substantial open-source release since GPT-2. Matches GPT-4o on most benchmarks at 65 GB Q4, making it the highest-quality open model at the workstation tier. Signals OpenAI's strategic shift toward supporting on-premise enterprise deployments.

Hardware Requirements

GPT-oss 120B Min 73 GB VRAM · Q4_K_M · 128,000 ctx · ollama run gpt-oss:120b

Recommended GPU

The cheapest GPU that runs GPT-oss 120B locally (min 73 GB VRAM) is the AMD Ryzen AI Max+ 395 (96 GB).

Affiliate disclosure: Some links on this page are affiliate links — if you buy through them, LLM Configurator may earn a commission at no extra cost to you. As an Amazon Associate, LLM Configurator earns from qualifying purchases.

GMKtec EVO-X2 (Ryzen AI Max+ 395, 128GB)

Launch MSRP: $2,349

2026 prices are volatile — check the current listing.

Check price on Amazon

How to Run Locally

Install Ollama then run: ollama run gpt-oss:120b

Minimum VRAM: 73 GB. For best results use Q4_K_M quantization.

GPT-oss 120B — Frequently Asked Questions

How much VRAM does GPT-oss 120B need?

GPT-oss 120B needs about 73 GB VRAM at Q4_K_M quantization for its smallest variant. Variants: GPT-oss 120B (73 GB, Q4_K_M). On Apple Silicon, unified memory counts toward this requirement.

Can I run GPT-oss 120B on an RTX 4090 (24 GB)?

GPT-oss 120B's smallest variant needs about 73 GB, which exceeds a single RTX 4090 (24 GB). Use multiple GPUs, a higher-VRAM card, or Apple Silicon with large unified memory.

What quantization should I use for GPT-oss 120B?

Q4_K_M is the best balance of quality and VRAM for GPT-oss 120B in most cases. Choose Q8_0 for near-lossless quality if you have spare VRAM, or smaller quants (Q3/Q2) only when memory is tight.

How do I run GPT-oss 120B with Ollama?

Install Ollama, then run: ollama run gpt-oss:120b. This downloads GPT-oss 120B and starts a local, OpenAI-compatible endpoint — no internet connection is needed after the initial download.