Written by Jakub Rusinowski · Last updated June 15, 2026
Founder, LLM Configurator — AI educator & workshop leader on local LLM deployment
RuntimeError: No CUDA-capable device is detected
Your code or runtime can't find the GPU at all — torch.cuda.is_available() returns False, and anything that needs CUDA bails immediately. The card is physically there, but the software stack can't see it.
Unlike an out-of-memory error, this one is about visibility, not capacity. Something in the chain — NVIDIA driver, CUDA runtime, the framework's CUDA build, or an environment variable — is broken or mismatched, so the GPU never gets exposed to your process. Common causes: a missing/outdated driver, a CPU-only PyTorch install, a CUDA_VISIBLE_DEVICES set to empty, or running inside a container without GPU passthrough.
Start at the bottom of the stack. nvidia-smi talks directly to the driver — if it lists your card, the driver is fine and the problem is higher up (framework/toolkit). If nvidia-smi itself fails or shows nothing, install or update the NVIDIA driver first; nothing above it can work until this does.
nvidia-smi
A plain pip install torch can pull a CPU-only wheel that will never see the GPU. Reinstall the CUDA build that matches your driver. After installing, verify from Python that CUDA is now available.
# example: CUDA 12.1 build of PyTorch
pip install torch --index-url https://download.pytorch.org/whl/cu121
python -c "import torch; print(torch.cuda.is_available())"
If this variable is set to an empty string or a non-existent index, CUDA hides every GPU. Unset it (or set it to a valid index) and retry.
echo $CUDA_VISIBLE_DEVICES # if empty-string or wrong, fix it:
unset CUDA_VISIBLE_DEVICES
Containers don't get the GPU by default. You need the NVIDIA Container Toolkit installed on the host and the --gpus all flag on docker run, otherwise the container sees no CUDA device.
docker run --gpus all <image>
When the GPU is finally visible, the next wall people hit is memory — the model being too big for the card. That part is worth getting right up front: check which model and quant fit your VRAM so you don't trade a visibility error for an out-of-memory one.
Rarely. It is almost always a software/visibility issue — missing or mismatched driver, a CPU-only framework build, or an environment/container setting. Start with nvidia-smi: if that sees the card, the hardware is fine.
That points to a CPU-only or wrong-CUDA-version PyTorch install. Reinstall the CUDA build matching your driver, then check torch.cuda.is_available() again.
For most prebuilt frameworks (PyTorch, Ollama, llama.cpp releases) you only need a recent NVIDIA driver — the CUDA runtime ships with the package. You need the full toolkit mainly when compiling CUDA code yourself.