Can I Run Llama 3.2 Family on Apple M4 Max?

Name: LLM Configurator — GPU VRAM Checker
Author: LLM Configurator

Autor: Jakub Rusinowski · Ostatnia aktualizacja: 25 września 2024

Yes, comfortably — you'll have ~74 GB of headroom running Llama 3.2 90B Vision Instruct at Q4_K_M (54 GB, ~10 tok/s (est.)).

Ujawnienie afiliacyjne: Niektóre odnośniki na tej stronie to linki afiliacyjne — jeśli dokonasz zakupu za ich pośrednictwem, LLM Configurator może otrzymać prowizję bez dodatkowych kosztów dla Ciebie. Jako uczestnik programu Amazon Associates, LLM Configurator zarabia na kwalifikujących się zakupach.

Sprawdź cenę na Amazon — Apple Mac Studio M4 Max

Apple M4 Max Specs

VRAM	128 GB unified memory
Memory Bandwidth	546 GB/s

Llama 3.2 Family Sizes That Fit the Apple M4 Max

Llama 3.2 90B Vision Instruct	Q4_K_M · 54 GB · ~10 tok/s (est.)
Llama 3.2 11B Vision Instruct	Q4_K_M · 7.8 GB · ~70 tok/s (est.)
Llama 3.2 3B Instruct	Q4_K_M · 2.2 GB · ~248 tok/s (est.)
Llama 3.2 1B Instruct	Q4_K_M · 0.8 GB · ~400 tok/s (est.)

Buy vs. rent Llama 3.2 Family

Buy the GPU

~$3,499

Apple M4 Max · MSRP

Rent by the hour

from $0.77/hr

A100 (80 GB) class

At 2 hrs/day, buying (~$3,499) beats renting at $0.77/hr after about 6.3 years.

Affiliate links — we may earn a commission if you sign up, at no extra cost to you.

Vast.ai $0.77/hr · typical low · varies

Rent on Vast.ai →

RunPod $1.39/hr