Can I Run Llama 3.2 Family on Apple M3 Max?

Name: LLM Configurator — GPU VRAM Checker
Author: LLM Configurator

Written by Jakub Rusinowski · Last updated September 25, 2024

Yes, comfortably — you'll have ~10 GB of headroom running Llama 3.2 90B Vision Instruct at Q4_K_M (54 GB, ~7 tok/s (est.)).

Affiliate disclosure: Some links on this page are affiliate links — if you buy through them, LLM Configurator may earn a commission at no extra cost to you. As an Amazon Associate, LLM Configurator earns from qualifying purchases.

Check price on Amazon — Apple MacBook Pro M3 Max

Apple M3 Max Specs

VRAM	64 GB unified memory
Memory Bandwidth	400 GB/s

Llama 3.2 Family Sizes That Fit the Apple M3 Max

Llama 3.2 90B Vision Instruct	Q4_K_M · 54 GB · ~7 tok/s (est.)
Llama 3.2 11B Vision Instruct	Q4_K_M · 7.8 GB · ~51 tok/s (est.)
Llama 3.2 3B Instruct	Q4_K_M · 2.2 GB · ~182 tok/s (est.)
Llama 3.2 1B Instruct	Q4_K_M · 0.8 GB · ~400 tok/s (est.)

Buy vs. rent Llama 3.2 Family

Buy the GPU

~$2,499

Apple M3 Max · MSRP

Rent by the hour

from $0.77/hr

A100 (80 GB) class

At 2 hrs/day, buying (~$2,499) beats renting at $0.77/hr after about 4.5 years.

Affiliate links — we may earn a commission if you sign up, at no extra cost to you.

Vast.ai $0.77/hr · typical low · varies

Rent on Vast.ai →

RunPod $1.39/hr