SmolLM2 — Local AI Model by HuggingFace

The smallest production-quality LLMs. HuggingFace's SmolLM2 models are designed to run on microcontrollers, phones, and browsers via WebAssembly. Despite tiny size, they show surprising intelligence thanks to careful data curation with the Smoltalk dataset.

Hardware Requirements

SmolLM2 1.7B InstructMin 1 GB VRAM · Q4_K_M · 8,192 ctx · ollama run smollm2:1.7b
SmolLM2 360M InstructMin 1 GB VRAM · Q4_K_M · 8,192 ctx · ollama run smollm2:360m

How to Run Locally

Install Ollama then run: ollama run smollm2:1.7b

Minimum VRAM: 1 GB. For best results use Q4_K_M quantization.