GLM-4.7 / GLM-Z1 — Local AI Model by Zhipu AI

Zhipu AI's February 2026 MoE model. GLM-4.7 uses a 32B active MoE architecture and scores on par with Claude Opus 4.5 on hallucination leaderboards. The reasoning-focused GLM-Z1 variant posts near-GPT-5.2 scores on the Artificial Intelligence Index. Strong multilingual support with particular depth in Chinese.

Hardware Requirements

GLM-4.7 9BMin 6 GB VRAM · Q4_K_M · 128,000 ctx · ollama run glm4:9b
GLM-Z1 32B (Reasoning)Min 20 GB VRAM · Q4_K_M · 128,000 ctx · ollama run glm-z1:32b

How to Run Locally

Install Ollama then run: ollama run glm4:9b

Minimum VRAM: 6 GB. For best results use Q4_K_M quantization.