WizardLM Evol Instruct 70k — LLM instruction Dataset

Uses the 'Evol-Instruct' method to progressively rewrite simple instructions into increasingly complex and diverse ones using GPT-4. Results in a dataset where ~70k examples span from basic to graduate-level difficulty. WizardLM 13B outperformed Vicuna 13B on complex reasoning tasks using this data alone.

Dataset Details

ProviderWizardLM
Categoryinstruction
Size70k Rows
LicenseApache 2.0
Downloads800k
TagsSynthetic, Evol-Instruct, Complex-Reasoning, GPT-4
from datasets import load_dataset
ds = load_dataset("WizardLM/wizardlm-evol-instruct")

← All Datasets | Fine-Tuning Guide