meta-llama/Llama-4-Maverick-17B-128E-Instruct¶

Model Information¶

meta-llama/Llama-4-Maverick-17B-128E-Instruct is a compact, instruction-tuned model developed by Meta as part of the LLaMA 4 "Maverick" series. This 17B parameter model, with 128 experts, is designed to deliver high performance at a fraction of the inference cost of larger models. It demonstrates strong generalization across multilingual, coding, and reasoning tasks, while being efficient enough for scalable deployment.

Model Developer: Meta
Model Release Date: July 2024
Supported Languages: English (primary), with broad multilingual generalization including French, Spanish, German, Portuguese, Japanese, Korean, and Hindi

Model Architecture¶

meta-llama/Llama-4-Maverick-17B-128E-Instruct uses a Mixture-of-Experts (MoE) architecture, enabling efficient compute utilization with high performance.

Key Features:

Model Type: Decoder-only Transformer
Parameter Count: 17B active parameters, 128 total experts
MoE Routing: Sparse activation (2 experts per token)
Context Length: Up to 32,000 tokens
Training Techniques:
- Instruction tuning on curated multi-task datasets
- Reinforcement Learning from Human Feedback (RLHF)
- Safety alignment and toxicity mitigation
Tokenizer: Extended version of LLaMA 3 tokenizer

The Maverick architecture is designed to combine the benefits of Mixture-of-Experts scalability with general-purpose reasoning, making it ideal for serving tasks in constrained compute environments.

Benchmark Scores¶

Category	Benchmark	Shots	Metric	LLaMA 4 Maverick 17B-128E Instruct
General	MMLU (CoT)	0	Acc. (avg)	86.5
	MMLU Pro (CoT)	5	Acc. (avg)	58.6
Steerability	IFEval	–	–	91.3
Reasoning	GPQA Diamond (CoT)	0	Accuracy	45.3
Code	HumanEval	0	Pass@1	83.7
	MBPP EvalPlus (base)	0	Pass@1	84.1
Math	MATH (CoT)	0	Sympy Score	58.3
Tool Use	BFCL v2	0	AST Macro Avg.	79.4
Multilingual	MGSM	0	EM (exact match)	76.8

LLaMA 4 Maverick 17B-128E sets a new benchmark for compute-efficient instruction-following models, offering near-flagship quality at smaller scale.

meta-llama/Llama-4-Maverick-17B-128E-Instruct¶

Model Information¶

Model Architecture¶

Benchmark Scores¶

References¶