Skip to content

meta-llama/Llama-4-Maverick-17B-128E-Instruct

Model Information

meta-llama/Llama-4-Maverick-17B-128E-Instruct is a compact, instruction-tuned model developed by Meta as part of the LLaMA 4 "Maverick" series. This 17B parameter model, with 128 experts, is designed to deliver high performance at a fraction of the inference cost of larger models. It demonstrates strong generalization across multilingual, coding, and reasoning tasks, while being efficient enough for scalable deployment.

  • Model Developer: Meta
  • Model Release Date: July 2024
  • Supported Languages: English (primary), with broad multilingual generalization including French, Spanish, German, Portuguese, Japanese, Korean, and Hindi

Model Architecture

meta-llama/Llama-4-Maverick-17B-128E-Instruct uses a Mixture-of-Experts (MoE) architecture, enabling efficient compute utilization with high performance.

Key Features:

  • Model Type: Decoder-only Transformer
  • Parameter Count: 17B active parameters, 128 total experts
  • MoE Routing: Sparse activation (2 experts per token)
  • Context Length: Up to 32,000 tokens
  • Training Techniques:
    • Instruction tuning on curated multi-task datasets
    • Reinforcement Learning from Human Feedback (RLHF)
    • Safety alignment and toxicity mitigation
  • Tokenizer: Extended version of LLaMA 3 tokenizer

The Maverick architecture is designed to combine the benefits of Mixture-of-Experts scalability with general-purpose reasoning, making it ideal for serving tasks in constrained compute environments.


Benchmark Scores

Category Benchmark Shots Metric LLaMA 4
Maverick 17B-128E Instruct
General MMLU (CoT) 0 Acc. (avg) 86.5
MMLU Pro (CoT) 5 Acc. (avg) 58.6
Steerability IFEval 91.3
Reasoning GPQA Diamond (CoT) 0 Accuracy 45.3
Code HumanEval 0 Pass@1 83.7
MBPP EvalPlus (base) 0 Pass@1 84.1
Math MATH (CoT) 0 Sympy Score 58.3
Tool Use BFCL v2 0 AST Macro Avg. 79.4
Multilingual MGSM 0 EM (exact match) 76.8

LLaMA 4 Maverick 17B-128E sets a new benchmark for compute-efficient instruction-following models, offering near-flagship quality at smaller scale.


References