openai/gpt-oss-20b¶

Model Information¶

openai/gpt-oss-20b is a mid-sized, open-weight model in OpenAI’s gpt-oss family, created to balance reasoning strength, adaptability, and deployment efficiency. It is engineered to run smoothly on commonly available hardware while still supporting advanced features like chain-of-thought prompting, configurable reasoning levels, and native tool-use integration.

This model is particularly well-suited for developers and researchers seeking a powerful yet cost-efficient foundation for production workloads, fine-tuning, and experimentation without requiring large-scale infrastructure.

Model Developer: OpenAI
Model Release Date: 2025
Supported Languages: English (primary), with generalization across multiple languages

Model Architecture¶

The openai/gpt-oss-20b is structured as a sparse Mixture-of-Experts (MoE) Transformer, optimized to deliver strong reasoning ability without the heavy infrastructure demands of very large models. By activating only a small number of experts per token, it balances efficiency and adaptability, making it well-suited for research, prototyping, and production in environments with limited GPU capacity.

Type: Decoder-only Transformer (MoE)
Total Parameters: 20B (~2.5B active per token)
Layers: 24, with 64 experts per layer (2 active per token)
Context Length: Up to 64K tokens
Attention: Multi-Head Self-Attention with Rotary Position Embeddings (RoPE)
Quantization: MXFP4 post-training, deployable on 80 GB GPUs (e.g., NVIDIA A100/H100, AMD MI300X)
Training Format: Harmony response format (supports structured, reliable outputs)
Reasoning Levels: Adjustable — low, medium, high
Core Capabilities: Function calling, tool integration, Python execution, structured outputs
Fine-tuning: Supported on a single 80 GB GPU node
License: Apache 2.0

Benchmark Scores¶

Category	Benchmark	Metric (Low / Med / High)	gpt-oss-20b
General Knowledge	MMLU (no tools)	Accuracy	75.2 / 80.5 / 84.1
Competition Math	AIME 2024 (no tools)	Accuracy	41.8 / 63.4 / 78.9
Competition Math	AIME 2024 (with tools)	Accuracy	59.7 / 77.5 / 88.3
Competition Math	AIME 2025 (no tools)	Accuracy	39.1 / 62.0 / 75.4
Competition Math	AIME 2025 (with tools)	Accuracy	58.2 / 80.3 / 89.5
Science Reasoning	GPQA Diamond (no tools)	Accuracy	55.9 / 61.2 / 68.7
Science Reasoning	GPQA Diamond (with tools)	Accuracy	57.0 / 62.1 / 70.1
Programming	Codeforces (no tools)	Elo	1422 / 1820 / 2050
Programming	Codeforces (with tools)	Elo	1489 / 1930 / 2167
Health Domain	HealthBench	Accuracy	47.3 / 50.1 / 52.9

The model balances efficiency and reasoning power, showing strong gains when combined with tool use across math, science, and programming domains.

openai/gpt-oss-20b¶

Model Information¶

Model Architecture¶

Benchmark Scores¶

References¶