OpenAI Open Source Model (gpt-oss series). Source: OpenAI
OpenAI has released gpt-oss series, a new line of open-weight models built for strong reasoning, agentic tasks, and flexible developer use. There are two variants of these models:
gpt-oss-120b — with 117B parameters with 5.1B active parameters and suits for production, general purpose, high reasoning use cases that fit into a single H100 GPU.
gpt-oss-20b — with 21B parameters with 3.6B active parameters and fits for low latency, and local or specialized use cases.
Apache 2.0 License: Use, modify, and sell your projects freely without legal worries.
Custom reasoning levels: Choose the model reasoning level (low, medium, high) based on your scenario and latency needs.
Full chain-of-thought: Full access to how the model reasons, which helps with debugging and understanding its outputs.
Fine-tuning: Train the models further to fit your specific needs.
Built-in tools: Include function calling, web brwosing, Python code execution and structured outputs.
Native MXFP4 quantization: The models use a memory-efficient format (MXFP4), therefore allowing gpt-oss-120b to run on a single H100 GPU and gpt-oss-20b to run in just 16GB of memory.
NOTE
gpt-oss-120b and gpt-oss-20b were trained on harmony response format (the only format should be used).
The harmony response format is made up of “messages” and the model may generate several messages at once. Each message generally follows this structure:
md
<|start|>{header}<|message|>{content}<|end|>
Here’s an example of how special tokens are used in the harmony message format for chat conversations. For more use cases, see the OpenAI Harmony Response Format.
<|channel|>analysis<|message|>User asks: "What is 2 + 2?" Simple arithmetic. Provide answer.<|end|><|start|>assistant<|channel|>final<|message|>2 + 2 = 4.<|return|>
The output includes a message in the analysis channel for the model's chain-of-thought reasoning. Then it switches to the final channel and ends with <|return|> when the final answer is generated.
You can install it directly from PyPI if you want to try of the code.
md
# if you just need the toolspip install gpt-oss# if you want to try the torch implementationpip install gpt-oss[torch]# if you want to try the triton implementationpip install gpt-oss[triton]