Peter Zhang
Mar 17, 2026 18:05
OpenAI releases GPT-5.4 mini and nano fashions with 2x sooner speeds and dramatically decrease prices, focusing on coding assistants and agentic AI methods.
OpenAI dropped its most cost-efficient fashions but on March 17, 2026—GPT-5.4 mini and nano—focusing on builders constructing latency-sensitive purposes the place the flagship mannequin’s horsepower turns into overkill.
The mini variant runs greater than twice as quick as GPT-5 mini whereas approaching the complete GPT-5.4’s efficiency on coding benchmarks. On SWE-Bench Professional, mini scored 54.4% in comparison with the flagship’s 57.7%—a slender hole that issues if you’re paying 75 cents per million enter tokens as a substitute of premium charges.
Nano goes even cheaper at $0.20 per million enter tokens and $1.25 per million output tokens. OpenAI positions it for classification, information extraction, and what they name “coding subagents”—smaller AI employees dealing with less complicated duties inside bigger methods.
The Subagent Play
This is the place this will get attention-grabbing for builders constructing agentic methods. OpenAI is explicitly pushing a tiered structure: let GPT-5.4 deal with planning and sophisticated judgment whereas mini or nano subagents execute narrower duties in parallel. Of their Codex platform, mini makes use of solely 30% of the GPT-5.4 quota.
The benchmark numbers again this up. Mini hit 72.1% on OSWorld-Verified for pc use duties—almost matching the flagship’s 75%—whereas nano dropped to 39%. Translation: mini can interpret screenshots and navigate interfaces nearly in addition to the massive mannequin, however nano should not contact these workflows.
The place Every Mannequin Matches
The efficiency unfold tells you precisely what OpenAI optimized for:
Mini excels at coding (54.4% SWE-Bench Professional, 60% Terminal-Bench 2.0) and tool-calling (93.4% on τ2-bench telecom duties). It helps a 400k context window with textual content and picture inputs, internet search, and performance calling.
Nano trades functionality for value effectivity. It scored 52.4% on SWE-Bench Professional and 46.3% on Terminal-Bench 2.0—respectable for a mannequin at one-quarter mini’s value level. However its long-context efficiency drops considerably, hitting simply 33.1% on the 128K-256K needle retrieval check.
Hebbia’s CTO Aabhas Sharma famous that mini “matched or exceeded aggressive fashions on a number of output duties and quotation recall at a a lot decrease value” whereas attaining “stronger supply attribution than the bigger GPT-5.4 mannequin.”
Availability
Mini is stay throughout the API, Codex, and ChatGPT. Free and Go customers can entry it by way of the Considering function; different tiers get it as a fee restrict fallback for GPT-5.4 Considering.
Nano stays API-only—a sign that OpenAI sees it primarily as infrastructure for builders quite than a consumer-facing product.
For groups operating high-volume AI workloads, the maths simply modified. The query is not whether or not to make use of smaller fashions anymore—it is determining which duties really need the flagship.
Picture supply: Shutterstock

