OpenAI has launched its o3-mini AI model, aiming to stay ahead in the AI race. This release comes in response to China’s DeepSeek R1 and is part of OpenAI’s o3-series announced in December last year.
The o3-mini-high model, available to ChatGPT Plus and Pro users, delivers top performance in several key areas. Here’s what makes it stand out.
The o3-mini-high model excels in coding, outperforming many AI competitors. When tested, it successfully created a Python snake game with autonomous snakes in one go, demonstrating high precision.
It has an Elo score of 2,130 on Codeforces, placing it among the world’s top 2,500 programmers. In the SWE-bench Verified benchmark, it scored 49.3% accuracy, surpassing OpenAI’s larger o1 model.
Math is another area where o3-mini-high shines. It scored 87.3% on the 2024 American Invitational Mathematics Examination (AIME), outperforming the larger o1 model.
In the FrontierMath benchmark, which features expert-level problems, it achieved 20% after eight attempts. Even in a single attempt, it scored 9.2%, significantly higher than other AI models, most of which struggle to hit 2%.
The o3-mini-high model excels in advanced science questions, scoring 79.7% on the GPQA Diamond benchmark. This test evaluates AI models on complex topics in biology, physics, and chemistry.
In comparison, Google’s Gemini 2.0 Flash Thinking scored 73.3%, while Claude 3.5 Sonnet reached only 65%. This makes o3-mini-high one of the best AI models for science-related queries.
Despite its smaller size, o3-mini-high performs well in general knowledge. In the MMLU benchmark, which tests AI models across multiple subjects, it scored 86.9%, coming close to GPT-4o’s 88.7%. While it doesn’t beat larger models, the upcoming full o3 model is expected to dominate this area.
Unlike many AI models, o3-mini can access the web, allowing it to retrieve and analyze the latest information.
While its knowledge cutoff is October 2023, web search keeps it updated. OpenAI’s competitor, DeepSeek R1, also offers this feature, but most other AI models do not.
Should You Upgrade to ChatGPT Plus?
Free ChatGPT users can access o3-mini, but it runs at medium reasoning effort, using less compute.
Upgrading to ChatGPT Plus ($20/month) unlocks o3-mini-high, which provides superior performance in coding, math, and science. For coders, researchers, and STEM students, this upgrade is well worth it.