Byte-Sized Intelligence July 31 2025

AI aces Olympiad Math; emergent skills in action

This week: AI hit gold at the world’s toughest Math competition; let’s break down what that means and explore emergent abilities in AI models.

AI in Action

When AI solve Olympiad Math like a human [Milestone/Reasoning]

In a breakthrough moment for AI reasoning, Google DeepMind’s Gemini just earned a gold medal score at the International Mathematical Olympiad (IMO) using its new Deep Think model. IMO is a global competition known for its complex, graduate level math problems. The model solved 5 out of 6 questions, reaching a score on par with the world’s top student competitors. This is a milestone that once seems out of reach for machines.

For years, AI models struggled with Olympiad math. Smaller systems, like GPT-2 or early GPT-3, failed to reason through multi step logic or produce valid proofs. Some previous approaches relied on symbolic solvers or code based workarounds. But Gemini did something fundamentally different, it solved every problem entirely in natural language. That means no calculators, no plugins, and no translating math into formal programming syntax. Instead, the model reasoned through each problem using plain English (and LaTeX-style formatting to represent formulas), writing out step by step explanations just like a human competitor might on paper. It shows that AI models can now now work through logic flexibly rather than relying on memorized patterns.

Gemini’s success came from its new Deep Think architecture, which explores multiple solution paths in parallel and chooses the best one. OpenAI, too, reached a similar milestone: its experimental O1-series model also achieved gold level performance, though it didn’t officially enter the competition. Independent IMO medalists reviewed its results, which relied on a different method by combining massive compute with consensus based reasoning.

These contrasting methods reflect two philosophies in AI: Gemini is focused on building new reasoning process, while Open AI is refining intelligence through  scale and alignment strategy. Both are impressive, and both suggest that advanced models are beginning to tackle problems in ways that look more like reasoning, not just pattern matching.

Bits of Brilliance

When AI start doing things you didn’t teach it [AI Model/Emergence]

One of the most surprising things about large AI models is that sometimes, they develop skills no one explicitly taught them. These are called emergent abilities, which are new capabilities that only appear once a model crosses a certain scale in size, compute, or training data. Smaller models fail completely, but suddenly, something just clicks.

For example, at scale, models start to show chain-of-thought reasoning, solving logic problems step by step. They also pick up code generation, writing and debugging scripts in multiple languages without formal programming training. And with the right prompts, some can simulate multi-step planning, like booking trips or managing schedules, mimicking tool use without actually using tools.

These abilities weren’t manually programmed. They emerged, unexpectedly, through scale. That’s what makes large models powerful, and unpredictable. Researchers still don’t fully understand why these tipping points happen, only that they tend to appear suddenly once a model becomes large enough. It’s both exciting and unsettling.

Emergent abilities suggest scaling can produce surprising flexible forms of intelligence, but they also raise deeper questions: If a model develops skills we didn’t directly teach it, how do we ensure it behaves as expected? And who gets access to these capabilities, especially as they become more powerful? Emergence sure is exciting, but it also challenges how we think about control, trust, and responsibility in AI.

Next week, we’ll look more closely at some of the most useful emergent skills. Why some models do it better than others, and what it means for future work.

Curiosity in Clicks

How does your chatbot solve Math questions? [Reasoning/Math]

This week, we saw AI models hit gold medal scores at the world’s top math competition. Want to see how your chatbot stacks up? Let’s try solving an Olympiad styled question with your chat bot.

“Solve this step by step like a math student: In a group of 100 people, each person shakes hands with every other person exactly once. How many handshakes happen in total? Explain how you get the answer. ”

Now, take it further, ask the same question again, but say: “Answer using only natural language, without equations or formulas.” Can it still reason its way through?

This is chain-of-thought reasoning in action, one of the emergent abilities that large models begin to develop once they scale.

Byte-Sized Intelligence is a personal newsletter created for educational and informational purposes only. The content reflects the personal views of the author and does not represent the opinions of any employer or affiliated organization. This publication does not offer financial, investment, legal, or professional advice. Any references to tools, technologies, or companies are for illustrative purposes only and do not constitute endorsements. Readers should independently verify any information before acting on it. All AI-generated content or tool usage should be approached critically. Always apply human judgment and discretion when using or interpreting AI outputs.