Recursion Is The Next Scaling Law In AI
Englishto
Imagine an artificial intelligence model with only 7 million parameters, trained from scratch, that outperforms models hundreds of times larger and trained on the entire Internet in problems such as Sudoku or the famous Arc Prize tests. It seems impossible, doesn't it? Yet in 2025, two academic papers showed that it is no longer necessary to infinitely inflate the size of models to achieve better performance: the real breakthrough comes from recursion applied at the time of inference, that is, when the model is reasoning, not when it is training. What was thought about AI was clear: the larger the model, the more powerful it becomes. But this rule is crumbling. Recursive models, such as HRM and TRM, show that the real breakthrough comes not only from scale, but from how the model manages to "think in multiple steps" — recursively — during reasoning. Recursion, that is, calling oneself multiple times with the same set of rules, makes it possible to tackle problems that large LLMs only address superficially. Take Francois Chopard, one of the key players in this revolution. He tells of how, until 2016, the hope for AI was all pinned on RNNs: recursive models that, however, were limited by technical problems such as the famous "backpropagation through time", which drove the deepest networks crazy due to errors that accumulated or vanished. Then came the Transformers, which do everything in parallel during training and bypass these problems, but they pay a price: every time they have to reason, they have to "remember" the entire context — as if every time you read a page, you had to carry the entire Shakespeare novel with you. It seems powerful, but it actually blocks them on tasks where real chains of reasoning are needed, such as sorting a list or solving a Sudoku. There's an example you'll never forget: if you ask an LLM to sort a list with 31 items, but the model only has 30 levels of "depth", it simply can't do it. It's not a question of data; it's a structural barrier. That's why HRM and TRM make the difference. HRM, for example, takes inspiration from the human brain, where different parts work at different frequencies: there is the low level that handles quick details, and the high level that controls slower, deeper strategies. But the real magic lies in the external refinement ring, a kind of "loop" that allows the model to go over its own answers several times, improving them each time, without having to grow exponentially. And the trick is to get around the old curse of backpropagation through a technique called "deep equilibrium" and "truncated backpropagation": instead of propagating errors on all recursions, they stop at one point and start again, creating a kind of mini-batch but on the internal memory, not on the inputs. In practice, at each cycle, the model updates two types of memory: a local one, ZL, which works on the details, and a more global one, ZH, which keeps track of the overall picture. This scheme makes it possible to solve problems that LLMs face only with "hacks" such as the chain of thought, that is, having each reasoning written step by step, or delegating to external tools such as Python functions. But beware: even these shortcuts stop where human knowledge stops. If you want a model to discover a new algorithm — such as merge sort — without anyone ever teaching it, chain of thought is not enough. True recursion, on the other hand, can do it. The Sudoku example is clear: the recursive model can discover strategies never seen before, without the need to be guided step by step by human data. And there's more: TRM takes simplification to the extreme. It reduces the network layers to just one, goes from 27 to 7 million parameters, and yet its accuracy rises from 70% to 87% on tasks like Arc Prize. This flips the logic: it is no longer necessary to "just go bigger", but to "think more deeply". And there is a quote from Mel Mitchell, a researcher mentioned in the podcast, that captures the point: “It is sufficient, not necessary, to go bigger to improve. It is sufficient, not necessary, to add more recursion." The question that remains is: what happens if you really combine these two forces? If tomorrow you have gigantic models that can also reason recursively, the scale of what they can do will change again. Not everyone is convinced that drawing too much inspiration from biology is the right way to go: sometimes machine learning works better when it moves away from the human brain and adapts to computers – as demonstrated by the transition from AlexNet to VGG, where "neural" inspirations were abandoned in favor of simplicity that wins over GPUs. But the fact remains: recursion allows tiny models to beat giants, as long as the problem requires multi-step reasoning. Today, recursive models are task-specific — a TRM that can do Sudoku can't solve a maze, and vice versa. But as soon as a way is found to generalize this recursion, we will have agents capable of really reasoning "like thinking beings", not just imitating them. The phrase to remember is this: the next law of scale for AI will not just be "bigger is better", but "more recursive is better". If this perspective has changed the way you think about artificial intelligence, you can indicate it on Lara Notes with I'm In: it's not a like, it's your way of saying that this vision is now part of you. And if tomorrow you tell someone that a tiny model can beat a giant thanks to recursion, on Lara Notes you can tag the person with Shared Offline — so that conversation isn't lost. This episode of Decoded by Y Combinator saves you 34 minutes of listening.
0shared

Recursion Is The Next Scaling Law In AI