Tiny Recursive Model (TRM): A Deep Dive

Deep dive into the architecture and inner workings of the 7M parameter Tiny Recursive Model (TRM) that beats the most advanced reasoning LLMs on complex problems.

Oct 16, 2025

∙ Paid

A new class of AI models is emerging. It’s called the Recursive Reasoning Model.

The architecture of the earliest recursive reasoning model, called the Hierarchical Reasoning Model (HRM), was published by Sapient Intelligence in early 2025.

The biologically inspired HRM consists of two interdependent neural networks operating at different frequencies (one updating faster than the other).

With just 27 million parameters, it outperforms powerful LLMs on complex tasks, such as solving challenging Sudoku puzzles, finding optimal paths in large mazes, and on ARC-AGI, when trained using only 1,000 examples.

While these results were impressive enough, new research published by a researcher at Samsung SAIT AI Lab has improved HRMs to reach even better performance.

Their newly proposed Tiny Recursive Model (TRM) uses a single small network with only two layers.

Architecture of the Tiny Recursive Model (TRM)

With only 7 million parameters, the TRM achieves a test accuracy of 45% on ARC-AGI-1 and 8% on ARC-AGI-2. (If you’re new to them, ARC-AGI benchmarks are specifically designed to act as a helpful signal for reaching AGI.)

This is a better result than most advanced reasoning LLMs available today (including Deepseek R1, o3-mini, and Gemini 2.5 Pro), achieved with less than 0.01% of their parameters.

Here is a story where we understand the architecture and inner workings of the Tiny Recursive Model (TRM) and explore the reasons why it beats most advanced reasoning LLMs available to us today.

Let’s begin!

Before we start, I want to introduce you to my new book called ‘LLMs In 100 Images’.

It is a collection of 100 easy-to-follow visuals that describe the most important concepts you need to master LLMs today.

Grab your copy today at a special early bird discount using this link.

But First, How Do LLMs Reason?

Reasoning in LLMs is a popular area of AI research.

To ensure that LLMs can reliably answer complex queries, they use a technique called Chain-of-Thought (CoT) reasoning. CoT imitates human reasoning by having the LLM produce step-by-step reasoning traces before giving its final answer.

Using CoT involves generating more tokens during inference (called Inference-time scaling), which in turn means using more inference compute. Generating high-quality CoT traces also means training the LLM to do so by using high-quality training datasets using expensive RL techniques.

Despite their extensive use, CoT-based LLMs still fail on benchmarks like ARC-AGI. As an example, while humans can fully solve the tasks in ARC-AGI-2, OpenAI’s o3 (high) achieves merely 6.5% accuracy on it.

A ray of hope emerged with the introduction of the Hierarchical Reasoning Model (HRM) in early 2025, which, with only 27 million parameters and using only 1000 training samples, achieved exceptional performance on complex reasoning tasks, including ARC-AGI.

HRM outperforms all other reasoning LLMs on complex Sudoku puzzles, optimal pathfinding in large mazes, and ARC-AGI benchmarks. (Image from the ArXiv research paper titled ‘Hierarchical Reasoning Model’)

To understand Tiny Recursive Models, you’ll have to understand HRMs well. Let’s learn about them in depth before proceeding further.

What Is The Hierarchical Reasoning Model (HRM)?

The HRM architecture is inspired by the human brain and consists of four components:

Input network (f(I)), which converts a given input into embeddings, passing it to the low-level module
A faster, low-level module (f(L)) for detailed computations (the “Worker” module)
A slower, high-level module (f(H)) for abstract, deliberate reasoning (the “Controller” module)
Output head (f(O)), which gets the output from the high-level module and produces a final output

Both low and high-level modules follow the 4-layer Transformer architecture, with:

RMSNorm
No bias in linear layers (following the PaLM architecture)
Rotary embeddings, and
SwiGLU activation function

An HRM uses two interdependent neural networks that work at different paces, similar to how the brain processes information when solving problems. (Image from the ArXiv research paper titled ‘Hierarchical Reasoning Model’)

Understanding A Forward Pass In An HRM

Given an input x̃, the high-level and low-level modules start with their initial latent vectors (z(H) and z(L), respectively) and recursively update them.

The low-level module updates z(L) at a higher frequency, and the high-level module updates z(H) at a lower frequency.

After recursion, the latent vector of the high-level module z(H) is used to reach the final answer ŷ using the output head.

Mathematically, a forward pass of HRM looks as follows:

Let’s understand this step by step.

Into AI

Tiny Recursive Model (TRM): A Deep Dive

Deep dive into the architecture and inner workings of the 7M parameter Tiny Recursive Model (TRM) that beats the most advanced reasoning LLMs on complex problems.

But First, How Do LLMs Reason?

What Is The Hierarchical Reasoning Model (HRM)?

Understanding A Forward Pass In An HRM

This post is for paid subscribers