August Recap On "Into AI"
Here are the 5 interesting research papers that we discussed this month.
Hey there! 👋🏻
Thanks for being a curious subscriber to this publication. I’m glad you’re one of the individuals who love staying at the forefront of AI research and development.
August was a month marked by some amazing research publications!
Here are 5 of them that we discussed in depth this month.
Before we move forward, I wanted to let you know that my latest book, called “LLMs In 100 Images”, has become my best-read book of all time!
It is a collection of 100 easy-to-follow visuals that describe the most important concepts you need to master LLMs today.
Grab your copy today at a special early bird discount using this link.
Next up, the August recap!
1. A 27M Hierarchical Reasoning Model Beats OpenAI's 'o3-mini-high'
A small Singapore-based AI lab, founded in 2024, called Sapient Intelligence, published a new AI architecture called Hierarchical Reasoning Model (HRM).
With just 27 million parameters and 1,000 training samples, an HRM achieves near-perfect scores on complex Sudoku puzzles and maze path-finding, where models like o3‑mini‑high, Claude 3.7 8K, and DeepSeek‑R1 all score zero.
2. Top Vision Models Cannot Really See Our World
A new benchmark, known as the Turing Eye Test (TET), reveals that state-of-the-art multimodal large language models continue to struggle with accurately perceiving visual data.
We also learn about how Multi-modal LLMs work and are trained from the basics in this article.
3. You Don’t Need Normalization In Transformers Anymore
Meta researchers introduced Dynamic Tanh (DyT), a simple activation function that removes the need for normalization layers in Transformers.
DyT simplifies architectures, reduces complexity, and still matches or even improves performance across vision and language tasks, making it a clean and efficient alternative for future AI models.
We also learn about how Normalization works in depth, from scratch, in this article.
4. R-Zero: A Method For Training Reasoning LLMs With Zero Data Is Here
R-Zero is a framework that lets LLMs improve their reasoning skills without external training data.
By having one copy of the model generate challenging problems (the Challenger) and another attempt to solve them (the Solver), the system self-trains on tasks where performance hovers around 50%.
This self-play approach improves reasoning ability and generalizes across domains, paving the way for truly self-improving AI.
We also discuss Self-evolving LLMs from the basics in this article.
5. LLMs Pass On Generational Trauma Just Like Humans Do
This research by Anthropic and Truthful AI uncovers a surprising phenomenon called "Subliminal learning," where a teacher LLM unintentionally transmits personal traits or preferences to a student LLM even when the training data are unrelated to that trait.
This research explains how latent biases or misalignments can silently spread during model distillation, making it difficult for trained LLMs to remain fair, accurate, and aligned.
We also learn about Knowledge distillation vs. SFT in this article.
Alongside these research-based articles, I published an article discussing the 10 Must-Know Concepts For AI Engineers To Build Better Systems.
This will really help you understand what building production-ready AI systems really requires.
I also collaborated with the brilliant
on and wrote an article titled ‘LLMs: Common Terms Explained, Simply’ which describes modern LLMs from the absolute basics.This one is a must-read as well!
I plan to write more articles on RL and its role in training LLMs this month, and I am excited to publish them soon!
Thanks again for being a curious reader of ‘Into AI’. See you soon! 👋🏻