Smarter with AI
Posts
MonDive#40: AI That Improves Itself

MonDive#40: AI That Improves Itself

How AutoResearch creates a self-improving AI loop through testing, scoring, and repeating

May 11, 2026

Welcome to the MonDive

Today in MonDive, we’re looking at AutoResearch, a new open-source project from Andrej Karpathy that has been getting a lot of attention in the AI world.

But this edition is not for AI researchers only.

We’ll break down the idea in plain language, show why people are excited about it, and look at how the same concept could be useful for emails, newsletters, landing pages, and other measurable work.

By the end, you’ll understand the AutoResearch mindset without needing to code or understand machine learning.

Alright, let’s dive in.

Free PPC & Pipeline Audit for Smarter with AI Subscribers

Why AutoResearch Matters

Most people use AI as an assistant.

They ask a question, get an answer, and then decide what to do next.

AutoResearch introduces a different way of working with AI.

Instead of only helping with one task at a time, it introduces the idea of AI systems that can keep improving a result through testing and feedback.

That matters because many useful improvements do not come from one perfect idea. They come from trying small changes, seeing what worked, and building on the better result.

That is the simple idea behind self-improving AI here: the system improves by testing, measuring, keeping the better version, and using that better version as the starting point for the next test.

For businesses, creators, developers, and researchers, this changes how improvement works. Instead of relying only on human guesses, teams can build systems that learn from results over time.

AutoResearch is still early and technical, but the mindset behind it is important for everyone:

AI is no longer only helping people think of ideas. It is starting to help test which ideas actually work.

What Is AutoResearch?

AutoResearch is an open-source project released by Andrej Karpathy, an AI researcher and former OpenAI and Tesla AI leader.

In Karpathy’s example, the system is used to improve a small AI model. The AI agent runs an experiment, measures the result, picks the better version, and then uses that winner as the starting point for the next test.

The idea works like a simple science process.

Start with a goal. Choose a metric. Run an experiment. Measure the result. Pick the winner. Then slightly change the winning version and test again.

That is the core of AutoResearch.

And this same loop does not have to stay inside AI model training.

The same concept can be used for cold emails, landing pages, ads, prompts, product descriptions, research reports, or any workflow where there is a clear result to improve.

The simple rule is:

If there is a clear number to measure and something the AI can safely change, an AutoResearch-style loop can be built around it.

How AutoResearch Works

AutoResearch works through a controlled experiment loop.

It starts with a baseline. This is the current version of whatever the system is trying to improve.

Then the AI creates one small change. In Karpathy’s original project, that change happens inside the model training setup. In simpler business use cases, the change could be an email subject line, a landing page headline, an ad hook, or a prompt.

After that, the system runs a short test and checks the result against the chosen metric.

If the new version performs better, the system keeps it.

If the new version performs worse, the system removes it and returns to the previous version.

Then the process starts again with another small change.

Over time, the system builds a record of what worked, what failed, and which direction produced better results.

Where AutoResearch Could Be Used

Now, let’s make the idea practical.
Here are three simple ways an AutoResearch-style loop could show up in everyday business and content work.

(Quick note: A metric is the number used to judge success, like reply rate, open rate, click rate, conversion rate, or booked calls.)

1. Cold Email Optimization

A cold email loop could test different email versions and measure which one gets more positive replies.

Instead of guessing which message sounds better, the system could compare the current email against a new version and keep the one that performs better.

It could test:

Opening lines
Offer angles
CTA wording
Follow-up messages

Here’s a simple example:

Email A says: “Quick question, are you looking for help with Google Ads?”
Email B says: “Many local businesses waste money on Google Ads because the wrong people click.”

The system sends both versions to similar audiences, checks which one gets more interested replies, and keeps the better version for the next test.

Metric: Positive reply rate
AI changes: Email copy and structure
Goal: More qualified replies

2. Newsletter Optimization

A newsletter loop could test different subject lines, intros, preview text, or content angles to see what readers respond to best.

For example:

Version A says: “The AI loop everyone is missing.”
Version B says: “How AutoResearch helps AI improve itself”

The system could compare open rate, click rate, or replies, then keep the stronger version as the starting point for the next edition.

It could test:

Subject lines
Preview text
Opening hooks
CTA placement

Metric: Open rate, click rate, replies
AI changes: Subject line, intro, preview text, section title
Goal: More readers opening, clicking, and engaging with the newsletter

3. Landing Page Testing

A landing page loop could test different versions of a page to see which one turns more visitors into customers, signups, or booked calls.

For example:

Version A says: “AI Tools for Better Marketing”
Version B says: “Stop Guessing Which Marketing Ideas Work”

The system could send traffic to both versions, measure which one gets more action, and keep the stronger version for the next test.

It could test:

Headlines
CTA buttons
Social proof
Offer positioning

Metric: Conversion rate
AI changes: Headline, CTA, offer, page section
Goal: More visitors taking action

The Beginner-Friendly Explanation

Best video to watch first if AutoResearch feels confusing.
Explains the core idea in simple language: AI runs tests, checks results, keeps what works, and repeats.
Shows why the loop needs a clear metric, so the AI knows what “better” means.
Helps readers understand where AutoResearch works well and where it can fail.

See How You Can Use It

Shows how AutoResearch can be used for real business experiments, not just AI model training.
Uses cold email as a simple example: test two email versions and keep the one that gets more replies.
Helps you understand the idea of a baseline and a challenger version.
Gives practical examples like email copy, landing pages, ads, newsletter subject lines, and product descriptions.

How did you feel about today’s MonDive?

Was this guide easy to follow?

Know someone who may be interested?

And that's a wrap on today's MonDive!

Reply

or to participate.