Skip to main content
  1. Blog/

AI Agents Are No Longer Assisting Scientists. They Are Doing the Science.

·8 mins· loading
Carles Abarca
Author
Carles Abarca
Writing about AI, digital transformation, and the forces reshaping technology.

In March 2026, something shifted. Not a bigger model. Not a higher benchmark. Something deeper: AI agents stopped being tools that help scientists and started producing scientific knowledge on their own.

This week, three events converged, and I believe they mark a tipping point with no return. The ShipSquad team documented it brilliantly in their analysis “AutoResearch, OpenClaw, Claude Opus 4.6: AI Agents Are Now Doing the Science”, and it inspired me to dig deeper into what this means for scientific research as we know it.

The facts
#

Andrej Karpathy released AutoResearch — a 630-line Python framework that lets an AI agent run hundreds of machine learning experiments autonomously on a single GPU. You let it run overnight. You wake up to a better model and a complete discovery log. In 48 hours, the agent found ~20 improvements no human had identified, cutting GPT-2 training time by 11%.

Anthropic’s Claude Opus 4.6 discovered 22 zero-day vulnerabilities in Firefox — 14 of them high-severity — in just two weeks. For context: those 14 represent nearly one-fifth of all serious Firefox vulnerabilities patched throughout 2025. An AI model matched months of human security research.

Sakana AI unveiled The AI Scientist v2 — an agentic system that generates hypotheses, designs experiments, executes them, analyzes results, and writes the complete scientific paper. The result: the first entirely AI-generated scientific paper accepted through peer review at an academic workshop.

Three events. The same week. The same conclusion: AI agents are now producing original knowledge.

AI has discovered before, but never on its own
#

To understand why this moment is different, trace the evolution:

Era 1: AI as microscope (2018-2023)
#

AI amplified human researchers’ capabilities. Humans asked the questions, designed the experiments, and AI processed data at impossible scales.

  • AlphaFold (2020-2024) predicted the structure of all 200 million known proteins. A problem that had resisted solution for 50 years. Hassabis and Jumper won the 2024 Nobel Prize in Chemistry for it. But the question — “can we predict protein structures?” — was formulated by humans.

  • Halicin (2020) — MIT researchers used AI to screen 100 million chemical compounds and discovered a new antibiotic capable of killing multi-drug-resistant bacteria, including the dreaded Acinetobacter baumannii. But the experimental design was human.

  • GNoME by DeepMind (2023) discovered 2.2 million new crystals, including 380,000 stable materials — multiplying by 10x everything humanity had found in the entire history of materials science. But the evaluation framework was designed by Google researchers.

Era 2: AI as colleague (2024-2025)
#

AI began proposing hypotheses and designing experiments, but under human supervision.

  • FunSearch by DeepMind (2024) used an LLM to discover new solutions to open problems in pure mathematics — the first time a language model made a genuine discovery in formal sciences.

  • Insilico Medicine got its molecule rentosertib — designed entirely by generative AI — through a Phase IIa clinical trial with positive results for idiopathic pulmonary fibrosis. From concept to human testing in under 30 months, when the traditional process takes 10-15 years.

  • MIT Antibiotics-AI Project (2025) moved from discovering existing antibiotics to designing entirely new molecules using generative AI capable of killing resistant bacteria. No longer searching for needles in a haystack; they are manufacturing new needles.

Era 3: AI as autonomous researcher (2026 →)
#

And now, in March 2026, we cross the threshold: AI formulates its own questions, designs its own experiments, executes them, and produces papers accepted by human reviewers. Not science fiction. It is AutoResearch, AI Scientist v2, and Claude Opus doing independent security research.

The Great Acceleration: when AI researches faster than humans can read
#

I considered making a safer prediction — forecasting the percentage growth of AI-authored scientific papers. Instead, here is the unfiltered version.

Before 2030, AI agents will have produced more verifiable scientific discoveries in materials science, drug discovery, and computational mathematics than all human researchers combined in those same disciplines.

This is not hyperbole. It is arithmetic.

Consider the numbers: GNoME discovered in weeks 380,000 stable materials that all of humanity took centuries to accumulate (barely 48,000 by 2023). AutoResearch runs 100 experiments per night — months of a PhD student’s work. And AI Scientist v2 can generate a complete scientific paper in hours, not months.

Now scale that. Not one agent, but thousands. Not one night, but every night. Not one domain, but all of them.

The Compound Acceleration phenomenon
#

What we are witnessing is not linear acceleration. It is compound acceleration: each agent discovery feeds the next research cycle. One agent discovers a new material → another simulates its properties → another designs applications → another writes the paper. In parallel. 24/7. No vacations, no ego, no departmental politics.

Human science operates at publication speed: one paper every 6-18 months. Agentic science will operate at computation speed: one discovery per minute.

Problems we will solve decades ahead of schedule
#

This acceleration is not just quantitative. There are problems we believed were decades away that research agents could solve much sooner:

  • Antibiotic resistance — AI is already designing new molecules against superbugs. With autonomous agents iterating 24/7 on thousands of variants, we could have a complete new generation of antibiotics before 2030. The WHO estimated that by then, superbugs would kill 10 million people per year. Agents could get there first — for the benefit of all humanity.

  • Nuclear fusion — The greatest challenge is controlling plasma. DeepMind already used AI to control plasma shape in the TCV tokamak. Autonomous agents simulating and optimizing millions of magnetic configurations could compress decades of experimental research into years.

  • Cellular aging — AlphaFold solved protein structures. The next frontier is understanding the interactions between proteins, genes, and cellular processes that cause aging. It is a problem of massive combinatorial complexity — exactly the kind of problem where agents excel.

  • New energy materials — GNoME already opened the door with 380,000 stable materials. Agents systematically exploring that space could find the room-temperature superconductor, the perfect battery electrolyte, or the catalyst that makes industrial carbon capture viable.

The question is no longer whether AI will surpass humans in scientific output. It is when. And my answer is: in many fields, it is already happening. We just have not updated our metrics to measure it.

The consequences nobody wants to discuss
#

1. The end of “batch research”
#

Today, a researcher can run perhaps 5-10 experiments per week. AutoResearch demonstrates that an agent runs 100 per night. When research shifts from a sequential human process to a continuous autonomous one, knowledge production will multiply by orders of magnitude.

2. The radical democratization of discovery
#

Karpathy proved that a single GPU and 630 lines of code are enough for autonomous research. A PhD student in Monterrey, Lagos, or Bangalore can now compete in knowledge production with a Stanford lab. The barrier is no longer budget; it is the ability to formulate good questions and direct agents with precision.

3. The “Orchestra Conductor” as the new scientific role
#

The scientist of the future will not be the one pipetting in the lab or writing code. It will be the one who “programs in Markdown” — who writes the precise instructions that guide squadrons of autonomous agents. This is exactly what Karpathy demonstrates with his program.md file: the future of directing research is not writing better code, but writing better programs in natural language.

This paradigm shift is not new. I explored it in my article “The Next AI Wave: Agents” — where I argued that autonomous agents represent a fundamental shift from assistive AI to autonomous execution. The same applies to science: the researcher becomes the conductor of research agent orchestras.

4. The peer review crisis
#

If AI Scientist v2 already generates papers accepted at workshops, how long until it produces papers accepted at top conferences? And how will we distinguish between a paper written by an agent and one written by a human? The peer review system, designed to evaluate human work, is unprepared for a world where papers are generated at industrial speed. We will need new evaluation frameworks — and possibly, AI agents reviewing the papers of other AI agents.

What this means for universities
#

The institutions that most quickly integrate autonomous research agents into their labs will lead knowledge production in the next decade. It is not about buying bigger GPUs. It is about:

  1. Training “research agent directors” — scientists who know how to formulate questions and direct AI squadrons.
  2. Building autonomous experimentation infrastructure — labs where agents can run thousands of experiments without continuous human oversight.
  3. Redefining authorship and intellectual property — if an AI agent discovers a molecule that cures a disease, who owns the discovery?
  4. Measuring scientific output differently — current indicators (papers, citations, h-index) are metrics designed for human speed. We need metrics that capture agentic velocity.

The race has started. And the window of opportunity to position yourself is now.

Conclusion
#

In just six years, AI went from predicting protein structures to discovering antibiotics, from designing materials to writing scientific papers, from finding known vulnerabilities to discovering zero-days. The trajectory is clear and accelerating exponentially.

AI agents will not replace scientists. They will make scientists who do not use them irrelevant. And that transition, unlike the one we lived through in the software industry, will not take a decade. It will take months.

Welcome to the era of the Great Scientific Acceleration. Those who understand it first will lead the science of the future. Those who ignore it will read about it in papers written by machines.


🤖 Go deeper with AI
#

Want to explore further? Click your favorite AI with a ready-made prompt:

On the scientific acceleration with AI agents: ChatGPT · Perplexity · Claude

On the future of peer review: ChatGPT · Perplexity · Claude

On the impact for universities: ChatGPT · Perplexity · Claude