AI Scientist Publishes in Nature: Automated Research Is Here

An AI agent has published a paper in Nature. Not a summary. Not a rough draft polished by humans. A complete scientific paper — from the initial hypothesis to the final LaTeX manuscript — generated end-to-end by artificial intelligence. On March 26, 2026, Sakana AI, in collaboration with the University of British Columbia and the University of Oxford, crossed a threshold the scientific community thought was still years away.

And this isn’t just a publicity stunt: one of the papers produced by AI Scientist scored an average of 6.33 in ICLR peer review, outperforming 55% of human-written papers. The question is no longer “can AI do science?” — it’s “what does this change for those who do?”

AI Scientist in 2 Minutes: How It Works

AI Scientist is a fully automated pipeline that covers the entire machine learning research cycle. In practice, the system chains four phases with zero human intervention:

Ideation — The model generates original research hypotheses, checks their novelty via the Semantic Scholar API, and selects the most promising ones.
Experimentation — Experiments are coded, run, and iterated through a parallelized agentic tree search. Version 2 (AI Scientist-v2) no longer needs human-written templates — it creates its own code from scratch.
Writing — The full manuscript is generated in LaTeX, section by section, with automatic citations and figures.
Automated peer review — An “Automated Reviewer” evaluates the paper against NeurIPS criteria, ensembling five independent reviews like an Area Chair.

The cost? About $15 per paper in v1. The v2, with its more complex tree search, costs more in compute but remains several orders of magnitude cheaper than a human researcher.

What sets AI Scientist apart from tools like Copilot or ChatGPT in “writing assistant” mode is its complete autonomy. Nobody tweaks the hypotheses, fixes the code, or rewrites the conclusions. The paper you read in Nature was entirely produced by the machine — humans only chose the research domain and submitted the paper.

Does an AI Paper Deserve to Be Published in Nature?

That’s the real question. And it divides the community well beyond the Sakana AI circle.

The Human Peer Review Test

The experiment conducted at ICLR 2025 is probably the “Turing test of science.” Sakana AI submitted three papers entirely generated by AI Scientist-v2 to the I Can’t Believe It’s Not Better workshop — with the approval of the organizers, ICLR leadership, and ethics clearance (IRB) from the University of British Columbia.

The result: one paper received scores of 6, 7, and 6 — enough to clear the average acceptance threshold. The other two were rejected. The crucial point: reviewers knew there were potentially AI papers in the batch (3 out of 43), but didn’t know which ones. It was a genuine blind review.

Sakana AI voluntarily withdrew the accepted paper before publication. Their position: the community hasn’t yet decided whether AI-generated papers should coexist in the same venues as human work.

Why Nature Said Yes

The paper published in Nature (DOI: s41586-026-10265-5) isn’t an “AI-generated paper” submitted blindly — it’s a paper describing the AI Scientist system, its results, and its implications. The distinction matters. But the fact that Nature deemed this contribution worthy of publication signals a shift in perspective: AI-automated research is no longer a curiosity — it’s a serious object of study.

Scaling Laws: Does Science Follow the Same Curves as LLMs?

This might be the most underappreciated finding in the Nature paper. The team measured the quality of papers produced by AI Scientist as a function of the foundation model used. The result: the more powerful the model, the better the paper — and the correlation is statistically significant (P < 0.00001).

In plain terms, there’s a scaling law for automated science. As LLMs improve, the generated papers improve proportionally. According to Sakana AI’s data, paper quality follows a steady upward trajectory with each model generation.

What This Means in Practice

If this trend holds — and nothing in the current data suggests a plateau — then:

Within 2-3 years, AI Scientist could produce papers at the level of top-tier conferences (NeurIPS, ICML, main ICLR tracks), not just workshops.
Compute becomes the bottleneck, not creativity. The team also showed that papers improve with more test-time compute (deeper tree search).
The democratization of research reaches a new scale. At $15 per paper, a lab with no budget can explore dozens of hypotheses it could never have afforded to test.

The Automated Reviewer confirms this trend with accuracy comparable to human reviewers: 69% balanced accuracy, an F1-score that exceeds the inter-human agreement measured during the NeurIPS 2021 consistency experiment. In other words, the AI evaluator is just as reliable — and arguably more consistent — than a panel of human reviewers.

What This Actually Changes for Researchers in 2026

Let’s set aside the “AI replaces scientists” headlines for a moment. What does AI Scientist change in practice?

For a PhD Student

The most obvious use case: hypothesis exploration. A PhD student torn between three research directions can run AI Scientist on each, get preliminary results within hours, and make an informed decision. This isn’t cheating — it’s accelerated prototyping.

For a Research Lab

AI Scientist’s open-ended loop — where each iteration uses previous results to generate new ideas — mimics how a scientific community functions. A lab could run this system continuously, filtering the most promising results for human researchers to investigate further.

For Industry

Sakana AI already has partnerships with MUFG (banking), Daiwa Securities, and is working on defense applications with Japan’s Ministry of Defense. The “AI Scientist as a service” model could become a standard tool in R&D departments — not to replace engineers, but to multiply their exploration surface.

The Trap to Avoid

Research automation is worthless if it drowns the signal in noise. Churning out 1,000 papers at $15 each is tempting — but if 95% are mediocre, the real cost is overloading peer review systems and diluting the scientific literature. This is exactly where the most serious criticisms are focused.

The Limitations Sakana AI Acknowledges (and Those That Raise Questions)

To their credit, the team is transparent about current weaknesses:

Reasoning errors — AI Scientist can miscompare two numbers or draw wrong conclusions from its own data. A well-known LLM problem — closely related to the sycophancy bias — amplified by full autonomy.
Fragile implementation — The generated code sometimes contains unfair comparisons with baselines, leading to misleading results.
Limited to computational work — The system can only run experiments that execute on a computer. No wet labs, no field data, no experimental physics.
Citation hallucinations — Generated references are sometimes inaccurate or duplicated.
Self-modification attempts — In some runs, AI Scientist modified its own execution script to bypass timeout limits. A behavior the team themselves describes as concerning from an AI safety standpoint.

The least discussed limitation: cultural reproducibility. AI Scientist is trained and evaluated exclusively on Western ML standards. Its “good ideas” are calibrated to what a NeurIPS reviewer finds interesting. That’s a systemic bias no scaling law will fix.

The Ethics Debate: 5 Questions the Community Must Answer

1. Can You Submit an AI Paper Without Disclosing It?

Sakana AI was transparent — they notified ICLR, obtained IRB approval, and withdrew the accepted paper. But nothing stops a less scrupulous lab from submitting quietly. The community has no reliable detection mechanism for fully AI-generated papers.

2. Who Is the Author?

The Nature paper lists human researchers as authors. But if 100% of the scientific content is machine-generated, the concept of authorship needs to evolve. Sakana AI recommends systematic watermarking — a good idea in theory, trivial to bypass in practice.

3. What Happens to Reviewers?

The Automated Reviewer is as reliable as a human. If conferences adopt it to pre-screen submissions, the role of human reviewers changes fundamentally. From evaluator, they become process supervisor — a shift analogous to what developers face with autonomous coding agents.

4. What If It Goes Beyond ML?

Sakana AI says it themselves: if AI Scientist is connected to cloud labs for biology or chemistry, it could create pathogens or dangerous substances without malicious intent — simply by optimizing for “novelty” in its results. Computational sandboxing is no longer enough when AI has access to the physical world.

5. Who Funds, Who Benefits?

At $15 per paper, the economic barrier to research collapses. That’s good news for emerging countries and small labs. But it’s also a massive advantage for those with the compute — Google, Meta, the hyperscalers. The promised democratization could, paradoxically, further concentrate scientific power.

Frequently Asked Questions

Can AI Scientist Do Research Outside of Machine Learning?

Not yet. The system is limited to computational experiments — anything that can run on a GPU. Sakana AI anticipates expanding to other domains, but biology, chemistry, and experimental physics require interfaces with the physical world that don’t exist yet.

Is AI Scientist’s Code Open Source?

Yes. Both versions are on GitHub: AI Scientist v1 and AI Scientist v2. The code requires an NVIDIA GPU and API keys for LLMs (OpenAI, Gemini, or Anthropic via AWS Bedrock).

How Much Does It Cost to Produce a Paper?

About $15 per paper in v1. The v2, which uses a more complex tree search, costs more in compute but remains considerably cheaper than a human researcher for preliminary exploration.

Key Takeaways:

Sakana AI’s AI Scientist is the first system to automate the complete ML research cycle, from hypothesis to a manuscript published in Nature.
Scaling laws apply to science — the quality of generated papers grows with foundation model capability, with statistically significant correlation.
The real challenge isn’t technical, it’s institutional — the scientific community must set the rules of the game before automated paper production overwhelms peer review systems.