Modern large language models are trapped in a novelty paradox: everything in their output that sounds convincing and accurate is deeply derivative, while anything claiming to be original turns out to be a hallucination. Richard Sutton, a founding father of reinforcement learning and Turing Award winner, states bluntly: mainstream generative AI lacks the "engine" required for real scientific breakthroughs. In Sutton’s view, systems designed to predict the next token are merely mirrors of their training data. They masterfully mimic patterns but are organically incapable of verifying results or developing their own ideas. This sets a hard ceiling on the technology: it can summarize the past, but it cannot systematically construct the future of science.
The Failure of Imitation
Sutton points to a fundamental limitation of current AI. LLMs, image generators, and video models are trained on colossal datasets to produce content that looks as much like the source material as possible. As Sutton ironically notes, when these results are good, the credit goes to the data the model learned from. When the results are truly new, they lose touch with reality because the model has no internal mechanism to test its own hypotheses. Sutton recalls the old academic joke: "Your work is both new and good. But the part that is good is not new, and the part that is new is not good."
"Novelty flashes for a moment, but if its value is not recognized by the system, it flickers out and disappears forever."
This lack of an "internal censor" is what separates a predictive engine from a scientist. Sutton defines genuine discovery as a three-stage process: variation, evaluation, and selection. While generative AI excels at variation—generating endless versions of text or code—it fails at the second stage. Without the ability to test hypotheses against a clear objective, the model cannot perform the selection necessary for progress. It produces "candidates," but still requires a human to decide which is a breakthrough and which is nonsense.
The Architecture of True Creativity
Sutton isn't writing off AI entirely, but he draws a thick line between imitation models and systems capable of creativity. He cites AlphaGo, AlphaZero, and AlphaFold—projects that moved beyond simple mimicry. What unites them is an evaluation loop that operates outside of content generation. In AlphaGo, a move either brings the system closer to victory or it doesn't; in programming, code either passes tests or it crashes. This feedback allows the system to search for and find solutions a human would never consider, like the famous "Move 37" against Lee Sedol.
According to Sutton, the industry will achieve more if it shifts focus from mindlessly scaling datasets to architectures that include formal verifiers, search, and reinforcement learning (RL). For business and tech leaders, the signal is clear: the path to the "autonomous scientist" lies not in parameter growth, but in creating agents that interact with an environment and evaluate themselves. Betting on pure LLMs in R&D-intensive industries risks a degradation of expertise: you will end up with mountains of low-quality hypotheses and no way to verify them.
The economic value of current generative AI will remain at the level of a "high-speed assistant" until it incorporates the self-correction mechanisms found in gaming or mathematical models. Investing in pure generation as a primary research tool is a recipe for stagnation, as these models cannot distinguish an epiphany from a hallucination. A real competitive advantage will go to those who integrate language models into closed-loop systems with formal verifiers. To move beyond imitation, we must build systems capable of proving they are right—or admitting they are wrong—without a human prompt.