While the industry drowns in marketing hype and meaningless benchmarks, the team at Google DeepMind has decided to bring some order to the terminology. In their paper "Measuring Progress Towards AGI: A Cognitive Taxonomy," Shane Legg and his colleagues suggest we stop guessing and implement a scientific classification of cognitive abilities. This shift is long overdue: current tests have devolved into a data-memorization contest rather than a true measure of intelligence.
Instead of evaluating isolated tasks, DeepMind proposes assessing AI across ten fundamental pillars, including metacognition, executive functions, and social intelligence. This isn't just an academic exercise. For business leaders, this transition offers a way to forecast the automation of entire cognitive chains rather than just text generation. If a model lacks planning and self-correction (executive functions), integrating it into complex business processes will only lead to rising hidden costs for human oversight.
The evaluation methodology is getting tougher. DeepMind insists on a three-stage protocol using hidden datasets to eliminate data contamination, alongside mandatory comparisons with representative human samples. To prove the approach is viable, the company is launching a Kaggle hackathon focused on the most problematic areas: learning and social interaction.
Executives and CTOs should examine this framework now. When deploying AI agents, the total cost of ownership (TCO) will depend directly on how well their cognitive architecture gaps are closed—not on the inflated scores from standard benchmarks.