Modern AI agents built on Large Language Models (LLMs) are facing a persistent compositionality crisis: they excel at recognizing standard patterns but fail instantly when presented with new combinations of familiar tasks. According to Mahnoor Shahid and Hannes Rothe from the University of Duisburg-Essen, the problem lies at the foundation—models rely on statistical probability rather than a true understanding of cause and effect. Their proposed framework, AGEL-Comp (Action-Grounded Experiential Learning), aims to move AI past its "hallucinatory" phase by grounding actions in a rigorous logical environment.
Rather than inflating parameter counts in hopes of a miracle, AGEL-Comp utilizes a Causal Program Graph (CPG)—a dynamic hypergraph that serves as a world model. In this architecture, actions are not merely selected from a list of probabilities; they are integrated into a procedural structure. This allows the system to build explicit, interpretable links, preventing scenarios where an agent might try to "open a door without a key" simply because that sequence appeared frequently in its training data. Here, logic takes the lead, while the neural network merely suggests options.
To handle learning without the need for constant weight retraining, the system employs Inductive Logic Programming (ILP). It synthesizes new Horn clauses, transforming experience into a set of formal rules. Essentially, the agent writes its own code of conduct on the fly, adapting to its environment without expensive fine-tuning. Furthermore, suggestions from the LLM acting as a planner are never taken at face value. They are vetted by a Neural Theorem Prover (NTP), which acts as a strict filter, verifying the logical consistency of hypotheses before the agent performs any physical action.
Experiments in the Retro Quest environment have confirmed that while traditional neural models lose their way when tasks are reshuffled, AGEL-Comp remains functional. However, a corporate rollout is still a long way off. Shahid and Rothe’s current methodology is limited to simulations, and scaling this deductive-abductive learning cycle to the chaos of real-world business is a non-trivial challenge. Nevertheless, the research highlights a clear trend: to build truly reliable agents, the industry must return to symbolic rules and verify every step through cold logic rather than statistical plausibility.