Traditional cybersecurity perimeters are proving powerless against the next generation of social engineering. Researchers have unveiled PhySE, a framework that integrates AR headsets with Large Language Models (LLMs) to execute sophisticated attacks in real time. Unlike previous attempts to automate offline manipulation, PhySE eliminates data collection latency through a built-in Vision-Language Model (VLM). The system instantly recognizes an interlocutor, analyzes their digital footprint, and displays contextual prompts directly onto the attacker's lenses without interrupting the flow of live dialogue.

The core of the system is an adaptive psychological agent. According to a preprint published on arXiv, this AI engine replaces static scripts with dynamic behavioral strategies. Instead of memorized phrases, the attacker receives instructions based on the victim's current reactions. Effectively, PhySE solves the "cold start" problem in personalization: a psychological profile and trust-building tactics are generated the very second the glasses' camera captures the target's face. The methodology was validated in a study involving 60 volunteers across 360 controlled dialogue scenarios.

While vendors position AR glasses as tools for productivity and instant knowledge access, in practice, they are becoming the ideal weapon for "on-the-ground" spear phishing. The developers of PhySE have demonstrated how the synergy of physical intelligence and LLMs transforms the visual and vocal data of top executives into a detailed map of vulnerabilities. This is no longer a digital threat that can be patched with a software update; it is a direct exploitation of human psychology, where augmented reality serves as a shroud for deep manipulation. Face-to-face meetings are ceasing to be a safe harbor, turning instead into a space where every word and facial expression can be used against you.

AI SafetyCybersecurityComputer VisionDigital Transformation