X5 Tech has demonstrated that turning neural networks into a viable business tool isn't about endless model fine-tuning; it's a relentless battle for engineering efficiency. Over eight months, project "Ivanych" evolved from a raw MVP into an industrial pipeline serving 27,000 Pyaterochka and Perekrestok stores. Instead of chasing "universal AI," Ivan Popov's team bet on a pragmatic hybrid of 26 specialized models, including CNNs and VLMs. This architecture has automated 62 scenarios, ranging from cleanliness control in meat departments to geolocation verification.
Tech Stack vs. Operational Gaps
The project’s focus shifted from pure Computer Vision to rigorous productization. The golden rule: every VLM request must be economically justified. To avoid burning the budget on hardware, the team implemented Triton Inference Server and asynchronous pipelines via Kafka. This allowed the system to digest colossal data volumes on infrastructure comparable to a standard office server (4 vCPUs and 16 GB RAM). In our view, this is a healthy reality check for those who reflexively purchase GPUs for every task.
The Economics of the "Mystery Shopper"
Moving the "Mystery Shopper Club" to autonomous rails radically fixed the unit economics. Previously, manual survey moderation dragged on for weeks, killing participant loyalty. Now, the share of same-day verifications has jumped from zero to 40%. According to X5 Tech, moderator productivity has increased sevenfold, while operating expenses have been halved.
"Ivanych" is not just an algorithm, but a service that has effectively replaced manual labor on a national scale, processing 10 million photos every month.
Scaling wasn't without its quirks: for instance, the model for the meat processing area persistently mistook ground meat for dirt. The team solved the context issue by deploying "narrow" classifiers for specific zones and a "safe mode." If the AI doubts its verdict, the photo is sent for manual review. This preserved user trust, ensuring participants receive their loyalty points on time rather than "sometime later."
The bigger question remains: can this experience be transferred to inventory management and stock control? In those areas, the cost of an AI error isn't measured by a shopper's frustration, but by direct losses from empty shelves and inaccurate stock levels. For now, the X5 case stands as a rare example of common sense prevailing over the hype surrounding "smart" systems.