In busy warehouses, even small inefficiencies can slow down operations. One of the most time-consuming and error-prone tasks is matching incoming product labels with packing slips and purchase orders. Traditionally done by hand, this process slows inventory updates, introduces errors and ties up staff.

WWT recently piloted an AI-powered system to automate this work. The goal: Understand the effort required to deploy it and the impact it could make on accuracy, efficiency and scalability — first in our warehouses and then in our clients' environments.

Learn how we did it: Automating Label Scanning at Scale with AI at the Edge

The problem: Manual label matching slows everything down

In most warehouse environments, receiving teams must manually read shipping labels, compare them to printed documentation and reconcile that data with backend systems. This creates bottlenecks, especially during peak times, and introduces opportunities for human error that ripple throughout the supply chain.

The solution: AI that works where the work happens

To address this, we developed and tested a custom system using edge-based AI scanning technology. Installed at receiving stations, the system captures and processes label information in real time, right where packages arrive.

Instead of sending data to the cloud and waiting for a response, each unit processes information locally using edge devices. This minimizes latency, lowers operating costs and keeps the system responsive — even in high-throughput environments.

Built for the real world

Rather than relying on off-the-shelf models, we trained a custom AI model to identify the specific label formats used in our operations. This allowed for greater accuracy and reliability than generic solutions, which often struggle with real-world label variation and lack the precision our workflows require.

WWT chose to build this system in-house to meet several key requirements: customizability, integration with our existing systems, and the ability to scale across warehouse locations with minimal redesign. Commercial solutions didn't offer the flexibility or precision we needed, particularly for tight integration with our WMS and ERP platforms, or for adapting to future operational changes.

Infrared sensors detect when a package enters into view, while LED indicators confirm scan status to the operator. These features work together to improve traceability and reduce handling errors.

Traditional optical character recognition tools didn't provide consistent results, so we used multimodal LLMs to extract text with greater accuracy and less preprocessing. This allows the system to adapt to varying label conditions without retraining.

Finally, the system connects directly to our WMS and ERP platforms to update records, reconcile discrepancies and flag issues in real time — without additional manual steps.

Paired with a handheld app for floor staff

To further drive efficiency and supplement our automated solution, we developed a Swift-based mobile app that allows our warehouse workers to capture label images manually. The app:

  • Uses the device camera to capture images.
  • Uses API calls to GPT-4o or a similar multimodal to extract structured text.
  • Publishes extracted data to message queue (RabbitMQ, Kafka, etc.), making it instantly available for processing.

This handheld solution extends the reach of our automated system, enabling multiple staff members to contribute to label processing when automated capture isn't feasible.

Results: A faster, smarter warehouse

The pilot produced clear, measurable outcomes:

  • Minutes instead of hours: What once took hours per shipment now takes minutes, dramatically improving throughput.
  • Fewer errors: High-accuracy label detection reduced the need for manual corrections.
  • Lower costs: Processing data on the edge cut down on cloud usage and associated fees.
  • Scalable by design: The system is modular and can be adapted across warehouse locations and use cases.
  • More productive teams: With less time spent on routine tasks, staff can focus on higher-value work.
  • Better data, faster decisions: Real-time updates allow for quicker, more informed responses to operational issues.

What's next

This pilot wasn't just about optimizing one process. It was a test of what's possible when applied AI meets real-world operations. The modular architecture opens the door for further enhancements, from new sensors to smarter models, all deployable at scale.