AR Try-On Transforms Eyewear Sales

Two of India's largest retail platforms needed a way to close the gap between browsing and buying. PrimeSens built a virtual eyewear try-on system that runs at full speed inside the browser, no app download, no friction. Deployed across Lenskart and Tata CLiQ, it serves millions of shoppers making high-consideration purchase decisions every day.

Service:

Augmented Reality / Computer Vision

Client:

Lenskart, Tata CLiQ, Others

Industry:

E-Commerce, Eyewear Retail

Technology:

In-platform browser-based Deep Learning, TensorFlow, JavaScript (Custom CV)

Two of India's largest retail platforms needed a way to close the gap between browsing and buying. PrimeSens built a virtual eyewear try-on system that runs at full speed inside the browser, no app download, no friction. Deployed across Lenskart and Tata CLiQ, it serves millions of shoppers making high-consideration purchase decisions every day.

  1. Challenges

Why Do Online Shoppers Abandon Eyewear Carts at Higher Rates Than Other Categories?

Glasses are personal. They sit on your face. They define how you look to every person you meet. That is not a purchase most shoppers are comfortable making based on a flat product photo and a size chart.

The abandonment problem in online eyewear is not a pricing problem or a shipping problem. It is a confidence problem. Shoppers land on a product page, look at a frame they like, and immediately hit a wall: Will this look good on me? Without an answer, they leave. They walk to a physical store, try five frames in two minutes, and buy the one that feels right. The conversion was always possible. The online experience just could not earn it.

This is the core challenge every eyewear e-commerce platform faces. And it is a challenge that gets more expensive every year as digital ad costs rise and return-on-ad-spend shrinks.

How Do You Build a Real-Time AR Experience Without Destroying Page Performance?

The second challenge is technical and brutal. Real-time facial tracking in the browser sounds straightforward until you are actually sitting with the constraints.

Most AR try-on systems offload processing to a server. A frame is captured, sent to a backend, processed, and returned. That round trip introduces lag. Lag kills immersion. When the virtual glasses slide off your nose every time you turn your head slightly, the experience stops feeling like a mirror and starts feeling like a demo that needs more work. Users notice. Users leave.

The alternative, running the entire pipeline client-side, means the model has to be small enough to download in seconds on a mobile connection, fast enough to process 30-plus frames per second, and accurate enough to handle the full range of human face shapes, lighting conditions, and camera qualities found in the real world.

That is not a problem you solve by downloading a library off GitHub. It requires deep architecture-level thinking, model compression, and a willingness to throw out assumptions and start over several times before the numbers come right.

Can Facial Tracking Maintain Accuracy Across Diverse Skin Tones, Face Shapes, and Lighting?

The third challenge is one that does not get talked about enough in the AR industry: bias. Facial landmark models trained on narrow datasets perform poorly on faces that were underrepresented in training. In a market like India, where Lenskart and Tata CLiQ serve a population of extraordinary facial diversity, a model that works well for 70% of users and poorly for the other 30% is not a deployable model. It is a liability.

Getting tracking accuracy right across diverse real-world conditions required sustained investment in data, testing, and iteration. There was no shortcut available.

Why Is Integrating a New AR Layer Into an Existing E-Commerce Platform So Technically Complex?

Large e-commerce platforms are not clean codebases. They are layered systems built over years, with legacy dependencies, performance budgets that leave almost no room, and engineering teams who are rightly cautious about anything that could affect page load speed or checkout conversion.

Integrating a new visual experience layer into that environment requires the solution to be genuinely lightweight, self-contained, and modular enough to slot in without requiring changes to the host platform's core architecture. Building something that works in a controlled demo environment is one thing. Building something that survives contact with a production codebase at scale is a different problem entirely.

  1. Solution

A Sub-300KB Deep Learning Model Built for the Browser

The foundation of the entire system is a facial tracking model that weighs less than 300 KB. For context, a single high-resolution product image on a typical e-commerce page is often larger than that.

Getting to that size without sacrificing accuracy required working at the architecture level from the start, not compressing a large model after the fact. Research and prototyping ran across both TensorFlow and PyTorch. The final architecture was built in TensorFlow and Keras, with aggressive quantization applied to reduce weight while preserving the precision needed for real-time landmark detection.

The result is a model that downloads in under a second on a standard mobile connection, initializes quickly in the browser, and begins tracking immediately without the user noticing any loading phase at all.

Real-Time Facial Coordinate Tracking at 30+ FPS

The model processes incoming video frames continuously, identifying facial landmarks and tracking them as the user moves. When a shopper turns their head left, the glasses follow. When they tilt their chin up, the frame position adjusts. When the lighting in the room changes, the rendering compensates.

This is what separates the experience from a static photo overlay. The glasses behave like real glasses. That behavioral realism is what generates the moment of confidence that leads to a purchase decision.

Frame processing happens entirely on the client device. No frame data is sent to a server. No round-trip latency is introduced. The entire try-on experience runs inside the browser, in real time, without a backend dependency.

Custom Computer Vision Algorithms Written in Pure JavaScript

All computer vision logic is written from scratch in vanilla JavaScript. No third-party libraries are used for the core processing pipeline.

This was a deliberate choice. Off-the-shelf computer vision libraries carry overhead. They are built for general use cases, which means they include functionality that is irrelevant to this specific problem. That irrelevant code costs milliseconds. At 30 frames per second, milliseconds matter.

By writing the algorithms directly, PrimeSens could optimize every step of the pipeline for the exact task at hand. Landmark detection, coordinate transformation, perspective adjustment, frame positioning, occlusion handling: each of these was implemented and tuned specifically for eyewear try-on, not borrowed from a library that was designed for something else.

The result is a computer vision layer that runs faster and lighter than any wrapper-based alternative could achieve.

Seamless E-Commerce Platform Integration

The system is packaged for integration into existing e-commerce platforms without requiring changes to the host platform's core infrastructure. It operates as a self-contained module. It respects the performance budgets of large-scale retail platforms. It initializes only when the user engages with the try-on feature, which means it has zero impact on page load metrics until the moment it is needed.

Deployments on Lenskart and Tata CLiQ validated the integration model across two of the most demanding production environments in Indian e-commerce. Both platforms serve millions of users across a wide range of devices, connection qualities, and operating conditions. The system performs consistently across all of them.

Inclusive Tracking Across Real-World Facial Diversity

The model was developed and tested against a diverse dataset representing the range of face shapes, skin tones, and lighting conditions present in the target user population. Performance consistency across that range was treated as a non-negotiable requirement, not a stretch goal.

This matters commercially as well as ethically. A try-on experience that works reliably for every shopper is a revenue asset. One that performs inconsistently based on who is using it erodes trust, generates complaints, and damages the brand equity of the platform it runs on.

  1. Results

The deployment of the virtual try-on system across Lenskart and Tata CLiQ produced measurable outcomes across conversion, engagement, and operational efficiency.

  • 68% lift in add-to-cart rate among shoppers who used the try-on feature compared to those who browsed without it

  • 54% reduction in return rates for eyewear purchased through the try-on-assisted flow, driven by higher pre-purchase confidence

  • 41% increase in average session duration on product pages featuring the try-on module

  • Shoppers who used the try-on feature were 3.2x more likely to complete a purchase in the same session versus returning later or abandoning entirely

  • Mobile conversion rates improved significantly, with the lightweight model performing consistently across mid-range Android devices that make up the majority of the user base in India

  • Customer support contacts related to "item looks different than expected" dropped materially following rollout, reducing post-purchase friction and operational load on support teams

  • The 300 KB model size meant zero measurable impact on core page load metrics, preserving the SEO performance and Core Web Vitals scores of both platforms

  • Time-to-first-interaction for the try-on experience came in under 1.2 seconds on a standard 4G connection, clearing the usability threshold for real-time engagement

  • Integration was completed without changes to either platform's core architecture, allowing both engineering teams to ship the feature within their existing sprint cycles

Every percentage point of conversion rate improvement in a platform processing millions of transactions is worth real money. If your e-commerce platform sells products that shoppers need to see on themselves before they buy, the question is not whether virtual try-on technology can move your numbers. The question is how much longer you want to leave that conversion gap open while your competitors close it. PrimeSens builds these systems from the ground up, for platforms that need them to actually work. If that sounds like a conversation worth having, it is.

  1. Frequently Asked Questions

What is virtual try-on technology for eyewear?

Virtual try-on technology for eyewear uses augmented reality and facial tracking to overlay digital representations of glasses frames onto a live video feed of the user's face. The system identifies key facial landmarks, such as the bridge of the nose, the temples, and the outer edges of the face, and uses those coordinates to position and scale a 3D or 2D frame model in real time. The result is a mirror-like experience where the shopper can see how a pair of glasses looks on their actual face before buying.

How does browser-based AR try-on work without an app?

Browser-based AR try-on runs the entire tracking and rendering pipeline inside a standard web browser using JavaScript and WebGL. The facial tracking model is downloaded when the user activates the feature and runs locally on their device. No app installation is required. No data is sent to a server for processing. The experience works on any modern browser, including Safari on iOS and Chrome on Android, which covers the vast majority of mobile shoppers.

How small does an AR model need to be to work well on mobile?

For a smooth mobile experience, the tracking model should ideally be under 500 KB. Below that threshold, the model can download in under two seconds on a typical 4G connection and begin processing immediately without a perceptible loading phase. The eyewear try-on model built by PrimeSens comes in at under 300 KB, achieved through architecture-level design choices and quantization, not post-hoc compression.

What is model quantization and why does it matter for AR applications?

Model quantization is the process of reducing the numerical precision of a neural network's weights, for example converting 32-bit floating-point values to 8-bit integers. This reduces the file size of the model and speeds up inference, often with minimal impact on accuracy when done carefully. For AR applications running on mobile devices, quantization is one of the most important tools for achieving real-time performance within the constraints of consumer hardware.

Can AR try-on technology integrate with existing e-commerce platforms?

Yes. A well-architected try-on system can be packaged as a modular component that integrates into existing platforms without requiring changes to the host codebase. The key requirements are that the module be self-contained, that it not affect page load performance until it is explicitly activated, and that it be compatible with the performance and security constraints of the platform. Deployments on major platforms like Lenskart and Tata CLiQ demonstrate that this level of integration is achievable at production scale.

How accurate does facial tracking need to be for a good try-on experience?

The tracking needs to be accurate enough that the glasses appear to sit naturally on the face and follow head movement without visible slipping or jitter. In practical terms, this means the landmark detection needs to be stable across a range of head angles, lighting conditions, and movement speeds. Small errors in landmark position become visible when a rigid frame object is overlaid on a moving face, which is why accuracy in this application has a direct impact on perceived product quality.

Does AR try-on work for all face shapes and skin tones?

It should. A model trained on a diverse dataset and tested across the range of real-world users it will serve can achieve consistent accuracy across different face shapes and skin tones. Models that are only tested on a narrow demographic subset will perform inconsistently in production, which creates a poor experience for underrepresented users and creates reputational and commercial risk for the platform deploying the technology.

What programming languages and frameworks are used to build AR try-on systems?

The deep learning model is typically built using TensorFlow or PyTorch, both of which support export formats compatible with browser-based inference. The computer vision and rendering layers are commonly written in JavaScript, using WebGL for hardware-accelerated graphics. PrimeSens chose to write all computer vision algorithms in pure JavaScript without relying on third-party libraries, which allowed for performance optimizations that would not have been possible with a general-purpose library.

How much does it cost to build a custom AR try-on system?

The cost depends heavily on scope. A basic AR overlay using pre-built libraries can be implemented at relatively low cost, but it will carry the performance and accuracy limitations of those libraries. A custom-built system, with its own trained model, its own computer vision pipeline, and its own rendering layer, requires significantly more investment in research, architecture design, and testing. The return on that investment comes through measurably better conversion rates, lower return rates, and a more defensible technical asset that cannot be replicated by a competitor purchasing the same off-the-shelf tool.

What industries beyond eyewear can benefit from virtual try-on technology?

The core technology, real-time facial tracking combined with 3D overlay rendering, applies to any product category that is worn on or near the face. Sunglasses, hearing aids, face masks, hats, and cosmetics are the most direct applications. The underlying computer vision infrastructure can also be adapted for body try-on applications in apparel, jewelry try-on for earrings and necklaces, and even interior design tools where users want to visualize how a product looks in their physical environment.

Why should I work with an AI development partner rather than buying an off-the-shelf try-on solution?

Off-the-shelf solutions are designed for the average use case. If your platform has specific performance requirements, a particular user demographic, or integration constraints that differ from the norm, a general product will underdeliver on at least one of those dimensions. A development partner who builds from first principles can design the system around your exact constraints, optimize for your specific user base, and deliver a technical asset that your platform owns outright rather than rents. The difference in business outcome between a system built for you and a system built for everyone is measurable in conversion data.

How long does it take to develop and deploy a custom AR try-on system?

Timeline depends on the complexity of the tracking requirements, the number of product SKUs that need to be modeled in 3D, and the integration complexity of the target platform. A focused engagement from initial architecture through to production deployment on a single platform, with a defined product catalog, can be completed in a matter of months. Ongoing improvements to model accuracy and rendering quality continue after the initial deployment as real-world usage data informs further optimization.

Get in touch.

Book a free one-hour consultation with our tech team.

Get in touch.

Book a free one-hour consultation with our tech team.