In a sunlit conference room in San Francisco, a group of developers huddle over a table scattered with phones, tablets, and laptops. The air is electric with possibility, but tinged with frustration. One device keeps lagging, another refuses to load the demo. The problem isn’t the code. It’s the connection.

“We’ve optimised everything,” one engineer mutters, thumbing through logs. “But when the signal drops, the whole thing falls apart.”

Welcome to the fragile reality of AI on mobile.

Despite astonishing leaps in generative AI (image generation, natural language interfaces, and real-time summarisation) the entire experience often hinges on a single, outdated assumption: that the user is always connected to the cloud. As AI continues to embed itself deeper into our daily lives, from customer support to healthcare diagnostics to personal assistants, that assumption is becoming less and less viable.

The next frontier of AI isn’t in bigger models or smarter prompts. It’s in bringing intelligence closer to the user: physically. And that means rethinking where data lives, how it moves, and what happens when the cloud is simply out of reach.

Cloud can’t keep up

To understand the shift, start with how most AI systems currently operate. When you open a chatbot on your phone or ask your smart assistant a question, the request typically travels across the internet to a cloud server where the large language model (LLM) processes the input, consults relevant databases, and sends a response back.

This setup works beautifully when bandwidth is strong, latency is low, and privacy isn’t a concern.

But increasingly, those conditions don’t hold.

Users are demanding smarter apps that work offline, in real-time, without sacrificing responsiveness or control. They expect AI to assist them on subways, in rural areas, on factory floors, and in high-security environments where external data connections are limited or forbidden. And they expect it to do so while preserving data privacy and ensuring low power consumption on resource-constrained devices.

The cloud, once the crown jewel of modern computing, is beginning to look like a bottleneck.

Also Read: From village to cloud: Why public-private partnerships hold the key to inclusive tech in SEA

On-device AI is ready—almost

Technically, we’re closer than ever to running real AI locally.

Thanks to advances in quantisation, pruning, and specialised silicon—such as Apple’s Neural Engine, Google’s Tensor SoC, and Qualcomm’s AI chips—it’s now possible to run compressed versions of LLMs directly on mobile hardware. Models like Phi-2, LLaMA, and Gemma have been distilled to sizes small enough to live on-device and still perform impressively on many natural language tasks.

But here’s the kicker: AI doesn’t exist in a vacuum. Models need data. They need context. And they need it fast.

Without local access to relevant, real-time data (user preferences, recent activity, stored documents, or business-critical information) the model may be running on your device, but it’s still blind. That’s the paradox. We’ve brought the brain to the edge, but not the memory.

The data problem at the edge

Imagine a travel assistant that knows your itinerary, hotel reservations, and preferences. You’re in a taxi in Rome, with no Wi-Fi, and you ask for a dinner recommendation near your hotel. The model runs on your phone, but the hotel address? That’s stored in the cloud. Without it, the AI draws a blank.

Or take a field medic using an AI tool to interpret patient data in a disaster zone. The model is on the device, but if the patient records are locked in a remote database, the application fails at the moment it’s needed most.

The problem isn’t just about fetching data, either. It’s more about ensuring that the data is accurate, synchronised, and secure, across millions of devices, many of which may be offline for hours or days at a time. Data has to move intelligently between the cloud and the edge, updating incrementally, resolving conflicts, and maintaining integrity without draining battery life or compromising privacy.

Solving this data mobility challenge is the linchpin to unlocking AI’s full potential on mobile.

Also Read: Singapore hit by 6.4M cyberattacks in 2024 as AI supercharges threats

Rethinking infrastructure for AI at the edge

What’s needed now is a new kind of infrastructure, one that treats data like a living, breathing entity that must exist both on the device and in the cloud, capable of thriving in both connected and disconnected environments.

This means:

Offline-first design: AI apps must function even without a network. No fallback mode. Full functionality, even in airplane mode.
Bidirectional sync: Changes made on-device should sync back seamlessly once a connection returns, without data loss or duplication.
Latency-free access: Models must be able to query data at memory speed, not over-the-air latency, especially during inference.
Security and privacy at the core: Sensitive information should never leave the device without explicit consent and encryption.
Scalability across fleets: Whether it’s 100 or 10 million devices, the system must keep them in sync, without micromanagement.

In short, the backend of AI needs to evolve, away from centralised architectures and toward distributed, intelligent data systems purpose-built for the edge.

A future that’s closer than it looks

We are entering a phase where the intelligence is not just in the cloud, not just in the model, but in the choreography between device, data, and environment.

AI that truly feels human, i.e. contextual, responsive, and always available, will not be achieved through brute compute or bigger models. It will come from systems designed to operate at the speed of thought, wherever that thought occurs, and regardless of whether a signal bar is present.

The future of AI on mobile will not be replacing the cloud, but rather liberating the user from it.

And that future is already starting to take shape. Quietly. Locally. One intelligent, offline interaction at a time.

—

Editor’s note: e27 aims to foster thought leadership by publishing views from the community. Share your opinion by submitting an article, video, podcast, or infographic.

Enjoyed this read? Don’t miss out on the next insight. Join our WhatsApp channel for real-time drops.

Image courtesy of the author.

The post Why the future of AI on mobile may not be in the cloud appeared first on e27.