Ant Group launches LingGuang, a multimodal AI assistant built for the app era

Alibaba-affiliated companies are making a broad push into the consumer market.

On November 19, Ant Group officially launched LingGuang, an artificial intelligence assistant that delivers multimodal responses. Instead of presenting only text, each interaction produces an interactive webpage capable of generating images, text, 3D models, animations, maps, tables, and audio or video.

Within one of its modules, users can input natural language instructions and LingGuang can generate an editable, interactive mini app in roughly 30 seconds, according to the company. The app supports multiple formats of information output.

Another module functions as a visual assistant, allowing users to upload or take photos in real time. The AI interprets the image and provides relevant information or performs follow-up tasks.

As of November 24, LingGuang surpassed one million downloads less than a week after launch, ranking sixth on app store charts. Its early trajectory has outpaced several well-known AI apps. According to mobile analytics firm Appfigures, the recently publicized Sora 2 took five days to reach the same milestone.

One day before LingGuang’s debut, on November 18, Alibaba refreshed and consolidated its consumer-facing AI products under the relaunched Qwen app. Ant released LingGuang the next day.

When asked why two conversational AI assistants were released in such close succession, Ant Group CTO He Zhengyu said the teams did not coordinate in advance and that the timing was a coincidence.

The backdrop is that Alibaba historically did not emphasize consumer AI applications. In 2025, however, the company began accelerating its consumer strategy. At Qwen’s launch event, Alibaba stated that it intended to compete directly in consumer AI, positioning the space as a new entry point for users.

To diversify its approach, Alibaba is not relying on a single product strategy. “Jack Ma encouraged us to push ourselves to the top of the app store rankings,” He said.

Given rapid gains in model capabilities and an uncertain competitive landscape, pursuing several directions is viewed internally as a safer path. He offered an analogy:

“If you’re trying to find water in a desert, you wouldn’t send everyone in one direction. You’d send teams in several directions.”

The products are also differentiated.

From a product standpoint, the Qwen app showcases Alibaba’s underlying model capabilities and is well suited to general Q&A, longform writing, and complex reasoning.

LingGuang, by contrast, emphasizes mobile-native interaction. Beyond answering questions, it renders rich multimodal content and supports code generation, which allows the AI to build mini apps for users.

He added that LingGuang is not intended to be a universal assistant and does not aim to compete with companionship-oriented platforms such as Doubao. Its positioning remains that of an efficiency tool.

Built for higher information density

LingGuang differs most clearly from general-purpose AI assistants in how it presents information.

It moves beyond the conventional chat-based interface. Instead of returning text alone, it can draw images, create animations, render 3D models, and produce charts. Cai Wei, LingGuang’s product lead, likened this to a teacher who explains concepts by sketching visuals that help users understand content on the spot.

For example, when a user asks how to make sweet-and-sour ribs, a typical AI assistant might produce a long, text-heavy recipe. LingGuang aims to improve comprehension by offering more visual formats.

In a test conducted by 36Kr using the sweet-and-sour ribs prompt, LingGuang generated a visual card within seconds. The card included an image of the finished dish, step-by-step instructions, varied font sizes, subheadings, diagrams, and stickers. The layout was designed for quick scanning.

Cai compared this shift to moving from the era of text-heavy emails to modern webpages, which integrate images, video, and interactive elements. “We want AI-generated answers to achieve that level of information density,” he said.

This approach aligns with how users naturally process information. It also works across varied scenarios, such as receiving a chart while drafting a paper, viewing a 3D floor plan while discussing home renovations, or watching an animation of planetary orbits during a conversation about the solar system.

“We encounter all kinds of information every day and often drown in it,” Cai said. “But which parts actually matter? With search engines, you get a list of links and must click into each one. We want a more efficient way to maximize the speed of information transfer.”

Optimizing information presentation is only the first step. LingGuang’s second major feature generates interactive mini apps based on user needs.

If a user asks for a timer, LingGuang produces one. These mini apps can be used immediately, edited, saved, and shared.

The conceptual idea is not new, as model developers often demonstrate similar capabilities. The difficulty lies in whether AI-generated web pages or apps are functional and usable at scale.

Much of this challenge stems from model architecture and engineering. A short instruction in Mandarin Chinese, such as “create a centered blue button,” may require dozens or hundreds of characters of underlying code to generate an interactive component.

The model must translate concise requests into large volumes of executable code, which increases computational load and latency. To address this, LingGuang incorporates engineering optimizations designed to maintain performance and stability.

This requires not only code generation but also precise reasoning, tool use such as dynamic map or chart generation, mathematical capability for data visualization, and accurate interpretation of user intent.

Compared with other web app generation tools, LingGuang’s differentiating feature is that it operates directly on mobile devices and produces outputs that are reportedly immediately usable.

Relieved from the burden of overbuilding

The launch of DeepSeek R1 in January marked a turning point in Ant’s commitment to artificial general intelligence, or AGI. He said he felt “excitement, urgency, and shame” in response to the announcement.

After the Lunar New Year, Ant formed an independent AGI team called Inclusion AI, bringing research, engineering, and product development under one structure.

Strategically, Ant chose not to compete with general-purpose assistants such as Doubao, which center on voice-driven companionship and time-spending use cases. Instead, Ant focused on two narrower but high-leverage directions: coding capability and multimodal output. This positioning shaped LingGuang as an efficiency tool rather than a universal assistant.

The decision required tradeoffs. For instance, although many model developers emphasized reasoning in 2025, LingGuang did not incorporate extensive reasoning features. “DeepSeek already does that extremely well and solves many problems. We don’t need to reinvent it,” Cai said.

Ant is betting that coding capabilities will continue to advance.

When LingGuang received approval in March, coding capabilities in foundation models were still limited, and generating a mini app from a single instruction performed poorly.

“We believed coding would become increasingly important,” Cai said. “But how far and how fast it would improve was extremely uncertain.”

Model capabilities set the ceiling for product performance. Ant advanced along two tracks: the foundation model team strengthened low-level coding skills, while the application team handled post-training and product adjustments.

LingGuang’s feature iterations were designed for long-term value through reusable modules. When the underlying model improves, post-training refinements layer on top rather than requiring a rebuild.

After DeepSeek’s release, AI applications have diverged in product philosophy. At this stage, user preference may matter more than short-term competition.

Differentiation has become a central challenge in the AI market. General-purpose assistants, supported by rapidly evolving foundation models, now crowd the field. Doubao emphasizes accessible, voice-first multimodal interaction, while DeepSeek and Kimi prioritize professional and productivity-oriented scenarios.

Ant’s strategy, if summarized in one line, is to build the “QR code” of the AGI era.

This means identifying a simple, high-value application that fits the market with minimal cost, choosing narrow entry points, and delivering concentrated utility. “We didn’t invent the QR code,” He said. “But we popularized it widely by embedding it into payment scenarios. AI apps face similar challenges today.”

Looking ahead, LingGuang is planning a mini app ecosystem with a marketplace, hosting platform, and sharing tools. “We want to lower the barriers for anyone to create and consume mini apps,” Cai said.

KrASIA Connection features translated and adapted content that was originally published by 36Kr. This article was written by Deng Yongyi for 36Kr.