MiniMax’s Hailuo AI text-to-video generator: How does it stack up?

MiniMax, a rising Chinese artificial intelligence startup, is making waves with the release of its latest text-to-video AI model, known as Video-01 and commercially as Hailuo AI. Backed by tech giants Alibaba and Tencent, the company launched the new model in early September, attracting attention for its ability to generate realistic video clips.

Official demo video of MiniMax’s Hailuo AI. Video source: MiniMax.

The introduction of Hailuo AI comes at a pivotal moment in the generative AI industry, as the race to dominate the AI-generated video market heats up. Established players like OpenAI’s Sora, Runway, and Google’s Veo are already competing in this space, while Chinese public company Kuaishou recently released Kling AI, its generative AI platform, which also offers text-to-video features.

Hailuo AI versus Kling AI

To assess the model’s performance, an initial test was conducted using several prompts identical to the ones previously utilized with Kuaishou’s Kling AI.

Video generated by Hailuo AI in response to a prompt requesting a “cute kitten eating lunch like a human.”

Video generated by Hailuo AI in response to a prompt written in Chinese, also requesting a “cute kitten eating lunch like a human.” Like with Kling AI, the result did not significantly differ from the version generated in response to the English prompt.

While there was no significant difference in rendering quality for the kitten prompts, regardless of language (English and Chinese), Kling AI notably handled the unconventional request with greater precision. In KrASIA’s previous test of Kling AI, it accurately portrayed the kitten using utensils, providing a closer approximation of how a human would eat. In contrast, Hailuo AI’s result depicted the cat in a more animalistic manner, with the mannerisms resembling how a cat might naturally consume food.

However, in a head-to-head comparison for the prompt to generate a “realistic puppy driving a car,” Hailuo AI was able to deliver a more realistic interpretation. Its result showcased a puppy at the wheel, smooth and believable, but nonetheless less imaginative in anthropomorphizing the puppy driver.

Video generated by Hailuo AI in response to a prompt requesting a “realistic puppy driving a car.”

In addition, various media reports suggest that Hailuo AI’s key strength lies in its ability to render human-like movements with a high degree of realism. To confirm these observations, two prompts were issued for Hailuo AI to handle: “astronauts repairing a space station orbiting Earth” and “medieval knights in combat,” aiming to test the model’s ability to handle complex movements in these scenarios.

As with the puppy driver prompt, Hailuo AI was somewhat conservative in interpreting the astronauts’ movements, though the results were satisfactory considering the early stage of text-to-video AI generation.

Video generated by Hailuo AI in response to a prompt requesting “astronauts repairing a space station orbiting Earth.”

However, when tasked with generating the latter video, Hailuo AI seemed to struggle with the perceived complexity of the scene. A knight appeared out of the blue, seemingly off the back of another character, and their movements were not sufficiently coherent.

Video generated by Hailuo AI in response to a prompt requesting “medieval knights in combat.”

To test whether specificity could improve results, a follow-up prompt for “two medieval knights in combat” was tested. The results were better, but still not quite there—some movements felt overly repetitive, and the pace was far too fast to pass as realistic.

Video generated by Hailuo AI in response to a prompt requesting “two medieval knights in combat.”

The current version of Hailuo AI can generate six-second video clips at 1280×720 resolution, running at 25 frames per second. The model is limited by the short duration of clips, though MiniMax has promised to address this with future updates. A new iteration of Hailuo AI is already in development, expected to offer longer clip durations and introduce features such as image-to-video conversion—something Kling AI already provides.

Founded in 2021 by Yan Junjie, former vice president and head of general AI technology at SenseTime, MiniMax has quickly gained a foothold in the AI industry. In March this year, reports emerged that the company raised at least USD 600 million from various investors, including HongShan, with Alibaba expected to lead the round. This adds to the USD 250 million MiniMax raised mid last year, backed by Tencent and other investors. The latest Alibaba-led round is said to value MiniMax at more than USD 2.5 billion.

MiniMax was also among the initial companies approved by Beijing to offer large language models (LLMs) for public use in August 2023, placing it among the leading AI firms in China as the country accelerates its focus on advancing AI capabilities.

Aside from Hailuo AI, MiniMax offers a range of AI solutions, including speech and language generation. Its Talkie AI app has been downloaded millions of times and was dubbed one of the “hottest entertainment apps” in the US by WSJ.

With strong financial backing and a growing portfolio of AI tools, MiniMax’s advances will increase pressure on other companies in the AI race. As further refinements to Hailuo AI are made, it remains to be seen how far MiniMax can go.

Hailuo AI versus Kling AI

RELATED ARTICLE

Kuaishou’s Kling AI video generator launches globally to challenge OpenAI’s Sora