Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now
If you follow any AI influencers or creators on social media, there’s a good chance you may have seen them more excited than usual lately about a new AI video generation model called “Kling.”
The videos it generates from pure text prompts and some configurable, in-app buttons and settings, look incredibly realistic, on par with OpenAI’s still non-public, invitation only, closed beta AI model Sora, which it has shared with a small group of artists and filmmakers as it tests it and its adversarial (read: risky, objectionable) uses.
In fact, Kling even posted a video on its YouTube channel yesterday imitating one of the first third-party videos generated with Sora, “air head” by the creative agency shy kids.
Embedded below is Kling’s video, “a day with the Balloon Man”:
AI Scaling Hits Its Limits
Power caps, rising token costs, and inference delays are reshaping enterprise AI. Join our exclusive salon to discover how top teams are:
- Turning energy into a strategic advantage
- Architecting efficient inference for real throughput gains
- Unlocking competitive ROI with sustainable AI systems
Secure your spot to stay ahead: https://bit.ly/4mwGngO
And below that is “air head” made by shy kids with OpenAI’s Sora:
But where did Kling come from? What does it offer? And how can you get your hands on it? Read on to find out.
What is Kling and where did it come from?
The South China Morning Post (SCMP) newspaper and website reported that the new AI video model was developed by Kuaishou Technology, the maker of Kuaishou, the number two most popular short video creation and viewing app in China (branded Kwai outside of the country), with 400 million daily active users (DAUs).
That puts Kuaishou/Kwai just behind Douyin, the Chinese version of TikTok from ByteDance, which counts 600 million DAUs.
As such, it will likely immediately be of interest to users in China and encourage them to check out Kuaishou so they can get their hands on the new compelling video model, giving it a boost in its battle for users against Douyin.
Here’s how SCMP describes it:
“The Kling AI Model, which is in the trial stage, can process text into video clips up to 2 minutes long with 1080p resolution, supporting various aspect ratios, app operator Kuaishou Technology said. It can interpret prompts to generate videos that mimic the physical world and create imaginative scenes from text instructions, it added.“
According to several sources cited by Perplexity, the Kling AI model:
“employs a unique 3D Variational Autoencoder (VAE) for face and body reconstruction, enabling detailed expression and limb movement from a single full-body image. This technology is further enhanced by a 3D spatiotemporal joint attention mechanism, which allows the model to handle complex scenes and movements, ensuring that generated content adheres to the laws of physics”
How can you use Kling and how much does it cost?
Kling can be accessed for free through the apps Kuaishou, Kwai and KwaiCut (the latter a video editing competitor to TikTok’s CapCut).
Unfortunately for those outside of China, users report you need a Chinese phone number in order to download and access the model.
However, venture capital firm a16z partner Justine Moore posted a work-around on X cautioning she wasn’t sure was advisable, using a burner phone number and entering it through the KwaiCut app:
U.S. filmmaker Dustin Hollywood also posted the idea to use ChatGPT to translate the app menus and screens.
What is Kling capable of and good for?
According to videos posted by early users, the new AI video generation model is good for creating a wide range of immersive, realistic, and detailed video in high resolution, including everything from realistic action scenes…
..to those mimicking first-person shooter video games…
…and high fantasy series like House of the Dragon or Game of Thrones.
Dustin Hollywood reports it takes around 2 minutes to generate a video based on an “intermediate” level complexity text prompt. However, as he notes, the generator struggles with accurately depicting race/skin color (a similar issue caused Google’s Gemini AI image generation capability to be openly mocked and reviled by prominent Silicon Valley libertarians and conservatives).
Yet even issues aside, it’s clear Kling is a powerful new AI tool and is already having filmmakers such as Dustin Hollywood re-asses their opinion of Sora, and question OpenAI’s invitation-only strategy of releasing it, so far.
Will Kling pressure U.S.-based AI video model providers such as OpenAI, Runway, and Pika to step up the quality of their generations and resolution? It seems highly likely. Now, whether they can match Kling in short order is an open question.
But either way, if you’re interested in AI filmmaking or filmmaking in general, the arrival of Kling is definitely something worth getting excited about. Hopefully, a full U.S. release without requiring users to have Chinese phone numbers is not far off.