Chinese AI company Shengshu Technology has introduced an innovative upgrade to its artificial intelligence-driven video generation tool, Vidu, adding capabilities that allow users to create videos from images. Based in Beijing, Shengshu revealed on Wednesday that this new feature can combine up to three separate images into a cohesive video clip, offering a new level of visual consistency. Vidu, initially launched in April, previously allowed users to create 8-second videos based on text prompts, a feature that quickly gained popularity and went viral on platforms like TikTok for its ability to turn photos into lifelike scenes.
Shengshu’s co-founder and CEO Jiayu Tang explained that Vidu’s tool is already gaining traction among advertisers, animators, and businesses. He shared that customer usage typically ranges from 100,000 yuan to 1 million yuan per month ($13,871 to $138,711), highlighting the tool’s potential as a revenue-generating asset in the AI content creation industry. While other AI platforms have developed text-to-video or image-to-video tools, the quality and coherence of the output can vary widely. Shengshu claims a breakthrough in achieving “visual consistency,” seamlessly integrating distinct images into smooth, realistic video scenes. For example, Vidu can merge images of a person, a shirt, and a moped into a video where the individual appears to wear the shirt and ride the moped through a chosen setting.
This release challenges OpenAI’s upcoming model, Sora, which was announced in February with promises of generating one-minute videos from text. However, Sora has yet to be publicly released, leaving Vidu’s enhanced capabilities at the forefront for now. Shengshu’s chief technology officer, Fan Bao, emphasized the company’s focus on visual consistency as a key technical hurdle, a problem the team prioritized from the start. Bao underscored Vidu’s ability to produce high-quality, AI-driven videos that avoid the disjointedness often seen in other AI video generators.
To address potential copyright issues, Tang said Shengshu’s team is open to agreements where the AI can replicate an artist’s style for commercial purposes, and he noted that he has not encountered significant legal cases related to image-based content. Vidu’s platform prohibits the public from using celebrity images or those of “sensitive” individuals, banning explicit or violent content as well. For personal images, Shengshu adheres to global data protection regulations, destroying user data after processing.
Shengshu Technology was founded last year with support from prominent backers, including Baidu Ventures, Ant Group (an affiliate of Alibaba), Zhipu AI, Qiming Venture Partners, and Beijing city. Its AI operations rely on rented cloud servers in China and internationally, underscoring its rapid expansion in AI content generation. With Vidu’s latest upgrade, Shengshu has positioned itself as a competitive force in the global AI video generation landscape.
