Geely Auto, StepFun to Open Source Two Advanced AI Models for Global Developers

Huxin Luo in BusinessnewsTechnologyTop News

Chongqing - Geely Auto Group (Geely Auto), a Chinese NEV maker, and StepFun, a Chinese generative AI model maker, announced on February 18 that they will open source two jointly developed multimodal big models to global developers.

Geely Auto and StepFun have partnered in AI model development. Geely Auto focuses on scene design and evaluation, while StepFun leads pretraining. The open-sourced models are Step-Video-T2V (video generation) and Step-Audio (voice interaction).

Step-Video-T2V generated the football match footage. (Photo/Geely Auto)

Geely Auto claims that Step-Video-T2V generates 204 frames of video at 540P resolution. With 30 billion parameters, it surpasses Tencent’s 13-billion-parameter HunYuan-Video model, making it the largest video generation model globally.

Step-Audio, the industry's first open-source product-level voice interaction model, can generate emotional tones and personalized styles based on different scene requirements. It also supports voice replication and role-playing, making it suitable for applications in film and television, social interactions, and gaming.

As an automaker, Geely Auto has long prioritized the development of AI technology, particularly in smart vehicle AI. To this end, Geely Auto plans to integrate these two open-source models into its development of intelligent vehicles, boosting its advancements in smart driving and human-vehicle interaction.

According to Geely Auto, it has integrated Step-Video-T2V and Step-Audio with Geely Auto’s proprietary WiseStar AI big model, which covers full-scenario applications.

Step-Video-T2V acts as a "virtual driving test field" for WiseStar AI's autonomous driving system, generating diverse driving scenario videos to train the model using real-world driving data. WiseStar AI's optimization will convert these videos into training data, enabling continuous generation for the autonomous driving system.

Step-Audio will improve human-vehicle interaction by enhancing voice command responsiveness and emotional understanding. It will also enable WiseStar AI to optimize navigation and offer personalized voice options for a more engaging experience.

The Galaxy Starship 7EM-i model is equipped with Geely Auto's latest AI technology. (Photo/Geely Auto)

In addition to the two open-source models, Geely Auto is also training WiseStar AI model FunctionCall and user-side active interactions with DeepSeek, helping the model better understand vague user intentions, anticipate potential needs based on the vehicle's environment, and proactively respond to users.

Next Read: Indonesian Influencers Explore Chongqing's Unique Charm »

Twin Treasures of the Silk Road | Dunhuang in the North, Dazu in the South
The sweetest lo…
The Unforgettable 8·19:Restored HD Color Footage of the Bombing of Chongqing Stuns the World
the Japanese ar…
Spring Festival at the Grassroots: A Four-Year Journey of Rural Revitalization
On Feb. 5, a re…