Alibaba has released its image- and video-generating AI model, Wan 2.1, as open source, making its code and architecture publicly accessible for use, modification, and development.
The company introduced four versions—T2V-1.3B, T2V-14B, I2V-14B-720P, and I2V-14B-480P—to improve accuracy in image and video generation. The models support text-to-video, image-to-video, video editing, text-to-image, and video-to-audio applications.
Alibaba’s decision aligns with a similar move by Chinese firm DeepSeek, which has also open-sourced AI models. Wan 2.1 is expected to compete directly with OpenAI’s Sora, following its initial launch in January under the name Wanx.
The “14B” designation indicates models utilizing 14 billion parameters, enabling them to process larger datasets and generate more refined visuals. The I2V-14B and T2V-14B versions generate videos at 480P and 720P, with T2V-14B being the only model capable of producing videos in both Chinese and English.
The T2V-1.3B is optimized for consumer hardware, requiring 8.19 GB of VRAM on an RTX 4090 to generate a five-second 480P video in four minutes. In addition to Wan 2.1, Alibaba previewed QwQ-Max, a reasoning model that will be open-sourced upon its full release.
The company also announced plans to invest 380 billion yuan ($52 billion) over the next three years to strengthen its AI and cloud computing infrastructure.