Wan: New Open-Source Models Advance AI Video Generation

Wan: A New Standard for Open-Source Video Generation Models

The world of AI-powered video generation has made rapid progress in recent years. With Wan, a research team now presents a series of open-source models that set new standards in terms of performance, versatility, and accessibility.

Wan is based on the established diffusion-transformer paradigm and is characterized by innovative approaches. These include a novel VAE (Variational Autoencoder), scalable pre-training strategies, the use of extensive datasets, and automated evaluation metrics. This combination enables improved performance and a broad range of applications.

Performance and Scalability

The flagship of the Wan series, the 14B model with 14 billion parameters, was trained with a massive dataset of billions of images and videos. The results impressively demonstrate the scalability of video generation models in terms of data and model size. In internal and external benchmarks, Wan outperforms both existing open-source models and commercial solutions.

Versatility and Applications

Wan offers two main models: a 1.3B model and the aforementioned 14B model. The smaller model is particularly resource-efficient and runs on consumer graphics cards with only 8.19 GB of VRAM. The larger model, on the other hand, maximizes generation quality. The applications of Wan are diverse and include, among others, the generation of videos from images (image-to-video), instruction-guided video editing, and the creation of personalized videos. Overall, Wan covers up to eight different tasks.

Open Source and Community

A central aspect of Wan is the complete disclosure of the source code and all models. This is intended to promote the further development of the video generation community and open up new creative possibilities for video production. Research also benefits from access to high-quality video foundation models.

Outlook

Wan represents an important step towards powerful and accessible AI video generation. The open-source nature of the project allows the community to build upon the existing models, further develop them, and explore new applications. The combination of scalable architecture, efficient implementation, and open availability makes Wan a promising tool for the future of video production.

Bibliographie: - https://github.com/Wan-Video/Wan2.1 - https://www.abdulazizahwan.com/2025/02/wan-21-the-ultimate-guide-to-open-and-advanced-large-scale-video-generative-models.html - https://huggingface.co/Wan-AI/Wan2.1-T2V-1.3B - https://huggingface.co/Wan-AI - https://arxiv.org/abs/2412.03603 - https://github.com/deepbeepmeep/Wan2GP - https://www.digitalocean.com/community/tutorials/wan-video-foundation-models - https://arxiv.org/html/2502.07825v1 - https://www.youtube.com/watch?v=MpZqNlYceFw - https://medium.com/data-science-in-your-pocket/wan2-1-best-open-sourced-ai-video-generation-model-beats-openai-sora-6ea081cbb8f8