LingBot-World is an open-source, high-fidelity world simulator designed to advance the state of world models through video generation. Built on top of Wan2.2, it enables realistic, dynamic environment simulation across diverse styles, including real-world, scientific, and stylized domains. LingBot-World supports long-term temporal consistency, maintaining coherent scenes and interactions over minute-level horizons. With real-time interactivity and sub-second latency at 16 FPS, it is well-suited for interactive applications and rapid experimentation. The project is fully open-access, releasing both code and models to help bridge the gap between closed and open world-model systems. LingBot-World empowers researchers and developers in areas such as content creation, gaming, robotics, and embodied AI learning.
Features
- High-fidelity world simulation across realistic, scientific, and stylized environments
- Long-term memory with minute-level temporal consistency
- Real-time interactive generation with under 1-second latency at 16 FPS
- Image-to-video generation with optional camera pose and action control signals
- Support for multiple resolutions, including 480P and 720P output
- Scalable inference using distributed GPU execution (torchrun, FSDP)
- Open-source code and publicly released pretrained models
- Designed for research and applications in AIGC, gaming, and robot learning