Open-source deep-learning framework
Wan2.2: Open and Advanced Large-Scale Video Generative Model
Official repository for LTX-Video
Project Lyra: Open Generative 3D World Models
CogView4, CogView3-Plus and CogView3(ECCV 2024)
Continuous Autonomy for the AI SDK
Achieving 3+ generation speedup on reasoning tasks
Text and image to video generation: CogVideoX and CogVideo
PyTorch code and models for the DINOv2 self-supervised learning
Video understanding codebase from FAIR for reproducing video models
4M: Massively Multimodal Masked Modeling
Controllable & emotion-expressive zero-shot TTS
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Chat & pretrained large audio language model proposed by Alibaba Cloud
Real-time behaviour synthesis with MuJoCo, using Predictive Control
GLM-130B: An Open Bilingual Pre-Trained Model (ICLR 2023)
Implementation of model parallel autoregressive transformers on GPUs
T5-Small: Lightweight text-to-text transformer for NLP tasks
Tencent’s 36-language state-of-the-art translation model