An experimental version of DeepSeek model
Large Multimodal Models for Video Understanding and Editing
Ling is a MoE LLM provided and open-sourced by InclusionAI
CogView4, CogView3-Plus and CogView3(ECCV 2024)
State-of-the-art (SoTA) text-to-video pre-trained model
Release for Improved Denoising Diffusion Probabilistic Models
A Conversational Speech Generation Model
AI Suite for upscaling, interpolating & restoring images/videos
Code release for ConvNeXt V2 model
Reference implementation of the Transformer architecture optimized
Code release for "Masked-attention Mask Transformer
Code for the paper "Improved Techniques for Training GANs"
Tencent’s 36-language state-of-the-art translation model