GLM-Image: Auto-regressive for Dense-knowledge and High-fidelity Image
Netease Youdao's open-source embedding and reranker models
Visual Causal Flow
Open-source multi-speaker long-form text-to-speech model
An AI-powered security review GitHub Action using Claude
Qwen-Image-Layered: Layered Decomposition for Inherent Editablity
A Powerful Native Multimodal Model for Image Generation
Audio foundation model excelling in audio understanding
Video understanding codebase from FAIR for reproducing video models
Multimodal Diffusion with Representation Alignment
Foundational Models for State-of-the-Art Speech and Text Translation
New family of code large language models (LLMs)
The ChatGPT Retrieval Plugin lets you easily find personal documents
Encoder of greater-than-word length text trained on a variety of data
Official code for Style Aligned Image Generation via Shared Attention
Let us control diffusion models
Code release for "Masked-attention Mask Transformer
PyTorch implementation of MAE
Compact English sentence embedding model for semantic search tasks
Efficient English embedding model for semantic search and retrieval
BGE-Large v1.5: High-accuracy English embedding model for retrieval
An advanced bilingual image editing with semantic control
Custom BLEURT model for evaluating text similarity using PyTorch