A Conversational Speech Generation Model
Di♪♪Rhythm: Blazingly Fast & Simple End-to-End Song Generation
Qwen2.5-Coder is the code version of Qwen2.5, the large language model
Open Multilingual Multimodal Chat LMs
An Open Bilingual Chat LLM | Open Source Bilingual Conversation LLM
Janus-Series: Unified Multimodal Understanding and Generation Models
tiktoken is a fast BPE tokeniser for use with OpenAI's models
The ChatGPT Retrieval Plugin lets you easily find personal
CLIP, Predict the most relevant text snippet given an image
JetBrains’ 4B parameter code model for completions
Dia-1.6B generates lifelike English dialogue and vocal expressions
Tencent’s 36-language state-of-the-art translation model
OpenAI’s compact 20B open model for fast, agentic, and local use
State-of-the-art RL-trained coding agent for complex SWE tasks
CTC-based forced aligner for audio-text in 158 languages
Mirror of Ultralytics YOLO-World model weights for object detection
Speaker segmentation model for 10s audio chunks with powerset labels
Open-weight, large-scale hybrid-attention reasoning model
Vision-language-action model for robot control via images and text
Detects speech activity in audio using pyannote.audio 2.1 pipeline
Time series forecasting model using T5 architecture with 46M params