CLIP, Predict the most relevant text snippet given an image
Open source platform for the machine learning lifecycle
Provides CTP stock options and Zhongtai Securities XTP
Get a ChatGPT plugin up and running in under 5 minutes
MemU is an open-source memory framework for AI companions
Benchmarking Multimodal Agents for Open-Ended Tasks
RL research on Android devices
The no-nonsense RAG chunking library
Easily turn large sets of image urls to an image dataset
Reproduction of Poetiq's record-breaking submission to the ARC-AGI-1
End-to-end speech processing toolkit
Sample code and notebooks for Generative AI on Google Cloud
A minimal yet professional single agent demo project
Open-source large language model family from Tencent Hunyuan
SAPIEN Manipulation Skill Framework
Collection of reference environments, offline reinforcement learning
Simple and easily configurable grid world environments
OpenDILab Decision AI Engine
MARS5 speech model (TTS) from CAMB.AI
Real-time voice interactive digital human
One-click deployment (including offline integration package)
Towards Human-Level Text-to-Speech through Style Diffusion
A TTS model capable of generating ultra-realistic dialogue
Mixture-of-Experts Vision-Language Models for Advanced Multimodal
Renderer for the harmony response format to be used with gpt-oss