Real-World Centric Foundation GUI Agents
An experimental version of DeepSeek model
text and image to video generation: CogVideoX (2024) and CogVideo
Models for object and human mesh reconstruction
A sound cloning tool with a web interface, using your voice
A Powerful Native Multimodal Model for Image Generation
Generate short videos with one click using AI LLM
⚡ Building applications with LLMs through composability ⚡
Open Source Document Management System for Digital Archives
Image polygonal annotation with Python
Easily turn large sets of image urls to an image dataset
Label Studio is a multi-type data labeling and annotation tool
A python tool that uses GPT-4, FFmpeg, and OpenCV
Powerful tool that lets you create and run intelligent agents
ChatGLM-6B: An Open Bilingual Dialogue Language Model
LLM based autonomous agent that does online comprehensive research
Reproduction of Poetiq's record-breaking submission to the ARC-AGI-1
State-of-the-art TTS model under 25MB
Qwen-Image is a powerful image generation foundation model
The official gpt4free repository
Lets make video diffusion practical
Offline Text To Speech synthesis for python
Generate audiobooks from EPUBs, PDFs and text with captions
Qwen2.5-VL is the multimodal large language model series
Chemcrow