Generate Any 3D Scene in Seconds
MARS5 speech model (TTS) from CAMB.AI
Large-language-model & vision-language-model based on Linear Attention
Foundational model for human-like, expressive TTS
Multi-lingual large voice generation model, providing inference
Towards Real-World Vision-Language Understanding
Real-time voice interactive digital human
Documentation for Google's Gen AI site - including Gemini API & Gemma
Build Vision Agents quickly with any model or video provider
Concatenate a directory full of files into a single prompt
Provides CTP stock options and Zhongtai Securities XTP
Implementation of Make-A-Video, new SOTA text to video generator
Guiding Instruction-based Image Editing via Multimodal Large Language
Diffusion Transformer with Fine-Grained Chinese Understanding
Sample code and notebooks for Generative AI on Google Cloud
Flexible Photo Recrafting While Preserving Your Identity
Bailing is a voice dialogue robot similar to GPT-4o
Toolkit for audio, music, and speech generation
Open source terminal session recorder
Large Multimodal Models for Video Understanding and Editing
One-click deployment (including offline integration package)
dude uncomplicated data extraction: A simple framework
Framework for building neural networks
A single Gradio + React WebUI with extensions for ACE-Step
Curl cryptocurrencies exchange rates