A tool for learning vector representations of words and entities
Free, high-quality text-to-speech API endpoint to replace OpenAI
Flexible Photo Recrafting While Preserving Your Identity
ComfyUI integration for Microsoft's VibeVoice text-to-speech model
Deploy and share agents with open infrastructure
A state-of-the-art open visual language model
Refractoring ChatBot+LLM, Gpt-3.5-turbo, ChatGPT Bot/Voice Assistant
Django friendly finite state machine support
A set of Docker images for training and serving models in TensorFlow
A multi-function Discord bot
"Big Model" trains a visual multimodal VLM with 26M parameters
A chatbot built based on a large model
Collection of reference environments, offline reinforcement learning
Simple and easily configurable grid world environments
The most powerful Android RPA agent framework
Implementation of "MobileCLIP" CVPR 2024
Code release for Cut and Learn for Unsupervised Object Detection
VMZ: Model Zoo for Video Modeling
High-resolution models for human tasks
Towards Real-World Vision-Language Understanding
Code for the paper "Evaluating Large Language Models Trained on Code"
CLIP, Predict the most relevant text snippet given an image
Ling is a MoE LLM provided and open-sourced by InclusionAI
Personalize Any Characters with a Scalable Diffusion Transformer
Stable Virtual Camera: Generative View Synthesis with Diffusion Models