Chinese and English multimodal conversational language model
Easy-to-use and powerful NLP library with Awesome model zoo
Tensor search for humans
NLP Cloud serves high performance pre-trained or custom models for NER
Towards Real-World Vision-Language Understanding
AutoGluon: AutoML for Image, Text, and Tabular Data
State-of-the-art diffusion models for image and audio generation
Chat & pretrained large vision language model
High-Resolution 3D Assets Generation with Large Scale Diffusion Models
Implementation of Make-A-Video, new SOTA text to video generator
Accurate × Fast × Comprehensive
Official Python inference and LoRA trainer package
Fast-stable-diffusion + DreamBooth
Dealing with all unstructured data, such as reverse image search
Parse files for optimal RAG
ComfyUI wrapper nodes for WanVideo and related models
21 Lessons, Get Started Building with Generative AI
"Big Model" trains a visual multimodal VLM with 26M parameters
Implementation of "MobileCLIP" CVPR 2024
Multimodal embedding and reranking models built on Qwen3-VL
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System
Large-language-model & vision-language-model based on Linear Attention
Unified Multimodal Understanding and Generation Models
Multilingual sentence & image embeddings with BERT
Free, high-quality text-to-speech API endpoint to replace OpenAI