Helping you get the most out of AWS, wherever you use MCP
100–200× Acceleration for Video Diffusion Models
RGBD video generation model conditioned on camera input
Designed for text embedding and ranking tasks
HY-Motion model for 3D character animation generation
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
OCR expert VLM powered by Hunyuan's native multimodal architecture
High-Fidelity and Controllable Generation of Textured 3D Assets
AIMET is a library that provides advanced quantization and compression
A Universal Customization Method for Single and Multi Conditioning
Foundational model for human-like, expressive TTS
Ring is a reasoning MoE LLM provided and open-sourced by InclusionAI
Geographic add-ons for Django REST Framework
GUI Exploration Lab. One of the best GUI agent solutions
StreamSpeech is a seamless model for offline speech recognition
Repo of Qwen2-Audio chat & pretrained large audio language model
GPT4V-level open-source multi-modal model based on Llama3-8B
Ollama Python library
PyTorch extensions for fast R&D prototyping and Kaggle farming
Framework for building neural networks
This repository contains the official implementation of FastVLM
The Open Source Cowork Desktop to Unlock Your Exceptional Productivity
Drop in a screenshot and convert it to clean code
Prompt Declaration Language is a declarative prompt programming lang
Interpretable prompting and models for NLP