CLI tool for configuring and monitoring Claude Code
Long-form streaming TTS system for multi-speaker dialogue generation
No-code LLM Platform to launch APIs and ETL Pipelines
The power of Claude Code / GeminiCLI / CodexCLI
Fast and Universal 3D reconstruction model for versatile tasks
4M: Massively Multimodal Masked Modeling
This repository contains the official implementation of FastVLM
ICLR2024 Spotlight: curation/training code, metadata, distribution
A PyTorch library for implementing flow matching algorithms
CogView4, CogView3-Plus and CogView3(ECCV 2024)
Memory-efficient and performant finetuning of Mistral's models
Ring is a reasoning MoE LLM provided and open-sourced by InclusionAI
Diffusion Transformer with Fine-Grained Chinese Understanding
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Benchmarking Multimodal Agents for Open-Ended Tasks
An industrial grade federated learning framework
A simple tool for reading in poorly redacted documents
Open 3D Engine (O3DE) is an Apache 2.0-licensed multi-platform 3D
Korvus is a search SDK that unifies the entire RAG pipeline
Open-source Agent OS for hardware intelligence
All-in-one WebUI for AI generative image and video creation
Benchmark LLMs by fighting in Street Fighter 3
Repo of Qwen2-Audio chat & pretrained large audio language model
Automated translation solution for visual novels