Free, high-quality text-to-speech API endpoint to replace OpenAI
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
LLM-based agent for general purpose software engineering tasks
Powering Amazon custom machine learning chips
Inference script for Oasis 500M
Document Image Parsing via Heterogeneous Anchor Prompting”
Framework for building neural networks
StreamSpeech is a seamless model for offline speech recognition
Toolkit for audio, music, and speech generation
Advanced techniques for RAG systems
The best ChatGPT that $100 can buy
A secure sandbox environment for malware developers and red teamers
A Model Context Protocol server for searching and analyzing arXiv
4M: Massively Multimodal Masked Modeling
Guiding Instruction-based Image Editing via Multimodal Large Language
Refer and Ground Anything Anywhere at Any Granularity
Supercharge Your LLM with the Fastest KV Cache Layer
A Model Context Protocol (MCP) Gateway & Registry
The official Meta Llama 3 GitHub site
Utilities intended for use with Llama models
Agent toolkit providing semantic retrieval and editing capabilities
Open-source platform for building enterprise-grade agents
FAIR Sequence Modeling Toolkit 2
Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass
ICLR2024 Spotlight: curation/training code, metadata, distribution