An opinionated CLI to transcribe Audio files w/ Whisper on-device
Official code base for LeWorldModel: Stable End-to-End Joint-Embedding
Paste Markdown and AI responses into Word Excel instantly fast
Multilingual speech recognition and audio understanding model
An extensive node suite that enables ComfyUI to process 3D inputs
LLM-based agent for general purpose software engineering tasks
Supercharge Your LLM with the Fastest KV Cache Layer
Agent toolkit providing semantic retrieval and editing capabilities
A Production-ready Reinforcement Learning AI Agent Library
Technical principles related to large models
The largest collection of PyTorch image encoders / backbones
A library for accelerating Transformer models on NVIDIA GPUs
Hummingbird compiles trained ML models into tensor computation
Book about interpretable machine learning
GPT4V-level open-source multi-modal model based on Llama3-8B
Build production-ready AI agents in both Python and Typescript
Zero-code platform for building AI agents from natural language input
MCP server enabling AI agents to control and automate Windows OS
Framework for building real-time voice and multimodal AI agents
HivisionIDPhotos: a lightweight and efficient AI ID photos tools
Designed for training LLM/VLM agents via RL
Build AI WhatsApp Bots with Pure Python
Bringing BERT into modernity via both architecture changes and scaling
A lightweight framework for building LLM-based agents
SDG is a specialized framework