Renderer for the harmony response format to be used with gpt-oss
Interpretable prompting and models for NLP
Entity Relation Diagrams generation tool
The 100 line AI agent that solves GitHub issues
Reflexion: Language Agents with Verbal Reinforcement Learning
Test-Time Reinforcement Learning
How to optimize some algorithm in cuda
NeurIPS2025 Spotlight] Quantized Attention
Generative AI reference workflows
A list of free LLM inference resources accessible via API
A frontier, first-principles handbook
A New Axis of Sparsity for Large Language Models
"Big Model" trains a visual multimodal VLM with 26M parameters
Spanish-language course repository that teaches fundamentals of SQL
LLM training in simple, raw C/CUDA
K8s-mcp-server is a Model Context Protocol (MCP) server
Collect, organize, use, and share, all in OmniBox
Low-latency REST API for serving text-embeddings
Central interface to connect your LLM's with external data
Bridging Reasoning and Action Prediction
OpenRecall is a fully open-source, privacy-first alternative
AudioMuse-AI is an Open Source Dockerized environment
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
Tom Preston-Werner's obvious, minimal language
A framework that facilitates all stages of LLM development