A high-throughput and memory-efficient inference and serving engine
A modular graph-based Retrieval-Augmented Generation (RAG) system
Tensor search for humans
Inference Llama 2 in one file of pure C
AI-powered CLI git wrapper, boilerplate code generator, chat history
Experimental search engine for conversational AI such as parl.ai