How to optimize some algorithm in cuda
MobileLLM Optimizing Sub-billion Parameter Language Models
Real-time multi-AI collaboration: Claude, Codex & Gemini
Cybersecurity AI (CAI), the framework for AI Security
Hypernetworks that adapt LLMs for specific benchmark tasks
Specify a github or local repo, github pull request
The SOTA Open-Source Browser Agent
Inference code for CodeLlama models
AI-Powered Data Processing: Use LOTUS to process all of your datasets
Performance-optimized AI inference on your GPUs
Run LLM prompts from your shell
Qwen3-Coder is the code version of Qwen3
Implement CPU from scratch and play with large model deployments
Inference Llama 2 in one file of pure C
AI-Driven Exploration in the Space of Code
Collect, organize, use, and share, all in OmniBox
ChatGLM2-6B: An Open Bilingual Chat LLM
Streamlines and simplifies prompt design for both developers
SimpleMem: Efficient Lifelong Memory for LLM Agents
ChatGLM3 series: Open Bilingual Chat LLMs | Open Source Bilingual Chat
Anomaly detection related books, papers, videos, and toolboxes
GLM-4 series: Open Multilingual Multimodal Chat LMs
Designed for text embedding and ranking tasks
Unify Efficient Fine-tuning of RAG Retrieval, including Embedding
Did you say you like data?