Accelerate local LLM inference and finetuning
Advanced techniques for RAG systems
Ring is a reasoning MoE LLM provided and open-sourced by InclusionAI
Qwen3-omni is a natively end-to-end, omni-modal LLM
Control Gmail, Google Calendar, Docs, Sheets, Slides, Chat, Forms
A course of learning LLM inference serving on Apple Silicon
GPT4V-level open-source multi-modal model based on Llama3-8B
Multilingual sentence & image embeddings with BERT
Open-weight, large-scale hybrid-attention reasoning model
Traditional Mandarin LLMs for Taiwan
Autoregressive Model Beats Diffusion
Constrained Value Alignment via Safe Reinforcement Learning
An efficient forwarding service designed for LLMs
Inference Llama 2 in one file of pure C
Inference code for CodeLlama models
Committed to building an open, public welfare
Qwen2.5-Coder is the code version of Qwen2.5, the large language model
A Pioneering Open-Source Alternative to GPT-4o
Open-source, high-performance Mixture-of-Experts large language model
Chat & pretrained large audio language model proposed by Alibaba Cloud
Unify Efficient Fine-tuning of RAG Retrieval, including Embedding
Chat & pretrained large vision language model
Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project
Ray Aviary - evaluate multiple LLMs easily
A repository that contains models, datasets, and fine-tuning