Showing 53 open source projects for "optimization"

View related business solutions
  • $300 Free Credits to Build on Google Cloud Icon
    $300 Free Credits to Build on Google Cloud

    New to Google Cloud? Get $300 in credits to explore Compute Engine, BigQuery, Cloud Run, Gemini Enterprise Agent Platform, and more.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query petabytes in BigQuery, or build agents with Gemini Enterprise Agent Platform. Once your credits are used, keep building with 20+ always-free tier products including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. No commitment required—just sign up and start building.
    Claim $300 Free
  • Stop vibe-debugging. Icon
    Stop vibe-debugging.

    Plug Claude into your app's actual errors.

    AppSignal's MCP server hands Claude, Cursor, or Zed your real errors, traces, and the deploy that shipped them. AI writes the fix; you review the diff.
    Free 30 days.
  • 1
    AIDE ML

    AIDE ML

    AI-Driven Exploration in the Space of Code

    ...AIDE ML is packaged as a Python toolkit with built-in utilities such as command-line tools, configuration presets, and visualization interfaces that allow researchers to observe how the search process evolves. The framework is designed for experimentation and academic research into automated programming and machine learning optimization.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    GLM-5.1

    GLM-5.1

    GLM-5: From Vibe Coding to Agentic Engineering

    GLM-5.1 is a next-generation large language model developed by Z.ai for advanced coding, reasoning, and long-horizon agentic engineering tasks. Built as the successor to GLM-5, the model significantly improves performance in software engineering benchmarks, repository generation, and real-world terminal-based workflows. GLM-5.1 is designed to remain effective over extended problem-solving sessions, allowing it to iteratively refine strategies, analyze failures, and sustain productivity...
    Downloads: 69 This Week
    Last Update:
    See Project
  • 3
    Heretic

    Heretic

    Fully automatic censorship removal for language models

    Heretic is an open-source Python tool that automatically removes the built-in censorship or “safety alignment” from transformer-based language models so they respond to a broader range of prompts with fewer refusals. It works by applying directional ablation techniques and a parameter optimization strategy to adjust internal model behaviors without expensive post-training or altering the core capabilities. Designed for researchers and advanced users, Heretic makes it possible to study and experiment with uncensored model responses in a reproducible, automated way. The project can decensor many popular dense and some mixture-of-experts (MoE) models, supporting workflows that would otherwise require manual tuning. ...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 4
    RLHF-Reward-Modeling

    RLHF-Reward-Modeling

    Recipes to train reward model for RLHF

    ...The repository provides training recipes and implementations for building reward and preference models using modern machine learning frameworks. It supports multiple optimization strategies commonly used in alignment pipelines, including reinforcement learning with PPO, iterative supervised fine-tuning using rejection sampling, and direct preference optimization methods. The project also includes evaluation results showing that the trained reward models can achieve competitive performance compared with other open-source alignment systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    SkillOpt

    SkillOpt

    Text-space optimizer that trains reusable natural-language skills

    ...Its output is a deployable best_skill.md artifact that can be reused across agent tasks. The project is focused on making agents more effective through text-space optimization rather than traditional fine-tuning. It is most useful for AI researchers and agent developers studying self-improving workflows, skill libraries, and evaluation-driven prompt refinement.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    LLaMA-Factory

    LLaMA-Factory

    Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

    LLaMA-Factory is a fine-tuning and training framework for Meta's LLaMA language models. It enables researchers and developers to train and customize LLaMA models efficiently using advanced optimization techniques.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 7
    dive-into-llms

    dive-into-llms

    "Dive into LLMs" series of practical programming tutorials

    ...It includes code samples, tutorials, and conceptual breakdowns that bridge the gap between academic research and real-world implementation. The project also highlights best practices for working with LLMs, including prompt design and optimization strategies. By focusing on clarity and depth, it serves as both a teaching tool and a reference for developers. Overall, dive-into-llms provides a structured and practical approach to mastering modern language model technology.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    tiny-llm

    tiny-llm

    A course of learning LLM inference serving on Apple Silicon

    ...The project is structured as a guided course that walks developers through the process of implementing the core components required to run a modern language model, including attention mechanisms, token generation, and optimization techniques. Rather than relying on high-level machine learning frameworks, the codebase uses mostly low-level array and matrix manipulation APIs so that developers can understand exactly how model inference works internally. The project demonstrates how to load and run models such as Qwen-style architectures while progressively implementing performance improvements like KV caching, request batching, and optimized attention mechanisms. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    ERNIE

    ERNIE

    The official repository for ERNIE 4.5 and ERNIEKit

    ...It supports both full-parameter training and parameter-efficient approaches so teams can choose between maximum quality and lower-cost adaptation depending on their constraints. The project also emphasizes optimization techniques for large-scale training, including mixed-precision and hybrid-parallel strategies that are commonly needed for multi-node GPU clusters. In addition to training, it includes guidance and example materials intended to help developers adopt ERNIE models for real product scenarios rather than only research demonstrations.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 10
    Nano-vLLM

    Nano-vLLM

    A lightweight vLLM implementation built from scratch

    ...The project recreates the core functionality of vLLM in a simplified architecture written in approximately a thousand lines of Python, making it easier for developers and researchers to understand how modern LLM inference systems work. Despite its compact design, nano-vllm incorporates advanced optimization techniques such as prefix caching, tensor parallelism, and CUDA graph execution to achieve high performance during model inference. The engine is intended primarily for educational use, experimentation, and lightweight deployments where a full production-grade inference stack may be unnecessary. Its API closely mirrors that of the original vLLM framework, allowing developers familiar with vLLM to adopt the tool with minimal changes.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    MiniOneRec

    MiniOneRec

    Minimal reproduction of OneRec

    ...The framework provides an end-to-end pipeline for building generative recommender systems, including semantic identifier construction, supervised fine-tuning, and reinforcement learning-based optimization. Semantic IDs are created using techniques such as quantized variational autoencoders to convert item features into token sequences that can be modeled by transformer architectures. Developers can train and evaluate recommendation models using different backbone language models while benefiting from the generative framework’s parameter efficiency and scalability.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    how-to-optim-algorithm-in-cuda

    how-to-optim-algorithm-in-cuda

    How to optimize some algorithm in cuda

    ...Instead of presenting only theoretical explanations, the repository includes hand-written CUDA implementations of fundamental operations such as reductions, element-wise computations, softmax, and attention mechanisms. These examples show how different optimization techniques influence performance on modern GPU hardware and allow readers to experiment with real implementations. The repository also contains extensive learning notes that summarize CUDA programming concepts, GPU architecture details, and performance engineering strategies.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Context Engineering

    Context Engineering

    A frontier, first-principles handbook

    Context Engineering is a comprehensive, open-source project serving as a first-principles handbook for the emerging discipline of context design and optimization in AI. Moving beyond traditional prompt engineering, this repository defines and explores how to craft and provide complete context payloads — not just single prompts — to large language models so they can perform tasks more reliably and intelligently. It takes inspiration from thought leaders like Andrej Karpathy and bridges theory with practical examples, offering structured guidance on context orchestration, memory, retrieval, and state control within AI workflows. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    mllm

    mllm

    Fast Multimodal LLM on Mobile Devices

    ...Implemented primarily in C and C++, it is designed to operate with minimal external dependencies while taking advantage of hardware-specific acceleration technologies such as ARM NEON and x86 AVX2 instructions. The system supports multiple optimization techniques including quantization, pruning, and speculative decoding to improve performance while reducing computational overhead. It also provides tools to convert models from popular formats like PyTorch checkpoints into optimized runtime formats that can be executed on supported hardware platforms.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    LightLLM

    LightLLM

    LightLLM is a Python-based LLM (Large Language Model) inference

    ...The framework enables developers to run and serve modern language models with significantly improved speed and resource efficiency compared to many traditional inference systems. Built primarily in Python, the project integrates optimization techniques and ideas from several leading open-source implementations, including FasterTransformer, vLLM, and FlashAttention, to accelerate token generation and reduce latency. LightLLM is designed to handle large-scale model workloads in production environments, supporting efficient batching and GPU utilization for fast inference across multiple requests. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    PyTorch-Tutorial-2nd

    PyTorch-Tutorial-2nd

    CV, NLP, LLM project applications, and advanced engineering deployment

    ...The project serves as a practical companion to a second edition of a PyTorch learning guide and is designed to help learners understand neural network concepts through hands-on coding examples. The repository covers a wide range of topics including tensor operations, neural network construction, model training workflows, and optimization strategies. It also introduces practical machine learning techniques such as convolutional neural networks, recurrent networks, and other architectures commonly used in modern AI applications. Each tutorial focuses on step-by-step implementation so learners can understand how theoretical concepts translate into working code. The materials are designed for both beginners and intermediate developers who want to gain practical experience building deep learning models using PyTorch.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Xtuner

    Xtuner

    A Next-Generation Training Engine Built for Ultra-Large MoE Models

    ...Its architecture incorporates memory-efficient optimizations that allow researchers to train large models even when computational resources are limited. XTuner is also designed to integrate with modern AI ecosystems, supporting multimodal training, reinforcement learning optimization, and instruction tuning pipelines.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    LLM Course

    LLM Course

    Course to get into Large Language Models (LLMs)

    ...Learners get exposure to multiple adaptation strategies—LoRA/QLoRA, instruction fine-tuning, and alignment techniques—so they can choose approaches that fit their hardware and budgets. The materials also cover inference optimization and quantization to make serving LLMs feasible on commodity GPUs or even CPUs, which is crucial for side projects and startups. Evaluation is treated as a first-class topic, with examples of automatic and human-in-the-loop methods to catch regressions and verify quality beyond simple loss values. By the end, students have a mental model and a practical toolkit for iterating on datasets, training configs, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    SWIFT LLM

    SWIFT LLM

    Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs

    SWIFT LLM is a comprehensive framework developed within the ModelScope ecosystem for training, fine-tuning, evaluating, and deploying large language models and multimodal models. The platform provides a full machine learning pipeline that supports tasks ranging from model pre-training to reinforcement learning alignment techniques. It integrates with popular inference engines such as vLLM and LMDeploy to accelerate deployment and runtime performance. The framework also includes support for...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20
    LLMs-from-scratch

    LLMs-from-scratch

    Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

    ...The focus is on readability, correctness, and experimentation, making it ideal for students and practitioners transitioning from theory to working systems. By the end, you have a grounded sense of how data pipelines, optimization, and inference interact to produce fluent text.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 21
    SageAttention

    SageAttention

    NeurIPS2025 Spotlight] Quantized Attention

    SageAttention is an open-source optimization library designed to accelerate the attention mechanism used in transformer-based neural networks. Since attention operations are often the most computationally expensive component of modern AI models, SageAttention introduces quantization techniques that significantly reduce computational overhead while preserving model accuracy.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    DecryptPrompt

    DecryptPrompt

    Summarize Prompt & LLM papers, open source data & models

    ...The project collects papers, technical reports, and research materials that explore prompting techniques, model architectures, and reasoning strategies used in modern AI systems. It serves as a structured knowledge base where developers and researchers can quickly find key papers about topics such as chain-of-thought reasoning, prompt optimization, reasoning frameworks, and model training techniques. The repository organizes research into thematic sections that cover different prompting methodologies and reasoning paradigms used in LLM development. Many of the resources focus on understanding how prompts influence model behavior and how prompting strategies can improve reasoning or efficiency.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    LLM Action

    LLM Action

    Technical principles related to large models

    LLM-Action is a knowledge/tutorial/repository that shares principles, techniques, and real-world experience related to large language models (LLMs), focusing on LLM engineering, deployment, optimization, inference, compression, and tooling. It organizes content in domains like training, inference, compression, alignment, evaluation, pipelines, and applications. Sections covering infrastructure, engineering, and deployment. Repository templates, sample code, and resource links. Articles/code on LLM compression (quantization, pruning).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    VibeThinker

    VibeThinker

    Diversity-driven optimization and large-model reasoning ability

    VibeThinker is a compact but high-capability open-source language model released by WeiboAI (Sina AI Lab). It contains about 1.5 billion parameters, far smaller than many “frontier” models, yet it is explicitly optimized for reasoning, mathematics, and code generation tasks rather than general open-domain chat. The innovation lies in its training methodology: the team uses what they call the Spectrum-to-Signal Principle (SSP), where a first stage emphasizes diversity of reasoning paths (the...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 25
    Google Workspace MCP Server

    Google Workspace MCP Server

    Control Gmail, Google Calendar, Docs, Sheets, Slides, Chat, Forms

    Google Workspace MCP is an open-source server that connects AI assistants to Google Workspace services through the Model Context Protocol (MCP), allowing large language models to interact directly with productivity tools. The project exposes a wide set of Google services including Gmail, Google Drive, Docs, Sheets, Slides, Calendar, Chat, and other Workspace components as structured tools that an AI system can call programmatically. By acting as a bridge between AI clients and the Google...
    Downloads: 3 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
Auth0 Logo