Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence
Large Language Models (LLM)
Search Results

Search Results for "optimization"

x

Sort By:

Relevance

Clear All Filters

OS

Linux 54
Windows 54
Mac 53
More...
BSD 39
ChromeOS 39
Mobile Operating Systems 1

Category

Artificial Intelligence 55
Education 1

License

OSI-Approved Open Source 49

Programming Language

Python 45
C++ 3
Go 1
JavaScript 1
More...
TypeScript 1

Showing 55 open source projects for "optimization"

View related business solutions

Large Language Models (LLM) Clear Filters & Widen Search

$300 Free Credits for Your Google Cloud Projects
Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.

Start Free Trial
Compliant and Reliable File Transfers Backed by Top Security Certifications
Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.

Start Free Trial
1

AIDE ML

AI-Driven Exploration in the Space of Code

...AIDE ML is packaged as a Python toolkit with built-in utilities such as command-line tools, configuration presets, and visualization interfaces that allow researchers to observe how the search process evolves. The framework is designed for experimentation and academic research into automated programming and machine learning optimization.

Downloads: 0 This Week

Last Update: 2026-03-09
See Project
2

GLM-5.1

GLM-5: From Vibe Coding to Agentic Engineering

GLM-5.1 is a next-generation large language model developed by Z.ai for advanced coding, reasoning, and long-horizon agentic engineering tasks. Built as the successor to GLM-5, the model significantly improves performance in software engineering benchmarks, repository generation, and real-world terminal-based workflows. GLM-5.1 is designed to remain effective over extended problem-solving sessions, allowing it to iteratively refine strategies, analyze failures, and sustain productivity...

Downloads: 69 This Week

Last Update: 3 days ago
See Project
3

Heretic

Fully automatic censorship removal for language models

Heretic is an open-source Python tool that automatically removes the built-in censorship or “safety alignment” from transformer-based language models so they respond to a broader range of prompts with fewer refusals. It works by applying directional ablation techniques and a parameter optimization strategy to adjust internal model behaviors without expensive post-training or altering the core capabilities. Designed for researchers and advanced users, Heretic makes it possible to study and experiment with uncensored model responses in a reproducible, automated way. The project can decensor many popular dense and some mixture-of-experts (MoE) models, supporting workflows that would otherwise require manual tuning. ...

Downloads: 12 This Week

Last Update: 6 days ago
See Project
4

Headroom

Compress tool outputs, logs, files, and RAG chunks

Headroom is a context optimization layer for LLM applications that compresses information before it reaches the model. It sits between an application and an LLM provider, intercepting requests and forwarding a shorter optimized prompt. The project is designed to reduce token usage while preserving the answer quality needed for agent workflows. It can compress tool outputs, logs, RAG chunks, files, and conversation history.

Downloads: 4 This Week

Last Update: 4 days ago
See Project
Build Agents and Models on One Platform
Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.

Try It Free
5

RLHF-Reward-Modeling

Recipes to train reward model for RLHF

...The repository provides training recipes and implementations for building reward and preference models using modern machine learning frameworks. It supports multiple optimization strategies commonly used in alignment pipelines, including reinforcement learning with PPO, iterative supervised fine-tuning using rejection sampling, and direct preference optimization methods. The project also includes evaluation results showing that the trained reward models can achieve competitive performance compared with other open-source alignment systems.

Downloads: 0 This Week

Last Update: 2026-03-06
See Project
6

SkillOpt

Text-space optimizer that trains reusable natural-language skills

...Its output is a deployable best_skill.md artifact that can be reused across agent tasks. The project is focused on making agents more effective through text-space optimization rather than traditional fine-tuning. It is most useful for AI researchers and agent developers studying self-improving workflows, skill libraries, and evaluation-driven prompt refinement.

Downloads: 1 This Week

Last Update: 2026-06-02
See Project
7

LLaMA-Factory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

LLaMA-Factory is a fine-tuning and training framework for Meta's LLaMA language models. It enables researchers and developers to train and customize LLaMA models efficiently using advanced optimization techniques.

Downloads: 11 This Week

Last Update: 2026-05-30
See Project
8

dive-into-llms

"Dive into LLMs" series of practical programming tutorials

...It includes code samples, tutorials, and conceptual breakdowns that bridge the gap between academic research and real-world implementation. The project also highlights best practices for working with LLMs, including prompt design and optimization strategies. By focusing on clarity and depth, it serves as both a teaching tool and a reference for developers. Overall, dive-into-llms provides a structured and practical approach to mastering modern language model technology.

Downloads: 0 This Week

Last Update: 2026-04-15
See Project
9

All-in-RAG

Big Model Application Development Practice 1

...It explains the full development pipeline required to create knowledge-aware AI assistants, including data preparation, document indexing, vector embedding generation, and retrieval strategies. The project also explores advanced topics such as hybrid retrieval methods, query optimization, and evaluation techniques for improving system accuracy. Alongside theoretical explanations, the repository includes hands-on exercises and example projects that demonstrate how to build production-ready RAG systems. These projects guide developers through the process of integrating vector databases, embedding models, and large language models into a unified application.

Downloads: 0 This Week

Last Update: 2026-06-05
See Project
Your monitoring isn't a stack. It's a pile. Fix that.
Errors, performance, logs, uptime. One install, one invoice, one UI.

Replace Datadog, New Relic, and Sentry without adding three more dashboards.

Free 30 days.
10

tiny-llm

A course of learning LLM inference serving on Apple Silicon

...The project is structured as a guided course that walks developers through the process of implementing the core components required to run a modern language model, including attention mechanisms, token generation, and optimization techniques. Rather than relying on high-level machine learning frameworks, the codebase uses mostly low-level array and matrix manipulation APIs so that developers can understand exactly how model inference works internally. The project demonstrates how to load and run models such as Qwen-style architectures while progressively implementing performance improvements like KV caching, request batching, and optimized attention mechanisms. ...

Downloads: 2 This Week

Last Update: 7 days ago
See Project
11

ERNIE

The official repository for ERNIE 4.5 and ERNIEKit

...It supports both full-parameter training and parameter-efficient approaches so teams can choose between maximum quality and lower-cost adaptation depending on their constraints. The project also emphasizes optimization techniques for large-scale training, including mixed-precision and hybrid-parallel strategies that are commonly needed for multi-node GPU clusters. In addition to training, it includes guidance and example materials intended to help developers adopt ERNIE models for real product scenarios rather than only research demonstrations.

Downloads: 2 This Week

Last Update: 2026-03-04
See Project
12

Nano-vLLM

A lightweight vLLM implementation built from scratch

...The project recreates the core functionality of vLLM in a simplified architecture written in approximately a thousand lines of Python, making it easier for developers and researchers to understand how modern LLM inference systems work. Despite its compact design, nano-vllm incorporates advanced optimization techniques such as prefix caching, tensor parallelism, and CUDA graph execution to achieve high performance during model inference. The engine is intended primarily for educational use, experimentation, and lightweight deployments where a full production-grade inference stack may be unnecessary. Its API closely mirrors that of the original vLLM framework, allowing developers familiar with vLLM to adopt the tool with minimal changes.

Downloads: 1 This Week

Last Update: 2026-04-26
See Project
13

MiniOneRec

Minimal reproduction of OneRec

...The framework provides an end-to-end pipeline for building generative recommender systems, including semantic identifier construction, supervised fine-tuning, and reinforcement learning-based optimization. Semantic IDs are created using techniques such as quantized variational autoencoders to convert item features into token sequences that can be modeled by transformer architectures. Developers can train and evaluate recommendation models using different backbone language models while benefiting from the generative framework’s parameter efficiency and scalability.

Downloads: 0 This Week

Last Update: 2026-05-14
See Project
14

how-to-optim-algorithm-in-cuda

How to optimize some algorithm in cuda

...Instead of presenting only theoretical explanations, the repository includes hand-written CUDA implementations of fundamental operations such as reductions, element-wise computations, softmax, and attention mechanisms. These examples show how different optimization techniques influence performance on modern GPU hardware and allow readers to experiment with real implementations. The repository also contains extensive learning notes that summarize CUDA programming concepts, GPU architecture details, and performance engineering strategies.

Downloads: 0 This Week

Last Update: 2026-06-08
See Project
15

Context Engineering

A frontier, first-principles handbook

Context Engineering is a comprehensive, open-source project serving as a first-principles handbook for the emerging discipline of context design and optimization in AI. Moving beyond traditional prompt engineering, this repository defines and explores how to craft and provide complete context payloads — not just single prompts — to large language models so they can perform tasks more reliably and intelligently. It takes inspiration from thought leaders like Andrej Karpathy and bridges theory with practical examples, offering structured guidance on context orchestration, memory, retrieval, and state control within AI workflows. ...

Downloads: 0 This Week

Last Update: 2026-02-27
See Project
16

mllm

Fast Multimodal LLM on Mobile Devices

...Implemented primarily in C and C++, it is designed to operate with minimal external dependencies while taking advantage of hardware-specific acceleration technologies such as ARM NEON and x86 AVX2 instructions. The system supports multiple optimization techniques including quantization, pruning, and speculative decoding to improve performance while reducing computational overhead. It also provides tools to convert models from popular formats like PyTorch checkpoints into optimized runtime formats that can be executed on supported hardware platforms.

Downloads: 0 This Week

Last Update: 2026-03-09
See Project
17

LightLLM

LightLLM is a Python-based LLM (Large Language Model) inference

...The framework enables developers to run and serve modern language models with significantly improved speed and resource efficiency compared to many traditional inference systems. Built primarily in Python, the project integrates optimization techniques and ideas from several leading open-source implementations, including FasterTransformer, vLLM, and FlashAttention, to accelerate token generation and reduce latency. LightLLM is designed to handle large-scale model workloads in production environments, supporting efficient batching and GPU utilization for fast inference across multiple requests. ...

Downloads: 0 This Week

Last Update: 2026-03-05
See Project
18

PyTorch-Tutorial-2nd

CV, NLP, LLM project applications, and advanced engineering deployment

...The project serves as a practical companion to a second edition of a PyTorch learning guide and is designed to help learners understand neural network concepts through hands-on coding examples. The repository covers a wide range of topics including tensor operations, neural network construction, model training workflows, and optimization strategies. It also introduces practical machine learning techniques such as convolutional neural networks, recurrent networks, and other architectures commonly used in modern AI applications. Each tutorial focuses on step-by-step implementation so learners can understand how theoretical concepts translate into working code. The materials are designed for both beginners and intermediate developers who want to gain practical experience building deep learning models using PyTorch.

Downloads: 0 This Week

Last Update: 2026-03-04
See Project
19

Xtuner

A Next-Generation Training Engine Built for Ultra-Large MoE Models

...Its architecture incorporates memory-efficient optimizations that allow researchers to train large models even when computational resources are limited. XTuner is also designed to integrate with modern AI ecosystems, supporting multimodal training, reinforcement learning optimization, and instruction tuning pipelines.

Downloads: 0 This Week

Last Update: 2026-03-04
See Project
20

LLM Course

Course to get into Large Language Models (LLMs)

...Learners get exposure to multiple adaptation strategies—LoRA/QLoRA, instruction fine-tuning, and alignment techniques—so they can choose approaches that fit their hardware and budgets. The materials also cover inference optimization and quantization to make serving LLMs feasible on commodity GPUs or even CPUs, which is crucial for side projects and startups. Evaluation is treated as a first-class topic, with examples of automatic and human-in-the-loop methods to catch regressions and verify quality beyond simple loss values. By the end, students have a mental model and a practical toolkit for iterating on datasets, training configs, etc.

Downloads: 0 This Week

Last Update: 2026-02-05
See Project
21

SWIFT LLM

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs

SWIFT LLM is a comprehensive framework developed within the ModelScope ecosystem for training, fine-tuning, evaluating, and deploying large language models and multimodal models. The platform provides a full machine learning pipeline that supports tasks ranging from model pre-training to reinforcement learning alignment techniques. It integrates with popular inference engines such as vLLM and LMDeploy to accelerate deployment and runtime performance. The framework also includes support for...

Downloads: 3 This Week

Last Update: 3 days ago
See Project
22

LLMs-from-scratch

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

...The focus is on readability, correctness, and experimentation, making it ideal for students and practitioners transitioning from theory to working systems. By the end, you have a grounded sense of how data pipelines, optimization, and inference interact to produce fluent text.

Downloads: 4 This Week

Last Update: 2026-06-02
See Project
23

SageAttention

NeurIPS2025 Spotlight] Quantized Attention

SageAttention is an open-source optimization library designed to accelerate the attention mechanism used in transformer-based neural networks. Since attention operations are often the most computationally expensive component of modern AI models, SageAttention introduces quantization techniques that significantly reduce computational overhead while preserving model accuracy.

Downloads: 2 This Week

Last Update: 2026-03-08
See Project
24

DecryptPrompt

Summarize Prompt & LLM papers, open source data & models

...The project collects papers, technical reports, and research materials that explore prompting techniques, model architectures, and reasoning strategies used in modern AI systems. It serves as a structured knowledge base where developers and researchers can quickly find key papers about topics such as chain-of-thought reasoning, prompt optimization, reasoning frameworks, and model training techniques. The repository organizes research into thematic sections that cover different prompting methodologies and reasoning paradigms used in LLM development. Many of the resources focus on understanding how prompts influence model behavior and how prompting strategies can improve reasoning or efficiency.

Downloads: 1 This Week

Last Update: 2026-05-06
See Project
25

LLM Action

Technical principles related to large models

LLM-Action is a knowledge/tutorial/repository that shares principles, techniques, and real-world experience related to large language models (LLMs), focusing on LLM engineering, deployment, optimization, inference, compression, and tooling. It organizes content in domains like training, inference, compression, alignment, evaluation, pipelines, and applications. Sections covering infrastructure, engineering, and deployment. Repository templates, sample code, and resource links. Articles/code on LLM compression (quantization, pruning).

Downloads: 0 This Week

Last Update: 2026-05-25
See Project

Previous
You're on page 1
2
3
Next

Related Searches

llm

Related Categories

Artificial Intelligence

Education

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise