Join/Login
Business Software
Open Source Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Business Software

Open Source Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Open Source Software
Artificial Intelligence
Large Language Models (LLM)
Search Results

Search Results for "optimization"

x

Sort By:

Relevance

Clear All Filters

OS

Mac 53
Linux 53
Windows 53
More...
BSD 39
ChromeOS 39
Mobile Operating Systems 1

Category

Artificial Intelligence 53
Education 1

License

OSI-Approved Open Source 48

Programming Language

Python 43
C++ 3
Go 1
JavaScript 1
More...
TypeScript 1

Showing 53 open source projects for "optimization"

View related business solutions

Large Language Models (LLM) Mac Clear Filters & Widen Search

$300 Free Credits to Build on Google Cloud
New to Google Cloud? Get $300 in credits to explore Compute Engine, BigQuery, Cloud Run, Gemini Enterprise Agent Platform, and more.

Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query petabytes in BigQuery, or build agents with Gemini Enterprise Agent Platform. Once your credits are used, keep building with 20+ always-free tier products including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. No commitment required—just sign up and start building.

Claim $300 Free
Stop vibe-debugging.
Plug Claude into your app's actual errors.

AppSignal's MCP server hands Claude, Cursor, or Zed your real errors, traces, and the deploy that shipped them. AI writes the fix; you review the diff.

Free 30 days.
1

AIDE ML

AI-Driven Exploration in the Space of Code

...AIDE ML is packaged as a Python toolkit with built-in utilities such as command-line tools, configuration presets, and visualization interfaces that allow researchers to observe how the search process evolves. The framework is designed for experimentation and academic research into automated programming and machine learning optimization.

Downloads: 0 This Week

Last Update: 2026-03-09
See Project
2

GLM-5.1

GLM-5: From Vibe Coding to Agentic Engineering

GLM-5.1 is a next-generation large language model developed by Z.ai for advanced coding, reasoning, and long-horizon agentic engineering tasks. Built as the successor to GLM-5, the model significantly improves performance in software engineering benchmarks, repository generation, and real-world terminal-based workflows. GLM-5.1 is designed to remain effective over extended problem-solving sessions, allowing it to iteratively refine strategies, analyze failures, and sustain productivity...

Downloads: 69 This Week

Last Update: 3 days ago
See Project
3

Heretic

Fully automatic censorship removal for language models

Heretic is an open-source Python tool that automatically removes the built-in censorship or “safety alignment” from transformer-based language models so they respond to a broader range of prompts with fewer refusals. It works by applying directional ablation techniques and a parameter optimization strategy to adjust internal model behaviors without expensive post-training or altering the core capabilities. Designed for researchers and advanced users, Heretic makes it possible to study and experiment with uncensored model responses in a reproducible, automated way. The project can decensor many popular dense and some mixture-of-experts (MoE) models, supporting workflows that would otherwise require manual tuning. ...

Downloads: 12 This Week

Last Update: 6 days ago
See Project
4

RLHF-Reward-Modeling

Recipes to train reward model for RLHF

...The repository provides training recipes and implementations for building reward and preference models using modern machine learning frameworks. It supports multiple optimization strategies commonly used in alignment pipelines, including reinforcement learning with PPO, iterative supervised fine-tuning using rejection sampling, and direct preference optimization methods. The project also includes evaluation results showing that the trained reward models can achieve competitive performance compared with other open-source alignment systems.

Downloads: 0 This Week

Last Update: 2026-03-06
See Project
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
5

SkillOpt

Text-space optimizer that trains reusable natural-language skills

...Its output is a deployable best_skill.md artifact that can be reused across agent tasks. The project is focused on making agents more effective through text-space optimization rather than traditional fine-tuning. It is most useful for AI researchers and agent developers studying self-improving workflows, skill libraries, and evaluation-driven prompt refinement.

Downloads: 1 This Week

Last Update: 2026-06-02
See Project
6

LLaMA-Factory

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

LLaMA-Factory is a fine-tuning and training framework for Meta's LLaMA language models. It enables researchers and developers to train and customize LLaMA models efficiently using advanced optimization techniques.

Downloads: 11 This Week

Last Update: 2026-05-30
See Project
7

dive-into-llms

"Dive into LLMs" series of practical programming tutorials

...It includes code samples, tutorials, and conceptual breakdowns that bridge the gap between academic research and real-world implementation. The project also highlights best practices for working with LLMs, including prompt design and optimization strategies. By focusing on clarity and depth, it serves as both a teaching tool and a reference for developers. Overall, dive-into-llms provides a structured and practical approach to mastering modern language model technology.

Downloads: 0 This Week

Last Update: 2026-04-15
See Project
8

tiny-llm

A course of learning LLM inference serving on Apple Silicon

...The project is structured as a guided course that walks developers through the process of implementing the core components required to run a modern language model, including attention mechanisms, token generation, and optimization techniques. Rather than relying on high-level machine learning frameworks, the codebase uses mostly low-level array and matrix manipulation APIs so that developers can understand exactly how model inference works internally. The project demonstrates how to load and run models such as Qwen-style architectures while progressively implementing performance improvements like KV caching, request batching, and optimized attention mechanisms. ...

Downloads: 2 This Week

Last Update: 2026-06-13
See Project
9

ERNIE

The official repository for ERNIE 4.5 and ERNIEKit

...It supports both full-parameter training and parameter-efficient approaches so teams can choose between maximum quality and lower-cost adaptation depending on their constraints. The project also emphasizes optimization techniques for large-scale training, including mixed-precision and hybrid-parallel strategies that are commonly needed for multi-node GPU clusters. In addition to training, it includes guidance and example materials intended to help developers adopt ERNIE models for real product scenarios rather than only research demonstrations.

Downloads: 2 This Week

Last Update: 2026-03-04
See Project
Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now
10

Nano-vLLM

A lightweight vLLM implementation built from scratch

...The project recreates the core functionality of vLLM in a simplified architecture written in approximately a thousand lines of Python, making it easier for developers and researchers to understand how modern LLM inference systems work. Despite its compact design, nano-vllm incorporates advanced optimization techniques such as prefix caching, tensor parallelism, and CUDA graph execution to achieve high performance during model inference. The engine is intended primarily for educational use, experimentation, and lightweight deployments where a full production-grade inference stack may be unnecessary. Its API closely mirrors that of the original vLLM framework, allowing developers familiar with vLLM to adopt the tool with minimal changes.

Downloads: 1 This Week

Last Update: 2026-04-26
See Project
11

MiniOneRec

Minimal reproduction of OneRec

...The framework provides an end-to-end pipeline for building generative recommender systems, including semantic identifier construction, supervised fine-tuning, and reinforcement learning-based optimization. Semantic IDs are created using techniques such as quantized variational autoencoders to convert item features into token sequences that can be modeled by transformer architectures. Developers can train and evaluate recommendation models using different backbone language models while benefiting from the generative framework’s parameter efficiency and scalability.

Downloads: 0 This Week

Last Update: 2026-05-14
See Project
12

how-to-optim-algorithm-in-cuda

How to optimize some algorithm in cuda

...Instead of presenting only theoretical explanations, the repository includes hand-written CUDA implementations of fundamental operations such as reductions, element-wise computations, softmax, and attention mechanisms. These examples show how different optimization techniques influence performance on modern GPU hardware and allow readers to experiment with real implementations. The repository also contains extensive learning notes that summarize CUDA programming concepts, GPU architecture details, and performance engineering strategies.

Downloads: 0 This Week

Last Update: 2026-06-08
See Project
13

Context Engineering

A frontier, first-principles handbook

Context Engineering is a comprehensive, open-source project serving as a first-principles handbook for the emerging discipline of context design and optimization in AI. Moving beyond traditional prompt engineering, this repository defines and explores how to craft and provide complete context payloads — not just single prompts — to large language models so they can perform tasks more reliably and intelligently. It takes inspiration from thought leaders like Andrej Karpathy and bridges theory with practical examples, offering structured guidance on context orchestration, memory, retrieval, and state control within AI workflows. ...

Downloads: 0 This Week

Last Update: 2026-02-27
See Project
14

mllm

Fast Multimodal LLM on Mobile Devices

...Implemented primarily in C and C++, it is designed to operate with minimal external dependencies while taking advantage of hardware-specific acceleration technologies such as ARM NEON and x86 AVX2 instructions. The system supports multiple optimization techniques including quantization, pruning, and speculative decoding to improve performance while reducing computational overhead. It also provides tools to convert models from popular formats like PyTorch checkpoints into optimized runtime formats that can be executed on supported hardware platforms.

Downloads: 0 This Week

Last Update: 2026-03-09
See Project
15

LightLLM

LightLLM is a Python-based LLM (Large Language Model) inference

...The framework enables developers to run and serve modern language models with significantly improved speed and resource efficiency compared to many traditional inference systems. Built primarily in Python, the project integrates optimization techniques and ideas from several leading open-source implementations, including FasterTransformer, vLLM, and FlashAttention, to accelerate token generation and reduce latency. LightLLM is designed to handle large-scale model workloads in production environments, supporting efficient batching and GPU utilization for fast inference across multiple requests. ...

Downloads: 0 This Week

Last Update: 2026-03-05
See Project
16

PyTorch-Tutorial-2nd

CV, NLP, LLM project applications, and advanced engineering deployment

...The project serves as a practical companion to a second edition of a PyTorch learning guide and is designed to help learners understand neural network concepts through hands-on coding examples. The repository covers a wide range of topics including tensor operations, neural network construction, model training workflows, and optimization strategies. It also introduces practical machine learning techniques such as convolutional neural networks, recurrent networks, and other architectures commonly used in modern AI applications. Each tutorial focuses on step-by-step implementation so learners can understand how theoretical concepts translate into working code. The materials are designed for both beginners and intermediate developers who want to gain practical experience building deep learning models using PyTorch.

Downloads: 0 This Week

Last Update: 2026-03-04
See Project
17

Xtuner

A Next-Generation Training Engine Built for Ultra-Large MoE Models

...Its architecture incorporates memory-efficient optimizations that allow researchers to train large models even when computational resources are limited. XTuner is also designed to integrate with modern AI ecosystems, supporting multimodal training, reinforcement learning optimization, and instruction tuning pipelines.

Downloads: 0 This Week

Last Update: 2026-03-04
See Project
18

LLM Course

Course to get into Large Language Models (LLMs)

...Learners get exposure to multiple adaptation strategies—LoRA/QLoRA, instruction fine-tuning, and alignment techniques—so they can choose approaches that fit their hardware and budgets. The materials also cover inference optimization and quantization to make serving LLMs feasible on commodity GPUs or even CPUs, which is crucial for side projects and startups. Evaluation is treated as a first-class topic, with examples of automatic and human-in-the-loop methods to catch regressions and verify quality beyond simple loss values. By the end, students have a mental model and a practical toolkit for iterating on datasets, training configs, etc.

Downloads: 0 This Week

Last Update: 2026-02-05
See Project
19

SWIFT LLM

Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs

SWIFT LLM is a comprehensive framework developed within the ModelScope ecosystem for training, fine-tuning, evaluating, and deploying large language models and multimodal models. The platform provides a full machine learning pipeline that supports tasks ranging from model pre-training to reinforcement learning alignment techniques. It integrates with popular inference engines such as vLLM and LMDeploy to accelerate deployment and runtime performance. The framework also includes support for...

Downloads: 3 This Week

Last Update: 3 days ago
See Project
20

LLMs-from-scratch

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

...The focus is on readability, correctness, and experimentation, making it ideal for students and practitioners transitioning from theory to working systems. By the end, you have a grounded sense of how data pipelines, optimization, and inference interact to produce fluent text.

Downloads: 4 This Week

Last Update: 2026-06-02
See Project
21

SageAttention

NeurIPS2025 Spotlight] Quantized Attention

SageAttention is an open-source optimization library designed to accelerate the attention mechanism used in transformer-based neural networks. Since attention operations are often the most computationally expensive component of modern AI models, SageAttention introduces quantization techniques that significantly reduce computational overhead while preserving model accuracy.

Downloads: 2 This Week

Last Update: 2026-03-08
See Project
22

DecryptPrompt

Summarize Prompt & LLM papers, open source data & models

...The project collects papers, technical reports, and research materials that explore prompting techniques, model architectures, and reasoning strategies used in modern AI systems. It serves as a structured knowledge base where developers and researchers can quickly find key papers about topics such as chain-of-thought reasoning, prompt optimization, reasoning frameworks, and model training techniques. The repository organizes research into thematic sections that cover different prompting methodologies and reasoning paradigms used in LLM development. Many of the resources focus on understanding how prompts influence model behavior and how prompting strategies can improve reasoning or efficiency.

Downloads: 1 This Week

Last Update: 2026-05-06
See Project
23

LLM Action

Technical principles related to large models

LLM-Action is a knowledge/tutorial/repository that shares principles, techniques, and real-world experience related to large language models (LLMs), focusing on LLM engineering, deployment, optimization, inference, compression, and tooling. It organizes content in domains like training, inference, compression, alignment, evaluation, pipelines, and applications. Sections covering infrastructure, engineering, and deployment. Repository templates, sample code, and resource links. Articles/code on LLM compression (quantization, pruning).

Downloads: 0 This Week

Last Update: 2026-05-25
See Project
24

VibeThinker

Diversity-driven optimization and large-model reasoning ability

VibeThinker is a compact but high-capability open-source language model released by WeiboAI (Sina AI Lab). It contains about 1.5 billion parameters, far smaller than many “frontier” models, yet it is explicitly optimized for reasoning, mathematics, and code generation tasks rather than general open-domain chat. The innovation lies in its training methodology: the team uses what they call the Spectrum-to-Signal Principle (SSP), where a first stage emphasizes diversity of reasoning paths (the...

Downloads: 6 This Week

Last Update: 4 days ago
See Project
25

Google Workspace MCP Server

Control Gmail, Google Calendar, Docs, Sheets, Slides, Chat, Forms

Google Workspace MCP is an open-source server that connects AI assistants to Google Workspace services through the Model Context Protocol (MCP), allowing large language models to interact directly with productivity tools. The project exposes a wide set of Google services including Gmail, Google Drive, Docs, Sheets, Slides, Calendar, Chat, and other Workspace components as structured tools that an AI system can call programmatically. By acting as a bridge between AI clients and the Google...

Downloads: 3 This Week

Last Update: 3 days ago
See Project

Previous
You're on page 1
2
3
Next

Related Searches

llm

Related Categories

Artificial Intelligence

Education

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
1320 Columbia Street Suite 310
San Diego, CA 92101
+1 (858) 422-6466

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2026 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise