routing free download

Showing 27 open source projects for "routing"

View related business solutions

Artificial Intelligence Python Clear Filters & Widen Search

Gemini 3 and 200+ AI Models on One Platform
Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.

Start Free
Auth0 B2B Essentials: SSO, MFA, and RBAC Built In
Unlimited organizations, 3 enterprise SSO connections, role-based access control, and pro MFA included. Dev and prod tenants out of the box.

Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.

Sign Up Free
1

caveman

Why use many token when few token do trick

...The project often serves as a foundation for developers who want to build applications quickly without being constrained by heavy conventions or extensive configuration. It may include utilities for routing, state handling, or simple server logic, depending on its implementation scope. Caveman embraces a philosophy of “less is more,” encouraging developers to focus on core functionality rather than framework overhead. Its design makes it particularly useful for experimentation, small tools, or proof-of-concept applications.

Downloads: 11 This Week

Last Update: 1 day ago
See Project
2

OpenMythos

A theoretical reconstruction of the Claude Mythos architecture

...It divides computation into three main stages, including a pre-processing phase, a looped recurrent reasoning block, and a final output refinement stage, creating a structured pipeline for inference. The architecture incorporates advanced techniques such as mixture-of-experts routing, adaptive computation time, and multiple attention mechanisms to dynamically allocate compute where needed. It is highly configurable through a centralized configuration system, allowing experimentation with different architectural parameters such as loop depth, attention type.

Downloads: 17 This Week

Last Update: 2026-04-27
See Project
3

Claude Cognitive

Persistent context and multi-instance coordination

...It introduces an attention-based context router that prioritizes files and content relevant to the current development discussion — tagging them as HOT, WARM, or COLD based on recency and keyword activation — so Claude Code doesn’t waste token budget rereading irrelevant code. This context routing dramatically reduces redundant token usage and accelerates large codebase interactions by focusing only on what truly matters to the current task. Additionally, Claude-Cognitive includes a pool coordinator to share state across multiple Claude Code instances, preserving what’s been learned or completed and preventing repetitive debugging or redundant exploration.

Downloads: 7 This Week

Last Update: 2026-01-28
See Project
4

NVIDIA cuOpt

GPU accelerated decision optimization

NVIDIA cuOpt is a GPU-accelerated optimization engine designed to solve complex mathematical optimization problems at large scale. It supports a range of optimization models including linear programming (LP), mixed integer linear programming (MILP), quadratic programming (QP), and vehicle routing problems (VRP). Built primarily in C++, cuOpt leverages NVIDIA GPUs to deliver near real-time solutions for optimization tasks involving millions of variables and constraints. The platform provides multiple interfaces, including C, Python, and server APIs, allowing developers to integrate optimization capabilities into applications and services. cuOpt is designed for high-performance environments and can be deployed across cloud, hybrid, or on-premise infrastructures. ...

Downloads: 3 This Week

Last Update: 2026-04-09
See Project
Try Google Cloud Risk-Free With $300 in Credit
No hidden charges. No surprise bills. Cancel anytime.

Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.

Start Free
5

OpenAI Forward

An efficient forwarding service designed for LLMs

...Its main purpose is to make model access more manageable and efficient by adding operational controls such as request rate limiting, token rate limiting, caching, logging, routing, and key management around existing LLM endpoints. The project can proxy both local and cloud-hosted language model services, which makes it useful for teams that want a single control layer regardless of whether they are using something like LocalAI or a hosted provider compatible with OpenAI-style APIs. A major emphasis of the repository is asynchronous performance, using tools such as uvicorn, aiohttp, and asyncio to support high-throughput forwarding workloads.

Downloads: 0 This Week

Last Update: 2026-03-10
See Project
6

MoBA

MoBA: Mixture of Block Attention for Long-Context LLMs

...Instead of forcing each token to attend to every other token in the sequence, MoBA divides the context into blocks and dynamically routes queries to only the most relevant segments of information. This routing strategy reduces the computational cost associated with traditional attention while preserving performance on reasoning and long-context tasks. The approach allows language models to scale to significantly longer input contexts without the quadratic computational cost normally associated with transformer attention mechanisms.

Downloads: 0 This Week

Last Update: 2026-03-06
See Project
7

DreamO

A Unified Framework for Image Customization

...Built on a diffusion-transformer (DiT) backbone, it supports a diverse set of tasks — including identity preservation, virtual “try-on” (e.g. clothing, accessories), style transfer, IP adaptation (objects/characters), and layout/condition-aware customizations — all handled within the same unified architecture. DreamO’s design introduces a feature routing constraint that helps disentangle different control conditions (like identity, style, clothing) when more than one is specified, which significantly reduces conflicts and artifacts when combining controls. It also uses a “placeholder strategy” to precisely align conditional inputs (e.g. where to place clothing or objects) in generated images, giving users fine-grained control over composition.

Downloads: 0 This Week

Last Update: 2025-12-02
See Project
8

Pruna AI

Pruna is a model optimization framework built for developers

Pruna is an open-source, self-hostable AI inference engine designed to help teams deploy and manage large language models (LLMs) efficiently across private or hybrid infrastructures. Built with performance and developer ergonomics in mind, Pruna simplifies inference workflows by enabling multi-model orchestration, autoscaling, GPU resource allocation, and compatibility with popular open-source models. It is ideal for companies or teams looking to reduce reliance on external APIs while...

Downloads: 5 This Week

Last Update: 2026-04-22
See Project
9

Semantic Router

Superfast AI decision making and processing of multi-modal data

Semantic Router is a superfast decision-making layer for your LLMs and agents. Rather than waiting for slow, unreliable LLM generations to make tool-use or safety decisions, we use the magic of semantic vector space — routing our requests using semantic meaning. Combining LLMs with deterministic rules means we can be confident that our AI systems behave as intended. Cramming agent tools into the limited context window is expensive, slow, and fundamentally limited. Semantic Router enables lightning-fast and cheap tool usage that can scale to many thousands of tools. ...

Downloads: 6 This Week

Last Update: 2025-11-18
See Project
Enterprise-grade ITSM, for every business
Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.

Try it Free
10

KServe

Standardized Serverless ML Inference Platform on Kubernetes

KServe provides a Kubernetes Custom Resource Definition for serving machine learning (ML) models on arbitrary frameworks. It aims to solve production model serving use cases by providing performant, high abstraction interfaces for common ML frameworks like Tensorflow, XGBoost, ScikitLearn, PyTorch, and ONNX. It encapsulates the complexity of autoscaling, networking, health checking, and server configuration to bring cutting edge serving features like GPU Autoscaling, Scale to Zero, and...

Downloads: 8 This Week

Last Update: 2026-04-29
See Project
11

ChatGPT Clone

ChatGPT interface with better UI

...The project is useful for prototyping assistants, documentation bots, and internal developer tools without committing to a specific vendor or UI framework. Configuration is kept simple so newcomers can get a working chat in minutes and then dial in features like authentication or multi-model routing. While it illustrates how to hook into third-party LLM endpoints, it is typically positioned as an educational, self-hosted starter that you should operate responsibly and within provider's terms of use.

Downloads: 10 This Week

Last Update: 2025-11-07
See Project
12

Cactus

Low-latency AI inference engine optimized for mobile devices

Cactus is a low-latency, energy-efficient AI inference framework designed specifically for mobile devices and wearables, enabling advanced machine learning capabilities directly on-device. It provides a full-stack architecture composed of an inference engine, a computation graph system, and highly optimized hardware kernels tailored for ARM-based processors. Cactus emphasizes efficient memory usage through techniques such as zero-copy computation graphs and quantized model formats, allowing...

Downloads: 6 This Week

Last Update: 2026-04-18
See Project
13

Every Code

Local AI coding agent CLI with multi-agent orchestration tools

Every Code (often referred to simply as Code) is a fast, local AI-powered coding agent designed to run directly in the terminal environment. It is a community-driven fork of the Codex CLI, with a strong emphasis on improving real-world developer ergonomics and workflows. Every Code enhances the traditional coding assistant model by introducing multi-agent orchestration, allowing multiple AI agents to collaborate, compare solutions, and refine outputs in parallel. It supports integration with...

Downloads: 7 This Week

Last Update: 3 days ago
See Project
14

Parallax

Parallax is a distributed model serving framework

Parallax is a decentralized inference framework designed to run large language models across distributed computing resources. Instead of relying on centralized GPU clusters in data centers, the system allows multiple heterogeneous machines to collaborate in serving AI inference workloads. Parallax divides model layers across different nodes and dynamically coordinates them to form a complete inference pipeline. A two-stage scheduling architecture determines how model layers are allocated to...

Downloads: 3 This Week

Last Update: 2026-03-09
See Project
15

AgentScope

Build and run agents you can see, understand and trust

AgentScope is a production-ready agent framework designed to help developers build, deploy, and scale intelligent agentic applications. It provides essential abstractions that evolve with advancing LLM capabilities, emphasizing reasoning, tool use, and flexible orchestration rather than rigid prompt constraints. With built-in support for ReAct agents, memory, planning, human-in-the-loop control, and real-time voice interaction, developers can create powerful agents in minutes. AgentScope...

Downloads: 6 This Week

Last Update: 2026-04-28
See Project
16

ContextForge MCP Gateway

A Model Context Protocol (MCP) Gateway & Registry

MCP Context Forge is a feature-rich gateway and registry that federates Model Context Protocol (MCP) servers and traditional REST services behind a single, governed endpoint. It exposes an MCP-compliant interface to clients while handling discovery, authentication, rate limiting, retries, and observability on the server side. The gateway scales horizontally, supports multi-cluster deployments on Kubernetes, and uses Redis for federation and caching across instances. Operators can define...

Downloads: 4 This Week

Last Update: 2026-05-01
See Project
17

Lingua-Py

The most accurate natural language detection library for Python

...This is very useful as a preprocessing step for linguistic data in natural language processing applications such as text classification and spell checking. Other use cases, for instance, might include routing e-mails to the right geographically located customer service department, based on the e-mails' languages. Language detection is often done as part of large machine learning frameworks or natural language processing applications. In cases where you don't need the full-fledged functionality of those systems or don't want to learn the ropes of those, a small flexible library comes in handy.

Downloads: 0 This Week

Last Update: 2026-03-09
See Project
18

ZAPI

ZAPI by Adopt AI is an open-source Python library

ZAPI is a developer-centric API framework that streamlines building, testing, and deploying APIs with strong type safety and minimal boilerplate, helping teams deliver backend services faster with fewer errors. It emphasizes a declarative router and schema model that uses types to define request and response formats, providing clear contracts for frontend and backend teams while automatically generating documentation. Zapi abstracts many repetitive tasks such as validation, authentication...

Downloads: 0 This Week

Last Update: 2026-02-05
See Project
19

SkillForge

Ultimate meta-skill for generating best-in-class Claude Code skills

SkillForge is a systematic methodology and tooling framework for creating high-quality AI “skills” specifically optimized for Claude Code integrations, treating skill creation as an engineering discipline rather than an ad-hoc art form. It introduces a multi-phase architecture where every input or request is triaged intelligently, analyzed deeply through structured lenses, specified formally, synthesized with automated generation, and finally subjected to multi-agent review before...

Downloads: 0 This Week

Last Update: 2026-02-25
See Project
20

Build Your Own OpenClaw

A step-by-step guide to build your own AI agent

Build Your Own OpenClaw is a step-by-step educational framework that teaches developers how to construct a fully functional AI agent system from scratch, gradually evolving from a simple chat loop into a multi-agent, production-ready architecture. The project is structured into 18 progressive stages, each introducing a new concept such as tool usage, memory persistence, event-driven design, and multi-agent coordination, with each step including both explanatory documentation and runnable...

Downloads: 0 This Week

Last Update: 2026-05-04
See Project
21

MiniMax-01

Large-language-model & vision-language-model based on Linear Attention

MiniMax-01 is the official repository for two flagship models: MiniMax-Text-01, a long-context language model, and MiniMax-VL-01, a vision-language model built on top of it. MiniMax-Text-01 uses a hybrid attention architecture that blends Lightning Attention, standard softmax attention, and Mixture-of-Experts (MoE) routing to achieve both high throughput and long-context reasoning. It has 456 billion total parameters with 45.9 billion activated per token and is trained with advanced parallel strategies such as LASP+, varlen ring attention, and Expert Tensor Parallelism, enabling a training context of 1 million tokens and up to 4 million tokens at inference. ...

Downloads: 0 This Week

Last Update: 2025-12-01
See Project
22

Warlock-Studio

AI Suite for upscaling, interpolating & restoring images/videos

v6.0. Warlock-Studio is a Windows application that uses Real-ESRGAN, BSRGAN, IRCNN, GFPGAN, RealESRNet, RealESRAnime and RIFE Artificial Intelligence models to upscale, restore faces, interpolate frames and reduce noise in images and videos. the application supports GPU acceleration (including multi-GPU setups) and offers batch processing for large workloads. It includes drag-and-drop handling for single or multiple files, optional pre-resize functions, and an automatic tiling system...

Downloads: 35 This Week

Last Update: 2026-02-16
See Project
23

LLaMA-MoE

Building Mixture-of-Experts from LLaMA with Continual Pre-training

LLaMA-MoE is an open-source project that builds mixture-of-experts language models from LLaMA through expert partitioning and continual pre-training. The repository is centered on making MoE research more accessible by offering smaller and more affordable models with only about 3.0 to 3.5 billion activated parameters, which helps reduce deployment and experimentation costs. Its architecture works by splitting LLaMA feed-forward networks into sparse experts and adding gating mechanisms so...

Downloads: 4 This Week

Last Update: 2026-03-10
See Project
24

e-Dokyumento

e-Dokyumento is web-based Document Management System (DMS)

e-Dokyumento is opensource web-based Document Management System (DMS) A Document Management which automates the basic office document workflow such as receiving, filing, routing, and approving through capturing (scanning), digitizing (OCR Reading), storing, tagging, and electronically routing and approving (e-signature) of electronic documents. # Demo : https://e-dokyumento.herokuapp.com/ https://edokyu.seillig.com/ (refer to Readme.md for the accounts) #Dockerhub: https://hub.docker.com/r/nelsonmaligro/edokyumento # Install using the ISO: 1. ...

2 Reviews

Downloads: 2 This Week

Last Update: 2022-05-14
See Project
25

CapsGNN

A PyTorch implementation of "Capsule Graph Neural Network"

...Inspired by the Capsule Neural Network (CapsNet), we propose the Capsule Graph Neural Network (CapsGNN), which adopts the concept of capsules to address the weakness in existing GNN-based graph embeddings algorithms. By extracting node features in the form of capsules, routing mechanism can be utilized to capture important information at the graph level. As a result, our model generates multiple embeddings for each graph to capture graph properties from different aspects.

Downloads: 0 This Week

Last Update: 2022-08-18
See Project