Page 4 | token free download

Showing 216 open source projects for "token"

View related business solutions

Artificial Intelligence Linux Clear Filters & Widen Search

$300 Free Credits for Your Google Cloud Projects
Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.

Start Free Trial
Build Agents and Models on One Platform
Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.

Try It Free
1

camofox-browser

Headless browser automation server for AI agents to visit sites

...The project is designed around a REST API, making it easier for agents and external tools to create tabs, navigate pages, click elements, type input, scroll, capture screenshots, and manage browsing sessions programmatically. Instead of returning large volumes of raw HTML, it emphasizes accessibility snapshots and stable element references, which reduces token usage and creates more reliable interaction flows for AI-driven browsing. It also supports practical operational features such as per-user session isolation, cookie importing for authenticated browsing, proxy and GeoIP routing.

Downloads: 16 This Week

Last Update: 2026-05-24
See Project
2

MiniMax-M1

Open-weight, large-scale hybrid-attention reasoning model

...Architecturally, it combines Mixture-of-Experts layers with lightning attention, enabling the model to support a native context length of 1 million tokens while using far fewer FLOPs than comparable reasoning models for very long generations. The team emphasizes efficient scaling of test-time compute: at 100K-token generation lengths, M1 reportedly uses only about 25 percent of the FLOPs of some competing models, making extended “think step” traces more feasible. M1 is further trained with large-scale reinforcement learning over diverse tasks.

Downloads: 1 This Week

Last Update: 2025-12-01
See Project
3

Code-Mode

Plug-and-play library to enable agents to call MCP and UTCP tools

...The repository contains both TypeScript and Python libraries, plus a code-mode-mcp component for integrating with MCP and UTCP ecosystems. Benchmarks in the README highlight improvements in latency and token cost for scenarios involving multiple tools, showing that code execution often outperforms traditional JSON-based function calling.

Downloads: 0 This Week

Last Update: 2025-11-25
See Project
4

Context Mode

Context window optimization for AI coding agents

Context Mode is a development approach and tooling concept that enhances how AI-assisted coding environments manage and inject context into language model interactions. It focuses on improving the relevance and accuracy of AI-generated outputs by controlling what information is provided to the model at each step. The project explores structured context management, enabling developers to define how files, code snippets, and metadata are included in prompts. It is particularly useful for large...

Downloads: 10 This Week

Last Update: 2026-06-02
See Project
AI-powered service management for IT and enterprise teams
Enterprise-grade ITSM, for every business

Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.

Try it Free
5

Suno AI API

Use API to call the music generation AI of suno.ai

Suno API is an unofficial open-source interface that enables developers to programmatically interact with Suno’s AI music generation platform, allowing automated creation of songs, lyrics, and audio content through API calls. It replicates the behavior of Suno’s web-based creation tools by reverse engineering internal endpoints and exposing them through a developer-friendly interface built with Python and FastAPI. The system supports asynchronous processing, enabling efficient handling of...

Downloads: 15 This Week

Last Update: 2026-03-18
See Project
6

Starknet MCP Server

MCP server that provides LLM with tools for interacting with Starknet

Starknet MCP Server is a comprehensive Model Context Protocol (MCP) server for the Starknet blockchain. It provides AI agents with the ability to interact with Starknet networks, query blockchain data, manage wallets, and interact with smart contracts.

Downloads: 0 This Week

Last Update: 2025-11-26
See Project
7

Qwen3

Qwen3 is the large language model series developed by Qwen team

Qwen3 is a cutting-edge large language model (LLM) series developed by the Qwen team at Alibaba Cloud. The latest updated version, Qwen3-235B-A22B-Instruct-2507, features significant improvements in instruction-following, reasoning, knowledge coverage, and long-context understanding up to 256K tokens. It delivers higher quality and more helpful text generation across multiple languages and domains, including mathematics, coding, science, and tool usage. Various quantized versions,...

1 Review

Downloads: 20 This Week

Last Update: 2026-01-09
See Project
8

LLM Gateway

Route, manage, and analyze your LLM requests across multiple providers

LLM Gateway is an open-source middleware that consolidates interactions with multiple LLM providers—such as OpenAI, Anthropic, Google Vertex AI—behind a single, unified API compatible with OpenAI's spec. Designed for both self-hosted and cloud use, it enables developers to route requests dynamically, secure and manage API keys, monitor token usage and costs, and analyze performance metrics. With optional UI, telemetry, and Docker deployment, it's ideal for teams aiming to centralize LLM orchestration and gain visibility into AI usage.

Downloads: 3 This Week

Last Update: 2025-12-18
See Project
9

Edgee

AI gateway with token compression for Claude Code, Codex, and more

Edgee is an edge-native execution platform designed to run AI-driven logic and data processing directly at the network edge, reducing latency and improving responsiveness for modern applications. It enables developers to deploy functions and workflows closer to users, allowing real-time processing without relying heavily on centralized cloud infrastructure. The platform is built to support event-driven architectures, where actions are triggered by incoming requests, user behavior, or...

Downloads: 8 This Week

Last Update: 6 days ago
See Project
Compliant and Reliable File Transfers Backed by Top Security Certifications
Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.

Start Free Trial
10

MemPalace

The highest-scoring AI memory system ever benchmarked

MemPalace is an open-source AI memory system designed to solve one of the most persistent limitations of large language models: the loss of context between sessions. Instead of relying on summarization or selective extraction like most memory tools, it takes a radically different approach by storing conversations in their entirety and making them retrievable through structured organization and semantic search. The system is inspired by the classical “memory palace” mnemonic technique,...

Downloads: 11 This Week

Last Update: 2026-06-06
See Project
11

LangGraph

Build resilient language agents as graphs

LangGraph is a library for building stateful, multi-actor applications with LLMs, used to create agent and multi-agent workflows. Compared to other LLM frameworks, it offers these core benefits: cycles, controllability, and persistence. LangGraph allows you to define flows that involve cycles, essential for most agentic architectures, differentiating it from DAG-based solutions. As a very low-level framework, it provides fine-grained control over both the flow and state of your application,...

Downloads: 11 This Week

Last Update: 2 days ago
See Project
12

Anthropic SDK Python

Provides convenient access to the Anthropic REST API from any Python 3

The anthropic-sdk-python repository is the official Python client library for interacting with the Anthropic (Claude) REST API. It is designed to provide a user-friendly, type-safe, and asynchronous/synchronous capable interface for making chat/completion requests to models like Claude. The library includes definitions for all request and response parameters using Python typed objects, automatically handles serialization and deserialization, and wraps HTTP logic (timeouts, retries, error...

Downloads: 15 This Week

Last Update: 5 days ago
See Project
13

AGiXT

AGiXT is a dynamic AI Automation Platform

AGiXT is a dynamic Artificial Intelligence Automation Platform engineered to orchestrate efficient AI instruction management and task execution across a multitude of providers. Our solution infuses adaptive memory handling with a broad spectrum of commands to enhance AI's understanding and responsiveness, leading to improved task completion. The platform's smart features, like Smart Instruct and Smart Chat, seamlessly integrate web search, planning strategies, and conversation continuity,...

Downloads: 3 This Week

Last Update: 2026-04-08
See Project
14

OpenClaw Office

OpenClaw Office is the visual monitoring and management frontend

...The platform enables real-time visualization of agent states, interactions, and workflows, making complex multi-agent coordination easier to understand and debug. Users can observe communication flows between agents through visual connections, track token usage and operational costs, and analyze performance through integrated dashboards and charts. The system also includes live chat capabilities, allowing users to monitor conversations and tool calls as they occur.

Downloads: 8 This Week

Last Update: 2026-05-10
See Project
15

uqlm

Uncertainty Quantification for Language Models, is a Python package

...The library includes both black-box and white-box approaches to uncertainty estimation. Black-box methods evaluate model outputs through multiple generations or comparative analysis, while white-box methods rely on token probabilities produced during inference. UQLM also supports ensemble strategies and model-as-judge approaches for evaluating responses. By combining multiple uncertainty metrics, the system provides more reliable indicators of when language model outputs may be unreliable.

Downloads: 7 This Week

Last Update: 6 days ago
See Project
16

VoxCPM

TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning

...This design helps decouple semantic and acoustic information while preserving fine-grained prosody, leading to more stable and expressive generation than many discrete-token systems. Trained on a large 1.8-million-hour bilingual corpus, VoxCPM can infer appropriate speaking style from context, dynamically adjusting intonation, rhythm, and emotional tone. It supports zero-shot voice cloning from a short reference audio clip, capturing timbre, accent, and pacing to closely mimic a target speaker without per-speaker fine-tuning.

Downloads: 19 This Week

Last Update: 2026-04-28
See Project
17

FastDeploy

High-performance Inference and Deployment Toolkit for LLMs and VLMs

...The platform enables developers to deploy trained models quickly using optimized inference pipelines that support GPUs, specialized AI accelerators, and other hardware architectures. FastDeploy includes advanced acceleration technologies such as speculative decoding, multi-token prediction, and efficient KV cache management to improve throughput and latency during inference. It also offers compatibility with OpenAI-style APIs and vLLM-like interfaces, allowing developers to integrate deployed models easily into existing applications and services.

Downloads: 6 This Week

Last Update: 2026-04-08
See Project
18

GPUStack

Performance-optimized AI inference on your GPUs

GPUStack is an open-source GPU cluster management platform designed to simplify the deployment and operation of artificial intelligence models across heterogeneous hardware environments. The system aggregates GPU resources from multiple machines into a unified cluster so developers and administrators can run large language models and other AI workloads efficiently across distributed infrastructure. Instead of requiring complex orchestration systems such as Kubernetes, GPUStack provides a...

Downloads: 6 This Week

Last Update: 2026-04-21
See Project
19

LLM Telegram Bot

A Telegram bot for Large Language Models

...The system supports multiple modes or personas, enabling users to switch between different conversational styles or use cases. It also allows fine-tuning of generation parameters such as temperature and token limits, giving users control over response behavior. The architecture is modular, making it easy to extend or adapt for different workflows or integrations.

Downloads: 2 This Week

Last Update: 2026-04-20
See Project
20

Reins

Ollama client that simplifies experimenting with LLMs

...It provides a highly customizable chat interface where users can configure system prompts, switch models dynamically, and adjust inference parameters such as temperature, token limits, and context size on a per-conversation basis. The application is built to run across platforms including mobile and desktop environments, making it accessible for a wide range of users who want consistent control over their AI workflows. It also includes features for editing and regenerating messages, enabling iterative refinement of outputs without restarting conversations. ...

Downloads: 8 This Week

Last Update: 2026-04-21
See Project
21

OpenClaw Opik Observability Plugin

Official plugin for OpenClaw that exports agent traces to Opik

...Each time an AI agent performs an action—such as calling a large language model, invoking a tool, accessing memory, or delegating to a sub-agent—the plugin records the full interaction and sends it to Opik for analysis and visualization. This allows developers to inspect inputs, outputs, token usage, latency, and execution flow across complex multi-step agent workflows. The goal of the project is to provide transparency into the internal reasoning and operational pipeline of agent systems so developers can diagnose failures, control costs, and improve reliability.

Downloads: 8 This Week

Last Update: 2026-05-22
See Project
22

NVIDIA NeMo Agent Toolkit

Library for efficiently connecting and optimizing teams of AI agents

...The toolkit integrates with popular agent frameworks such as LangChain, LlamaIndex, CrewAI, Microsoft Semantic Kernel, and Google ADK. Developers can monitor agent execution, trace workflows, and analyze token-level performance to identify bottlenecks and improve efficiency. NeMo Agent Toolkit also supports evaluation systems, prompt optimization, and reinforcement learning techniques to enhance agent behavior over time. By combining instrumentation, workflow orchestration, and performance optimization tools, the platform helps developers deploy scalable and intelligent multi-agent systems.

Downloads: 6 This Week

Last Update: 2026-05-21
See Project
23

Generative AI JS

This SDK is now deprecated, use the new unified Google GenAI SDK

deprecated-generative-ai-js is a JavaScript/TypeScript client and example suite for interacting with Gemini generative APIs in web and Node.js environments. Though marked deprecated (likely superseded by newer SDKs), the repo shows how to wrap HTTP/WS endpoints, manage streaming responses, and interoperate with browser UI or server logic. The examples include chat widgets, prompt pipelines, and generalized inference utilities. It also deals with streaming cancellation, retries, backoff...

Downloads: 6 This Week

Last Update: 2025-10-06
See Project
24

SageMaker Hugging Face Inference Toolkit

Library for serving Transformers models on Amazon SageMaker

SageMaker Hugging Face Inference Toolkit is an open-source library for serving Transformers models on Amazon SageMaker. This library provides default pre-processing, predict and postprocessing for certain Transformers models and tasks. It utilizes the SageMaker Inference Toolkit for starting up the model server, which is responsible for handling inference requests. For the Dockerfiles used for building SageMaker Hugging Face Containers, see AWS Deep Learning Containers. The SageMaker Hugging...

Downloads: 6 This Week

Last Update: 2026-03-17
See Project
25

VideoRAG

"VideoRAG: Chat with Your Videos

...When a user query is received, VideoRAG locates semantically relevant moments in the video using the embedding index, retrieves associated clips or transcripts, and feeds them to a generative model to produce accurate, grounded answers or summaries. This approach allows it to handle videos of arbitrary length without requiring the entire content to be passed into the model at once, overcoming token limits and enabling detailed, context-aware interaction.

Downloads: 1 This Week

Last Update: 2026-03-18
See Project