Showing 116 open source projects for "token"

View related business solutions
  • Secure File Transfer for Windows with Cerberus by Redwood Icon
    Secure File Transfer for Windows with Cerberus by Redwood

    Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

    Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.
    Try for Free
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • 1
    DeepSeek-V3

    DeepSeek-V3

    Powerful AI language model (MoE) optimized for efficiency/performance

    DeepSeek-V3 is a robust Mixture-of-Experts (MoE) language model developed by DeepSeek, featuring a total of 671 billion parameters, with 37 billion activated per token. It employs Multi-head Latent Attention (MLA) and the DeepSeekMoE architecture to enhance computational efficiency. The model introduces an auxiliary-loss-free load balancing strategy and a multi-token prediction training objective to boost performance. Trained on 14.8 trillion diverse, high-quality tokens, DeepSeek-V3 underwent supervised fine-tuning and reinforcement learning to fully realize its capabilities. ...
    Downloads: 64 This Week
    Last Update:
    See Project
  • 2
    TokenCost

    TokenCost

    Easy token price estimates for 400+ LLMs. TokenOps

    TokenCost is an open-source developer utility designed to estimate the cost of using large language model APIs by calculating token usage and translating it into real monetary values. The tool focuses on helping developers understand how much their prompts and generated completions cost when interacting with commercial AI models. It works by counting tokens in prompts and responses before or after sending requests and then applying pricing information associated with different models. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    FastVLM

    FastVLM

    This repository contains the official implementation of FastVLM

    ...Apple’s research brief frames FastVLM as targeting real-time or latency-sensitive scenarios, where lowering visual token pressure is critical to interactive UX. In short, it’s a practical recipe to make VLMs fast without exotic token-selection heuristics.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    OpenAI Privacy Filter

    OpenAI Privacy Filter

    Bidirectional token-classification model for identifiable info

    OpenAI Privacy Filter is an open-weight machine learning model designed to detect and mask personally identifiable information in text with high efficiency and contextual awareness. It operates as a bidirectional token classification system that labels sensitive data in a single forward pass rather than generating text sequentially, enabling fast processing for large datasets. The model supports long-context inputs, allowing it to analyze extensive documents without chunking, which improves consistency in redaction tasks. It can run locally on standard hardware, ensuring that sensitive information never leaves the user’s environment and supporting privacy-first workflows. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 5
    OpenSquilla

    OpenSquilla

    Token-Efficient AI Agent with same budget, higher intelligence density

    OpenSquilla is a token-efficient microkernel AI agent runtime designed for CLI, web UI, and chat-based workflows. It routes each turn through a shared loop that can select lower-cost models when appropriate while preserving tool dispatch, retries, memory, and decision logging. The project supports multiple LLM providers through a pluggable provider layer, making it adaptable to different model ecosystems.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 6
    Claude Cognitive

    Claude Cognitive

    Persistent context and multi-instance coordination

    ...It introduces an attention-based context router that prioritizes files and content relevant to the current development discussion — tagging them as HOT, WARM, or COLD based on recency and keyword activation — so Claude Code doesn’t waste token budget rereading irrelevant code. This context routing dramatically reduces redundant token usage and accelerates large codebase interactions by focusing only on what truly matters to the current task. Additionally, Claude-Cognitive includes a pool coordinator to share state across multiple Claude Code instances, preserving what’s been learned or completed and preventing repetitive debugging or redundant exploration.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 7
    Claw Compactor

    Claw Compactor

    14-stage Fusion Pipeline for LLM token compression

    ...It addresses the challenge of finite context windows in language models by compressing or summarizing historical interactions while preserving essential information. The system works by transforming older conversation data into condensed representations that maintain continuity without exceeding token limits. This approach allows long-running agent sessions to continue operating efficiently without losing critical context. It is especially useful in autonomous workflows where agents accumulate large volumes of interaction history over time. The project aligns with broader strategies in AI systems that balance memory retention with computational constraints. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 8
    caveman

    caveman

    Why use many token when few token do trick

    Caveman is a lightweight and experimental project focused on simplifying backend or full-stack development workflows through minimalistic abstractions and rapid prototyping principles. It is designed to reduce the complexity of modern frameworks by offering a stripped-down approach that prioritizes speed, clarity, and ease of use. The project often serves as a foundation for developers who want to build applications quickly without being constrained by heavy conventions or extensive...
    Downloads: 22 This Week
    Last Update:
    See Project
  • 9
    Phenaki - Pytorch

    Phenaki - Pytorch

    Implementation of Phenaki Video, which uses Mask GIT

    Implementation of Phenaki Video, which uses Mask GIT to produce text-guided videos of up to 2 minutes in length, in Pytorch. It will also combine another technique involving a token critic for potentially even better generations. A new paper suggests that instead of relying on the predicted probabilities of each token as a measure of confidence, one can train an extra critic to decide what to iteratively mask during sampling. This repository will also endeavor to allow the researcher to train on text-to-image and then text-to-video. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    Claude Code Usage Monitor

    Claude Code Usage Monitor

    Real-time Claude Code usage monitor with predictions and warnings

    Claude Code Usage Monitor is a developer-focused terminal tool that provides real-time visibility into Claude Code token consumption and session behavior. The project is designed to help users avoid unexpectedly hitting usage caps by continuously tracking token burn rate, message counts, and estimated costs during active sessions. It presents analytics through a visually rich terminal interface built with modern Python tooling, making it easy to interpret usage trends at a glance. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    Hermes Web UI

    Hermes Web UI

    The best way to use Hermes Agent from the web or from your phone

    ...It is built using simple technologies like Python and vanilla JavaScript, avoiding complex frontend frameworks. The UI supports real-time interaction, context tracking, and visualization of token usage. It connects to a self-hosted agent that continuously learns and evolves over time. The project emphasizes usability, accessibility, and seamless integration with existing workflows.
    Downloads: 24 This Week
    Last Update:
    See Project
  • 12
    Step-Audio-EditX

    Step-Audio-EditX

    LLM-based Reinforcement Learning audio edit model

    ...Rather than treating audio editing as low-level waveform manipulation, this model converts speech into a sequence of discrete “audio tokens” (via a dual-codebook tokenizer) — combining a linguistic token stream and a semantic (prosody/emotion/style) token stream — thereby abstracting audio editing into high-level token operations. This allows users to modify not only what is said (the text) but also how it's said: emotion, tone, speaking style, prosody, accent, even paralinguistic cues. Because the model is trained with a “large-margin learning” objective over many synthesized and natural speech samples, it gains robust control over expressive attributes, and can perform iterative editing: e.g. you could record a line, then ask the model to “make it sadder,” “speak slower,” or “change accent to X.”
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    DeepSeek R1

    DeepSeek R1

    Open-source, high-performance AI model with advanced reasoning

    ...DeepSeek R1 offers unrestricted access for both commercial and academic use. The model employs a Mixture of Experts (MoE) architecture, comprising 671 billion total parameters with 37 billion active parameters per token, and supports a context length of up to 128,000 tokens. DeepSeek-R1's training regimen uniquely integrates large-scale reinforcement learning (RL) without relying on supervised fine-tuning, enabling the model to develop advanced reasoning capabilities. This approach has resulted in performance comparable to leading models like OpenAI's o1, while maintaining cost-efficiency. ...
    Downloads: 98 This Week
    Last Update:
    See Project
  • 14
    Gitingest

    Gitingest

    Create prompt-friendly codebase digests from any Git repository URL

    ...In addition to producing the code digest, Gitingest also calculates statistics about the extracted content such as repository structure, total size of the extract, and token count. Gitingest can be used as a command line utility or integrated directly into Python applications.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 15
    OpenSpace

    OpenSpace

    OpenSpace: Make Your Agents: Smarter, Low-Cost, Self-Evolving

    ...The platform emphasizes collective intelligence, enabling multiple agents to share learned behaviors and benefit from each other’s experiences. It also focuses on cost efficiency by reducing redundant computations and reusing successful workflows, significantly lowering token usage in repeated tasks. The framework includes monitoring and evaluation mechanisms to track skill performance and ensure reliability as systems evolve. It supports integration with various agent platforms, making it flexible and extensible across different environments.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 16
    Tiktoken

    Tiktoken

    tiktoken is a fast BPE tokeniser for use with OpenAI's models

    tiktoken is a high-performance, tokenizer library (based on byte-pair encoding, BPE) designed for use with OpenAI’s models. It handles encoding and decoding text to token IDs efficiently, with minimal overhead. Because tokenization is a fundamental step in preparing text for models, tiktoken is optimized for speed, memory, and correctness in model contexts (e.g. matching OpenAI’s internal tokenization). The repo supports multiple encodings (e.g. “cl100k_base”) and lets users switch encoding names to match different model contexts. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 17
    Headroom

    Headroom

    Compress tool outputs, logs, files, and RAG chunks

    Headroom is a context optimization layer for LLM applications that compresses information before it reaches the model. It sits between an application and an LLM provider, intercepting requests and forwarding a shorter optimized prompt. The project is designed to reduce token usage while preserving the answer quality needed for agent workflows. It can compress tool outputs, logs, RAG chunks, files, and conversation history. Headroom can be used as a transparent proxy, a Python function, a TypeScript SDK, or through integrations with frameworks such as LangChain and LiteLLM. It is useful for teams building AI agents, research tools, or LLM products where context size, cost, and latency matter.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    MiniOneRec

    MiniOneRec

    Minimal reproduction of OneRec

    ...The framework provides an end-to-end pipeline for building generative recommender systems, including semantic identifier construction, supervised fine-tuning, and reinforcement learning-based optimization. Semantic IDs are created using techniques such as quantized variational autoencoders to convert item features into token sequences that can be modeled by transformer architectures. Developers can train and evaluate recommendation models using different backbone language models while benefiting from the generative framework’s parameter efficiency and scalability.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    Claude Code Bridge

    Claude Code Bridge

    Real-time multi-AI collaboration: Claude, Codex & Gemini

    ...The system allows developers to coordinate interactions between models such as Claude, Codex, and Gemini so that they can work together on programming tasks. By maintaining persistent shared context between these models, the tool reduces redundant prompts and minimizes token usage while allowing each AI system to contribute specialized capabilities. The architecture functions as a unified launcher that manages communication between multiple AI providers and coordinates their responses within the same development session. Developers can run the tool in terminal environments and integrate it with terminal multiplexers such as tmux or advanced terminal emulators.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    MCP Text Editor

    MCP Text Editor

    Provides line-oriented text file editing capabilities

    The MCP Text Editor Server provides line-oriented text file editing capabilities through a standardized API, optimized for integration with Large Language Models (LLMs). It enables efficient partial file access, minimizing token usage while ensuring safe concurrent editing.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 21
    InfiAgent

    InfiAgent

    Build your own Cowork, AI Scientist and other SoTA Agents

    ...Designed as a “Multi-Level Agent” (MLA) system, it externalizes persistent state to the file system so that agents can operate over unlimited runtime without the need for token-intensive context compression, enabling workflows such as research paper drafting, experiments, coding, and document generation to run reliably. The framework uses a serial multi-agent hierarchy where specialized agents coordinate in tree-structured paths for clear task delegation and minimal tool conflicts, while batch file operations and persistent workspaces ensure reproducibility and traceability. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    HunyuanImage-3.0

    HunyuanImage-3.0

    A Powerful Native Multimodal Model for Image Generation

    ...It unifies multimodal understanding and generation in a single autoregressive framework, combining text and image modalities seamlessly rather than relying on separate image-only diffusion components. It uses a Mixture-of-Experts (MoE) architecture with many expert subnetworks to scale efficiently, deploying only a subset of experts per token, which allows large parameter counts without linear inference cost explosion. The model is intended to be competitive with closed-source image generation systems, aiming for high fidelity, prompt adherence, fine detail, and even “world knowledge” reasoning (i.e. leveraging context, semantics, or common sense in generation). The GitHub repo includes code, scripts, model loading instructions, inference utilities, prompt handling, and integration with standard ML tooling (e.g. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    Text Generation Inference

    Text Generation Inference

    Large Language Model Text Generation Inference

    Text Generation Inference is a high-performance inference server for text generation models, optimized for Hugging Face's Transformers. It is designed to serve large language models efficiently with optimizations for performance and scalability.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 24
    RecursiveMAS

    RecursiveMAS

    Offical Implementation for "Recursive Multi-Agent Systems"

    ...It also incorporates an inner–outer loop training approach that optimizes the entire system collectively rather than tuning each agent separately. This design improves efficiency, reduces token usage, and stabilizes learning during iterative reasoning.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    MoBA

    MoBA

    MoBA: Mixture of Block Attention for Long-Context LLMs

    MoBA, short for Mixture of Block Attention, is an open-source research implementation of a novel attention mechanism designed to improve the efficiency of large language models processing extremely long contexts. The architecture adapts ideas from Mixture-of-Experts networks and applies them directly to the attention mechanism of transformer models. Instead of forcing each token to attend to every other token in the sequence, MoBA divides the context into blocks and dynamically routes queries to only the most relevant segments of information. This routing strategy reduces the computational cost associated with traditional attention while preserving performance on reasoning and long-context tasks. The approach allows language models to scale to significantly longer input contexts without the quadratic computational cost normally associated with transformer attention mechanisms.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
Auth0 Logo