Showing 34 open source projects for "fast linux"

View related business solutions
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 1
    Fast MCP

    Fast MCP

    A Ruby Implementation of the Model Context Protocol

    Fast MCP is a lightweight framework designed to simplify the development and deployment of servers that implement the Model Context Protocol. The Model Context Protocol enables AI assistants and applications to connect with external tools, services, and data sources through a standardized interface. Fast-mcp provides developers with a streamlined toolkit for building MCP servers that expose application functionality to AI agents. The framework focuses on ease of use, allowing developers to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    llama.cpp

    llama.cpp

    Port of Facebook's LLaMA model in C/C++

    The llama.cpp project enables the inference of Meta's LLaMA model (and other models) in pure C/C++ without requiring a Python runtime. It is designed for efficient and fast model execution, offering easy integration for applications needing LLM-based capabilities. The repository focuses on providing a highly optimized and portable implementation for running large language models directly within C/C++ environments.
    Downloads: 152 This Week
    Last Update:
    See Project
  • 3
    vLLM

    vLLM

    A high-throughput and memory-efficient inference and serving engine

    vLLM is a fast and easy-to-use library for LLM inference and serving. High-throughput serving with various decoding algorithms, including parallel sampling, beam search, and more.
    Downloads: 43 This Week
    Last Update:
    See Project
  • 4
    Flowise

    Flowise

    Drag & drop UI to build your customized LLM flow

    Open source UI visual tool to build your customized LLM flow using LangchainJS, written in Node Typescript/Javascript. Conversational agent for a chat model which utilizes chat-specific prompts and buffer memory. Open source is the core of Flowise, and it will always be free for commercial and personal usage. Flowise support different environment variables to configure your instance. You can specify the following variables in the .env file inside the packages/server folder.
    Downloads: 40 This Week
    Last Update:
    See Project
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 5
    Extractous

    Extractous

    Fast and efficient unstructured data extraction

    Extractous is a Rust-based unstructured data extraction library focused on fast local parsing of documents and other content-heavy files. Its purpose is to extract text and metadata efficiently from formats such as PDF, Word, HTML, email archives, images, and more, without depending on external APIs or separate parsing servers. The project emphasizes performance and low memory usage, and its maintainers describe it as a local-first alternative to heavier extraction stacks. For broader format...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    RWKV Runner

    RWKV Runner

    A RWKV management and startup tool, full automation, only 8MB

    RWKV (pronounced as RwaKuv) is an RNN with GPT-level LLM performance, which can also be directly trained like a GPT transformer (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, fast training, saves VRAM, "infinite" ctxlen, and free text embedding. Moreover it's 100% attention-free. Default configs has enabled custom CUDA kernel acceleration, which is much faster and consumes much less VRAM. If you encounter possible compatibility...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    GLM-4.5

    GLM-4.5

    GLM-4.5: Open-source LLM for intelligent agents by Z.ai

    GLM-4.5 is a cutting-edge open-source large language model designed by Z.ai for intelligent agent applications. The flagship GLM-4.5 model has 355 billion total parameters with 32 billion active parameters, while the compact GLM-4.5-Air version offers 106 billion total parameters and 12 billion active parameters. Both models unify reasoning, coding, and intelligent agent capabilities, providing two modes: a thinking mode for complex reasoning and tool usage, and a non-thinking mode for...
    Downloads: 76 This Week
    Last Update:
    See Project
  • 8
    mistral.rs

    mistral.rs

    Fast, flexible LLM inference

    mistral.rs is a fast and flexible LLM inference engine implemented in Rust, designed to run and serve modern language models with an emphasis on performance and practical deployment. It provides multiple entry points for developers, including a CLI for running models locally and an HTTP server that exposes an OpenAI-compatible API surface for easy integration with existing clients. The project includes hardware-aware tooling that can benchmark a system and choose sensible quantization and...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    Rocketnotes

    Rocketnotes

    AI-powered markdown editor - leverage LLMs with your documents

    RocketNotes is an open-source note-taking application designed to combine traditional knowledge management with artificial intelligence features that enhance how users capture and organize information. The project focuses on providing a fast, lightweight environment where users can create structured notes, manage personal knowledge bases, and interact with AI tools to summarize or expand their content. Instead of functioning purely as a document editor, RocketNotes integrates AI capabilities...
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    trench

    trench

    Open-Source Analytics Infrastructure

    Trench is an open-source analytics infrastructure designed for tracking events and performing real-time analysis of application data at scale. The system is built on top of high-performance data technologies including Apache Kafka and ClickHouse, which allows it to ingest and process very large volumes of events while maintaining fast query performance. It was originally developed to solve scaling challenges in product analytics systems where traditional relational databases become...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Zep

    Zep

    Zep: A long-term memory store for LLM / Chatbot applications

    Easily add relevant documents, chat history memory & rich user data to your LLM app's prompts. Understands chat messages, roles, and user metadata, not just texts and embeddings. Zep Memory and VectorStore implementations are shipped with your favorite frameworks: LangChain, LangChain.js, LlamaIndex, and more. Automatically embed texts and messages using state-of-the-art opeb source models, OpenAI, or bring your own vectors. Zep’s local embedding models and async enrichment ensure a snappy...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    AReal

    AReal

    Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible

    AReaL is an open source, fully asynchronous reinforcement learning training system. AReal is designed for large reasoning and agentic models. It works with models that perform reasoning over multiple steps, agents interacting with environments. It is developed by the AReaL Team at Ant Group (inclusionAI) and builds upon the ReaLHF project. Release of training details, datasets, and models for reproducibility. It is intended to facilitate reproducible RL training on reasoning / agentic tasks,...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    Nano-vLLM

    Nano-vLLM

    A lightweight vLLM implementation built from scratch

    Nano-vLLM is a lightweight implementation of the vLLM inference engine designed to run large language models efficiently while maintaining a minimal and readable codebase. The project recreates the core functionality of vLLM in a simplified architecture written in approximately a thousand lines of Python, making it easier for developers and researchers to understand how modern LLM inference systems work. Despite its compact design, nano-vllm incorporates advanced optimization techniques such...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    rwkv.cpp

    rwkv.cpp

    INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model

    Besides the usual FP32, it supports FP16, quantized INT4, INT5 and INT8 inference. This project is focused on CPU, but cuBLAS is also supported. RWKV is a novel large language model architecture, with the largest model in the family having 14B parameters. In contrast to Transformer with O(n^2) attention, RWKV requires only state from the previous step to calculate logits. This makes RWKV very CPU-friendly on large context lengths.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    spacy-llm

    spacy-llm

    Integrating LLMs into structured NLP pipelines

    Large Language Models (LLMs) feature powerful natural language understanding capabilities. With only a few (and sometimes no) examples, an LLM can be prompted to perform custom NLP tasks such as text categorization, named entity recognition, coreference resolution, information extraction and more. This package integrates Large Language Models (LLMs) into spaCy, featuring a modular system for fast prototyping and prompting, and turning unstructured responses into robust outputs for various...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    mllm

    mllm

    Fast Multimodal LLM on Mobile Devices

    mllm is an open-source inference engine designed to run multimodal large language models efficiently on mobile devices and edge computing environments. The framework focuses on delivering high-performance AI inference in resource-constrained systems such as smartphones, embedded hardware, and lightweight computing platforms. Implemented primarily in C and C++, it is designed to operate with minimal external dependencies while taking advantage of hardware-specific acceleration technologies...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    Paper2Slides

    Paper2Slides

    From Paper to Presentation in One Click

    Paper2Slides is an automation tool that converts research papers, reports, and other documents into polished slide decks and posters with minimal manual effort. It is designed to replace the repetitive work of turning dense technical documents into presentation-friendly structure by extracting key points, figures, and data into a coherent visual narrative. The system supports multiple input formats, so you can process PDFs and common office documents rather than being locked to a single file...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    llama.vim

    llama.vim

    Vim plugin for LLM-assisted code/text completion

    llama.vim is a lightweight Vim plugin that integrates large language model capabilities directly into the Vim text editor. The plugin enables developers to access AI-assisted text and code completion features without leaving their terminal-based development environment. Instead of relying on remote AI services, the plugin is designed to work with locally running LLM inference engines such as llama.cpp. This approach allows developers to benefit from AI-assisted coding features while...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    HN Time Capsule

    HN Time Capsule

    Analyzing Hacker News discussions from a decade ago in hindsight

    HN Time Capsule is a creative and nostalgic project that captures and preserves snapshots of Hacker News content over time, providing a historical look at how topics, discussions, and popular threads have evolved. Rather than functioning like a live aggregator, it stores periodic captures of posts and comments, creating a time capsule that lets researchers, enthusiasts, and historians trace changes in sentiment, technology trends, and community priorities across different eras of the Hacker...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Engram

    Engram

    A New Axis of Sparsity for Large Language Models

    Engram is a high-performance embedding and similarity search library focused on making retrieval-augmented workflows efficient, scalable, and easy to adopt by developers building search, recommendation, or semantic matching systems. It provides utilities to generate embeddings from text or other structured data, index them using efficient approximate nearest neighbor algorithms, and perform real-time similarity queries even on large corpora. Engineered with speed and memory efficiency in...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    NeMo Curator

    NeMo Curator

    Scalable data pre processing and curation toolkit for LLMs

    NeMo Curator is a Python library specifically designed for fast and scalable dataset preparation and curation for large language model (LLM) use-cases such as foundation model pretraining, domain-adaptive pretraining (DAPT), supervised fine-tuning (SFT) and paramter-efficient fine-tuning (PEFT). It greatly accelerates data curation by leveraging GPUs with Dask and RAPIDS, resulting in significant time savings. The library provides a customizable and modular interface, simplifying pipeline...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    webclaw

    webclaw

    Fast, local-first web content extraction for LLMs

    webclaw is a high-performance web content extraction tool designed specifically for AI agents and large language models, focusing on delivering clean, structured data instead of raw HTML. It is built in Rust and operates without a headless browser, using advanced techniques such as TLS fingerprinting to bypass common scraping barriers and mimic real browser behavior. The tool addresses a major inefficiency in AI workflows by removing irrelevant elements like navigation menus, ads, and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    OmAgent

    OmAgent

    Build multimodal language agents for fast prototype and production

    OmAgent is an open-source Python framework designed to simplify the development of multimodal language agents that can reason, plan, and interact with different types of data sources. The framework provides abstractions and infrastructure for building AI agents that operate on text, images, video, and audio while maintaining a relatively simple interface for developers. Instead of forcing developers to implement complex orchestration logic manually, the system manages task scheduling, worker...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    LightLLM

    LightLLM

    LightLLM is a Python-based LLM (Large Language Model) inference

    LightLLM is a high-performance inference and serving framework designed specifically for large language models, focusing on lightweight architecture, scalability, and efficient deployment. The framework enables developers to run and serve modern language models with significantly improved speed and resource efficiency compared to many traditional inference systems. Built primarily in Python, the project integrates optimization techniques and ideas from several leading open-source...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Infinity

    Infinity

    Low-latency REST API for serving text-embeddings

    Infinity is a high-throughput, low-latency REST API for serving vector embeddings, supporting all sentence-transformer models and frameworks. Infinity is developed under MIT License. Infinity powers inference behind Gradient.ai and other Embedding API providers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB