Showing 109 open source projects for "performance"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 99.99% Uptime for MySQL and PostgreSQL Databases Icon
    99.99% Uptime for MySQL and PostgreSQL Databases

    Sub-second maintenance. 2x read/write performance. Built-in vector search for AI apps.

    Cloud SQL Enterprise Plus delivers near-zero downtime with 35 days of point-in-time recovery. Supports MySQL, PostgreSQL, and SQL Server.
    Try Free
  • 1
    H2O LLM Studio

    H2O LLM Studio

    Framework and no-code GUI for fine-tuning LLMs

    Welcome to H2O LLM Studio, a framework and no-code GUI designed for fine-tuning state-of-the-art large language models (LLMs). You can also use H2O LLM Studio with the command line interface (CLI) and specify the configuration file that contains all the experiment parameters. To finetune using H2O LLM Studio with CLI, activate the pipenv environment by running make shell. With H2O LLM Studio, training your large language model is easy and intuitive. First, upload your dataset and then start...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    SageAttention

    SageAttention

    NeurIPS2025 Spotlight] Quantized Attention

    SageAttention is an open-source optimization library designed to accelerate the attention mechanism used in transformer-based neural networks. Since attention operations are often the most computationally expensive component of modern AI models, SageAttention introduces quantization techniques that significantly reduce computational overhead while preserving model accuracy. The system achieves this by using low-precision numerical formats such as INT4, FP8, or INT8 to represent key matrices...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    VLMEvalKit

    VLMEvalKit

    Open-source evaluation toolkit of large multi-modality models (LMMs)

    ...Instead of requiring complex data preparation pipelines or multiple repositories for each benchmark, the system enables evaluation through simple commands that automatically handle dataset loading, model inference, and metric computation. VLMEvalKit supports generation-based evaluation methods, allowing models to produce textual responses to visual inputs while measuring performance through techniques such as exact matching or language-model-assisted answer extraction.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    OpenCompass

    OpenCompass

    OpenCompass is an LLM evaluation platform

    ...One line command to implement task division and distributed evaluation, completing the full evaluation of billion-scale models in just a few hours. Support for zero-shot, few-shot, and chain-of-thought evaluations, combined with standard or dialogue type prompt templates, to easily stimulate the maximum performance of various models.
    Downloads: 1 This Week
    Last Update:
    See Project
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 5
    whichllm

    whichllm

    Find the local LLM that actually runs and performs best

    ...The project is useful for users who are unsure which local LLM will perform well on their system. It focuses on real, recency-aware benchmarks so recommendations better reflect current model performance. whichllm is especially helpful for developers, AI hobbyists, and researchers comparing local inference options across NVIDIA, AMD, Apple Silicon, and CPU-only environments. Its main value is reducing guesswork when choosing a local model to download and run.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Scikit-LLM

    Scikit-LLM

    Seamlessly integrate LLMs into scikit-learn

    ...If this is not the case, a label will be selected randomly (label probabilities are proportional to label occurrences in the training set). Note: unlike in a typical supervised setting, the performance of a zero-shot classifier greatly depends on how the label itself is structured. It has to be expressed in natural language, descriptive, and self-explanatory.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    MiniOneRec

    MiniOneRec

    Minimal reproduction of OneRec

    MiniOneRec is an open-source framework designed to explore generative approaches to recommendation systems using large language model architectures. Traditional recommender systems typically rely on large embedding tables and ranking models, but MiniOneRec adopts a generative paradigm in which items are represented as sequences of semantic identifiers generated by autoregressive models. The framework provides an end-to-end pipeline for building generative recommender systems, including...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    RAPTOR

    RAPTOR

    The official implementation of RAPTOR

    ...During inference, the system can navigate this hierarchical representation to retrieve information that best matches the user’s query while preserving broader contextual understanding. This approach improves question-answering performance on complex tasks that require reasoning across long documents or multiple sources.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    MoBA

    MoBA

    MoBA: Mixture of Block Attention for Long-Context LLMs

    ...Instead of forcing each token to attend to every other token in the sequence, MoBA divides the context into blocks and dynamically routes queries to only the most relevant segments of information. This routing strategy reduces the computational cost associated with traditional attention while preserving performance on reasoning and long-context tasks. The approach allows language models to scale to significantly longer input contexts without the quadratic computational cost normally associated with transformer attention mechanisms.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 10
    LMOps

    LMOps

    General technology for enabling AI capabilities w/ LLMs and MLLMs

    LMOps is a research initiative and open-source toolkit focused on the development and operational management of AI applications built with large language models and generative AI systems. The project explores the technologies and methodologies required to move foundation models from research environments into production-grade AI products. It includes experimental tools and frameworks that help developers optimize prompts, design workflows for generative models, and manage the lifecycle of...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    CodeGen

    CodeGen

    Open-source model for program synthesis

    CodeGen is a family of open-source large language models designed specifically for program synthesis and code generation tasks. Developed by Salesforce Research, the models are trained on large datasets containing both natural language and programming language content. This allows them to translate natural language descriptions into functional code across a variety of programming languages. CodeGen supports multi-turn program synthesis, meaning it can generate complex programs through a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Llama-Chinese

    Llama-Chinese

    Llama Chinese community, real-time aggregation

    ...The project aggregates datasets, research resources, tutorials, and tools that help developers train and fine-tune LLaMA-based models with Chinese linguistic capabilities. It also provides optimized versions of LLaMA models trained on large-scale Chinese datasets to improve performance in tasks such as translation, summarization, and conversational AI. The community maintains educational materials and technical documentation that help researchers understand the process of training and deploying Chinese-optimized large language models. In addition to model development, the project collects learning resources and open research contributions related to LLM technology in Chinese environments. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Engram

    Engram

    A New Axis of Sparsity for Large Language Models

    Engram is a high-performance embedding and similarity search library focused on making retrieval-augmented workflows efficient, scalable, and easy to adopt by developers building search, recommendation, or semantic matching systems. It provides utilities to generate embeddings from text or other structured data, index them using efficient approximate nearest neighbor algorithms, and perform real-time similarity queries even on large corpora.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Ling

    Ling

    Ling is a MoE LLM provided and open-sourced by InclusionAI

    Ling is a Mixture-of-Experts (MoE) large language model (LLM) provided and open-sourced by inclusionAI. The project offers different sizes (Ling-lite, Ling-plus) and emphasizes flexibility and efficiency: being able to scale, adapt expert activation, and perform across a range of natural language/reasoning tasks. Example scripts, inference pipelines, and documentation. The codebase includes inference, examples, models, documentation, and model download infrastructure. As more developers and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Youtu-GraphRAG

    Youtu-GraphRAG

    Vertically Unified Agents for Graph Retrieval-Augmented Reasoning

    ...The framework also incorporates hierarchical community detection algorithms that organize knowledge into clusters, improving both retrieval efficiency and reasoning performance. In addition to graph construction and retrieval, the system integrates iterative reasoning techniques that refine answers through multiple retrieval and reasoning cycles.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    LongWriter

    LongWriter

    Unleashing 10,000+ Word Generation from Long Context LLMs

    LongWriter is an open-source framework and set of large language models designed to enable ultra-long text generation that can exceed 10,000 words while maintaining coherence and structure. Traditional large language models can process large inputs but often struggle to generate long outputs due to limitations in training data and alignment strategies. LongWriter addresses this challenge by introducing a specialized dataset and training approach that encourages models to produce longer...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    xLSTM

    xLSTM

    Neural Network architecture based on ideas of the original LSTM

    ...By introducing innovations such as matrix-based memory and improved normalization techniques, xLSTM improves the ability of recurrent networks to capture long-range dependencies in sequential data. The architecture aims to provide competitive performance with transformer-based models while maintaining advantages such as linear computational scaling and efficient memory usage for long sequences. Researchers have demonstrated that xLSTM models can scale to billions of parameters and large training datasets while maintaining efficient inference speeds.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    TigerBot

    TigerBot

    TigerBot: A multi-language multi-task LLM

    TigerBot is an open-source family of large language models designed to support multilingual and multi-task natural language processing applications. The project focuses on building high-performance models capable of handling both English and Chinese tasks while maintaining strong reasoning and conversational abilities. TigerBot models are based on modern transformer architectures and are trained on large datasets that cover multiple domains and languages. The project provides both base models and chat-optimized variants that can be used for dialogue systems, question answering, and general language understanding tasks. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    MatMul-Free LM

    MatMul-Free LM

    Implementation for MatMul-free LM

    ...Since matrix multiplication is one of the most computationally expensive components of modern language models, the project explores alternative computational strategies that reduce hardware requirements while maintaining comparable performance. The architecture relies on quantization-aware training and lightweight operations to replace conventional dense matrix multiplications with more efficient alternatives. These optimizations can significantly reduce memory consumption and potentially improve computational efficiency during both training and inference. The repository provides implementations of models at several parameter scales and includes tools for experimenting with the architecture using modern machine learning frameworks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Skywork-R1V4

    Skywork-R1V4

    Skywork-R1V is an advanced multimodal AI model series

    ...Instead of retraining both language and vision models from scratch, the framework uses a lightweight visual projection layer that connects a pretrained vision backbone with a reasoning-capable language model. This design allows the model to analyze images while maintaining strong textual reasoning performance, enabling tasks such as solving visual math problems, interpreting scientific diagrams, and answering questions about images.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Torch Pruning

    Torch Pruning

    DepGraph: Towards Any Structural Pruning

    Torch-Pruning is an open-source toolkit designed to optimize deep neural networks by performing structural pruning directly within PyTorch models. The library focuses on reducing the size and computational cost of neural networks by removing redundant parameters and channels while maintaining model performance. It introduces a graph-based algorithm called DepGraph that automatically identifies dependencies between layers, allowing parameters to be pruned safely across complex architectures. This dependency analysis makes it possible to prune large networks such as transformers, convolutional networks, and diffusion models without breaking the computational graph. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Chat with LLMs Everywhere

    Chat with LLMs Everywhere

    Run PyTorch LLMs locally on servers, desktop and mobile

    ...TorchChat supports running models through Python interfaces as well as integrating them directly into native applications written in languages such as C or C++. The project also demonstrates how modern LLMs like LLaMA-style models can be deployed locally while maintaining good performance across different hardware platforms.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    LightLLM

    LightLLM

    LightLLM is a Python-based LLM (Large Language Model) inference

    LightLLM is a high-performance inference and serving framework designed specifically for large language models, focusing on lightweight architecture, scalability, and efficient deployment. The framework enables developers to run and serve modern language models with significantly improved speed and resource efficiency compared to many traditional inference systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Huatuo-Llama-Med-Chinese

    Huatuo-Llama-Med-Chinese

    Instruction-tuning LLM with Chinese Medical Knowledge

    Huatuo-Llama-Med-Chinese is an open-source project that develops medical-domain large language models by instruction-tuning existing models using Chinese medical knowledge. The project builds specialized models by fine-tuning architectures such as LLaMA, Alpaca-Chinese, and Bloom with curated medical datasets. These datasets are constructed from medical knowledge graphs, academic literature, and question-answer pairs designed to teach models how to respond accurately to healthcare-related...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Agents 2.0

    Agents 2.0

    An Open-source Framework for Data-centric Language Agents

    Agents is an open-source framework designed to build and train autonomous language agents through a data-centric and learning-oriented architecture. The project introduces a concept known as agent symbolic learning, which treats an agent pipeline similarly to a neural network computational graph. In this framework, each node in the pipeline represents a step in the reasoning or action process, while prompts and tools act as adjustable parameters analogous to neural network weights. During...
    Downloads: 0 This Week
    Last Update:
    See Project
Auth0 Logo