Showing 109 open source projects for "performance"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Ship Agents Faster Icon
    Ship Agents Faster

    Transform your applications and workflows into powerful agentic systems at global scale.

    Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.
    Get Started Free
  • 1
    Intel LLM Library for PyTorch

    Intel LLM Library for PyTorch

    Accelerate local LLM inference and finetuning

    ...Built as an extension of the PyTorch ecosystem, the library enables developers to run modern transformer models efficiently on Intel CPUs, GPUs, and specialized AI accelerators. The framework provides hardware-aware optimizations and low-precision computation techniques that significantly improve the performance of large language models while reducing memory consumption. IPEX-LLM supports a wide range of popular models, including architectures such as LLaMA, Mistral, Qwen, and other transformer-based systems. The library can integrate with common AI frameworks and serving tools such as Hugging Face Transformers, LangChain, and vLLM, allowing developers to incorporate optimized inference into existing pipelines.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Advanced RAG Techniques

    Advanced RAG Techniques

    Advanced techniques for RAG systems

    ...It includes hands-on Jupyter notebooks and runnable scripts that show how to implement ideas like optimizing chunk sizes, proposition chunking, HyDE/HyPE query transformations, fusion retrieval, reranking, and ensemble retrieval. There is also an evaluation section that demonstrates how to measure RAG performance and compare different configurations in a systematic way.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Ring

    Ring

    Ring is a reasoning MoE LLM provided and open-sourced by InclusionAI

    ...It applies reinforcement learning/reasoning optimization techniques. Its architectures and training approaches are tuned to enable efficient and capable reasoning performance. Reasoning-optimized model with reinforcement learning enhancements. Efficient architecture and memory design for large-scale reasoning. If you are located in mainland China, we also provide the model on ModelScope.cn to speed up the download process.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Qwen3-Omni

    Qwen3-Omni

    Qwen3-omni is a natively end-to-end, omni-modal LLM

    Qwen3-Omni is a natively end-to-end multilingual omni-modal foundation model that processes text, images, audio, and video and delivers real-time streaming responses in text and natural speech. It uses a Thinker-Talker architecture with a Mixture-of-Experts (MoE) design, early text-first pretraining, and mixed multimodal training to support strong performance across all modalities without sacrificing text or image quality. The model supports 119 text languages, 19 speech input languages, and 10 speech output languages. It achieves state-of-the-art results: across 36 audio and audio-visual benchmarks, it hits open-source SOTA on 32 and overall SOTA on 22, outperforming or matching strong closed-source models such as Gemini-2.5 Pro and GPT-4o. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Host LLMs in Production With On-Demand GPUs Icon
    Host LLMs in Production With On-Demand GPUs

    NVIDIA L4 GPUs. 5-second cold starts. Scale to zero when idle.

    Deploy your model, get an endpoint, pay only for compute time. No GPU provisioning or infrastructure management required.
    Try Free
  • 5
    Google Workspace MCP Server

    Google Workspace MCP Server

    Control Gmail, Google Calendar, Docs, Sheets, Slides, Chat, Forms

    Google Workspace MCP is an open-source server that connects AI assistants to Google Workspace services through the Model Context Protocol (MCP), allowing large language models to interact directly with productivity tools. The project exposes a wide set of Google services including Gmail, Google Drive, Docs, Sheets, Slides, Calendar, Chat, and other Workspace components as structured tools that an AI system can call programmatically. By acting as a bridge between AI clients and the Google...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    tiny-llm

    tiny-llm

    A course of learning LLM inference serving on Apple Silicon

    ...Rather than relying on high-level machine learning frameworks, the codebase uses mostly low-level array and matrix manipulation APIs so that developers can understand exactly how model inference works internally. The project demonstrates how to load and run models such as Qwen-style architectures while progressively implementing performance improvements like KV caching, request batching, and optimized attention mechanisms. It also introduces concepts behind modern LLM serving systems that resemble simplified versions of production inference engines such as vLLM.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    CogVLM2

    CogVLM2

    GPT4V-level open-source multi-modal model based on Llama3-8B

    CogVLM2 is the second generation of the CogVLM vision-language model series, developed by ZhipuAI and released in 2024. Built on Meta-Llama-3-8B-Instruct, CogVLM2 significantly improves over its predecessor by providing stronger performance across multimodal benchmarks such as TextVQA, DocVQA, and ChartQA, while introducing extended context length support of up to 8K tokens and high-resolution image input up to 1344×1344. The series includes models for both image understanding and video understanding, with CogVLM2-Video supporting up to 1-minute videos by analyzing keyframes. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    SentenceTransformers

    SentenceTransformers

    Multilingual sentence & image embeddings with BERT

    ...The framework is based on PyTorch and Transformers and offers a large collection of pre-trained models tuned for various tasks. Further, it is easy to fine-tune your own models. Our models are evaluated extensively and achieve state-of-the-art performance on various tasks. Further, the code is tuned to provide the highest possible speed.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    MiniMax-M1

    MiniMax-M1

    Open-weight, large-scale hybrid-attention reasoning model

    MiniMax-M1 is presented as the world’s first open-weight, large-scale hybrid-attention reasoning model, designed to push the frontier of long-context, tool-using, and deeply “thinking” language models. It is built on the MiniMax-Text-01 foundation and keeps the same massive parameter budget, but reworks the attention and training setup for better reasoning and test-time compute scaling. Architecturally, it combines Mixture-of-Experts layers with lightning attention, enabling the model to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 10
    TAME LLM

    TAME LLM

    Traditional Mandarin LLMs for Taiwan

    ...These models are designed to support applications such as conversational AI, knowledge retrieval, and domain-specific reasoning in fields like manufacturing, law, healthcare, and electronics. The training pipeline leverages high-performance computing infrastructure and frameworks such as NVIDIA NeMo and Megatron to enable large-scale model training. Taiwan-LLM aims to improve language understanding and generation for Traditional Mandarin users by incorporating region-specific datasets and evaluation benchmarks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    LlamaGen

    LlamaGen

    Autoregressive Model Beats Diffusion

    LlamaGen is an open-source research project that introduces a new approach to image generation by applying the autoregressive next-token prediction paradigm used in large language models to visual generation tasks. Instead of relying on diffusion models, the framework treats images as sequences of tokens that can be generated progressively using transformer architectures similar to those used for text generation. The project explores how scaling autoregressive models and improving image...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    PKU Beaver

    PKU Beaver

    Constrained Value Alignment via Safe Reinforcement Learning

    ...The framework introduces techniques that separate helpfulness and harmlessness signals during training, allowing models to optimize for useful responses while minimizing harmful behavior. To support this process, the project provides datasets containing human-labeled examples that encode both performance preferences and safety constraints across multiple dimensions. These annotations include categories such as harmful language, unethical behavior, privacy violations, and other sensitive topics. By incorporating constraint-based optimization methods, Safe-RLHF trains models that balance reward objectives with safety requirements, ensuring that harmful outputs are penalized during training.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    OpenAI Forward

    OpenAI Forward

    An efficient forwarding service designed for LLMs

    ...The project can proxy both local and cloud-hosted language model services, which makes it useful for teams that want a single control layer regardless of whether they are using something like LocalAI or a hosted provider compatible with OpenAI-style APIs. A major emphasis of the repository is asynchronous performance, using tools such as uvicorn, aiohttp, and asyncio to support high-throughput forwarding workloads.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    llama2.c

    llama2.c

    Inference Llama 2 in one file of pure C

    ...The goal of llama2.c is to demonstrate how a compact and transparent implementation can perform meaningful inference even with small models, emphasizing simplicity, clarity, and accessibility. The project builds upon lessons from nanoGPT and takes inspiration from llama.cpp, focusing instead on minimalism and educational value over large-scale performance.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    CodeLlama

    CodeLlama

    Inference code for CodeLlama models

    ...The repo documents the sizes and capabilities (e.g., 7B, 13B, 34B) and highlights features like infilling and large input context to support real IDE workflows. It targets both general software synthesis and language-specific productivity, offering strong performance among open models at release time. Typical usage includes prompt-driven generation, function or class completion, and zero-shot adherence to natural-language instructions about code changes. The ecosystem provides multiple distributions (e.g., HF format) so developers can integrate with standard toolchains and serving stacks. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 16
    FinGLM

    FinGLM

    Committed to building an open, public welfare

    ...By combining large language model architectures with financial datasets such as corporate annual reports and structured financial records, FinGLM aims to improve AI performance on tasks that require domain expertise. The repository also provides educational materials and tutorials that help developers learn how to build and fine-tune financial AI systems using the GLM model ecosystem. In addition to model development, the project promotes collaboration between researchers, companies, and developers interested in applying AI to financial analysis.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Qwen2.5-Coder

    Qwen2.5-Coder

    Qwen2.5-Coder is the code version of Qwen2.5, the large language model

    ...It includes multiple model sizes—ranging from 0.5B to 32B parameters—providing solutions for a wide array of coding needs. The model supports over 92 programming languages and offers exceptional performance in generating code, debugging, and mathematical problem-solving. Qwen2.5-Coder, with its long context length of 128K tokens, is ideal for a variety of use cases, from simple code assistants to complex programming scenarios, matching the capabilities of models like GPT-4o.
    Downloads: 28 This Week
    Last Update:
    See Project
  • 18
    InternVL

    InternVL

    A Pioneering Open-Source Alternative to GPT-4o

    InternVL is a large-scale multimodal foundation model designed to integrate computer vision and language understanding within a unified architecture. The project focuses on scaling vision models and aligning them with large language models so that they can perform tasks involving both visual and textual information. InternVL is trained on massive collections of image-text data, enabling it to learn representations that capture both visual patterns and semantic meaning. The model supports a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Grok-1

    Grok-1

    Open-source, high-performance Mixture-of-Experts large language model

    Grok-1 is a 314-billion-parameter Mixture-of-Experts (MoE) large language model developed by xAI. Designed to optimize computational efficiency, it activates only 25% of its weights for each input token. In March 2024, xAI released Grok-1's model weights and architecture under the Apache 2.0 license, making them openly accessible to developers. The accompanying GitHub repository provides JAX example code for loading and running the model. Due to its substantial size, utilizing Grok-1...
    Leader badge
    Downloads: 23 This Week
    Last Update:
    See Project
  • 20
    Qwen-Audio

    Qwen-Audio

    Chat & pretrained large audio language model proposed by Alibaba Cloud

    ...There is also an instruction-tuned version called Qwen-Audio-Chat which supports conversational interaction (multi-round), audio + text input, creative tasks and reasoning over audio. It uses multi-task training over many different audio tasks (30+), and achieves strong multi-benchmarks performance without task-specific fine‐tuning. It includes features such as flexible multi-run chat, audio understanding/reasoning, music appreciation, and also tool usage (e.g. voice editing).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    RAG-Retrieval

    RAG-Retrieval

    Unify Efficient Fine-tuning of RAG Retrieval, including Embedding

    ...Retrieval-augmented generation combines large language models with external knowledge retrieval to improve factual accuracy and domain-specific reasoning. This repository provides end-to-end infrastructure for training retrieval models, performing inference, and distilling embedding models for improved performance. It includes implementations of modern embedding architectures designed to map documents and queries into vector spaces for efficient similarity search. The framework also supports reranking models that refine retrieved results using large language models or lightweight transformer architectures. Additional training techniques such as preference-based supervised fine-tuning and embedding distillation are included to improve retrieval quality.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Qwen-VL

    Qwen-VL

    Chat & pretrained large vision language model

    Qwen-VL is Alibaba Cloud’s vision-language large model family, designed to integrate visual and linguistic modalities. It accepts image inputs (with optional bounding boxes) and text, and produces text (and sometimes bounding boxes) as output. The model variants (VL-Plus, VL-Max, etc.) have been upgraded for better visual reasoning, text recognition from images, fine-grained understanding, and support for high image resolutions / extreme aspect ratios. Qwen-VL supports multilingual inputs...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    Chinese-LLaMA-Alpaca 2

    Chinese-LLaMA-Alpaca 2

    Chinese LLaMA-2 & Alpaca-2 Large Model Phase II Project

    ...These models expand and optimize the Chinese vocabulary on the basis of the original Llama-2, use large-scale Chinese data for incremental pre-training, and further improve the basic semantics and command understanding of Chinese. Performance improvements. The related model supports FlashAttention-2 training, supports 4K context and can be extended up to 18K+ through the NTK method.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Aviary

    Aviary

    Ray Aviary - evaluate multiple LLMs easily

    Aviary is an LLM serving solution that makes it easy to deploy and manage a variety of open source LLMs. Providing an extensive suite of pre-configured open source LLMs, with defaults that work out of the box. Supporting Transformer models hosted on Hugging Face Hub or present on local disk. Aviary has native support for autoscaling and multi-node deployments thanks to Ray and Ray Serve. Aviary can scale to zero and create new model replicas (each composed of multiple GPU workers) in...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    DB-GPT-Hub

    DB-GPT-Hub

    A repository that contains models, datasets, and fine-tuning

    DB-GPT-Hub is an open-source repository that provides datasets, models, and training tools designed to improve large language models for database interaction tasks, particularly Text-to-SQL. The project serves as a specialized extension of the broader DB-GPT ecosystem, focusing on the preparation and evaluation of models capable of translating natural language questions into structured database queries. It offers a modular framework that supports data preparation, model fine-tuning,...
    Downloads: 0 This Week
    Last Update:
    See Project
Auth0 Logo