Showing 34 open source projects for "state-thread"

View related business solutions
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • Error to trace to log to deploy. One click. No SSH. Icon
    Error to trace to log to deploy. One click. No SSH.

    Catch the cause before the pager goes off.

    AppSignal links every error to the trace, the trace to the log, the log to the deploy that shipped it.
    Free 30 days.
  • 1
    vLLM

    vLLM

    A high-throughput and memory-efficient inference and serving engine

    vLLM is a fast and easy-to-use library for LLM inference and serving. High-throughput serving with various decoding algorithms, including parallel sampling, beam search, and more.
    Downloads: 16 This Week
    Last Update:
    See Project
  • 2
    SentenceTransformers

    SentenceTransformers

    Multilingual sentence & image embeddings with BERT

    ...The framework is based on PyTorch and Transformers and offers a large collection of pre-trained models tuned for various tasks. Further, it is easy to fine-tune your own models. Our models are evaluated extensively and achieve state-of-the-art performance on various tasks. Further, the code is tuned to provide the highest possible speed.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 3
    GLM-V

    GLM-V

    GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning

    ...The repository provides both GLM-4.5V and GLM-4.1V models, designed to advance beyond basic perception toward higher-level reasoning, long-context understanding, and agent-based applications. GLM-4.5V builds on the flagship GLM-4.5-Air foundation (106B parameters, 12B active), achieving state-of-the-art results on 42 benchmarks across image, video, document, GUI, and grounding tasks. It introduces hybrid training for broad-spectrum reasoning and a Thinking Mode switch to balance speed and depth of reasoning. GLM-4.1V-9B-Thinking incorporates reinforcement learning with curriculum sampling (RLCS) and Chain-of-Thought reasoning, outperforming models much larger in scale (e.g., Qwen-2.5-VL-72B) across many benchmarks.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    PEFT

    PEFT

    State-of-the-art Parameter-Efficient Fine-Tuning

    ...Fine-tuning large-scale PLMs is often prohibitively costly. In this regard, PEFT methods only fine-tune a small number of (extra) model parameters, thereby greatly decreasing the computational and storage costs. Recent State-of-the-Art PEFT techniques achieve performance comparable to that of full fine-tuning.
    Downloads: 0 This Week
    Last Update:
    See Project
  • $300 Free Credits to Build on Google Cloud Icon
    $300 Free Credits to Build on Google Cloud

    New to Google Cloud? Get $300 in credits to explore Compute Engine, BigQuery, Cloud Run, Gemini Enterprise Agent Platform, and more.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query petabytes in BigQuery, or build agents with Gemini Enterprise Agent Platform. Once your credits are used, keep building with 20+ always-free tier products including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. No commitment required—just sign up and start building.
    Claim $300 Free
  • 5
    OM1

    OM1

    Modular AI runtime for robots

    ...Developers can extend the system by connecting new tools, services, or data sources to the agent architecture. The platform also includes mechanisms for coordinating workflows and managing the state of ongoing tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    MobileLLM

    MobileLLM

    MobileLLM Optimizing Sub-billion Parameter Language Models

    ...The framework integrates several architectural innovations—SwiGLU activation, deep and thin network design, embedding sharing, and grouped-query attention (GQA)—to achieve a superior trade-off between model size, inference speed, and accuracy. MobileLLM demonstrates remarkable performance, with the 125M and 350M variants outperforming previous state-of-the-art models of the same scale by up to 4.3% on zero-shot commonsense reasoning tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    OpenLLM

    OpenLLM

    Operating LLMs in production

    An open platform for operating large language models (LLMs) in production. Fine-tune, serve, deploy, and monitor any LLMs with ease. With OpenLLM, you can run inference with any open-source large-language models, deploy to the cloud or on-premises, and build powerful AI apps. Built-in supports a wide range of open-source LLMs and model runtime, including Llama 2, StableLM, Falcon, Dolly, Flan-T5, ChatGLM, StarCoder, and more. Serve LLMs over RESTful API or gRPC with one command, query via...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    CodeGeeX

    CodeGeeX

    CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)

    ...Developed with MindSpore and later made PyTorch-compatible, it is capable of multilingual code generation, cross-lingual code translation, code completion, summarization, and explanation. It has been benchmarked on HumanEval-X, a multilingual program synthesis benchmark introduced alongside the model, and achieves state-of-the-art performance compared to other open models like InCoder and CodeGen. CodeGeeX also powers IDE plugins for VS Code and JetBrains, offering features like code completion, translation, debugging, and annotation. The model supports Ascend 910 and NVIDIA GPUs, with optimizations like quantization and FasterTransformer acceleration for faster inference.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 9
    Qwen3-Coder

    Qwen3-Coder

    Qwen3-Coder is the code version of Qwen3

    ...Its flagship version, Qwen3-Coder-480B-A35B-Instruct, features a massive 480 billion-parameter Mixture-of-Experts architecture with 35 billion active parameters, delivering top-tier performance on coding and agentic tasks. This model sets new state-of-the-art benchmarks among open models for agentic coding, browser-use, and tool-use, matching performance comparable to leading models like Claude Sonnet. Qwen3-Coder supports an exceptionally long context window of 256,000 tokens, extendable to 1 million tokens using Yarn, enabling repository-scale code understanding and generation. ...
    Downloads: 18 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 10
    LangChain

    LangChain

    ⚡ Building applications with LLMs through composability ⚡

    Large language models (LLMs) are emerging as a transformative technology, enabling developers to build applications that they previously could not. But using these LLMs in isolation is often not enough to create a truly powerful app - the real power comes when you can combine them with other sources of computation or knowledge. This library is aimed at assisting in the development of those types of applications.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 11
    Gemma

    Gemma

    Gemma open-weight LLM library, from Google DeepMind

    ...Through included tutorials and Colab notebooks, users can explore examples covering sampling, multi-modal interactions, and fine-tuning workflows. By providing accessible open-weight models, Gemma enables researchers and developers to experiment with state-of-the-art LLM architectures.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 12
    Xorbits Inference

    Xorbits Inference

    Replace OpenAI GPT with another LLM in your app

    ...Xorbits Inference(Xinference) is a powerful and versatile library designed to serve language, speech recognition, and multimodal models. With Xorbits Inference, you can effortlessly deploy and serve your or state-of-the-art built-in models using just a single command. Whether you are a researcher, developer, or data scientist, Xorbits Inference empowers you to unleash the full potential of cutting-edge AI models.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 13
    SkillOpt

    SkillOpt

    Text-space optimizer that trains reusable natural-language skills

    SkillOpt is a Microsoft research project for improving frozen LLM agents by optimizing reusable natural-language skill documents. Instead of changing model weights, it treats a compact skill file as the trainable state of the agent. The system learns from agent rollouts, reflection, bounded edits, and validation gates to produce better instructions over time. Its output is a deployable best_skill.md artifact that can be reused across agent tasks. The project is focused on making agents more effective through text-space optimization rather than traditional fine-tuning. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    Qwen2.5-Math

    Qwen2.5-Math

    A series of math-specific large language models of our Qwen2 series

    ...Unlike its predecessor Qwen2-Math, Qwen2.5-Math supports both Chain-of-Thought (CoT) reasoning and Tool-Integrated Reasoning (TIR) for solving math problems, and works in both Chinese and English. It is optimized for solving mathematical benchmarks and exams; the 72B-Instruct model achieves state-of-the-art results among open source models on many English and Chinese math tasks.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    NVIDIA NeMo

    NVIDIA NeMo

    Toolkit for conversational AI

    NVIDIA NeMo, part of the NVIDIA AI platform, is a toolkit for building new state-of-the-art conversational AI models. NeMo has separate collections for Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text-to-Speech (TTS) models. Each collection consists of prebuilt modules that include everything needed to train on your data. Every module can easily be customized, extended, and composed to create new conversational AI model architectures.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    Context Engineering

    Context Engineering

    A frontier, first-principles handbook

    ...It takes inspiration from thought leaders like Andrej Karpathy and bridges theory with practical examples, offering structured guidance on context orchestration, memory, retrieval, and state control within AI workflows. With extensive materials drawn from research, surveys, and visual explanations, the project acts as both a learning resource and a reference for practitioners looking to improve model behavior by engineering richer inputs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    SimpleMem

    SimpleMem

    SimpleMem: Efficient Lifelong Memory for LLM Agents

    ...Unlike monolithic systems where memory management is ad-hoc, SimpleMem formalizes a memory lifecycle—write, index, retrieve, refine—so applications can handle user history, document collections, or dynamic contextual state systematically. It supports customizable embedding models, efficient vector indexes, and relevance weighting, making it practical for building assistants, personal agents, or domain-specific retrieval systems that need persistent knowledge.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Ludwig AI

    Ludwig AI

    Low-code framework for building custom LLMs, neural networks

    ...Ludwig is a low-code framework for building custom AI models like LLMs and other deep neural networks. Declarative YAML configuration file is all you need to train a state-of-the-art LLM on your data. Support for multi-task and multi-modality learning. Comprehensive config validation detects invalid parameter combinations and prevents runtime failures. Automatic batch size selection, distributed training (DDP, DeepSpeed), parameter efficient fine-tuning (PEFT), 4-bit quantization (QLoRA), and larger-than-memory datasets. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    H2O LLM Studio

    H2O LLM Studio

    Framework and no-code GUI for fine-tuning LLMs

    Welcome to H2O LLM Studio, a framework and no-code GUI designed for fine-tuning state-of-the-art large language models (LLMs). You can also use H2O LLM Studio with the command line interface (CLI) and specify the configuration file that contains all the experiment parameters. To finetune using H2O LLM Studio with CLI, activate the pipenv environment by running make shell. With H2O LLM Studio, training your large language model is easy and intuitive.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    KVCache-Factory

    KVCache-Factory

    Unified KV Cache Compression Methods for Auto-Regressive Models

    ...KVCache-Factory provides a platform for implementing and evaluating multiple compression strategies that reduce memory usage while preserving model performance. The framework integrates several state-of-the-art methods such as PyramidKV, SnapKV, H2O, and StreamingLLM, allowing researchers to compare and experiment with different approaches within the same environment. It also supports advanced inference configurations such as Flash Attention v2 and multi-GPU inference setups for very large models.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    LLM Colosseum

    LLM Colosseum

    Benchmark LLMs by fighting in Street Fighter 3

    LLM-Colosseum is an experimental benchmarking framework designed to evaluate the capabilities of large language models through gameplay interactions rather than traditional text-based benchmarks. The system places language models inside the environment of the classic video game Street Fighter III, where they must interpret the game state and decide which actions to perform during combat. This setup creates a dynamic environment that tests reasoning, situational awareness, and decision-making abilities in real time. Instead of relying purely on reward signals as in reinforcement learning agents, the models analyze contextual information and generate strategic actions based on the game environment. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Code World Model (CWM)

    Code World Model (CWM)

    Research code artifacts for Code World Model (CWM)

    ...It is explicitly trained on execution traces, action-observation trajectories, and agentic interactions in controlled environments. It has been developed to better capture how code, actions, and state interact over time. The repository provides inference code, reproducibility scripts, prompt guides, and more. It has model cards, utilities, demos, and evaluation artifacts. Inference scripts and utilities for code generation tasks. Evaluation benchmarks on code, mathematics, and reasoning tasks. Demos, serving code, and evaluation pipelines.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Qwen3-Omni

    Qwen3-Omni

    Qwen3-omni is a natively end-to-end, omni-modal LLM

    ...It uses a Thinker-Talker architecture with a Mixture-of-Experts (MoE) design, early text-first pretraining, and mixed multimodal training to support strong performance across all modalities without sacrificing text or image quality. The model supports 119 text languages, 19 speech input languages, and 10 speech output languages. It achieves state-of-the-art results: across 36 audio and audio-visual benchmarks, it hits open-source SOTA on 32 and overall SOTA on 22, outperforming or matching strong closed-source models such as Gemini-2.5 Pro and GPT-4o. To reduce latency, especially in audio/video streaming, Talker predicts discrete speech codecs via a multi-codebook scheme and replaces heavier diffusion approaches.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    ROSA

    ROSA

    I Agent designed to interact with ROS1- and ROS2-based robotics system

    ROSA, short for Robot Operating System Agent, is an AI-powered software assistant developed by NASA’s Jet Propulsion Laboratory to simplify interaction with robotic systems that use the Robot Operating System (ROS). The project provides a natural language interface that allows developers and operators to interact with robots by issuing commands or queries in conversational language. Built on top of frameworks such as LangChain and modern large language models, ROSA translates user...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Qwen3 Embedding

    Qwen3 Embedding

    Designed for text embedding and ranking tasks

    ...It builds upon the Qwen3 base/dense models and offers several sizes (0.6B, 4B, 8B parameters), for both embedding and reranking, with high multilingual capability, long‐context understanding, and reasoning. It achieves state-of-the-art performance on benchmarks like MTEB (Multilingual Text Embedding Benchmark) and supports instruction-aware embedding (i.e. embedding task instructions along with queries) and flexible embedding/vector dimension definitions. It is meant for tasks such as text retrieval, classification, clustering, bitext mining, and code retrieval.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
Auth0 Logo