Showing 88 open source projects for "safety"

View related business solutions
  • Stop Storing Third-Party Tokens in Your Database Icon
    Stop Storing Third-Party Tokens in Your Database

    Auth0 Token Vault handles secure token storage, exchange, and refresh for external providers so you don't have to build it yourself.

    Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.
    Try Auth0 for Free
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 1
    gpt-oss-safeguard

    gpt-oss-safeguard

    Safety reasoning models built-upon gpt-oss

    gpt-oss-safeguard is an open-weight reasoning model family released by OpenAI designed specifically for content safety and moderation tasks. Rather than just outputting a numeric “safety score,” it is trained to reason about content with respect to a user-provided policy, allowing flexible, customizable moderation definitions rather than fixed rules — ideal when different platforms have different safety standards. The model comes in at least two variants: a large 120B-parameter version for heavy-duty, high-accuracy reasoning, and a 20B-parameter version optimized for lower latency or smaller compute resources. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    PKU Beaver

    PKU Beaver

    Constrained Value Alignment via Safe Reinforcement Learning

    PKU Beaver is an open-source research project focused on improving the safety alignment of large language models through reinforcement learning from human feedback under explicit safety constraints. The framework introduces techniques that separate helpfulness and harmlessness signals during training, allowing models to optimize for useful responses while minimizing harmful behavior. To support this process, the project provides datasets containing human-labeled examples that encode both performance preferences and safety constraints across multiple dimensions. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Purple Llama

    Purple Llama

    Set of tools to assess and improve LLM security

    Purple Llama is an umbrella safety initiative that aggregates tools, benchmarks, and mitigations to help developers build responsibly with open generative AI. Its scope spans input and output safeguards, cybersecurity-focused evaluations, and reference shields that can be inserted at inference time. The project evolves as a hub for safety research artifacts like Llama Guard and Code Shield, along with dataset specs and how-to guides for integrating checks into applications. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    FuzzyAI Fuzzer

    FuzzyAI Fuzzer

    A powerful tool for automated LLM fuzzing

    ...FuzzyAI provides testing tools, datasets, and evaluation workflows that help researchers measure how well models resist harmful instructions or attempts to bypass safety mechanisms.
    Downloads: 1 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    System Prompts Leaks

    System Prompts Leaks

    Collection of extracted System Prompts from popular chatbots

    System Prompts Leaks is a curated repository that collects known leaked or publicly exposed system prompts used by large language models, organized so researchers, developers, and AI safety advocates can analyze them in one place. The project highlights how system prompts — instructions that strongly influence model behavior — have been inadvertently shared in forums, datasets, and open repositories, illustrating common patterns and potential vulnerabilities in prompt design and deployment. By aggregating these prompts, the repository serves as a valuable resource for understanding how widely different models are being guided in the wild, which helps with comparative analysis across architectures and service providers. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 6
    Safety-Prompts

    Safety-Prompts

    Chinese safety prompts for evaluating and improving the safety of LLMs

    Safety-Prompts is an open-source repository that provides a curated collection of prompts designed to evaluate and improve the safety behavior of large language models. The project focuses primarily on safety testing scenarios relevant to Chinese language models, though the concepts can be applied to other languages and systems. The prompts are structured to test whether models generate outputs that align with human values and safety guidelines when faced with potentially harmful or sensitive requests. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    In-The-Wild Jailbreak Prompts on LLMs

    In-The-Wild Jailbreak Prompts on LLMs

    A dataset consists of 15,140 ChatGPT prompts from Reddit

    In-The-Wild Jailbreak Prompts on LLMs is an open-source research repository that provides datasets and analytical tools for studying jailbreak prompts used to bypass safety restrictions in large language models. The project is part of a research effort to understand how users attempt to circumvent alignment and safety mechanisms built into modern AI systems. The repository includes a large collection of prompts gathered from real-world platforms such as Reddit, Discord, prompt-sharing communities, and other public sources. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Simple MCP

    Simple MCP

    A simple TypeScript library for creating MCP servers

    simple-mcp is a TypeScript library designed to facilitate the creation of Model Context Protocol (MCP) servers. It offers a straightforward API, enabling developers to set up MCP servers with minimal code. The library emphasizes type safety through full TypeScript integration and incorporates parameter validation using Zod. It fully implements the MCP, ensuring compatibility with MCP clients. ​
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Swift Concurrency Agent Skill

    Swift Concurrency Agent Skill

    Add expert Swift Concurrency guidance to your AI coding tool

    ...Rather than teaching basic Swift, it targets the nuanced behaviors of concurrency primitives, actor isolation, and safety annotations like @MainActor and Sendable. It also clarifies how to reason about structured tasks, cancellation, and performance trade-offs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • 10
    Phantasm

    Phantasm

    Toolkits to create a human-in-the-loop approval layer

    Phantasm offers toolkits to create a human-in-the-loop approval layer to monitor and guide AI agents' workflows in real-time, ensuring safety and reliability in AI operations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Claude Code Tools

    Claude Code Tools

    Practical productivity tools for Claude Code, Codex-CLI

    ...Some components enable Claude Code to interact with terminal multiplexers such as tmux so that it can run programs, debug applications, and interact with scripts that require user input. The toolkit also provides safety mechanisms that prevent potentially dangerous shell commands from being executed automatically by AI agents.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 12
    Eino

    Eino

    LLM application development framework for Go with agents and flows

    ...Eino also offers orchestration capabilities that allow components to be connected into chains, graphs, or workflows for complex AI pipelines. These orchestration features handle concerns such as concurrency, streaming responses, and type safety so developers can focus on application logic.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    GitHub Agentic Workflows

    GitHub Agentic Workflows

    GitHub Agentic Workflows

    ...By writing intent in markdown files, a developer can quickly generate .yml Actions workflows that perform tasks such as summarizing issues, automating triage, generating reports, or maintaining documentation, all without manually crafting YAML logic from scratch. The system emphasizes safety and guardrails, running agents in sandboxed environments with minimal permissions by default, and using “safe outputs” to constrain what the workflow can write back into the repository. It includes tooling for compiling, testing, and iterating on agentic workflows locally and integrates with GitHub’s existing Actions ecosystem.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14
    xiaohongshu-mcp

    xiaohongshu-mcp

    MCP for xiaohongshu.com

    xiaohongshu-mcp is a Model Context Protocol (MCP) server that equips AI assistants with first-class tools for working on Xiaohongshu (Little Red Book), focusing on day-to-day creator and operator workflows rather than generic browsing. The project centers on authenticated actions and data access that matter to content operations, such as checking login state, publishing or scheduling content, fetching recommendations and search results, reading post details, and acting on comments. It’s...
    Downloads: 85 This Week
    Last Update:
    See Project
  • 15
    ZAPI

    ZAPI

    ZAPI by Adopt AI is an open-source Python library

    ZAPI is a developer-centric API framework that streamlines building, testing, and deploying APIs with strong type safety and minimal boilerplate, helping teams deliver backend services faster with fewer errors. It emphasizes a declarative router and schema model that uses types to define request and response formats, providing clear contracts for frontend and backend teams while automatically generating documentation. Zapi abstracts many repetitive tasks such as validation, authentication flows, and error handling so developers can focus on business logic instead of infrastructure plumbing. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Agentic Coding Flywheel Setup

    Agentic Coding Flywheel Setup

    System tool for beginners wanting agentic engineering capabilities

    ...With a single shell installer, ACFS transforms a fresh compute environment into a ready-to-use development setup that includes modern shells, language runtimes, AI coding agents (like Claude Code, Codex CLI, and Gemini CLI), and a coordinated toolchain for orchestration and safety. The system is designed for developers who want to run multi-agent coding assistants on personal or VPS hosts with minimal manual configuration. It comes with a battle-tested suite of utilities for agent coordination, orchestration, and developer productivity enhancements, such as named tmux panes, agent mail coordination layers, and cloud CLI integrations.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    Anthropic SDK TypeScript

    Anthropic SDK TypeScript

    Access to Anthropic's safety-first language model APIs

    ...Example usage shows how to instantiate the Anthropic client, call client.messages.create(...), and obtain responses. It supports streaming endpoints as well. Because TypeScript provides type safety, it helps avoid common errors in JSON interplay. The repo also includes documentation (API spec in api.md) and examples (e.g. streaming examples).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    AI Agents Masterclass

    AI Agents Masterclass

    Follow along with my AI Agents Masterclass videos

    ...The project includes structured lessons, code examples, and practical exercises that cover foundational concepts like prompt engineering, chaining agents, tool usage, plan execution, evaluation, and safety considerations. It breaks down how autonomous agents interact with external systems, handle iterative reasoning, and integrate with third-party services or APIs to perform real tasks — for example, web search, browsing, scheduling, or coding assistance. Students of the masterclass can follow written modules or Jupyter notebooks that illustrate concepts step by step and progressively build more capable agents. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    Ralph AI Agent

    Ralph AI Agent

    AI agent loop that runs repeatedly until all PRD items are complete

    ...It provides a reactive loop where agents can repeatedly assess the current context, reason about the next best action using large language models, and execute actions across integrated tools and services. The runtime emphasizes safety boundaries by sandboxing operations, enforcing time and token limits, and isolating execution layers to prevent unpredictable side effects. Ralph also includes a built-in plugin system that lets developers attach custom tools, environment connectors, or monitoring hooks without modifying core logic. Designed for extensibility, the framework supports multi-model providers so agents can switch between models or fall back based on task needs. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    OpenHands

    OpenHands

    Open-source autonomous AI software engineer

    ...So we're building all our agents in the open on GitHub, under the MIT license. Our agents can do anything a human developer can: they write code, run commands, and use the web. We're partnering with AI safety experts like Invariant Labs to balance innovation with security.
    Downloads: 18 This Week
    Last Update:
    See Project
  • 21
    Claw Code

    Claw Code

    AI agent harness for AI coding agents

    ...It emphasizes harness engineering—how agents are structured, how they interact with tools, and how they maintain context during execution. The system is being actively expanded, with a Rust-based runtime in development to improve performance and memory safety. Overall, Claw Code serves as a research-driven platform for advancing agent-based software development systems.
    Downloads: 25 This Week
    Last Update:
    See Project
  • 22
    vLLM Semantic Router

    vLLM Semantic Router

    System Level Intelligent Router for Mixture-of-Models at Cloud

    ...The router operates as an intelligent layer between users and model infrastructure, capturing signals from prompts, responses, and contextual data to improve decision-making. It can also integrate safety and monitoring mechanisms that detect issues such as jailbreak attempts, hallucinations, or sensitive information exposure.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    WorkAny

    WorkAny

    Desktop Agent for Any Task

    ...It acts as a unified environment where users can ask the AI to generate documents, presentations, websites, spreadsheets, organize files, or write code — all with real-time streaming outputs directly in the app, so you see results as the AI produces them. Powered by a combination of Claude Code as the primary runtime agent and a sandbox execution environment for safety, WorkAny integrates an agent SDK, MCP (Model Context Protocol) support, and custom skills to handle diverse tasks with contextual understanding. Users can connect multiple model providers, including OpenAI, OpenRouter, or custom endpoints, and WorkAny supports parallel task execution with asynchronous result viewing, enhancing productivity.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Superagent

    Superagent

    Superagent protects your AI applications

    Superagent is an open-source AI safety platform built to protect applications from prompt injections, data leaks, and harmful outputs. It embeds real-time safety directly into AI workflows, helping teams secure models before threats cause damage. Superagent provides guardrails that block jailbreaks, prompt manipulation, and sensitive data exfiltration. It includes redaction tools to remove PII, PHI, and secrets automatically from text.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Heretic

    Heretic

    Fully automatic censorship removal for language models

    Heretic is an open-source Python tool that automatically removes the built-in censorship or “safety alignment” from transformer-based language models so they respond to a broader range of prompts with fewer refusals. It works by applying directional ablation techniques and a parameter optimization strategy to adjust internal model behaviors without expensive post-training or altering the core capabilities. Designed for researchers and advanced users, Heretic makes it possible to study and experiment with uncensored model responses in a reproducible, automated way. ...
    Downloads: 12 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • Next
Auth0 Logo