Showing 17 open source projects for "test"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Cloud tools for web scraping and data extraction Icon
    Cloud tools for web scraping and data extraction

    Deploy pre-built tools that crawl websites, extract structured data, and feed your applications. Reliable web data without maintaining scrapers.

    Automate web data collection with cloud tools that handle anti-bot measures, browser rendering, and data transformation out of the box. Extract content from any website, push to vector databases for RAG workflows, or pipe directly into your apps via API. Schedule runs, set up webhooks, and connect to your existing stack. Free tier available, then scale as you need to.
    Explore 10,000+ tools
  • 1
    Empirical

    Empirical

    Test and evaluate LLMs and model configurations

    Empirical is the fastest way to test different LLMs and model configurations, across all the scenarios that matter for your application.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    promptfoo

    promptfoo

    Evaluate and compare LLM outputs, catch regressions, improve prompts

    ...Use built-in metrics, LLM-graded evals, or define your own custom metrics. Compare prompts and model outputs side-by-side, or integrate the library into your existing test/CI workflow. Use OpenAI, Anthropic, and open-source models like Llama and Vicuna, or integrate custom API providers for any LLM API.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    Opik

    Opik

    Open-source end-to-end LLM Development Platform

    ...Run experiments with different prompts and evaluate against a test set. Choose and run pre-configured evaluation metrics or define your own with our convenient SDK library. Consult built-in LLM judges for complex issues like hallucination detection, factuality, and moderation.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    Langflow

    Langflow

    Low-code app builder for RAG and multi-agent AI applications

    Langflow is a low-code app builder for RAG and multi-agent AI applications. It’s Python-based and agnostic to any model, API, or database.
    Downloads: 6 This Week
    Last Update:
    See Project
  • Axe Credit Portal - ACP- is axefinance’s future-proof AI-driven solution to digitalize the loan process from KYC to servicing, available as a locally hosted or cloud-based software. Icon
    Axe Credit Portal - ACP- is axefinance’s future-proof AI-driven solution to digitalize the loan process from KYC to servicing, available as a locally hosted or cloud-based software.

    Banks, lending institutions

    Founded in 2004, axefinance is a global market-leading software provider focused on credit risk automation for lenders looking to provide an efficient, competitive, and seamless omnichannel financing journey for all client segments (FI, Retail, Commercial, and Corporate.)
    Learn More
  • 5
    langrocks

    langrocks

    Tools like web browser, computer access and code runner for LLMs

    Langrocks is a programming language experimentation toolkit that enables developers to create, test, and optimize custom programming languages.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    MiniMax-M1

    MiniMax-M1

    Open-weight, large-scale hybrid-attention reasoning model

    MiniMax-M1 is presented as the world’s first open-weight, large-scale hybrid-attention reasoning model, designed to push the frontier of long-context, tool-using, and deeply “thinking” language models. It is built on the MiniMax-Text-01 foundation and keeps the same massive parameter budget, but reworks the attention and training setup for better reasoning and test-time compute scaling. Architecturally, it combines Mixture-of-Experts layers with lightning attention, enabling the model to support a native context length of 1 million tokens while using far fewer FLOPs than comparable reasoning models for very long generations. The team emphasizes efficient scaling of test-time compute: at 100K-token generation lengths, M1 reportedly uses only about 25 percent of the FLOPs of some competing models, making extended “think step” traces more feasible. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Automated Interpretability

    Automated Interpretability

    Code for Language models can explain neurons in language models paper

    ...It includes a “neuron explainer” component that, given a target neuron or latent feature, proposes natural language explanations or heuristics (e.g. “this neuron activates when the input has property X”) and then simulates activation behavior across example inputs to test whether the explanation holds. The project also contains a “neuron viewer” web component for browsing neurons, explanations, and activation patterns, making it more interactive and exploratory.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    Pezzo

    Pezzo

    Open-source, developer-first LLMOps platform

    Pezzo enables you to build, test, monitor and instantly ship AI all in one platform, while constantly optimizing for cost and performance. Packed with powerful features to streamline your workflow, so you can focus on what matters. Pezzo is a fully cloud-native and open-source LLMOps platform. Seamlessly observe and monitor your AI operations, troubleshoot issues, save up to 90% on costs and latency, collaborate and manage your prompts in one place, and instantly deliver AI changes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Streamline Analyst

    Streamline Analyst

    AI agent that streamlines the entire process of data analysis

    ...This Data Analysis Agent effortlessly automates all the tasks such as data cleaning, preprocessing, and even complex operations like identifying target objects, partitioning test sets, and selecting the best-fit models based on your data. With Streamline Analyst, results visualization and evaluation become seamless.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Smart Business Texting that Generates Pipeline Icon
    Smart Business Texting that Generates Pipeline

    Create and convert pipeline at scale through industry leading SMS campaigns, automation, and conversation management.

    TextUs is the leading text messaging service provider for businesses that want to engage in real-time conversations with customers, leads, employees and candidates. Text messaging is one of the most engaging ways to communicate with customers, candidates, employees and leads. 1:1, two-way messaging encourages response and engagement. Text messages help teams get 10x the response rate over phone and email. Business text messaging has become a more viable form of communication than traditional mediums. The TextUs user experience is intentionally designed to resemble the familiar SMS inbox, allowing users to easily manage contacts, conversations, and campaigns. Work right from your desktop with the TextUs web app or use the Chrome extension alongside your ATS or CRM. Leverage the mobile app for on-the-go sending and responding.
    Learn More
  • 10
    Cake

    Cake

    Distributed LLM and StableDiffusion inference

    ...Unlike many simple proxies, Cake can act as a full connection broker: it can bind to arbitrary interfaces, handle simultaneous upstream/downstream sessions, and apply traffic rules on the fly. This makes it suitable for troubleshooting tricky network behavior, simulating network conditions, or chaining services in a modular test environment.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    BIG-bench

    BIG-bench

    Beyond the Imitation Game collaborative benchmark for measuring

    BIG-bench (Beyond the Imitation Game Benchmark) is a large, collaborative benchmark suite designed to probe the capabilities and limitations of large language models across hundreds of diverse tasks. Rather than focusing on a single metric or domain, it aggregates many hand-authored tasks that test reasoning, commonsense, math, linguistics, ethics, and creativity. Tasks are intentionally heterogeneous: some are multiple-choice with exact scoring, others are free-form generation judged by model-based or human evaluation. The suite provides a common JSON task format and an evaluation harness so research groups can contribute new tasks and reproduce results consistently. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    Aviary

    Aviary

    Ray Aviary - evaluate multiple LLMs easily

    Aviary is an LLM serving solution that makes it easy to deploy and manage a variety of open source LLMs. Providing an extensive suite of pre-configured open source LLMs, with defaults that work out of the box. Supporting Transformer models hosted on Hugging Face Hub or present on local disk. Aviary has native support for autoscaling and multi-node deployments thanks to Ray and Ray Serve. Aviary can scale to zero and create new model replicas (each composed of multiple GPU workers) in...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    unit-minions

    unit-minions

    AI R&D Efficiency Improvement Research: Do-It-Yourself Training LoRA

    "AI R&D Efficiency Improvement Research: Do-It-Yourself Training LoRA", including Llama (Alpaca LoRA) model, ChatGLM (ChatGLM Tuning) related Lora training. Training content: user story generation, test code generation, code-assisted generation, text to SQL, text generation code.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    LLaMA.go

    LLaMA.go

    llama.go is like llama.cpp in pure Golang

    llama.go is like llama.cpp in pure Golang. The code of the project is based on the legendary ggml.cpp framework of Georgi Gerganov written in C++ with the same attitude to performance and elegance. Both models store FP32 weights, so you'll needs at least 32Gb of RAM (not VRAM or GPU RAM) for LLaMA-7B. Double to 64Gb for LLaMA-13B.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    PromptCraft-Robotics

    PromptCraft-Robotics

    Community for applying LLMs to robotics and a robot simulator

    The PromptCraft-Robotics repository serves as a community for people to test and share interesting prompting examples for large language models (LLMs) within the robotics domain. We also provide a sample robotics simulator (built on Microsoft AirSim) with ChatGPT integration for users to get started. We currently focus on OpenAI's ChatGPT, but we also welcome examples from other LLMs (for example open-sourced models or others with API access such as GPT-3 and Codex).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    DomE

    DomE

    Implements a reference architecture for creating information systems

    DomE Experiment is an implementation of a reference architecture for creating information systems from the automated evolution of the domain model. The architecture comprises elements that guarantee user access through automatically generated interfaces for various devices, integration with external information sources, data and operations security, automatic generation of analytical information, and automatic control of business processes. All these features are generated from the domain...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Grade School Math

    Grade School Math

    8.5K high quality grade school math problems

    The grade-school-math repository (sometimes called GSM8K) is a curated dataset of 8,500 high-quality grade school math word problems intended for evaluating mathematical reasoning capabilities of language models. It is structured into 7,500 training problems and 1,000 test problems. These aren’t trivial exercises — many require multi-step reasoning, combining arithmetic operations, and handling intermediate steps (e.g. “If she sold half as many in May… how many in total?”). The problems are written by human authors (not automatically generated) to ensure linguistic variety and realism. The repository maintains strict formatting (e.g. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next