Ensure high-quality LLM outputs with automatic evals. Use a representative sample of user inputs to reduce subjectivity when tuning prompts. Use built-in metrics, LLM-graded evals, or define your own custom metrics. Compare prompts and model outputs side-by-side, or integrate the library into your existing test/CI workflow. Use OpenAI, Anthropic, and open-source models like Llama and Vicuna, or integrate custom API providers for any LLM API.

Features

  • Create a list of test cases
  • Set up evaluation metrics
  • Select the best prompt & model
  • Use a representative sample of user inputs to reduce subjectivity when tuning prompts
  • Use built-in metrics, LLM-graded evals, or define your own custom metrics
  • Compare prompts and model outputs side-by-side, or integrate the library into your existing test/CI workflow

Project Samples

Project Activity

See All Activity >

License

MIT License

Follow promptfoo

promptfoo Web Site

Other Useful Business Software
Train ML Models With SQL You Already Know Icon
Train ML Models With SQL You Already Know

BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.
Try Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of promptfoo!

Additional Project Details

Programming Language

TypeScript

Related Categories

TypeScript Large Language Models (LLM)

Registered

2023-08-25