DeepEval is a simple-to-use, open-source LLM evaluation framework, for evaluating and testing large-language model systems. It is similar to Pytest but specialized for unit testing LLM outputs. DeepEval incorporates the latest research to evaluate LLM outputs based on metrics such as G-Eval, hallucination, answer relevancy, RAGAS, etc., which uses LLMs and various other NLP models that run locally on your machine for evaluation. Whether your application is implemented via RAG or fine-tuning, LangChain, or LlamaIndex, DeepEval has you covered. With it, you can easily determine the optimal hyperparameters to improve your RAG pipeline, prevent prompt drifting, or even transition from OpenAI to hosting your own Llama2 with confidence.

Features

  • Large variety of ready-to-use LLM evaluation metrics (all with explanations) powered by ANY LLM of your choice
  • Red team your LLM application for 40+ safety vulnerabilities in a few lines of code
  • Documentation available
  • Examples available
  • Evaluate your entire dataset in bulk in under 20 lines of Python code in parallel. Do this via the CLI in a Pytest-like manner, or through our evaluate() function
  • Create your own custom metrics that are automatically integrated with DeepEval's ecosystem by inheriting DeepEval's base metric class

Project Samples

Project Activity

See All Activity >

License

Apache License V2.0

Follow DeepEval

DeepEval Web Site

Other Useful Business Software
Enterprise-grade ITSM, for every business Icon
Enterprise-grade ITSM, for every business

Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
Try it Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of DeepEval!

Additional Project Details

Operating Systems

Linux, Mac, Windows

Programming Language

Python

Related Categories

Python Artificial Intelligence Software

Registered

2024-11-08