Opik download | SourceForge.net

Confidently evaluate, test, and monitor LLM applications. Opik is an open-source platform for evaluating, testing, and monitoring LLM applications. Built by Comet. Record, sort, search, and understand each step your LLM app takes to generate a response. Manually annotate, view, and compare LLM responses in a user-friendly table. Log traces during development and in production. Run experiments with different prompts and evaluate against a test set. Choose and run pre-configured evaluation metrics or define your own with our convenient SDK library. Consult built-in LLM judges for complex issues like hallucination detection, factuality, and moderation.

Features

Track all LLM calls and traces during development and production
Annotate your LLM calls by logging feedback scores using the Python SDK or the UI
Automate the evaluation process of your LLM application
Store test cases and run experiments
Use Opik's LLM as a judge metric for complex issues like hallucination detection, moderation and RAG evaluation
Run evaluations as part of your CI/CD pipeline using our PyTest integration

Project Samples

Project Activity

See All Activity >

License

Apache License V2.0

Follow Opik

Opik Web Site

Other Useful Business Software

Our Free Plans just got better! | Auth0

With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.

Try free now

Rate This Project

User Reviews

Be the first to post a review of Opik!

Additional Project Details

Programming Language

Java

Related Categories

Java Artificial Intelligence Software, Java Large Language Models (LLM), Java Observability Tool

Registered

2024-11-08

Similar Business Software

Opik

Confidently evaluate, test, and ship LLM applications with a suite of observability tools to calibrate language model outputs across your dev and production lifecycle. Log traces and spans, define and compute evaluation metrics, score LLM outputs, compare performance across app versions, and...

See Software
StackAI

StackAI is an enterprise AI automation platform to build end-to-end internal tools and processes with AI agents in a fully compliant and secure way. Designed for large, regulated organizations, it enables teams to automate complex workflows across operations, compliance, finance, IT, and support...

See Software
Pipefy

Pipefy is the AI-driven Business Orchestration and Automation Technologies (BOAT) platform that delivers enterprise results in days, not months. Designed as a secure orchestration layer, Pipefy bridges the gap between rigid legacy systems (ERPs/CRMs) and agile business needs. It allows IT...

See Software
Retool

Retool is the AI-native enterprise app development platform where teams build and ship production-ready apps — at AI speed, with enterprise governance built in. Describe what you need and get a working app, import React-based apps from Lovable, Replit, or Claude Code, or connect your AI agent...

See Software
Grafana Cloud

Grafana Labs delivers the leading AI-powered observability platform, built around Grafana—the world’s most widely adopted open source technology for dashboards and visualization. Recognized as a Leader in the 2026 Gartner® Magic Quadrant™ for Observability Platforms (3x) and furthest in...

See Software
New Relic

There are an estimated 25 million engineers in the world across dozens of distinct functions. As every company becomes a software company, engineers are using New Relic to gather real-time insights and trending data about the performance of their software so they can be more resilient and...

See Software