Showing 292 open source projects for "scoring"

View related business solutions
  • Stop vibe-debugging. Icon
    Stop vibe-debugging.

    Plug Claude into your app's actual errors.

    AppSignal's MCP server hands Claude, Cursor, or Zed your real errors, traces, and the deploy that shipped them. AI writes the fix; you review the diff.
    Free 30 days.
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    ImageReward

    ImageReward

    [NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences

    ImageReward is the first general-purpose human preference reward model (RM) designed for evaluating text-to-image generation, introduced alongside the NeurIPS 2023 paper ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation. Trained on 137k expert-annotated image pairs, ImageReward significantly outperforms existing scoring methods like CLIP, Aesthetic, and BLIP in capturing human visual preferences. It is provided as a Python package (image-reward) that enables quick scoring of generated images against textual prompts, with APIs for ranking, scoring, and filtering outputs. Beyond evaluation, ImageReward supports Reward Feedback Learning (ReFL), a method for directly fine-tuning diffusion models such as Stable Diffusion using human-preference feedback, leading to demonstrable improvements in image quality.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    Retro Bowl

    Retro Bowl

    Retro Bowl is an American football game in retro style

    ...The project focuses on delivering a simplified yet engaging sports simulation where players take on the role of both coach and general manager, making decisions about roster management, player upgrades, and in-game strategy. It recreates core mechanics such as passing, scoring, and play execution while maintaining a retro aesthetic that emphasizes clarity and responsiveness over realism. The codebase is structured to allow developers to modify gameplay systems, tweak mechanics, or expand features, making it suitable for experimentation and learning. It also aims to be portable and easy to run across environments, lowering the barrier for both players and contributors.
    Downloads: 23 This Week
    Last Update:
    See Project
  • 3
    X For You Feed Algorithm

    X For You Feed Algorithm

    Algorithm powering the For You feed on X

    X For You Feed Algorithm is the open-sourced core recommendation system that powers the For You feed on X (the social network formerly known as Twitter), and it represents one of the first times a major social platform has published production-level ranking code for public review and experimentation. The repository contains the full pipeline that ingests user engagement and content candidate data, processes it through retrieval, hydration, filtering, scoring, and selection layers, and ultimately ranks posts to show what appears in a user’s feed. At its heart, the system uses a transformer-based model adapted from xAI’s Grok architecture to predict probabilities for various user actions (such as likes, replies, reposts, clicks, and negative signals), then combines those into a weighted final score that drives ranking.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    NGBoost

    NGBoost

    Natural Gradient Boosting for Probabilistic Prediction

    ngboost is a Python library that implements Natural Gradient Boosting, as described in "NGBoost: Natural Gradient Boosting for Probabilistic Prediction". It is built on top of Scikit-Learn and is designed to be scalable and modular with respect to the choice of proper scoring rule, distribution, and base learner. A didactic introduction to the methodology underlying NGBoost is available in this slide deck.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 5
    Prometheus-Eval

    Prometheus-Eval

    Evaluate your LLM's response with Prometheus and GPT4

    ...It also provides training data and utilities for fine-tuning evaluator models so they can assess outputs according to custom scoring rubrics such as helpfulness, accuracy, or style.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    CyberStrikeAI

    CyberStrikeAI

    CyberStrikeAI is an AI-native security testing platform built in Go

    ...It supports role-based testing, letting teams define security roles with tailored tool access and prompts, and includes a skills system that encapsulates specialized testing strategies that the AI can incorporate into its planning. Through comprehensive lifecycle management, results are tracked, aggregated, and visualized, with support for versioned persistence, search, and risk severity scoring.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    darwin-skill

    darwin-skill

    Autoresearch-inspired autonomous skill optimization for Claude Code

    ...Instead of treating prompts or skill definitions as static assets, the system applies a continuous improvement cycle that evaluates performance, proposes changes, tests outcomes, and either retains or reverts modifications. The framework introduces a scoring system across multiple dimensions, enabling quantitative assessment of skill quality and ensuring that only improvements are preserved over time. It incorporates a “ratchet mechanism” similar to version control workflows, guaranteeing that performance never degrades as iterations progress. The system also separates the agents responsible for editing and evaluating skills to avoid bias, which improves the reliability of optimization results.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    TreeQuest

    TreeQuest

    A Tree Search Library with Flexible API for LLM Inference-Time Scaling

    TreeQuest, developed by SakanaAI, is a versatile Python library implementing adaptive tree search algorithms—such as AB‑MCTS—for enhancing inference-time performance of large language models (LLMs). It allows developers to define custom state-generation and scoring functions (e.g., via LLMs), and then efficiently explores possible answer trees during runtime. With support for multi-LLM collaboration, checkpointing, and mixed policies, TreeQuest enables smarter, trial‑and‑error question answering by leveraging both breadth (multiple attempts) and depth (iterative refinement) strategies to find better outputs dynamically
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Job Recommend

    Job Recommend

    The basics of building a job recommendation workflow

    ...You can study how to transform raw text into features and how to evaluate simple heuristics or baseline models. The code encourages experimentation, inviting you to swap scoring rules, adjust weights, or plug in alternative representations. It serves as a starting point for understanding recommendation pipelines before moving to production-grade systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 10
    AutoAgent AI

    AutoAgent AI

    Autonomous harness engineering

    ...Instead of manually tuning prompts or workflows, developers define high-level goals in a configuration file, and the system continuously modifies its own tools, orchestration, and logic based on benchmark performance. It operates through a loop of testing, analyzing failures, and refining the agent’s configuration to maximize a scoring metric. The framework uses a single-file agent harness combined with structured tasks and evaluation suites to guide optimization. It runs inside Docker for safe execution and reproducibility. This approach shifts agent development from manual design to automated optimization. The system is particularly useful for building domain-specific agents that need continuous performance improvement.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    Output

    Output

    TypeScript framework for building AI workflows and agents

    ...It eliminates reliance on fragmented SaaS tools by providing all necessary components locally, ensuring better transparency and control over data and processes. Output includes built-in evaluation systems, such as LLM-as-a-judge scoring, and integrates workflow orchestration tools like Temporal to handle retries, parallel execution, and state management. It also supports multiple model providers through a unified API.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    Stop Slop

    Stop Slop

    A skill file for removing AI tells from prose

    ...The project targets common AI habits such as filler openings, overused contrasts, unnecessary adverbs, vague language, passive phrasing, and metronomic sentence rhythm. It also includes a scoring rubric that rates drafts across dimensions such as directness, rhythm, trust, authenticity, and density. The skill is useful for drafting, editing, polishing, and quality-checking prose before publication. Its main value is giving writers and AI assistants a practical checklist for making text feel less synthetic and more intentional.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    ChainForge

    ChainForge

    An open-source visual programming environment

    ...The platform enables rapid experimentation by generating permutations of prompts and inputs, making it possible to test hundreds of variations in parallel and analyze performance trends more effectively. It also includes evaluation nodes that allow developers to define scoring functions, enabling automated benchmarking of outputs based on custom criteria such as accuracy, formatting, or relevance.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    what-to-eat

    what-to-eat

    An AI-based intelligent recipe generation platform

    ...It supports a wide range of cuisines, including traditional Chinese regional styles and international dishes, making it versatile for different cultural preferences. The system goes beyond simple recipe suggestions by including features such as wine pairing recommendations, sauce design, and health scoring, providing a more holistic cooking experience. It also includes a dynamic configuration system that allows users to switch between AI models and adjust parameters in real time without restarting the application.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Cinephage

    Cinephage

    The AIO solution to your self hosted media gathering needs

    ...The platform boasts built-in support for multiple media acquisition paths (including torrent and Usenet indexers, streaming fallback mechanisms, and metadata discovery), smart quality and format scoring, and automated subtitle fetching, all while feeding results to compatible players and media servers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    AI Marketing Skills

    AI Marketing Skills

    Open-source AI marketing skills for Claude Code

    AI Marketing Skills is a comprehensive open-source framework designed to transform AI agents into fully operational marketing and sales systems by equipping them with structured, reusable “skills” that automate real business workflows. Instead of simple prompts, the project provides complete operational modules that include scripts, scoring systems, and decision-making logic, allowing AI tools like Claude Code to execute complex marketing tasks end-to-end. The system is organized into multiple domains such as growth experimentation, sales pipeline generation, content production, outbound marketing, SEO optimization, and financial analysis, effectively covering the entire revenue lifecycle of a business. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    CTFd

    CTFd

    CTFs as you need them

    ...It comes with everything you need to run a CTF and it's easy to customize with plugins and themes. Create your own challenges, categories, hints, and flags from the Admin Interface. Dynamic Scoring Challenges. Unlockable challenge support. Challenge plugin architecture to create your own custom challenges. Static & Regex-based flags. Custom flag plugins. Unlockable hints. File uploads to the server or an Amazon S3-compatible backend. Limit challenge attempts & hide challenges. Automatic bruteforce protection. Individual and Team-based competitions. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    WebGLM

    WebGLM

    An Efficient Web-enhanced Question Answering System

    ...WebGLM introduces several components that coordinate this process, including a retrieval module that selects relevant web documents, a generator that produces answers, and a scoring system that evaluates the quality of generated responses. The architecture aims to improve the reliability and usefulness of AI systems that answer questions about current or external knowledge sources.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    TKD Scoring Wi-Fi

    TKD Scoring Wi-Fi

    TKD Scoring Wi-Fi Server supporting Android and IPhone clients

    ...This looks even more professional and easier to use than any of the programs I’ve ever seen used at comps.” and “Hey... just installed everything and it’s unreal! Everything works perfectly, looks great!” encouraged us to make it public for everybody Like any scoring system the “TKD Scoring Wi-Fi” system has few basic components: *TKD Scoring Wi-Fi Server(PC or Android) *TKD Scoring Wi-Fi Client(Android or IPhone) *TKD Scoring Wi-Fi Remote Score Display (Android)
    Leader badge
    Downloads: 53 This Week
    Last Update:
    See Project
  • 20
    Elfeed Emacs Web Feed Reader

    Elfeed Emacs Web Feed Reader

    An Emacs web feeds client

    Elfeed is an extensible web feed reader for Emacs, supporting both Atom and RSS. It requires Emacs 24.3 and is available for download from MELPA or el-get.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    StabilityMatrix

    StabilityMatrix

    Multi-Platform Package Manager for Stable Diffusion

    ...It provides a framework to run experiments systematically—capturing inputs, model configurations, outputs, and metrics—so researchers and practitioners can reason about differences in quality, robustness, and failure modes. The repository often bundles tooling for automated prompt sweeping, scoring heuristics (such as diversity, coherence, or task-specific metrics), and visualization helpers to make comparisons interpretable. This approach is useful for model selection, prompt engineering, and benchmarking new checkpoints against baseline models under reproducible conditions. By turning ad-hoc tests into tracked experiments, StabilityMatrix reduces bias, surfaces subtle regressions, and accelerates iteration when tuning generative systems.
    Downloads: 83 This Week
    Last Update:
    See Project
  • 22
    Agent Zero

    Agent Zero

    Agent Zero AI framework

    ...There is a lot of freedom in this framework. You can instruct your agents to regularly report back to superiors asking for permission to continue. You can instruct them to use point-scoring systems when deciding when to delegate subtasks. Superiors can double-check subordinates' results and disputes.
    Downloads: 27 This Week
    Last Update:
    See Project
  • 23
    ZomboDB

    ZomboDB

    Making Postgres and Elasticsearch work together like it's 2023

    ZomboDB is a PostgreSQL extension that integrates Elasticsearch directly into Postgres, allowing for powerful full-text search and analytics capabilities. It manages Elasticsearch indices transparently, ensuring transactional consistency and enabling complex queries through SQL.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Memory OS

    Memory OS

    A 7-layer memory operating system for Hermes Agent

    Memory OS is a local memory operating system for Hermes Agent. It is designed to help an AI agent retain project context, decisions, structured facts, reasoning patterns, and prior conversations across sessions. The system uses seven memory layers that combine flat files, SQLite, full-text search, structured facts, semantic recall, Qdrant vector storage, and a self-curating wiki pipeline. It injects only relevant context back into the agent so memory remains useful without wasting tokens....
    Downloads: 3 This Week
    Last Update:
    See Project
  • 25
    Supermemory

    Supermemory

    Memory engine and app that is extremely fast, scalable

    ...It often incorporates clustering, semantic search, and summarization modules to reduce cognitive load and surface key ideas, which makes it useful for research, study, writing, and long-term project tracking. Users can interact with the system via conversational queries or traditional search interfaces, and the system leverages vector embeddings and memory scoring to prioritize the most relevant results.
    Downloads: 3 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
Auth0 Logo