Showing 180 open source projects for "reasoning models"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Go from Code to Production URL in Seconds Icon
    Go from Code to Production URL in Seconds

    Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

    Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.
    Try it free
  • 1
    LaVague

    LaVague

    Framework for building AI agents that automate complex web tasks

    ...LaVague is centered around a World Model that analyzes the current webpage state and determines the next set of instructions, combined with an Action Engine that converts those instructions into executable automation code. It can use browser automation tools such as Selenium or Playwright to interact with websites programmatically. Developers can integrate various language models and configure the agent’s reasoning and execution behavior to suit different automation scenarios.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    Vulnhuntr

    Vulnhuntr

    AI tool for detecting complex vulnerabilities in Python codebases

    Vulnhuntr is an open source security tool that uses large language models to analyze codebases and identify remotely exploitable vulnerabilities. It focuses on Python projects and applies static code analysis combined with LLM reasoning to trace how user input flows through an application. Instead of scanning entire repositories at once, it builds call chains step by step, allowing deeper inspection of complex, multi-stage issues that traditional tools may miss.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    deepclaude

    deepclaude

    Use Claude Code's agent loop with DeepSeek V4 Pro, OpenRouter & more

    deepclaude is a lightweight proxy tool that enables developers to run Claude Code’s autonomous coding agent loop using alternative AI backends like DeepSeek V4 Pro, OpenRouter, or other Anthropic-compatible models. It preserves the full Claude Code experience—including file editing, terminal execution, and multi-step agent workflows—while dramatically reducing operational costs. By swapping out the underlying model instead of the interface, deepclaude delivers the same familiar UX with significantly cheaper token pricing. The platform supports seamless backend switching in real time, allowing users to choose between cost efficiency and higher reasoning power when needed. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    Agentic Data Scientist

    Agentic Data Scientist

    An end-to-end Data Scientist

    Agentic Data Scientist is an experimental AI-driven research framework that orchestrates data science workflows through autonomous agents that can reason, plan, and execute complex analytics tasks. Unlike traditional scripted pipelines, this project lets AI agents break down high-level research goals into sub-tasks such as data acquisition, cleaning, modeling, evaluation, and reporting, with minimal human direction. Each agent is designed to independently call functions, interact with data...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Let your crypto work for you

    Put idle assets to work with competitive interest rates, borrow without selling, and trade with precision. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 5
    Qwen-Agent

    Qwen-Agent

    Agent framework and applications built upon Qwen>=3.0

    Qwen-Agent is a framework for building applications / agents using Qwen models (version 3.0+). It provides components for instruction following, tool usage (function calling), planning, memory, RAG (retrieval augmented generation), code interpreter, etc. It ships with example applications (Browser Assistant, Code Interpreter, Custom Assistant), supports GUI front-ends, backends, server setups. Agent workflow can maintain context / memory to perform multi-turn or more complex logic over time....
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    Matrix

    Matrix

    Multi-Agent daTa geneRation Infra and eXperimentation framework

    ...That design makes Matrix particularly well-suited for large-batch inference, model benchmarking, data curation, augmentation, or generation — whether for language, code, dialogue, or multimodal tasks. It supports both open-source LLMs and proprietary models (via integration with model backends), and works with containerized or sandboxed environments for safe tool execution or external code runs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    GLM-4-32B-0414

    GLM-4-32B-0414

    Open Multilingual Multimodal Chat LMs

    GLM-4-32B-0414 is a powerful open-source large language model featuring 32 billion parameters, designed to deliver performance comparable to leading models like OpenAI’s GPT series. It supports multilingual and multimodal chat capabilities with an extensive 32K token context length, making it ideal for dialogue, reasoning, and complex task completion. The model is pre-trained on 15 trillion tokens of high-quality data, including substantial synthetic reasoning datasets, and further enhanced with reinforcement learning and human preference alignment for improved instruction-following and function calling. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    Alpaca-CoT

    Alpaca-CoT

    We unified the interfaces of instruction-tuning data

    ...This chain-of-thought supervision helps models perform better on tasks requiring structured reasoning, such as mathematics, logic puzzles, and analytical problem solving. The repository includes datasets, training scripts, and examples demonstrating how chain-of-thought data can be used to fine-tune language models. It also explores how reasoning traces generated by larger models can be distilled into smaller models.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    MGIE

    MGIE

    Guiding Instruction-based Image Editing via Multimodal Large Language

    MGIE—Guiding Instruction-based Image Editing—demonstrates how a multimodal LLM can parse natural-language editing instructions and then drive image transformations accordingly. The project focuses on making edits explainable and controllable: the model interprets text guidance, reasons over image content, and outputs edits aligned with user intent. It’s positioned as an ICLR 2024 Spotlight work, with code and references that show how to connect language planning to concrete image operations....
    Downloads: 0 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 10
    Ailice

    Ailice

    AIlice is a fully autonomous, general-purpose AI agent

    AIlice is an open-source autonomous AI agent framework built to function as a general-purpose assistant that can plan, decompose, and execute complex tasks through a structured multi-agent architecture. The project presents itself as a standalone assistant powered by open-source language models, with an internal design that treats user requests almost like executable programs rather than simple chat prompts. Its core IACT architecture allows the system to break large goals into smaller sub-tasks, assign them to dynamically created agents, and combine the results with a focus on resilience and fault tolerance. AIlice is designed for a wide range of workloads, including coding, thematic research, literature analysis, system management, and mixed workflows that require several reasoning modes at once.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    BIG-bench

    BIG-bench

    Beyond the Imitation Game collaborative benchmark for measuring

    BIG-bench (Beyond the Imitation Game Benchmark) is a large, collaborative benchmark suite designed to probe the capabilities and limitations of large language models across hundreds of diverse tasks. Rather than focusing on a single metric or domain, it aggregates many hand-authored tasks that test reasoning, commonsense, math, linguistics, ethics, and creativity. Tasks are intentionally heterogeneous: some are multiple-choice with exact scoring, others are free-form generation judged by model-based or human evaluation. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    InternLM

    InternLM

    Official release of InternLM series

    ...InternLM’s direction includes strong general-purpose capabilities and ongoing iterations that target improved reasoning, coding, and tool-use behaviors. The broader InternLM ecosystem also includes training tooling and guidance aimed at making fine-tuning and adaptation more accessible across hardware setups, including smaller single-GPU environments and larger multi-node configurations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Graph of Thoughts

    Graph of Thoughts

    Official Implementation of "Graph of Thoughts

    Graph of Thoughts is an open-source framework that implements a novel reasoning paradigm for large language models by organizing reasoning steps as a structured graph instead of a simple linear chain. Traditional reasoning methods such as chain-of-thought generate sequential reasoning steps, but Graph of Thoughts introduces a more flexible structure where multiple reasoning paths can be explored and evaluated simultaneously.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    ReAct Prompting

    ReAct Prompting

    Synergizing Reasoning and Acting in Language Models

    ReAct is an open-source research project that demonstrates a prompting and reasoning framework designed to improve the problem-solving capabilities of large language models. The project implements the methodology described in the research paper “ReAct: Synergizing Reasoning and Acting in Language Models,” which combines reasoning traces with action-based interactions. Instead of generating answers in a single step, models using the ReAct approach produce intermediate reasoning steps and perform actions such as searching for information or interacting with external tools. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Nougat

    Nougat

    Implementation of Nougat Neural Optical Understanding

    ...Because it integrates structured layout reasoning, Nougat tends to produce more compositional and controllable results than purely unconstrained generative models.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    ThoughtSource

    ThoughtSource

    A central, open resource for data and tools

    ThoughtSource is a central, open resource and community centered on data and tools for chain-of-thought reasoning in large language models (Wei 2022). Our long-term goal is to enable trustworthy and robust reasoning in advanced AI systems for driving scientific research and medical practice.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Repo of Tree of Thoughts (ToT)

    Repo of Tree of Thoughts (ToT)

    Implementation of "Tree of Thoughts

    ...ToT allows LMs to perform deliberate decision-making by considering multiple different reasoning paths and self-evaluating choices to decide the next course of action, as well as looking ahead or backtracking when necessary to make global choices.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    CausalNex

    CausalNex

    A Python library that helps data scientists to infer causation

    CausalNex is a Python library that uses Bayesian Networks to combine machine learning and domain expertise for causal reasoning. You can use CausalNex to uncover structural relationships in your data, learn complex distributions, and observe the effect of potential interventions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Chameleon LLM

    Chameleon LLM

    Codes for "Chameleon: Plug-and-Play Compositional Reasoning

    Discover Chameleon, our cutting-edge compositional reasoning framework designed to enhance large language models (LLMs) and overcome their inherent limitations, such as outdated information and lack of precise reasoning. By integrating various tools such as vision models, web search engines, Python functions, and rule-based modules, Chameleon delivers more accurate, up-to-date, and precise responses, making it a game-changer in the natural language processing landscape. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    PRM800K

    PRM800K

    800,000 step-level correctness labels on LLM solutions to MATH problem

    PRM800K is a process supervision dataset accompanying the paper Let’s Verify Step by Step, providing 800,000 step-level correctness labels on model-generated solutions to problems from the MATH dataset. The repository releases the raw labels and the labeler instructions used in two project phases, enabling researchers to study how human raters graded intermediate reasoning. Data are stored as newline-delimited JSONL files tracked with Git LFS, where each line is a full solution sample that...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    CodeContests

    CodeContests

    Large dataset of coding contests designed for AI and ML model training

    CodeContests, developed by Google DeepMind, is a large-scale competitive programming dataset designed for training and evaluating machine learning models on code generation and problem solving. This dataset played a central role in the development of AlphaCode, DeepMind’s model for solving programming problems at a human-competitive level, as published in Science. CodeContests aggregates problems and human-written solutions from multiple programming competition platforms, including AtCoder,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    Grade School Math

    Grade School Math

    8.5K high quality grade school math problems

    The grade-school-math repository (sometimes called GSM8K) is a curated dataset of 8,500 high-quality grade school math word problems intended for evaluating mathematical reasoning capabilities of language models. It is structured into 7,500 training problems and 1,000 test problems. These aren’t trivial exercises — many require multi-step reasoning, combining arithmetic operations, and handling intermediate steps (e.g. “If she sold half as many in May… how many in total?”). The problems are written by human authors (not automatically generated) to ensure linguistic variety and realism. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 23
    ELI5

    ELI5

    A library for debugging/inspecting machine learning classifiers

    ELI5 is a Python library designed to help developers interpret, debug, and explain the predictions of machine learning models. The project focuses on improving model transparency by providing tools that visualize feature importance and prediction reasoning. It supports several popular machine learning frameworks including scikit-learn, XGBoost, LightGBM, CatBoost, and Keras. The library allows users to inspect model weights, analyze decision trees, and compute permutation feature importance for black-box models.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Mistral Small 4

    Mistral Small 4

    Model that fuses instruct, reasoning and agentic skills

    The Mistral Small 4 collection is a set of open-weight large language models developed by Mistral AI that aim to unify multiple capabilities, including instruction following, reasoning, and coding, within a single efficient architecture. These models are part of the broader Mistral Small family, which is designed to deliver strong performance across a wide range of everyday AI tasks while maintaining relatively low latency and efficient deployment requirements. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Nemotron 3 Nano

    Nemotron 3 Nano

    LL model providing reasoning and conversational capabilities

    NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 is a mid-sized open large language model created by NVIDIA to provide strong reasoning and conversational capabilities while maintaining efficient deployment requirements. The model contains roughly 30 billion parameters and is designed to balance performance and computational efficiency, making it suitable for developers building AI applications that cannot run extremely large models. It is trained from scratch and built using a hybrid architecture that integrates Transformer attention layers with Mamba-style sequence modeling components inside a Mixture-of-Experts framework. ...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB