Showing 3 open source projects for "realistic"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 1
    Magicoder

    Magicoder

    Empowering Code Generation with OSS-Instruct

    ...The project focuses on improving the quality and diversity of code generation by training models with a novel dataset construction approach known as OSS-Instruct. This technique uses open-source code repositories as a foundation for generating more realistic and diverse instruction datasets for training language models. By grounding training data in real open-source examples, Magicoder aims to reduce bias and improve the reliability of code generation results compared to models trained solely on synthetic instructions. The project includes model implementations, training resources, and evaluation benchmarks that demonstrate how the approach improves instruction-following and code synthesis capabilities. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    AICGSecEval

    AICGSecEval

    A.S.E (AICGSecEval) is a repository-level AI-generated code security

    ...The project was developed to address concerns that AI-assisted programming tools may produce insecure code containing vulnerabilities such as injection flaws or unsafe logic. The framework constructs evaluation tasks based on real-world software repositories and known vulnerability cases derived from CVE records. By simulating realistic development scenarios, the benchmark assesses how well AI code generation systems handle security-sensitive programming tasks. AICGSecEval combines static and dynamic evaluation techniques to analyze generated code for vulnerabilities and functional correctness. The framework includes datasets, test cases, and evaluation metrics that measure how AI programming tools perform across multiple programming languages and vulnerability categories.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    AgentBench

    AgentBench

    A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)

    ...Unlike traditional language model benchmarks that focus on static text tasks, AgentBench measures how models perform in interactive environments that require planning, reasoning, and decision-making. The benchmark includes multiple environments that simulate realistic scenarios such as web interaction, database querying, and problem solving tasks. These environments require agents to interpret instructions, take actions, and adapt their strategies based on feedback from the environment. AgentBench also includes an evaluation framework that measures success rates, rewards, and task completion performance across different agent implementations. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB