Showing 14 open source projects for "pc benchmark test"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Automate contact and company data extraction Icon
    Automate contact and company data extraction

    Build lead generation pipelines that pull emails, phone numbers, and company details from directories, maps, social platforms. Full API access.

    Generate leads at scale without building or maintaining scrapers. Use 10,000+ ready-made tools that handle authentication, pagination, and anti-bot protection. Pull data from business directories, social profiles, and public sources, then export to your CRM or database via API. Schedule recurring extractions, enrich existing datasets, and integrate with your workflows.
    Explore Apify Store
  • 1
    Meta Agents Research Environments (ARE)

    Meta Agents Research Environments (ARE)

    Meta Agents Research Environments is a comprehensive platform

    ...Unlike static benchmarks, ARE supports environments where agents must adapt to changes over time and reason over sequences of actions. It interacts with applications and faces uncertainty. The included Gaia2 benchmark offers 800 scenarios across multiple “universes”. It can test reasoning, memory, tool use, and adaptability. Integration with simulated applications/agent APIs (email, file system, etc.). Support for multiple AI model backends/providers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    MTEB

    MTEB

    MTEB: Massive Text Embedding Benchmark

    Text embeddings are commonly evaluated on a small set of datasets from a single task not covering their possible applications to other tasks. It is unclear whether state-of-the-art embeddings on semantic textual similarity (STS) can be equally well applied to other tasks like clustering or reranking. This makes progress in the field difficult to track, as various models are constantly being proposed without proper evaluation. To solve this problem, we introduce the Massive Text Embedding...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    BIG-bench

    BIG-bench

    Beyond the Imitation Game collaborative benchmark for measuring

    BIG-bench (Beyond the Imitation Game Benchmark) is a large, collaborative benchmark suite designed to probe the capabilities and limitations of large language models across hundreds of diverse tasks. Rather than focusing on a single metric or domain, it aggregates many hand-authored tasks that test reasoning, commonsense, math, linguistics, ethics, and creativity. Tasks are intentionally heterogeneous: some are multiple-choice with exact scoring, others are free-form generation judged by model-based or human evaluation. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 4
    openbench

    openbench

    Provider-agnostic, open-source evaluation infrastructure

    ...It bundles dozens of evaluation suites — covering knowledge, reasoning, math, code, science, reading comprehension, long-context recall, graph reasoning, and more — so users don’t need to assemble disparate datasets themselves. With a simple CLI interface (e.g. bench eval <benchmark> --model <model-id>), you can quickly evaluate any model supported by Groq or other providers (OpenAI, Anthropic, HuggingFace, local models, etc.). openbench also supports private/local evaluations: you can integrate your own custom benchmarks or data (e.g. internal test suites, domain-specific tasks) to evaluate models in a privacy-preserving way.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Teradata VantageCloud Enterprise is a data analytics platform for performing advanced analytics on AWS, Azure, and Google Cloud. Icon
    Teradata VantageCloud Enterprise is a data analytics platform for performing advanced analytics on AWS, Azure, and Google Cloud.

    Power faster innovation with Teradata VantageCloud

    VantageCloud is the complete cloud analytics and data platform, delivering harmonized data and Trusted AI for all. Built for performance, flexibility, and openness, VantageCloud enables organizations to unify diverse data sources, run complex analytics, and deploy AI models—all within a single, scalable platform.
    Learn More
  • 5
    Codeflash

    Codeflash

    Optimize your code automatically with AI

    Codeflash is a general-purpose optimizer for Python that uses advanced large language models (LLMs) to automatically generate, test, and benchmark multiple optimization ideas, then creates merge-ready pull requests with the best improvements for your code. Optimize an entire existing codebase by running codeflash --all. Automate optimizing all future code you will write by installing Codeflash as a GitHub action. Optimize a Python workflow python myscript.py end-to-end by running codeflash optimize myscript.py. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Diplomacy Cicero

    Diplomacy Cicero

    Code for Cicero, an AI agent that plays the game of Diplomacy

    The project is the codebase for an AI agent named Cicero developed by Facebook Research. It is designed to play the board game Diplomacy by combining open-domain natural language negotiation with strategic planning. The repository includes training code, model checkpoints, and infrastructure for both language modelling (via the ParlAI framework) and reinforcement learning for strategy agents. It supports two variants: Cicero (which handles full “press” negotiation) and Diplodocus (a variant...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 7
    MiniMax-M1

    MiniMax-M1

    Open-weight, large-scale hybrid-attention reasoning model

    MiniMax-M1 is presented as the world’s first open-weight, large-scale hybrid-attention reasoning model, designed to push the frontier of long-context, tool-using, and deeply “thinking” language models. It is built on the MiniMax-Text-01 foundation and keeps the same massive parameter budget, but reworks the attention and training setup for better reasoning and test-time compute scaling. Architecturally, it combines Mixture-of-Experts layers with lightning attention, enabling the model to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Agentex

    Agentex

    Open source codebase for Scale Agentex

    AgentEX is an open framework from Scale for building, running, and evaluating agentic workflows, with an emphasis on reproducibility and measurable outcomes rather than ad-hoc demos. It treats an “agent” as a composition of a policy (the LLM), tools, memory, and an execution runtime so you can test the whole loop, not just prompting. The repo focuses on structured experiments: standardized tasks, canonical tool interfaces, and logs that make it possible to compare models, prompts, and tool...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    MetaCLIP

    MetaCLIP

    ICLR2024 Spotlight: curation/training code, metadata, distribution

    MetaCLIP is a research codebase that extends the CLIP framework into a meta-learning / continual learning regime, aiming to adapt CLIP-style models to new tasks or domains efficiently. The goal is to preserve CLIP’s strong zero-shot transfer capability while enabling fast adaptation to domain shifts or novel class sets with minimal data and without catastrophic forgetting. The repository provides training logic, adaptation strategies (e.g. prompt tuning, adapter modules), and evaluation...
    Downloads: 0 This Week
    Last Update:
    See Project
  • AestheticsPro Medical Spa Software Icon
    AestheticsPro Medical Spa Software

    Our new software release will dramatically improve your medspa business performance while enhancing the customer experience

    AestheticsPro is the most complete Aesthetics Software on the market today. HIPAA Cloud Compliant with electronic charting, integrated POS, targeted marketing and results driven reporting; AestheticsPro delivers the tools you need to manage your medical spa business. It is our mission To Provide an All-in-One Cutting Edge Software to the Aesthetics Industry.
    Learn More
  • 10
    PyTorch Geometric Temporal

    PyTorch Geometric Temporal

    Spatiotemporal Signal Processing with Neural Machine Learning Models

    The library consists of various dynamic and temporal geometric deep learning, embedding, and Spatio-temporal regression methods from a variety of published research papers. Moreover, it comes with an easy-to-use dataset loader, train-test splitter and temporal snaphot iterator for dynamic and temporal graphs. The framework naturally provides GPU support. It also comes with a number of benchmark datasets from the epidemiological forecasting, sharing economy, energy production and web traffic management domains. Finally, you can also create your own datasets. The package interfaces well with Pytorch Lightning which allows training on CPUs, single and multiple GPUs out-of-the-box. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    MMDeploy

    MMDeploy

    OpenMMLab Model Deployment Framework

    MMDeploy is an open-source deep learning model deployment toolset. It is a part of the OpenMMLab project. Models can be exported and run in several backends, and more will be compatible. All kinds of modules in the SDK can be extended, such as Transform for image processing, Net for Neural Network inference, Module for postprocessing and so on. Install and build your target backend. ONNX Runtime is a cross-platform inference and training accelerator compatible with many popular ML/DNN...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    SSD in PyTorch 1.0

    SSD in PyTorch 1.0

    High quality, fast, modular reference implementation of SSD in PyTorch

    This repository implements SSD (Single Shot MultiBox Detector). The implementation is heavily influenced by the projects ssd.pytorch, pytorch-ssd and maskrcnn-benchmark. This repository aims to be the code base for research based on SSD. Multi-GPU training and inference: We use DistributedDataParallel, you can train or test with arbitrary GPU(s), the training schema will change accordingly. Add your own modules without pain. We abstract backbone, Detector, BoxHead, BoxPredictor, etc. You can replace every component with your own code without changing the code base. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Grade School Math

    Grade School Math

    8.5K high quality grade school math problems

    ...JSONL) for problem + answer pairs, and is used broadly in research to benchmark model performance under “word problem” settings. Issues are tracked (people report incorrect problems, ambiguous statements), and contributions are possible for cleaning or expanding the set.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    SMAC

    SMAC

    SMAC: The StarCraft Multi-Agent Challenge

    SMAC (StarCraft II Multi-Agent Challenge) is a benchmark environment for cooperative multi-agent reinforcement learning (MARL), based on real-time strategy (RTS) game scenarios in StarCraft II. It allows researchers to test algorithms where multiple units (agents) must collaborate to win battles against built-in game AI opponents. SMAC provides a controlled testbed for studying decentralized execution and centralized training paradigms in MARL.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next