Showing 12 open source projects for "complex system"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    DeepSeek-OCR 2

    DeepSeek-OCR 2

    Visual Causal Flow

    DeepSeek-OCR-2 is the second-generation optical character recognition system developed to improve document understanding by introducing a “visual causal flow” mechanism, enabling the encoder to reorder visual tokens in a way that better reflects semantic structure rather than strict raster scan order. It is designed to handle complex layouts and noisy documents by giving the model causal reasoning capabilities that mimic human visual scanning behavior, enhancing OCR performance on documents with rich spatial structure. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 2
    IndexTTS2

    IndexTTS2

    Industrial-level controllable zero-shot text-to-speech system

    IndexTTS is a modern, zero-shot text-to-speech (TTS) system engineered to deliver high-quality, natural-sounding speech synthesis with few requirements and strong voice-cloning capabilities. It builds on state-of-the-art models such as XTTS and other modern neural TTS backbones, improving them with a conformer-based speech conditional encoder and upgrading the decoder to a high-quality vocoder (BigVGAN2), leading to clearer and more natural audio output. The system supports zero-shot voice...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 3
    TRIBE v2

    TRIBE v2

    A multimodal model for brain response prediction

    TRIBE v2 is a multimodal foundation model developed by Meta AI for predicting human brain activity from naturalistic stimuli such as video, audio, and text. It is designed for in-silico neuroscience, enabling researchers to model how the brain responds to complex real-world inputs. The system integrates state-of-the-art encoders—including LLaMA for text, V-JEPA for video, and Wav2Vec-BERT for audio—into a unified Transformer architecture. This combined representation is mapped onto the cortical surface to predict fMRI responses across thousands of brain regions. TRIBE v2 allows researchers to simulate and analyze brain activity without requiring direct human experiments. ...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 4
    MiniMax-M2.5

    MiniMax-M2.5

    State of the art LLM and coding model

    MiniMax-M2.5 is a state-of-the-art foundation model extensively trained with reinforcement learning across hundreds of thousands of real-world environments. It delivers leading performance in coding, agentic tool use, search, and complex office workflows, achieving top benchmark scores such as 80.2% on SWE-Bench Verified and 76.3% on BrowseComp. Designed to reason efficiently and decompose tasks like an experienced architect, M2.5 plans features, structure, and system design before generating code. The model supports full-stack development across web, mobile, and desktop platforms, covering the entire lifecycle from system design to testing and code review. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    PokeeResearch-7B

    PokeeResearch-7B

    Pokee Deep Research Model Open Source Repo

    PokeeResearchOSS provides an open-source, agentic “deep research” model centered on a 7B backbone that can browse, read, and synthesize current information from the web. Instead of relying only on static training data, the agent performs searches, visits pages, and extracts evidence before forming answers to complex queries. It is built to operate end-to-end: planning a research strategy, gathering sources, reasoning over conflicting claims, and writing a grounded response. The repository includes evaluation results on multi-step QA and research benchmarks, illustrating how web-time context boosts accuracy. Because the system is modular, you can swap the search component, reader, or policy to fit private deployments or different data domains. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Vidi2

    Vidi2

    Large Multimodal Models for Video Understanding and Editing

    ...The system is built with open-source release in mind, giving developers access to model code, inference scripts, and evaluation pipelines so they can reproduce research results or integrate Vidi into their own video-processing workflows.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Poetiq

    Poetiq

    Reproduction of Poetiq's record-breaking submission to the ARC-AGI-1

    poetiq-arc-agi-solver is the open-source codebase from Poetiq that replicates their record-breaking submission to the challenging benchmark suite ARC-AGI (both ARC-AGI-1 and ARC-AGI-2). The project demonstrates a system that orchestrates large language models (LLMs) — like those from major providers — with carefully engineered prompting, reasoning workflows, and dynamic strategies, to tackle the abstract, logic-heavy problems in ARC-AGI. Instead of relying on a single prompt or fixed...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Watermark Anything

    Watermark Anything

    Official implementation of Watermark Anything with Localized Messages

    Watermark Anything (WAM) is an advanced deep learning framework for embedding and detecting localized watermarks in digital images. Developed by Facebook Research, it provides a robust, flexible system that allows users to insert one or multiple watermarks within selected image regions while maintaining visual quality and recoverability. Unlike traditional watermarking methods that rely on uniform embedding, WAM supports spatially localized watermarks, enabling targeted protection of...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    FireRed-Image-Edit

    FireRed-Image-Edit

    General-purpose image editing model that delivers high-fidelity

    FireRed-Image-Edit is an open-source general-purpose image editing model and toolset designed to deliver high-fidelity, visually coherent edits across a wide range of editing tasks, from simple object modifications to complex enhancements like restoration and style preservation. It is built on a flexible text-to-image foundation model that has been extended with training paradigms including pretraining, supervised fine-tuning, and reinforcement learning to imbue the system with strong instruction following and editing consistency. The model excels in maintaining visual and text stylistic fidelity, allowing users to preserve the original artistic qualities of an image while applying creative changes according to natural language instructions. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 10
    MuJoCo MPC

    MuJoCo MPC

    Real-time behaviour synthesis with MuJoCo, using Predictive Control

    MuJoCo MPC (MJPC) is an advanced interactive framework for real-time model predictive control (MPC) built on top of the MuJoCo physics engine, developed by Google DeepMind. It allows researchers and roboticists to design, visualize, and execute complex control tasks for simulated or real robotic systems. MJPC integrates a high-performance GUI and multiple predictive control algorithms, including iLQG, gradient descent, and Predictive Sampling — a competitive, derivative-free method that achieves robust real-time control. The system supports multi-shooting optimization, enabling precise motion planning across diverse domains like quadruped locomotion, humanoid tracking, and dexterous manipulation. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Ministral 3 14B Reasoning 2512

    Ministral 3 14B Reasoning 2512

    High-precision 14B multimodal model built for advanced reasoning tasks

    ...It pairs a 13.5B-parameter language model with a 0.4B vision encoder, enabling strong multimodal reasoning across both text and images. This version is specifically post-trained for reasoning tasks, making it highly effective for math, coding, STEM workloads, and complex multi-step problem-solving. Despite its scale, the model is engineered for practical deployment and can run locally on 32GB of VRAM in BF16 or under 24GB when quantized. It maintains robust system-prompt adherence, supports dozens of languages, and provides native function calling with clean JSON output for agentic workflows. The model's architecture also delivers a 256k context window, unlocking large-document analysis and long-form reasoning.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Mistral Large 3 675B Instruct 2512

    Mistral Large 3 675B Instruct 2512

    Frontier-scale 675B multimodal instruct MoE model for enterprise AIMis

    ...It incorporates a massive 673B-parameter language MoE backbone and a 2.5B-parameter vision encoder, enabling rich multimodal understanding across text and images. The model supports dozens of languages and maintains strong system-prompt adherence, making it suitable for global and structured enterprise use. Designed for high performance, it runs on a single node of B200 or H200 GPUs in FP8, and can also operate in NVFP4 mode on H100 or A100 hardware. With a 256k context window, it excels at long-document comprehension, deep retrieval workflows, and complex knowledge-intensive tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB