Showing 12 open source projects for "build ai"

View related business solutions
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    NVIDIA Earth2Studio

    NVIDIA Earth2Studio

    Open-source deep-learning framework

    NVIDIA Earth2Studio is an open-source Python package and framework designed to accelerate the development and deployment of AI-driven weather and climate science workflows. It provides a unified API that lets researchers, data scientists, and engineers build complex forecasting and analysis pipelines by combining modular prognostic and diagnostic AI models with a diverse range of real-world data sources such as global forecast systems, reanalysis datasets, and satellite feeds. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    TADA

    TADA

    Open Source Speech Language Model

    ...The system focuses on aligning speech and text streams using a dual-alignment mechanism that synchronizes the acoustic signal with its textual representation. By modeling both modalities together, the framework allows developers to build systems capable of generating, understanding, and transforming speech and language simultaneously. This approach can support applications such as conversational AI, speech synthesis, multimodal language modeling, and speech understanding systems. The project explores ways to treat speech and text as integrated data streams rather than separate pipelines, enabling more coherent interactions between language and audio. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    FinGPT

    FinGPT

    Open-Source Financial Large Language Models

    FinGPT is an open-source, finance-specialized large language model framework that blends the capabilities of general LLMs with real-time financial data feeds, domain-specific knowledge bases, and task-oriented agents to support market analysis, research automation, and decision support. It extends traditional GPT-style models by connecting them to live or historical financial datasets, news APIs, and economic indicators so that outputs are grounded in relevant and recent market conditions...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 4
    Kimi-Audio

    Kimi-Audio

    Audio foundation model excelling in audio understanding

    Kimi-Audio is an ambitious open-source audio foundation model designed to unify a wide array of audio processing tasks — from speech recognition and audio understanding to generative conversation and sound event classification — within a single cohesive architecture. Instead of fragmenting work across specialized models, Kimi-Audio handles automatic speech recognition (ASR), audio question answering, automatic audio captioning, speech emotion recognition, and audio-to-text chat in one...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Add Two Lines of Code. Get Full APM. Icon
    Add Two Lines of Code. Get Full APM.

    AppSignal installs in minutes and auto-configures dashboards, alerts, and error tracking.

    Works out of the box for Rails, Django, Express, Phoenix, and more. Monitoring exceptions and performance in no time.
    Start Free
  • 5
    Large Concept Model

    Large Concept Model

    Language modeling in a sentence representation space

    Large Concept Model is a research codebase centered on concept-centric representation learning at scale, aiming to capture shared structure across many categories and modalities. It organizes training around concepts (rather than just raw labels), encouraging models to understand attributes, relations, and compositional structure that transfer across tasks. The repository provides training loops, data tooling, and evaluation routines to learn and probe these concept embeddings, typically...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    xFormers

    xFormers

    Hackable and optimized Transformers building blocks

    xformers is a modular, performance-oriented library of transformer building blocks, designed to allow researchers and engineers to compose, experiment, and optimize transformer architectures more flexibly than monolithic frameworks. It abstracts components like attention layers, feedforward modules, normalization, and positional encoding, so you can mix and match or swap optimized kernels easily. One of its key goals is efficient attention: it supports dense, sparse, low-rank, and...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    DeepSeek VL

    DeepSeek VL

    Towards Real-World Vision-Language Understanding

    DeepSeek-VL is DeepSeek’s initial vision-language model that anchors their multimodal stack. It enables understanding and generation across visual and textual modalities—meaning it can process an image + a prompt, answer questions about images, caption, classify, or reason about visuals in context. The model is likely used internally as the visual encoder backbone for agent use cases, to ground perception in downstream tasks (e.g. answering questions about a screenshot). The repository...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Step-Video-T2V

    Step-Video-T2V

    State-of-the-art (SoTA) text-to-video pre-trained model

    Step-Video-T2V is a state-of-the-art text-to-video foundation model developed to generate videos from natural-language prompts; its 30B-parameter architecture is designed to produce coherent, temporally extended video sequences — up to around 204 frames — based on input text. Under the hood it uses a compressed latent representation (a Video-VAE) to reduce spatial and temporal redundancy, and a denoising diffusion (or similar) process over that latent space to generate smooth, plausible...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9
    4M

    4M

    4M: Massively Multimodal Masked Modeling

    4M is a training framework for “any-to-any” vision foundation models that uses tokenization and masking to scale across many modalities and tasks. The same model family can classify, segment, detect, caption, and even generate images, with a single interface for both discriminative and generative use. The repository releases code and models for multiple variants (e.g., 4M-7 and 4M-21), emphasizing transfer to unseen tasks and modalities. Training/inference configs and issues discuss things...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 10
    minGPT

    minGPT

    A minimal PyTorch re-implementation of the OpenAI GPT

    minGPT is a minimalist, educational re-implementation of the GPT (Generative Pretrained Transformer) architecture built in PyTorch, designed by Andrej Karpathy to expose the core structure of a transformer-based language model in as few lines of code as possible. It strips away extraneous bells and whistles, aiming to show how a sequence of token indices is fed into a stack of transformer blocks and then decoded into the next token probabilities, with both training and inference supported....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Menagerie

    Menagerie

    A collection of high-quality models for the MuJoCo physics engine

    MuJoCo Menagerie, developed by Google DeepMind, is a curated collection of high-quality simulation models designed for use with the MuJoCo physics engine. It serves as a comprehensive library of accurate and ready-to-use robotic, biomechanical, and mechanical models, ensuring users can perform reliable simulations without having to build or tune models from scratch. The repository aims to improve reproducibility and quality across robotics research by providing verified models that adhere to...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 12
    MUSE

    MUSE

    A library for Multilingual Unsupervised or Supervised word Embeddings

    MUSE is a framework for learning multilingual word embeddings that live in a shared space, enabling bilingual lexicon induction, cross-lingual retrieval, and zero-shot transfer. It supports both supervised alignment with seed dictionaries and unsupervised alignment that starts without parallel data by using adversarial initialization followed by Procrustes refinement. The code can align pre-trained monolingual embeddings (such as fastText) across dozens of languages and provides standardized...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB