Showing 213 open source projects for "open source project"

View related business solutions
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 1
    GLM-4

    GLM-4

    GLM-4 series: Open Multilingual Multimodal Chat LMs

    GLM-4 is a family of open models from ZhipuAI that spans base, chat, and reasoning variants at both 32B and 9B scales, with long-context support and practical local-deployment options. The GLM-4-32B-0414 models are trained on ~15T high-quality data (including substantial synthetic reasoning data), then post-trained with preference alignment, rejection sampling, and reinforcement learning to improve instruction following, coding, function calling, and agent-style behaviors. The...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    fairseq2

    fairseq2

    FAIR Sequence Modeling Toolkit 2

    fairseq2 is a modern, modular sequence modeling framework developed by Meta AI Research as a complete redesign of the original fairseq library. Built from the ground up for scalability, composability, and research flexibility, fairseq2 supports a broad range of language, speech, and multimodal content generation tasks, including instruction fine-tuning, reinforcement learning from human feedback (RLHF), and large-scale multilingual modeling. Unlike the original fairseq—which evolved into a...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    Core ML Stable Diffusion

    Core ML Stable Diffusion

    Stable Diffusion with Core ML on Apple Silicon

    Run Stable Diffusion on Apple Silicon with Core ML. python_coreml_stable_diffusion, a Python package for converting PyTorch models to Core ML format and performing image generation with Hugging Face diffusers in Python. StableDiffusion, a Swift package that developers can add to their Xcode projects as a dependency to deploy image generation capabilities in their apps. The Swift package relies on the Core ML model files generated by python_coreml_stable_diffusion. Hugging Face ran the...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    Stable Virtual Camera

    Stable Virtual Camera

    Stable Virtual Camera: Generative View Synthesis with Diffusion Models

    Stable Virtual Camera is a multi-view diffusion model developed by Stability AI that transforms 2D images into immersive 3D videos with realistic depth and perspective. Unlike traditional methods that require complex reconstruction or scene-specific optimization, this model allows users to generate novel views from any number of input images and define custom camera trajectories, enabling dynamic exploration of scenes. It supports various aspect ratios and can produce 3D-consistent videos up...
    Downloads: 1 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    GLM-V

    GLM-V

    GLM-4.5V and GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning

    GLM-V is an open-source vision-language model (VLM) series from ZhipuAI that extends the GLM foundation models into multimodal reasoning and perception. The repository provides both GLM-4.5V and GLM-4.1V models, designed to advance beyond basic perception toward higher-level reasoning, long-context understanding, and agent-based applications. GLM-4.5V builds on the flagship GLM-4.5-Air foundation (106B parameters, 12B active), achieving state-of-the-art results on 42 benchmarks across image, video, document, GUI, and grounding tasks. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    Clay Foundation Model

    Clay Foundation Model

    The Clay Foundation Model - An open source AI model and interface

    The Clay Foundation Model is an open-source AI model and interface designed to provide comprehensive data and insights about Earth. It aims to serve as a foundational tool for environmental monitoring, research, and decision-making by integrating various data sources and offering an accessible platform for analysis.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    AlphaGenome

    AlphaGenome

    Programmatic access to the AlphaGenome model

    The AlphaGenome API provides access to AlphaGenome, Google DeepMind’s unifying model for deciphering the regulatory code within DNA sequences. This repository contains client-side code, examples, and documentation to help you use the AlphaGenome API. AlphaGenome offers multimodal predictions, encompassing diverse functional outputs such as gene expression, splicing patterns, chromatin features, and contact maps. The model analyzes DNA sequences of up to 1 million base pairs in length and can...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    Pearl

    Pearl

    A Production-ready Reinforcement Learning AI Agent Library

    Pearl is a production-ready reinforcement learning and contextual bandit agent library built for real-world sequential decision making. It is organized around modular components—policy learners, replay buffers, exploration strategies, safety modules, and history summarizers—that snap together to form reliable agents with clear boundaries and strong defaults. The library implements classic and modern algorithms across two regimes: contextual bandits (e.g., LinUCB, LinTS, SquareCB, neural...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    Tencent-Hunyuan-Large

    Tencent-Hunyuan-Large

    Open-source large language model family from Tencent Hunyuan

    Tencent-Hunyuan-Large is the flagship open-source large language model family from Tencent Hunyuan, offering both pre-trained and instruct (fine-tuned) variants. It is designed with long-context capabilities, quantization support, and high performance on benchmarks across general reasoning, mathematics, language understanding, and Chinese / multilingual tasks. It aims to provide competitive capability with efficient deployment and inference.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 10
    Step1X-Edit

    Step1X-Edit

    A SOTA open-source image editing model

    ...The authors trained it on a large curated dataset and benchmarked it on a newly introduced evaluation suite, showing that Step1X-Edit significantly outperforms previous open-source baselines.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Map-Anything

    Map-Anything

    MapAnything: Universal Feed-Forward Metric 3D Reconstruction

    Map-Anything is a universal, feed-forward transformer for metric 3D reconstruction that predicts a scene’s geometry and camera parameters directly from visual inputs. Instead of stitching together many task-specific models, it uses a single architecture that supports a wide range of 3D tasks—multi-image structure-from-motion, multi-view stereo, monocular metric depth, registration, depth completion, and more. The model flexibly accepts different input combinations (images, intrinsics, poses,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    Qwen3 Embedding

    Qwen3 Embedding

    Designed for text embedding and ranking tasks

    Qwen3-Embedding is a model series from the Qwen family designed specifically for text embedding and ranking tasks. It builds upon the Qwen3 base/dense models and offers several sizes (0.6B, 4B, 8B parameters), for both embedding and reranking, with high multilingual capability, long‐context understanding, and reasoning. It achieves state-of-the-art performance on benchmarks like MTEB (Multilingual Text Embedding Benchmark) and supports instruction-aware embedding (i.e. embedding task...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    HunyuanWorld 1.0

    HunyuanWorld 1.0

    Generating Immersive, Explorable, and Interactive 3D Worlds

    HunyuanWorld-1.0 is an open-source, simulation-capable 3D world generation model developed by Tencent Hunyuan that creates immersive, explorable, and interactive 3D environments from text or image inputs. It combines the strengths of video-based diversity and 3D-based geometric consistency through a novel framework using panoramic world proxies and semantically layered 3D mesh representations.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    CogVLM2

    CogVLM2

    GPT4V-level open-source multi-modal model based on Llama3-8B

    ...The series includes models for both image understanding and video understanding, with CogVLM2-Video supporting up to 1-minute videos by analyzing keyframes. It supports bilingual interaction (Chinese and English) and has open-source versions optimized for dialogue and video comprehension. Notably, the Int4 quantized version allows efficient inference on GPUs with only 16GB of memory. The repository offers demos, API servers, fine-tuning examples, and integration with OpenAI API-compatible endpoints, making it accessible for both researchers and developers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Ling-V2

    Ling-V2

    Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI

    Ling-V2 is an open-source family of Mixture-of-Experts (MoE) large language models developed by the InclusionAI research organization with the goal of combining state-of-the-art performance, efficiency, and openness for next-generation AI applications. It introduces highly sparse architectures where only a fraction of the model’s parameters are activated per input token, enabling models like Ling-mini-2.0 to achieve reasoning and instruction-following capabilities on par with much larger dense models while remaining significantly more computationally efficient. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    NVIDIA Isaac GR00T

    NVIDIA Isaac GR00T

    NVIDIA Isaac GR00T N1.5 is the world's first open foundation model

    NVIDIA Isaac‑GR00T N1.5 is an open-source foundation model engineered for generalized humanoid robot reasoning and manipulation skills. It accepts multimodal inputs—such as language and images—and uses a diffusion transformer architecture built upon vision-language encoders, enabling adaptive robot behaviors across diverse environments. It is designed to be customizable via post-training with real or synthetic data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    LingBot-VLA

    LingBot-VLA

    A Pragmatic VLA Foundation Model

    LingBot-VLA is an open-source Vision-Language-Action (VLA) foundational AI model designed to serve as a general “brain” for real-world robotic manipulation by grounding multimodal perception and language into actionable motions. It has been pretrained on tens of thousands of hours of real robotic interaction data across multiple robot platforms, which enables it to generalize well to diverse morphologies and tasks without needing extensive retraining on each new bot.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    MedGemma

    MedGemma

    Collection of Gemma 3 variants that are trained for performance

    MedGemma is a collection of specialized open-source AI models created by Google as part of its Health AI Developer Foundations initiative, built on the Gemma 3 family of transformer models and trained for medical text and image comprehension tasks that help accelerate the development of healthcare-focused AI applications. It includes multiple variants such as a 4 billion-parameter multimodal model that can process both medical images and text and a 27 billion-parameter text-only (and multimodal) model that offers deeper clinical reasoning and understanding at higher capacity, making it suitable for complex tasks like medical question answering, summarization of clinical notes, or generating reports from radiology images. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    NVIDIA Earth2Studio

    NVIDIA Earth2Studio

    Open-source deep-learning framework

    NVIDIA Earth2Studio is an open-source Python package and framework designed to accelerate the development and deployment of AI-driven weather and climate science workflows. It provides a unified API that lets researchers, data scientists, and engineers build complex forecasting and analysis pipelines by combining modular prognostic and diagnostic AI models with a diverse range of real-world data sources such as global forecast systems, reanalysis datasets, and satellite feeds. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    CogView4

    CogView4

    CogView4, CogView3-Plus and CogView3(ECCV 2024)

    CogView4 is the latest generation in the CogView series of vision-language foundation models, developed as a bilingual (Chinese and English) open-source system for high-quality image understanding and generation. Built on top of the GLM framework, it supports multimodal tasks including text-to-image synthesis, image captioning, and visual reasoning. Compared to previous CogView versions, CogView4 introduces architectural upgrades, improved training pipelines, and larger-scale datasets, enabling stronger alignment between textual prompts and generated visual content. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Step-Audio

    Step-Audio

    Open-source framework for intelligent speech interaction

    Step-Audio is a unified, open-source framework aimed at building intelligent speech systems that combine both comprehension and generation: it integrates large language models (LLMs) with speech input/output to handle not only semantic understanding but also rich vocal characteristics like tone, style, dialect, emotion, and prosody. The design moves beyond traditional separate-component pipelines (ASR → text model → TTS), instead offering a multimodal model that ingests speech or audio and produces speech accordingly, enabling natural dialogue, voice cloning, and expressive speech synthesis. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Phi-3-MLX

    Phi-3-MLX

    Phi-3.5 for Mac: Locally-run Vision and Language Models

    Phi-3-Vision-MLX is an Apple MLX (machine learning on Apple silicon) implementation of Phi-3 Vision, a lightweight multi-modal model designed for vision and language tasks. It focuses on running vision-language AI efficiently on Apple hardware like M1 and M2 chips.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    VibeThinker

    VibeThinker

    Diversity-driven optimization and large-model reasoning ability

    VibeThinker is a compact but high-capability open-source language model released by WeiboAI (Sina AI Lab). It contains about 1.5 billion parameters, far smaller than many “frontier” models, yet it is explicitly optimized for reasoning, mathematics, and code generation tasks rather than general open-domain chat. The innovation lies in its training methodology: the team uses what they call the Spectrum-to-Signal Principle (SSP), where a first stage emphasizes diversity of reasoning paths (the “spectrum” phase) and a second stage uses reinforcement techniques (the “signal” phase) to refine toward correctness and strong reasoning. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    PokeeResearch-7B

    PokeeResearch-7B

    Pokee Deep Research Model Open Source Repo

    PokeeResearchOSS provides an open-source, agentic “deep research” model centered on a 7B backbone that can browse, read, and synthesize current information from the web. Instead of relying only on static training data, the agent performs searches, visits pages, and extracts evidence before forming answers to complex queries. It is built to operate end-to-end: planning a research strategy, gathering sources, reasoning over conflicting claims, and writing a grounded response. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Surya

    Surya

    Implementation of the Surya Foundation Model for Heliophysics

    Surya is an opensource, AI‑based foundation model for heliophysics developed collaboratively by NASA (via the IMPACT AI team) and IBM. Named after the Sanskrit word for “sun,” Surya is trained on nine years of high‑resolution solar imagery from NASA’s Solar Dynamics Observatory (SDO). It is designed to forecast solar phenomena—such as flares, solar wind, irradiance, and active region behavior—by predicting future solar images with a sophisticated long–short vision transformer architecture, thereby enabling improved space weather forecasting. ...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB