62 projects for "python code" with 2 filters applied:

  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    ChatGLM-6B

    ChatGLM-6B

    ChatGLM-6B: An Open Bilingual Dialogue Language Model

    ChatGLM-6B is an open bilingual (Chinese + English) conversational language model based on the GLM architecture, with approximately 6.2 billion parameters. The project provides inference code, demos (command line, web, API), quantization support for lower memory deployment, and tools for finetuning (e.g., via P-Tuning v2). It is optimized for dialogue and question answering with a balance between performance and deployability in consumer hardware settings. Support for quantized inference...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 2
    HunyuanImage-3.0

    HunyuanImage-3.0

    A Powerful Native Multimodal Model for Image Generation

    HunyuanImage-3.0 is a powerful, native multimodal text-to-image generation model released by Tencent’s Hunyuan team. It unifies multimodal understanding and generation in a single autoregressive framework, combining text and image modalities seamlessly rather than relying on separate image-only diffusion components. It uses a Mixture-of-Experts (MoE) architecture with many expert subnetworks to scale efficiently, deploying only a subset of experts per token, which allows large parameter...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 3
    HunyuanVideo-I2V

    HunyuanVideo-I2V

    A Customizable Image-to-Video Model based on HunyuanVideo

    HunyuanVideo-I2V is a customizable image-to-video generation framework from Tencent Hunyuan, built on their HunyuanVideo foundation. It extends video generation so that given a static reference image plus an optional prompt, it generates a video sequence that preserves the reference image’s identity (especially in the first frame) and allows stylized effects via LoRA adapters. The repository includes pretrained weights, inference and sampling scripts, training code for LoRA effects, and...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 4
    4M

    4M

    4M: Massively Multimodal Masked Modeling

    4M is a training framework for “any-to-any” vision foundation models that uses tokenization and masking to scale across many modalities and tasks. The same model family can classify, segment, detect, caption, and even generate images, with a single interface for both discriminative and generative use. The repository releases code and models for multiple variants (e.g., 4M-7 and 4M-21), emphasizing transfer to unseen tasks and modalities. Training/inference configs and issues discuss things...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Train ML Models With SQL You Already Know Icon
    Train ML Models With SQL You Already Know

    BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

    Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.
    Try Free
  • 5
    DFlash

    DFlash

    Block Diffusion for Ultra-Fast Speculative Decoding

    DFlash is an open-source framework for ultra-fast speculative decoding using a lightweight block diffusion model to draft text in parallel with a target large language model, dramatically improving inference speed without sacrificing generation quality. It acts as a “drafter” that proposes likely continuations which the main model then verifies, enabling significant throughput gains compared to traditional autoregressive decoding methods that generate token by token. This approach has been...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    MedGemma

    MedGemma

    Collection of Gemma 3 variants that are trained for performance

    MedGemma is a collection of specialized open-source AI models created by Google as part of its Health AI Developer Foundations initiative, built on the Gemma 3 family of transformer models and trained for medical text and image comprehension tasks that help accelerate the development of healthcare-focused AI applications. It includes multiple variants such as a 4 billion-parameter multimodal model that can process both medical images and text and a 27 billion-parameter text-only (and...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    FireRedTTS-2

    FireRedTTS-2

    Long-form streaming TTS system for multi-speaker dialogue generation

    FireRedTTS2 is a next-generation open-source text-to-speech (TTS) system focused on long-form, streaming speech synthesis for multi-speaker dialogue, delivering stable natural speech with context-aware prosody and reliable speaker transitions that support real-time and conversational applications. It features a specialized streaming speech tokenizer and a dual-transformer architecture that enables low latency and high-quality synthesis, making it suitable for interactive systems like...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    Gemma in PyTorch

    Gemma in PyTorch

    The official PyTorch implementation of Google's Gemma models

    gemma_pytorch provides the official PyTorch reference for running and fine-tuning Google’s Gemma family of open models. It includes model definitions, configuration files, and loading utilities for multiple parameter scales, enabling quick evaluation and downstream adaptation. The repository demonstrates text generation pipelines, tokenizer setup, quantization paths, and adapters for low-rank or parameter-efficient fine-tuning. Example notebooks walk through instruction tuning and evaluation...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    HunyuanDiT

    HunyuanDiT

    Diffusion Transformer with Fine-Grained Chinese Understanding

    HunyuanDiT is a high-capability text-to-image diffusion transformer with bilingual (Chinese/English) understanding and multi-turn dialogue capability. It trains a diffusion model in latent space using a transformer backbone and integrates a Multimodal Large Language Model (MLLM) to refine captions and support conversational image generation. It supports adapters like ControlNet, IP-Adapter, LoRA, and can run under constrained VRAM via distillation versions. LoRA, ControlNet (pose, depth,...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 10
    Vidi2

    Vidi2

    Large Multimodal Models for Video Understanding and Editing

    Vidi is a family of large multimodal models developed for deep video understanding and editing tasks, integrating vision, audio, and language to allow sophisticated querying and manipulation of video content. It’s designed to process long-form, real-world videos and answer complex queries such as “when in this clip does X happen?” or “where in the frame is object Y during that moment?” — offering temporal retrieval, spatio-temporal grounding (i.e. locating objects over time + space), and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    ComfyUI-LTXVideo

    ComfyUI-LTXVideo

    LTX-Video Support for ComfyUI

    ComfyUI-LTXVideo is a bridge between ComfyUI’s node-based generative workflow environment and the LTX-Video multimedia processing framework, enabling creators to orchestrate complex video tasks within a visual graph paradigm. Instead of writing code to apply effects, transitions, edits, and data flows, users can assemble nodes that represent video inputs, transformations, and outputs, letting them prototype and automate video production pipelines visually. This integration empowers...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    MobileCLIP

    MobileCLIP

    Implementation of "MobileCLIP" CVPR 2024

    MobileCLIP is a family of efficient image-text embedding models designed for real-time, on-device retrieval and zero-shot classification. The repo provides training, inference, and evaluation code for MobileCLIP models trained on DataCompDR, and for newer MobileCLIP2 models trained on DFNDR. It includes an iOS demo app and Core ML artifacts to showcase practical, offline photo search and classification on iPhone-class hardware. Project notes highlight latency/accuracy trade-offs, with...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    TimesFM

    TimesFM

    Pretrained time-series foundation model developed by Google Research

    TimesFM is a pretrained time-series foundation model from Google Research built for forecasting tasks, designed to generalize across many domains without requiring extensive per-dataset retraining. It provides a decoder-only model approach to forecasting, aiming for strong performance even in zero-shot or low-data settings where traditional models often struggle. The project includes code and an inference API intended to make it practical to run forecasts programmatically, with options to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Oasis

    Oasis

    Inference script for Oasis 500M

    Open-Oasis provides inference code and released weights for Oasis 500M, an interactive world model that generates gameplay frames conditioned on user keyboard input. Instead of rendering a pre-built game world, the system produces the next visual state via a diffusion-transformer approach, effectively “imagining” the world response to your actions in real time. The project focuses on enabling action-conditional frame generation so developers can experiment with interactive, model-generated...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    MetaCLIP

    MetaCLIP

    ICLR2024 Spotlight: curation/training code, metadata, distribution

    MetaCLIP is a research codebase that extends the CLIP framework into a meta-learning / continual learning regime, aiming to adapt CLIP-style models to new tasks or domains efficiently. The goal is to preserve CLIP’s strong zero-shot transfer capability while enabling fast adaptation to domain shifts or novel class sets with minimal data and without catastrophic forgetting. The repository provides training logic, adaptation strategies (e.g. prompt tuning, adapter modules), and evaluation...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    DreamCraft3D

    DreamCraft3D

    Official implementation of DreamCraft3D

    DreamCraft3D is DeepSeek’s generative 3D modeling framework / model family that likely extends their earlier 3D efforts (e.g. Shap-E or Point-E style models) with more capability, control, or expression. The name suggests a “dream crafting” metaphor—users probably supply textual or image prompts and generate 3D assets (point clouds, meshes, scenes). The repository includes model code, inference scripts, sample prompts, and possibly dataset preparation pipelines. It may integrate rendering or...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Poetiq

    Poetiq

    Reproduction of Poetiq's record-breaking submission to the ARC-AGI-1

    poetiq-arc-agi-solver is the open-source codebase from Poetiq that replicates their record-breaking submission to the challenging benchmark suite ARC-AGI (both ARC-AGI-1 and ARC-AGI-2). The project demonstrates a system that orchestrates large language models (LLMs) — like those from major providers — with carefully engineered prompting, reasoning workflows, and dynamic strategies, to tackle the abstract, logic-heavy problems in ARC-AGI. Instead of relying on a single prompt or fixed...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    ChatGPT Retrieval Plugin

    ChatGPT Retrieval Plugin

    The ChatGPT Retrieval Plugin lets you easily find personal documents

    The chatgpt-retrieval-plugin repository implements a semantic retrieval backend that lets ChatGPT (or GPT-powered tools) access private or organizational documents in natural language by combining vector search, embedding models, and plugin infrastructure. It can serve as a custom GPT plugin or function-calling backend so that a chat session can “look up” relevant documents based on user queries, inject those results into context, and respond more knowledgeably about a private knowledge...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Surya

    Surya

    Implementation of the Surya Foundation Model for Heliophysics

    Surya is an open‑source, AI‑based foundation model for heliophysics developed collaboratively by NASA (via the IMPACT AI team) and IBM. Named after the Sanskrit word for “sun,” Surya is trained on nine years of high‑resolution solar imagery from NASA’s Solar Dynamics Observatory (SDO). It is designed to forecast solar phenomena—such as flares, solar wind, irradiance, and active region behavior—by predicting future solar images with a sophisticated long–short vision transformer architecture,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    GLM-4.6V

    GLM-4.6V

    GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning

    GLM-4.6V represents the latest generation of the GLM-V family and marks a major step forward in multimodal AI by combining advanced vision-language understanding with native “tool-call” capabilities, long-context reasoning, and strong generalization across domains. Unlike many vision-language models that treat images and text separately or require intermediate conversions, GLM-4.6V allows inputs such as images, screenshots or document pages directly as part of its reasoning pipeline — and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Qwen2.5-Coder

    Qwen2.5-Coder

    Qwen2.5-Coder is the code version of Qwen2.5, the large language model

    Qwen2.5-Coder, developed by QwenLM, is an advanced open-source code generation model designed for developers seeking powerful and diverse coding capabilities. It includes multiple model sizes—ranging from 0.5B to 32B parameters—providing solutions for a wide array of coding needs. The model supports over 92 programming languages and offers exceptional performance in generating code, debugging, and mathematical problem-solving. Qwen2.5-Coder, with its long context length of 128K tokens, is...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 22
    Grok-1

    Grok-1

    Open-source, high-performance Mixture-of-Experts large language model

    Grok-1 is a 314-billion-parameter Mixture-of-Experts (MoE) large language model developed by xAI. Designed to optimize computational efficiency, it activates only 25% of its weights for each input token. In March 2024, xAI released Grok-1's model weights and architecture under the Apache 2.0 license, making them openly accessible to developers. The accompanying GitHub repository provides JAX example code for loading and running the model. Due to its substantial size, utilizing Grok-1...
    Leader badge
    Downloads: 37 This Week
    Last Update:
    See Project
  • 23
    FLUX.1 Krea

    FLUX.1 Krea

    Powerful open source image generation model

    FLUX.1 Krea [dev] is an open-source 12-billion parameter image generation model developed collaboratively by Krea and Black Forest Labs, designed to deliver superior aesthetic control and high image quality. It is a rectified-flow model distilled from the original Krea 1, providing enhanced sampling efficiency through classifier-free guidance distillation. The model supports generation at resolutions between 1024 and 1280 pixels with recommended inference steps between 28 and 32 for optimal...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    DeepSeek MoE

    DeepSeek MoE

    Towards Ultimate Expert Specialization in Mixture-of-Experts Language

    DeepSeek-MoE (“DeepSeek MoE”) is the DeepSeek open implementation of a Mixture-of-Experts (MoE) model architecture meant to increase parameter efficiency by activating only a subset of “expert” submodules per input. The repository introduces fine-grained expert segmentation and shared expert isolation to improve specialization while controlling compute cost. For example, their MoE variant with 16.4B parameters claims comparable or better performance to standard dense models like DeepSeek 7B...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    GLM-4-32B-0414

    GLM-4-32B-0414

    Open Multilingual Multimodal Chat LMs

    GLM-4-32B-0414 is a powerful open-source large language model featuring 32 billion parameters, designed to deliver performance comparable to leading models like OpenAI’s GPT series. It supports multilingual and multimodal chat capabilities with an extensive 32K token context length, making it ideal for dialogue, reasoning, and complex task completion. The model is pre-trained on 15 trillion tokens of high-quality data, including substantial synthetic reasoning datasets, and further enhanced...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB