112 projects for "generate image" with 1 filter applied:

  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 1
    image-blaster

    image-blaster

    An image-to-world skillset for Claude

    ...It can generate dynamic object models, static environment captures, ambient loops, and object-specific physics sounds. The resulting assets can be used in game engines, 3D tools, and web-based graphics projects. image-blaster is best understood as an experimental agent workflow for quickly jumpstarting interactive worlds rather than a full 3D authoring suite.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    FLUX.1

    FLUX.1

    Official inference repo for FLUX.1 models

    FLUX.1 repository contains inference code and tooling for the FLUX.1 text-to-image diffusion models, enabling developers and researchers to generate and edit images from natural-language prompts using open-weight versions of the model on their own hardware or within custom applications. The project is part of a larger family of FLUX models developed by Black Forest Labs, designed to produce high-quality, detailed visuals from text descriptions with competitive prompt adherence and artistic fidelity. ...
    Downloads: 45 This Week
    Last Update:
    See Project
  • 3
    Text-to-image Playground

    Text-to-image Playground

    A playground to generate images from any text prompt using SD

    dalle-playground is an open-source web application that allows users to generate images from natural language text prompts using modern text-to-image generative models. Originally built around DALL-E Mini, the project later transitioned to using Stable Diffusion, enabling more detailed and higher-quality image synthesis. The system combines a backend machine learning service with a browser-based frontend interface that lets users experiment interactively with prompt engineering and generative AI. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Cowart

    Cowart

    Local infinite canvas plugin for Codex

    ...Users can create AI image holders on the canvas and have Codex automatically generate images that match the selected frame and aspect ratio. It also supports annotation-driven image refinement, allowing users to mark up screenshots and generate clean revised versions while preserving the original artwork. The result is a collaborative environment that combines vis
    Downloads: 5 This Week
    Last Update:
    See Project
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • 5
    SwarmUI

    SwarmUI

    Modular AI image and video generation web UI with extensible tools

    SwarmUI is a modular web-based user interface designed for AI-driven image generation, with a strong focus on usability, performance, and extensibility. It serves as a unified environment for working with multiple AI models, including Stable Diffusion and newer image and video generation systems, allowing users to create and manage outputs through a browser interface. SwarmUI is built to accommodate both beginners and advanced users by offering a simple “Generate” interface alongside more advanced workflow tools that expose deeper configuration options. ...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 6
    LTX-2.3

    LTX-2.3

    Official Python inference and LoRA trainer package

    LTX-2.3 is an open-source multimodal artificial intelligence foundation model developed by Lightricks for generating synchronized video and audio from prompts or other inputs. Unlike most earlier video generation systems that only produced silent clips, LTX-2 combines video and audio generation in a unified architecture capable of producing coherent audiovisual scenes. The model uses a diffusion-transformer-based architecture designed to generate high-fidelity visual frames while...
    Downloads: 108 This Week
    Last Update:
    See Project
  • 7
    Watermark-Removal

    Watermark-Removal

    Machine learning image inpainting task that removes watermarks

    ...Through these techniques, the model learns to identify regions of the image affected by the watermark and generate realistic replacements for the missing visual information. The repository contains code for preprocessing images, training the model, and running inference on images to automatically remove watermark artifacts.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    HunyuanImage-3.0

    HunyuanImage-3.0

    A Powerful Native Multimodal Model for Image Generation

    HunyuanImage-3.0 is a powerful, native multimodal text-to-image generation model released by Tencent’s Hunyuan team. It unifies multimodal understanding and generation in a single autoregressive framework, combining text and image modalities seamlessly rather than relying on separate image-only diffusion components. It uses a Mixture-of-Experts (MoE) architecture with many expert subnetworks to scale efficiently, deploying only a subset of experts per token, which allows large parameter...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 9
    LibreNMS

    LibreNMS

    Community-based GPL-licensed network monitoring system

    Welcome to LibreNMS, a fully featured network monitoring system that provides a wealth of features and device support. LibreNMS is an auto-discovering PHP/MySQL/SNMP-based network monitoring that includes support for a wide range of network hardware and operating systems including Cisco, Linux, FreeBSD, Juniper, Brocade, Foundry, HP, and many more.
    Downloads: 39 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 10
    SAM 3D Objects

    SAM 3D Objects

    Models for object and human mesh reconstruction

    SAM 3D Objects is a foundation model that reconstructs full 3D geometry, texture, and spatial layout of objects and scenes from a single image. Given one RGB image and object masks (for example, from the Segment Anything family), it can generate a textured 3D mesh for each object, including pose and approximate scene layout. The model is specifically designed to be robust in real-world images with clutter, occlusions, small objects, and unusual viewpoints, where many earlier 3D-from-image systems struggle. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 11
    DeepSeek VL2

    DeepSeek VL2

    Mixture-of-Experts Vision-Language Models for Advanced Multimodal

    DeepSeek-VL2 is DeepSeek’s vision + language multimodal model—essentially the next-gen successor to their first vision-language models. It combines image and text inputs into a unified embedding / reasoning space so that you can query with text and image jointly (e.g. “What’s going on in this scene?” or “Generate a caption appropriate to context”). The model supports both image understanding (vision tasks) and multimodal reasoning, and is likely used as a component in agent systems to process visual inputs as context for downstream tasks. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    HunyuanCustom

    HunyuanCustom

    Multimodal-Driven Architecture for Customized Video Generation

    HunyuanCustom is a multimodal video customization framework by Tencent Hunyuan, aimed at generating customized videos featuring particular subjects (people, characters) under flexible conditions, while maintaining subject/identity consistency. It supports conditioning via image, audio, video, and text, and can perform subject replacement in videos, generate avatars speaking given audio, or combine multiple subject images. The architecture builds on HunyuanVideo, with added modules for identity reinforcement and modality-specific condition injection. Text-image fusion module based on LLaVA for improved multimodal understanding. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Guizang Social Card Skill

    Guizang Social Card Skill

    Claude Code / Codex skill — generate Xiaohongshu carousels

    Guizang Social Card Skill is an AI-agent skill for generating polished social image packages in a Guizang-inspired visual style. It is designed for formats such as Xiaohongshu or Rednote carousels, WeChat Official Account covers, article covers, product update graphics, thumbnails, and screenshot-heavy posts. The skill turns articles, scripts, screenshots, product notes, subtitles, or photos into structured social card outputs. It supports editorial magazine layouts and Swiss-style visual...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14
    InfiniteYou

    InfiniteYou

    Flexible Photo Recrafting While Preserving Your Identity

    InfiniteYou is an open-source image-generation and “identity-preserving image editing / generation” framework from ByteDance, designed to generate high-fidelity images that preserve a subject’s identity while allowing flexible editing or re-creation according to textual prompts. Using an architecture built around diffusion transformers (DiTs), InfiniteYou introduces a component called InfuseNet that injects identity features derived from reference images into the generation process — via residual connections — so that the output matches the person’s identity closely, without sacrificing visual quality or text-image alignment. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Story Flicks

    Story Flicks

    Generate high-definition story short videos with one click using AI

    ...Because the project is open and modifiable, developers can customize the generation pipeline: adjust story structure, alter rendering parameters, tweak video quality or resolution, or integrate with other AI models (e.g. for audio, voice-over, or image-to-video). It’s especially useful as a starting template or experimentation ground for developers building automated content-creation tools.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 16
    Mini QR

    Mini QR

    Create & scan cute qr codes easily

    Mini QR is a web app focused on making QR codes feel friendly and design-forward, combining a polished QR generator with a built-in scanner so you can both create and decode codes in the same place. It emphasizes customization so the QR you generate can match a brand, event theme, or personal style, including color and styling controls, framed layouts with labels, and the ability to add a logo image. Because QR reliability matters as much as looks, it exposes practical settings like error correction levels so you can balance data density with scannability, especially when adding a logo or encoding larger payloads. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 17
    InternLM-XComposer-2.5

    InternLM-XComposer-2.5

    InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System

    InternLM-XComposer is an open-source multimodal AI system designed to generate long-form content that combines text with visual elements such as images and diagrams. The model is built on top of the InternLM language model architecture and extends its capabilities to handle multimodal inputs and outputs. Instead of producing only textual responses, the system can generate visually enriched documents such as illustrated articles, presentations, and educational materials. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    AI Logo Generator

    AI Logo Generator

    A free + OSS logo generator powered by Flux on Together AI

    AI Logo Generator is an open-source AI logo generator that lets you create professional-looking logos in seconds from a simple text prompt. It uses the Flux Pro 1.1 model hosted on Together AI to generate logos, so the heavy lifting is done by a state-of-the-art image model while the app focuses on UX and workflow. The project is built with Next.js and TypeScript, and it uses shadcn/ui plus Tailwind CSS for a modern, responsive interface that feels like a polished SaaS product rather than a demo. It integrates Clerk for authentication so users can sign in, save their logo history (planned via a dashboard), and potentially manage usage tied to their own API key. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 19
    HivisionIDPhoto

    HivisionIDPhoto

    HivisionIDPhotos: a lightweight and efficient AI ID photos tools

    ...It also allows the generation of layout sheets such as six-inch photo arrangements for printing multiple ID photos on a single page. The project focuses on building a practical pipeline for automated ID photo production using AI-based segmentation and image processing techniques.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 20
    vim-ai

    vim-ai

    AI-powered code assistant for Vim. OpenAI and ChatGPT plugin for Vim

    vim-ai is an AI-powered assistant plugin for Vim and Neovim that brings language-model features directly into the editor. It allows users to generate code or text, edit selections in place, and carry on interactive chat-style conversations without leaving the terminal editing environment. The plugin is built around OpenAI-compatible APIs, which means it can work not only with OpenAI itself but also with compatible proxies and alternative providers. Its command set covers text completion, editing, chat continuation, image generation, and debugging utilities, making it more versatile than a narrow autocomplete add-on. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    ALLWEONE

    ALLWEONE

    AI tool that generates custom presentations with real-time editing

    Presentation AI by ALLWEONE is an open source tool that uses artificial intelligence to generate complete slide decks from a simple prompt. It helps users create professional presentations quickly, with support for customizable themes, layouts, and styles. You can define slide count, language, and tone, then review or edit the AI-generated outline before finalising. Slides are built in real time, allowing you to watch content develop as the system works.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    ML Sharp

    ML Sharp

    Sharp Monocular View Synthesis in Less Than a Second

    ML Sharp is a research code release that turns a single 2D photograph into a photorealistic 3D representation that can be rendered from nearby viewpoints. Instead of requiring multi-view input, it predicts the parameters of a 3D Gaussian scene representation directly from one image using a single forward pass through a neural network. The core idea is speed: the 3D representation is produced in under a second on a standard GPU, and then the resulting scene can be rendered in real time to generate new views interactively. The representation is metric, meaning it supports camera movements with an absolute scale rather than only relative depth cues, which is useful for consistent viewpoint changes and downstream spatial tasks. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    Janus

    Janus

    Unified Multimodal Understanding and Generation Models

    Janus is a sophisticated open-source project from DeepSeek AI that aims to unify both visual understanding and image generation in a single model architecture. Rather than having separate systems for “look and describe” and “prompt and generate”, Janus uses an autoregressive transformer framework with a decoupled visual encoder—allowing it to ingest images for comprehension and to produce images from text prompts with shared internal representations.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    PDFCraft

    PDFCraft

    PDFCraft is a free, privacy-focused PDF toolkit

    PDFCraft is an extensible toolkit for creating, editing, and transforming PDF documents with both a graphical interface and a scripting API, making it useful for users ranging from casual editors to automated document processors. At its core, the project provides a clean, modern UI where you can rearrange pages, annotate text, insert images, fill forms, and export to multiple formats, all without needing a heavyweight commercial PDF suite. But beyond manual editing, it also offers a...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 25
    LandPPT

    LandPPT

    An LLM-based presentation generation platform

    LandPPT is an open-source AI platform that automatically generates professional presentation slides using large language models. The system allows users to create complete PowerPoint presentations simply by entering a topic or uploading source documents such as PDFs, Word files, or Markdown notes. Using natural language processing and structured content generation, the platform produces presentation outlines and converts them into fully formatted slide decks. The application integrates...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
Auth0 Logo