Showing 8 open source projects for "composition"

View related business solutions
  • Ship Agents Faster Icon
    Ship Agents Faster

    Transform your applications and workflows into powerful agentic systems at global scale.

    Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.
    Get Started Free
  • Secure File Transfer for Windows with Cerberus by Redwood Icon
    Secure File Transfer for Windows with Cerberus by Redwood

    Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

    Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.
    Try for Free
  • 1
    Ideogram 4

    Ideogram 4

    Open image model at the forefront of design

    Ideogram 4 is an open-weight text-to-image model focused on high-quality visual generation, design control, and accurate text rendering inside images. It is built for users who need more than generic image generation, especially when layout, typography, composition, color, and language understanding matter. The project introduces a structured JSON prompting workflow that gives creators more explicit control over scene details and visual constraints. It can also accept plain-text prompts, making it accessible to users who prefer a simpler generation style. Ideogram 4 is especially useful for design-heavy outputs such as posters, ads, mockups, branded graphics, and images that include readable text. ...
    Downloads: 19 This Week
    Last Update:
    See Project
  • 2
    Z-Image

    Z-Image

    Image generation model with single-stream diffusion transformer

    ...Despite its compact size, Z-Image produces outputs that closely rival those from much larger models — including strong rendering of bilingual (English and Chinese) text inside images, accurate prompt adherence, and good layout and composition.
    Downloads: 30 This Week
    Last Update:
    See Project
  • 3
    HY-World 2.0

    HY-World 2.0

    A Multi-Modal World Model for Reconstructing, Generating, Simulation

    ...For text and single-image inputs, it generates high-fidelity 3D Gaussian Splatting scenes through a multi-stage pipeline that includes panorama generation, trajectory planning, world expansion, and world composition. The system also improves reconstruction from multi-view images and video by upgrading its feed-forward 3D prediction components and its memory-aware view generation process. Another major part of the project is WorldLens, a rendering platform designed for interactive exploration with an engine-agnostic architecture, automatic image-based lighting, collision detection, and support for character interaction.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 4
    Wan2.2

    Wan2.2

    Wan2.2: Open and Advanced Large-Scale Video Generative Model

    ...It introduces a Mixture-of-Experts (MoE) architecture that splits the denoising process across specialized expert models, increasing total model capacity without raising computational costs. Wan2.2 integrates meticulously curated cinematic aesthetic data, enabling precise control over lighting, composition, color tone, and more, for high-quality, customizable video styles. The model is trained on significantly larger datasets than its predecessor, greatly enhancing motion complexity, semantic understanding, and aesthetic diversity. Wan2.2 also open-sources a 5-billion parameter high-compression VAE-based hybrid text-image-to-video (TI2V) model that supports 720P video generation at 24fps on consumer-grade GPUs like the RTX 4090. ...
    Downloads: 105 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 5
    GLM-Image

    GLM-Image

    GLM-Image: Auto-regressive for Dense-knowledge and High-fidelity Image

    GLM-Image is an open-source generative AI model designed to create high-fidelity images from text prompts using a hybrid architecture that combines autoregressive semantic understanding with diffusion-based detail refinement. It excels at generating images that include complex layouts and detailed text content, making it especially useful for posters, diagrams, info-graphics, social media graphics, and visual content that requires precise text placement and semantic alignment. Because it...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    Style Aligned

    Style Aligned

    Official code for Style Aligned Image Generation via Shared Attention

    ...Instead of fully re-generating an image—and risking changes to lighting, texture, or rendering choices—the method aligns internal features across denoising steps so the target edit inherits the source style. This alignment acts like a constraint on the model’s evolution, steering composition, palette, and brushwork even as objects or attributes change. The result is more consistent edits across a set, which is crucial for workflows like product variations, character sheets, or brand-coherent art. The repository provides reproducible scripts, reference prompts, and guidance for tuning strengths so users can dial in subtle retouches or bolder substitutions. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Prompt-to-Prompt

    Prompt-to-Prompt

    Latent Diffusion and Stable Diffusion Implementation

    ...The method supports gentle edits (e.g., style, color, lighting) as well as stronger semantic substitutions, and it can localize edits to specific words or regions by selectively updating attention. Because edits are steerable via prompt wording and token weighting, creators can iterate quickly, exploring variations without losing composition. The repository includes reference notebooks and scripts that plug into popular latent diffusion backbones, making it practical to try the technique on your own prompts and seeds. It’s especially useful for workflows that need consistent framing, product shots, illustrations, and concept art, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    SG2Im

    SG2Im

    Code for "Image Generation from Scene Graphs", Johnson et al, CVPR 201

    ...The pipeline typically predicts object layouts (bounding boxes and masks) from the graph, then renders a realistic image conditioned on those layouts. This separation lets the model reason about geometry and composition before committing to texture and color, improving spatial fidelity. The repository includes training code, datasets, and evaluation scripts so researchers can reproduce baselines and extend components such as the graph encoder or image generator. In practice, sg2im demonstrates how structured semantics can guide generative models to produce controllable, compositional imagery.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
Auth0 Logo