23 projects for "caption" with 1 filter applied:

  • Outgrown Windows Task Scheduler? Icon
    Outgrown Windows Task Scheduler?

    Free diagnostic identifies where your workflow is breaking down—with instant analysis of your scheduling environment.

    Windows Task Scheduler wasn't built for complex, cross-platform automation. Get a free diagnostic that shows exactly where things are failing and provides remediation recommendations. Interactive HTML report delivered in minutes.
    Download Free Tool
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 1
    img2dataset

    img2dataset

    Easily turn large sets of image urls to an image dataset

    Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine. Also supports saving captions for url+caption datasets. Opt-out directives: Websites can pass the http headers X-Robots-Tag: noai, X-Robots-Tag: noindex , X-Robots-Tag: noimageai and X-Robots-Tag: noimageindex By default img2dataset will ignore images with such headers.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    RealtimeSTT

    RealtimeSTT

    A robust, efficient, low-latency speech-to-text library

    RealtimeSTT is a Python-based realtime speech-to-text engine emphasizing low latency, wake-word detection, voice activity detection, and automatic speech segmentation. It provides asynchronous callbacks, nanosecond-precision timestamps, and CLI tools, suitable for building voice assistants, meeting transcribers, or live caption systems.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    DeepSeek VL2

    DeepSeek VL2

    Mixture-of-Experts Vision-Language Models for Advanced Multimodal

    ...It combines image and text inputs into a unified embedding / reasoning space so that you can query with text and image jointly (e.g. “What’s going on in this scene?” or “Generate a caption appropriate to context”). The model supports both image understanding (vision tasks) and multimodal reasoning, and is likely used as a component in agent systems to process visual inputs as context for downstream tasks. The repository includes evaluation results (e.g. image/text alignment scores, common VL benchmarks), configuration files, and model weights (where permitted). ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 4
    DeepSeek VL

    DeepSeek VL

    Towards Real-World Vision-Language Understanding

    DeepSeek-VL is DeepSeek’s initial vision-language model that anchors their multimodal stack. It enables understanding and generation across visual and textual modalities—meaning it can process an image + a prompt, answer questions about images, caption, classify, or reason about visuals in context. The model is likely used internally as the visual encoder backbone for agent use cases, to ground perception in downstream tasks (e.g. answering questions about a screenshot). The repository includes model weights (or pointers to them), evaluation metrics on standard vision + language benchmarks, and configuration or architecture files. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Atera all-in-one platform IT management software with AI agents Icon
    Atera all-in-one platform IT management software with AI agents

    Ideal for internal IT departments or managed service providers (MSPs)

    Atera’s AI agents don’t just assist, they act. From detection to resolution, they handle incidents and requests instantly, taking your IT management from automated to autonomous.
    Learn More
  • 5
    CLIP

    CLIP

    CLIP, Predict the most relevant text snippet given an image

    CLIP (Contrastive Language-Image Pretraining) is a neural model that links images and text in a shared embedding space, allowing zero-shot image classification, similarity search, and multimodal alignment. It was trained on large sets of (image, caption) pairs using a contrastive objective: images and their matching text are pulled together in embedding space, while mismatches are pushed apart. Once trained, you can give it any text labels and ask it to pick which label best matches a given image—even without explicit training for that classification task. The repository provides code for model architecture, preprocessing transforms, evaluation pipelines, and example inference scripts. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    4M

    4M

    4M: Massively Multimodal Masked Modeling

    4M is a training framework for “any-to-any” vision foundation models that uses tokenization and masking to scale across many modalities and tasks. The same model family can classify, segment, detect, caption, and even generate images, with a single interface for both discriminative and generative use. The repository releases code and models for multiple variants (e.g., 4M-7 and 4M-21), emphasizing transfer to unseen tasks and modalities. Training/inference configs and issues discuss things like depth tokenizers, input masks for generation, and CUDA build questions, signaling active research iteration. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Qwen-Image-Layered

    Qwen-Image-Layered

    Qwen-Image-Layered: Layered Decomposition for Inherent Editablity

    Qwen-Image-Layered is an extension of the Qwen series of multimodal models that introduces layered image understanding, enabling the model to reason about hierarchical visual structures — such as separating foreground, background, objects, and contextual layers within an image. This architecture allows richer semantic interpretation, enabling use cases such as scene decomposition, object-level editing, layered captioning, and more fine-grained multimodal reasoning than with flat image...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Krajee bootstrap-star-rating

    Krajee bootstrap-star-rating

    A simple yet powerful JQuery star rating plugin with fractional rating

    ...The plugin uses Bootstrap markup and styling by default, but it can be overridden with any other CSS markup. Ability to size the rating control to any size including the stars, caption, and clear button. Five prebuilt size templates are available xl, lg, md, sm, and xs. However one can have their own size configured through a simple CSS manipulation. You can use the HTML 5 number input for polyfill and the plugin will automatically use the number attributes like min, max, and step. However, number inputs have a problem with decimal values on the Chrome Browser. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Coppermine Photo Gallery
    Coppermine is an easily set-up, fast, feature-rich photo gallery script with mySQL database, user management, private galleries, automatic thumbnail creation, ecard feature and a template system for easy customization to match the rest of a site.
    Downloads: 7 This Week
    Last Update:
    See Project
  • Free and Open Source HR Software Icon
    Free and Open Source HR Software

    OrangeHRM provides a world-class HRIS experience and offers everything you and your team need to be that HR hero you know that you are.

    Give your HR team the tools they need to streamline administrative tasks, support employees, and make informed decisions with the OrangeHRM free and open source HR software.
    Learn More
  • 10
    Wikimedia Picture of the Day Background

    Wikimedia Picture of the Day Background

    Uses Wikimedia picture of the day for desktop wallpaper

    Script which downloads current and future POTD from Wikimedia Commons, adds a summary caption to the bottom, and sets as desktop background image for gnome. Can also be setup on a server to download images any desired time of the day then mirror them for faster retrival from personal devices on the LAN. Has been tested on Ubuntu. Could be used on a RaspberryPi powered digital picture frame. Need to impliment auto cleanup for old images.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    A Transport Stream analyser for the Brazilian D-TV system (SBTVD / ISDB-Tb). It shows in a GUI the SI/PSI structure of the stream in a tree-view, bitrate statistics for each ES, specific Closed Caption, EPG and DSMCC carroussel decoding plus more.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    java Image Album

    java Image Album

    Java Image Album (jIA) is wizard-style photo album application.

    java Image Album (jIA) is a Free Open Source easy to use wizard-style JavaTM application that generates HTML photo albums. Automatically resize your images and produce a set of HTML pages including index pages with thumbnails and detailed caption pages for each photo. Publishing a new photo album is as simple as copying a directory of images to your web directory. Java Image Album is released under the Mozilla Public License 1.1. See the license agreement for more information. Your feedback is strongly desired, and is required to make this product more successful.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13

    ABC Filesystem

    FUSE-based filesystem reflecting XWindows into files

    An educational FUSE filesystem that represents each window in a system running XWindows interface as a directory containing special files. Those files can be used to change window properties such as caption or its position at screen. *tested on Gentoo linux
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Java Image Album (jIA) is a easy to use wizard-style Java application that generates HTML photo albums. Automatically resize your images and produce a set of HTML pages including index pages with thumbnails and detailed caption pages for each photo.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    screen_notify.rb - A ruby script to add messages to the caption field in gnu screen.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    CCVisualizer is a multiplatform Closed Caption or Subtitle visualizer for movies or films (es Youtube, Megavideo). By downloading the movie's SRT file, it is possible to visualize it, to pause and to manually align video and text.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    A PHP implementation of the Gallery2 Remote Protocol Supports the following commands No-op, Login, Create new item, Create new album, Find album, Fetch album images, Find duplicates by caption and includes a simple multilevel log functionality
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Provides files to display a photo album stored either locally (ie CD or hard drive) or on the web using a web-browser. Album data (title, caption, etc) is stored in a single convenient XML file which can be generated with our Windows GUI, or by hand.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Finds Text in Delphi *.Pas files e.g. : button1.caption:='Hello' Replaces Text with a function call button1.caption:=TXT(0) {##Hello##} Generates a function „TXT“ with a String List with all found Texts at the beginning of your source.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    The Xbae widget set consists of the well known XbaeMatrix widget, and Caption and XbaeInput widgets.
    Leader badge
    Downloads: 31 This Week
    Last Update:
    See Project
  • 21
    Closed Caption (CC) decoder for VDR.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Publish pictures from your digital camera to the web with this easy to use wizard-style Java application. Automatically resize your images and produce a set of HTML pages including index pages with thumbnails, and detailed caption pages for each photo.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Closed Caption/Extended Data Services Decoder for bttv based video cards.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next