Showing 59 open source projects for "image text input"

View related business solutions
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    auto-subtitle

    auto-subtitle

    Automatically generate and overlay subtitles for any video

    auto-subtitle is a Python-based command-line tool that automatically generates and overlays subtitles on video files using AI-driven speech recognition. It combines FFmpeg with OpenAI’s Whisper model to transcribe spoken audio into text and synchronize it with video playback. The tool processes video input, extracts audio, and produces subtitle files that can be either exported separately or burned directly into the final video output. It supports multiple transcription models with varying accuracy and performance, allowing users to balance speed and quality depending on their needs. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Bulk Image Optimizer and Converter

    Bulk Image Optimizer and Converter

    Imagine having all your images well compressed and optimized :)

    Bulk Image Optimizer and Converter (Portable Executable) It allows users to choose the output format (JPEG, PNG, or WebP), set the desired image quality, and remove EXIF data. The optimized images are saved in a separate folder named "optimized" within the input folder. The tool displays progress information, including the number of images processed, the average compression ratio, and the total space saved.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3

    psgdump

    Dump psg/ym chip tune files to txt and midi format

    PSGDump tool is parser and converter for chip tune files. It supports PSG and YM input file formats, focusing on AY/YM chip tunes from ZX Spectrum and Atari ST. The tool produces text output of notes played and creates multi-track MIDI file.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    AugLy

    AugLy

    A data augmentations library for audio, image, text, and video

    ...We designed AugLy to include many specific data augmentations that users perform in real life on internet platforms like Facebook's -- for example making an image into a meme, overlaying text/emojis on images/videos, reposting a screenshot from social media. While AugLy contains more generic data augmentations as well, it will be particularly useful to you if you're working on a problem like copy detection, hate speech detection, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 5
    VSGAN

    VSGAN

    VapourSynth Single Image Super-Resolution Generative Adversarial

    Single Image Super-Resolution Generative Adversarial Network (GAN) which uses the VapourSynth processing framework to handle input and output image data. Transform, Filter, or Enhance your input video, or the VSGAN result with VapourSynth, a Script-based NLE. You can chain models or re-run the model twice-over (or more). Have low VRAM? Don’t worry!
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    A set of tools (command line and GUI) to provide a complete digital photo workflow for Unixes. EXIF headers are used as the central information repository, so users may change their software at any time without loosing any data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    TRACER

    TRACER

    Extreme Attention Guided Salient Object Tracing Network

    Extreme Attention Guided Salient Object Tracing Network (AAAI 2022) implementation in PyTorch. Now, fast inference mode offers a salient object result with the mask. You can get the more clear salient object by tuning the threshold. We will release initializing TRACER with a version of pre-trained TE-x.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8

    PythonStarSplitter

    A Python Script I made to split a starfield image into several layers.

    A Python Script I made to split a starfield image into several layers. To be able to use the script, PixInsight with an installed Gaia data catalogue is required, as it needs the exported astrometry data text file.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    3DDFA

    3DDFA

    Fast, accurate and stable 3D dense face alignment

    ...A simple 3D render written by c++ and cython is also included. This repo supports the onnxruntime, and the latency of regressing 3DMM parameters using the default backbone is about 1.35ms/image on CPU with a single image as input. See requirements.txt, tested on macOS and Linux platforms. The Windows users may refer to FQA for building issues. Note that this repo uses Python3. The major dependencies are PyTorch, numpy, opencv-python and onnxruntime, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Streamline Azure Security with Palo Alto Networks VM-Series Icon
    Streamline Azure Security with Palo Alto Networks VM-Series

    Centrally manage physical and virtualized firewalls with Panorama

    Improve your security posture and reduce incident response time. Use the VM-Series to natively analyze Azure traffic and dynamically drive policy updates based on workload changes.
    Learn more
  • 10
    Linux-Intelligent-Ocr-Solution

    Linux-Intelligent-Ocr-Solution

    Easy-OCR solution and Tesseract trainer for GNU/Linux

    Linux-intelligent-ocr-solution Lios is a free and open source software for converting print in to text using either scanner or a camera, It can also produce text out of scanned images from other sources such as Pdf, Image, Folder containing Images or screenshot. Program is given total accessibility for visually impaired. A Tesseract Trainer GUI is also shipped with this package. Forum : https://groups.google.com/forum/#!forum/lios Video Tutorial : https://www.youtube.com/playlist?...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 11
    Consistent Depth

    Consistent Depth

    We estimate dense, flicker-free, geometrically consistent depth

    ...The system builds upon traditional structure-from-motion (SfM) techniques to provide geometric constraints while integrating a convolutional neural network trained for single-image depth estimation. During inference, the model fine-tunes itself to align with the geometric constraints of a specific input video, ensuring stable and realistic depth maps even in less-constrained regions. This approach achieves improved geometric consistency and visual stability compared to prior monocular reconstruction methods. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Super-résolution via CNN

    Super-résolution via CNN

    Super resolution using a CNN, based on the work of the DGtal team

    ...This program will generate "model_epoch_ .pth" files corresponding to the model at epoch n, in a folder saved_model_u t_bs bs_tbs tbs_lr lr, where corresponds to the scale factor, bsthe size of the training batch, tbsthe size of the test batch and lrto the learning rate. Low res images should be located in a "dataset/input" folder, and high res targets in a "dataset/target" folder, where each different quality image has the same name in both folders.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    GIF for CLI

    GIF for CLI

    Takes in a GIF, short video, or a query to the Tenor GIF API

    gif-for-cli is a small, playful utility that brings animated GIFs to the command line by rendering frames directly in a terminal. It takes an input GIF (or a URL) and converts each frame into a terminal-friendly representation, timing updates to approximate the original animation. Depending on terminal capabilities, it can use ANSI color blocks or image protocols to achieve surprisingly faithful playback. The tool includes conveniences such as looping control, scaling to fit your terminal, and caching to avoid repeated downloads. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14

    TimingDrawer

    Text based timing diagram generator

    This tool generates timing diagrams for documenting hardware design. It reads the description from a text file with a simple syntax. It generates vector graphic (EPS, SVG or EMF format). It can be used in command line mode or with a GUI. It is written in Python and works on any platform.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    Windows Spotlight Slideshow Update

    App maintains a slideshow image folder using Windows Spotlight images.

    This app maintains a slideshow image folder, using Windows Spotlight images that meet the customizable selection criteria. The app adds or deletes slideshow images based on additions to or deletions from the Windows Spotlight folder by Windows Spotlight. The slideshow folder can be specified in Windows Settings background personalization as the input to a desktop/background slideshow.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    Training Image Operators from Samples

    Tools to train Image Operators automatically from a set of samples.

    TRIOS - Training Image Operators from Samples is a set of tools to bring Image Processing closer to scientists in general. It is capable of estimating an operator between two images using only pairs of samples that contain an input image and the desired output. The operator is saved to a file and can be applied to any image.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Crazy Eddies GUI System (CEGUI)

    Crazy Eddies GUI System (CEGUI)

    A fast, powerful and adaptable GUI solution

    Crazy Eddie's GUI (CEGUI) system is a graphical user interface C++ library. It was designed particularly for the needs of videogames, but the library is usable for non-game tasks, such as any other type of applications (rendering/visualisation/virtual reality) and tools. It is designed for user flexibility in look-and-feel, as well as being adaptable to the user's choice in tools and operating systems. Established in 2003, CEGUI sees continual, active development and remains one of the...
    Downloads: 48 This Week
    Last Update:
    See Project
  • 18
    GimpPy uses img maps & an img as the input, output is a report.py file used to generate PDFs, the out files may run solo or chained together to make more complex multi page reports. Input required is a dict with vals for flds you have mapped on your img.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Newspaper3k

    Newspaper3k

    News, full-text, and article metadata extraction in Python 3

    Inspired by requests for its simplicity and powered by lxml for its speed. Newspaper is an amazing python library for extracting & curating articles. Newspaper delivers Instapaper style article extraction. Newspaper is a Python3 library! If you are certain that an entire news source is in one language, go ahead and use the same api. Works in 10+ languages, English, Chinese, German, Arabic, and more! On python3 you must install newspaper3k, not newspaper. newspaper is our python2 library....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Importer library to import assets from different common 3D file formats such as Collada, Blend, Obj, X, 3DS, LWO, MD5, MD2, MD3, MDL, MS3D and a lot of other formats. The data is stored in an own in-memory data-format, which can be easily processed. www.open3mod.com/ is a 3D model viewer and exporter based on Assimp that is also Open Source.
    Downloads: 27 This Week
    Last Update:
    See Project
  • 21
    scrimage

    scrimage

    A unique python-based image editor

    A unique python-based image editor with low-level control. It will be able to apply fairly complex mathematical operations to individual pixels based on the contents of a script or user input at a command line. It will then be able to apply those changes to the image for a unique effect.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    EnKoDeur-Mixeur
    EnKoDeur-Mixeur (EKD) is an open source software which makes videos, pictures and audio post-production. It can be also used to convert videos in many formats. It is written in python and use the PyQt4 bindings.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    GOFoto is an appliaction for managing large collections of photos. It allows photo refining, generating web gallery and VideoCD.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    memoree

    memoree

    Browser interface to your memories

    You have your pictures stored on a server or PC that is accessible from internet or your intranet. This package will make them available through a web-browser.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    L.E.I.C.A. (Long Exposure Internet CAmera) is an application that takes input from your webcam, or v4l2 device, and makes a long-exposure image. Until now there's only a python binding for this app, soon there will be a C/C++ SDL one
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB