109 projects for "image text input" with 1 filter applied:

  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 1
    Image Toolbox

    Image Toolbox

    Image Toolbox is an powerful picture editor, which can crop

    Image Toolbox is a powerful picture editor, which can crop, apply filters, add some drawings, erase background, edit EXIF, or even create a PDF file.
    Downloads: 19 This Week
    Last Update:
    See Project
  • 2
    Mozc

    Mozc

    Mozc - a Japanese Input Method Editor designed for multi-platform

    Mozc is an open source Japanese Input Method Editor (IME) developed by Google, designed to provide Japanese text input across multiple operating systems including Android, macOS, Windows, GNU/Linux, and Chromium OS. The project originated as a subset of Google Japanese Input, released publicly under the BSD 3-Clause license for community use and development. Mozc offers core IME functionality such as text conversion, prediction, and dictionary-based input, enabling users to efficiently type and edit Japanese text. ...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 3
    Reins

    Reins

    Ollama client that simplifies experimenting with LLMs

    ...The application is built to run across platforms including mobile and desktop environments, making it accessible for a wide range of users who want consistent control over their AI workflows. It also includes features for editing and regenerating messages, enabling iterative refinement of outputs without restarting conversations. Reins extends beyond text by supporting image input and multimodal interactions, which expands its use cases beyond basic chat scenarios. Overall, it is best suited for users who want granular control over model behavior and experimentation while maintaining a clean and intuitive interface.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 4
    IOPaint

    IOPaint

    Image inpainting tool powered by SOTA AI Model

    ...Its feature set includes erasing people, watermarks, or defects, adding or replacing objects, applying text-aware edits, and extending images outward (outpainting) to fill contours or expand compositions.
    Downloads: 16 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 5
    SpeechRecognition

    SpeechRecognition

    Speech recognition module for Python

    Library for performing speech recognition, with support for several engines and APIs, online and offline. Recognize speech input from the microphone, transcribe an audio file, save audio data to an audio file. Show extended recognition results, calibrate the recognizer energy threshold for ambient noise levels (see recognizer_instance.energy_threshold for details). Listening to a microphone in the background, various other useful recognizer features. The easiest way to install this is using...
    Downloads: 21 This Week
    Last Update:
    See Project
  • 6
    Skiko

    Skiko

    Kotlin Multiplatform bindings to Skia

    ...It serves as the low-level rendering backbone for Kotlin UI frameworks like Compose for Desktop and Compose for Web, enabling smooth, GPU-accelerated 2D graphics across Windows, macOS, Linux, and other supported targets without writing native code. Skiko abstracts away platform-specific rendering details while exposing Skia’s powerful features such as high-quality text shaping, image filters, path operations, and hardware accelerated canvases, making it ideal for building rich UI components, animations, games, or custom drawing surfaces. By leveraging Skia’s proven performance and cross-platform consistency, Skiko helps developers write a single graphics pipeline that behaves predictably across environments, simplifying maintenance and reducing platform fragmentation.
    Downloads: 86 This Week
    Last Update:
    See Project
  • 7
    HarfBuzz

    HarfBuzz

    Open source text shaping engine

    ...This shaping depends on a number of factors: the input string, the active font, the script (or writing system) of the string, and the string's language. Various font formats have their own set of standard text-shaping rules. With Harfbuzz, you can properly shape all the major writing systems. HarfBuzz is cross-platform and supports all major software platforms and font formats.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    Open QR Code

    Open QR Code

    Open QR Code is an open-source, cross-platform app

    Open QR Code is an open-source cross-platform application developed using Flutter as main framework used to build the application, in common C, C++, Dart, Skia (a 2D rendering engine), and Impeller (the default rendering engine on iOS), Java, Kotlin. Open QR Code allows users to generate and scan QR codes effortlessly. The app is available on Android, Windows, and the Web. Users can generate QR codes from any text input, save them to their gallery, share them directly from the app, and scan QR codes to retrieve encoded information. Whether you're on Android, Windows, or the Web, you can create and share QR codes or scan them with a single click.
    Downloads: 19 This Week
    Last Update:
    See Project
  • 9
    Ebitengine

    Ebitengine

    A dead simple 2D game engine for Go

    Ebitengine (formerly known as Ebiten) is a lightweight, open-source 2D game engine built for the Go programming language. It is designed to be simple and easy to use, allowing developers to build games quickly with a clean and minimal API. Ebitengine supports cross-platform deployment, including desktop, mobile, web, and select console platforms. The engine provides essential features such as 2D graphics rendering, input handling, and audio playback. Developers can work with transformations,...
    Downloads: 3 This Week
    Last Update:
    See Project
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 10
    ZLPhotoBrowser

    ZLPhotoBrowser

    Wechat-like image picker. Support select photos, videos, gif, etc.

    ZLPhotoBrowser is a Wechat-like image picker. Support select normal photos, videos, gif, and live photos. Support edit images and crop video. Image editor (Draw/Crop/Image sticker/Text sticker/Mosaic/Filter/Adjust(Brightness, Contrast, and Saturation)), (Draw color can be customized; Crop ratio can be customized; Filter effect can be customized; You can choose the editing tool you want).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Dicio assistant

    Dicio assistant

    Dicio assistant app for Android

    Dicio is a free and open source voice assistant for Android that focuses on strong privacy by running its understanding and response generation directly on the device whenever possible. It supports multiple input and output methods, including hotword-based voice input using the Vosk speech-to-text engine and a graphical interface for users who prefer to tap instead of talk. The assistant is built around a flexible “skills” system that lets it respond to a wide variety of requests such as search, weather, navigation, calculator, timers, media control, and more. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    shuyuan

    shuyuan

    Reading book source

    ...For learners, researchers, or avid readers, Shuyuan offers a way to bridge from plain text files or eBooks into a manageable, interactive resource — one where notes, references, and reading progress can be tracked. It likely supports different input formats (text, HTML, PDF), and may integrate optional translation or text normalization tools.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Google AI Edge Gallery

    Google AI Edge Gallery

    A gallery that showcases on-device ML/GenAI use cases

    Gallery is a curated collection of on-device machine learning examples, demo apps, and model artifacts designed to help developers experiment with and deploy ML at the edge. The project bundles runnable samples that show how to run TensorFlow Lite/Edge TPU models (and similar lightweight runtimes) on mobile and embedded platforms, demonstrating common tasks like image classification, object detection, audio recognition, and pose estimation. Each sample is intended to be both a learning aid...
    Downloads: 651 This Week
    Last Update:
    See Project
  • 14
    Aidea

    Aidea

    Flutter-based cross-platform app integrating major AI models

    AIdea is a comprehensive Flutter-based cross-platform app integrating major AI models—OpenAI GPT, Chinese models Tongyi Qianwen and Wenxin Yiyan, plus image models like Stable Diffusion for text-to-image, image-to-image, SDXL 1.0, super-resolution, and colorization. It includes a client app, server backend, and Docker deployment scripts for hosted setups.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Lumo Android App

    Lumo Android App

    Android application for Proton Lumo

    Lumo Android App is the official Android client implementation of Lumo, a privacy-first AI assistant created by Proton that lets users interact with an intelligent chatbot securely and confidentially on mobile devices. Lumo is designed so that every conversation remains encrypted and private, meaning chats are not logged, tracked, or used to train external large language models, and all interactions are protected with zero-access encryption so only the user can read them. The Android app...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    ncnn

    ncnn

    High-performance neural network inference framework for mobile

    ncnn is a high-performance neural network inference computing framework designed specifically for mobile platforms. It brings artificial intelligence right at your fingertips with no third-party dependencies, and speeds faster than all other known open source frameworks for mobile phone cpu. ncnn allows developers to easily deploy deep learning algorithm models to the mobile platform and create intelligent APPs. It is cross-platform and supports most commonly used CNN networks, including...
    Downloads: 26 This Week
    Last Update:
    See Project
  • 17
    Notifee Notifications

    Notifee Notifications

    A feature rich notifications library for React Native

    ...Present & handle quick actions alongside your notification content. Actions can be handled in the background or foreground with JavaScript code! Notifee supports many notification styles such as Big Text, Big Picture, Inbox & Messaging on Android and attachments & custom summary text on iOS. Trigger your notifications to display at certain point in the future, or set up repeating triggers to alert your users regularly! Notifications support displaying remote images and local with support for requiring React Native image assets.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Omi

    Omi

    AI that sees your screen and listens to conversations

    The Omi project is an open-source AI wearable ecosystem developed by Based Hardware that combines hardware, software, and cloud infrastructure to create a persistent “second brain” for capturing and processing real-world interactions. It is designed as a system that continuously listens to conversations and monitors screen activity, converting this input into structured data such as transcripts, summaries, and actionable insights in real time. The platform operates across multiple environments, including wearable devices, mobile apps, and desktop applications, ensuring seamless integration into a user’s daily workflow. At its core, omi uses a pipeline of speech-to-text systems, large language models, and memory storage services to transform raw audio and context into meaningful outputs like tasks and reminders. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    Airtest

    Airtest

    UI Automation Framework for Games and Apps

    ¿Airtest provides cross-platform APIs, including app installation, simulated input, assertion and so forth. Airtest uses image recognition technology to locate UI elements so that you can automate games and apps without injecting any code. Airtest cases can be easily run on large device farms, using the command line or python API. HTML reports with detailed info and screen recording allow you to quickly locate failure points.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 20
    Open-AutoGLM

    Open-AutoGLM

    An open phone agent model & framework

    Open-AutoGLM is an open-source framework and model designed to empower autonomous mobile intelligent assistants by enabling AI agents to understand and interact with phone screens in a multimodal manner, blending vision and language capability to control real devices. It aims to create an “AI phone agent” that can perceive on-screen content, reason about user goals, and execute sequences of taps, swipes, and text input via automated device control interfaces like ADB, enabling hands-off completion of multi-step tasks such as navigating apps, filling forms, and more. Unlike traditional automation scripts that depend on brittle heuristics, Open-AutoGLM uses pretrained large language and vision-language models to interpret visual context and natural language instructions, giving the agent robust adaptability across apps and interfaces.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 21
    OpenFL

    OpenFL

    Open source library for creative expression on the web, desktop, etc.

    ...It builds on the Haxe programming language and offers a familiar display list and event-driven API inspired by classic Adobe Flash and AIR, allowing developers to leverage well-known paradigms while targeting modern platforms. OpenFL supports 2D and limited 3D graphics rendering, audio playback, advanced user input (mouse, touch, keyboard, gamepads), rich text formatting, asset management, networking, and file system access, making it a comprehensive foundation for interactive experiences. Projects written with OpenFL can compile to native C++ executables, JavaScript/WebGL for web, or run through app runtimes like Electron without plugins, enabling high performance and broad reach.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    CogAgent

    CogAgent

    An open sourced end-to-end VLM-based GUI Agent

    CogAgent is a 9B-parameter bilingual vision-language GUI agent model based on GLM-4V-9B, trained with staged data curation, optimization, and strategy upgrades to improve perception, action prediction, and generalization across tasks. It focuses on operating real user interfaces from screenshots plus text, and follows a strict input–output format that returns structured actions, grounded operations, and optional sensitivity annotations. The model is designed for agent-style execution rather than freeform chat, maintaining a continuous execution history across steps while requiring a fresh session for each new task. Inference supports BF16 on NVIDIA GPUs, with optional INT8 and INT4 modes available but with noted performance loss at INT4; example CLIs and a web demo illustrate bounding-box outputs and operation categories.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Texture

    Texture

    Smooth asynchronous user interfaces for iOS apps

    ...If you've ever dealt with cell reuse bugs, tried to performantly preload data for a page or scroll style interface or even just tried to keep your app from dropping too many frames you can benefit from integrating Texture. Texture lets you move image decoding, text sizing and rendering, and other expensive UI operations off the main thread, to keep the main thread available to respond to user interaction.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Zulip

    Zulip

    Powerful open source team chat application

    Zulip is a powerful open source group chat application that combines the immediacy of real-time chat with the productivity benefits of a threaded conversation model. Zulip’s unique threading model allows users to easily catch up on important conversations, helping to save time and increase productivity.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    MiniCPM-o

    MiniCPM-o

    A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming

    MiniCPM-o 2.6 is a cutting-edge multimodal large language model (MLLM) designed for high-performance tasks across vision, speech, and video. Capable of running on end-side devices such as smartphones and tablets, it provides powerful features like real-time speech conversation, video understanding, and multimodal live streaming. With 8 billion parameters, MiniCPM-o 2.6 surpasses its predecessors in versatility and efficiency, making it one of the most robust models available. It supports...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB