Search Results for "image to text converter"

Sort By:

Showing 113 open source projects for "image to text converter"

View related business solutions

JavaScript Clear Filters & Widen Search

Add Two Lines of Code. Get Full APM.
AppSignal installs in minutes and auto-configures dashboards, alerts, and error tracking.

Works out of the box for Rails, Django, Express, Phoenix, and more. Monitoring exceptions and performance in no time.

Start Free
Full-stack observability with actually useful AI | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
1

Text-to-image Playground

A playground to generate images from any text prompt using SD

dalle-playground is an open-source web application that allows users to generate images from natural language text prompts using modern text-to-image generative models. Originally built around DALL-E Mini, the project later transitioned to using Stable Diffusion, enabling more detailed and higher-quality image synthesis. The system combines a backend machine learning service with a browser-based frontend interface that lets users experiment interactively with prompt engineering and generative AI. ...

Downloads: 1 This Week

Last Update: 2026-03-11
See Project
2

lcd-image-converter

Tool to create bitmaps and fonts for embedded applications.

This program allows you to create bitmaps and fonts, and transform them to "C" source format for embedded applications. The transformation of the images to the source code is made by using templates. Therefore, by modifying the templates, you can change the format of the output within certain limits.

2 Reviews

Downloads: 524 This Week

Last Update: 2025-01-31
See Project
3

Tesseract.js

A pure Javascript Multilingual OCR

Tesseract.js is a pure Javascript port of the popular Tesseract OCR engine. Tesseract.js' library supports more than 100 languages, automatic text orientation and script detection, a simple interface for reading paragraph, word, and character bounding boxes. Tesseract.js can run either in a browser and on a server with NodeJS. Tesseract.js is a javascript library that gets words in almost any spoken language out of images. The main Tesseract.js functions (ex. recognize, detect) take an image parameter, which should be something that is like an image. ...

Downloads: 15 This Week

Last Update: 2025-12-15
See Project
4

Markdown PDF

Markdown converter for Visual Studio Code

This extension converts Markdown files to PDF, HTML, PNG or JPEG files.

Downloads: 4 This Week

Last Update: 2026-04-13
See Project
Go From AI Idea to AI App Fast
One platform to build, fine-tune, and deploy ML models. No MLOps team required.

Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.

Try Free
5

Easy Diffusion

An easy 1-click way to create beautiful artwork on your PC using AI

Easy Diffusion is a widely used community-driven repository offering a simple, one-click way to install and use Stable Diffusion-based generative AI on a personal computer without advanced technical skills or prior setup. It provides a browser-based user interface that runs locally, allowing users to type text prompts and immediately generate images directly within their web browser, democratizing access to powerful text-to-image models for artists and hobbyists alike. The project abstracts away environment setup, dependencies, and model installation — tasks that can be daunting to beginners — and instead lets users focus on creative experimentation with prompt phrasing, model parameters, and image output settings. ...

Downloads: 47 This Week

Last Update: 2026-03-31
See Project
6

Diffusion Bee

Diffusion Bee is the easiest way to run Stable Diffusion locally

...Users can generate images from text prompts, perform image-to-image transformations, and apply additional features like inpainting, outpainting, and model-based upscaling directly within a clean graphical interface. It’s optimized for Apple hardware performance and can automatically manage features like ControlNet, LoRA models, and advanced prompt options without exposing complexity to the user.

Downloads: 19 This Week

Last Update: 2026-02-03
See Project
7

AUTOMATIC1111 Stable Diffusion web UI

Stable Diffusion web UI

AUTOMATIC1111's stable-diffusion-webui is a powerful, user-friendly web interface built on the Gradio library that allows users to easily interact with Stable Diffusion models for AI-powered image generation. Supporting both text-to-image (txt2img) and image-to-image (img2img) generation, this open-source UI offers a rich feature set including inpainting, outpainting, attention control, and multiple advanced upscaling options. With a flexible installation process across Windows, Linux, and Apple Silicon, plus support for GPUs and CPUs, it caters to a wide range of users—from hobbyists to professionals. ...

1 Review

Downloads: 313 This Week

Last Update: 2025-06-02
See Project
8

Stable Diffusion web UI for AMDGPUs

Stable Diffusion WebUI optimized for AMD GPUs with editing tools

Stable Diffusion WebUI AMDGPU is a browser-based interface for generating images using Stable Diffusion, built with Gradio and adapted for AMD graphics hardware. It provides both text-to-image and image-to-image workflows, allowing users to create, refine, and upscale visuals within a single interface. It includes tools such as inpainting and outpainting for editing specific areas of an image, along with features like prompt matrix generation and attention controls to fine-tune outputs. Users can emphasize or de-emphasize elements in prompts to influence results more precisely. ...

Downloads: 9 This Week

Last Update: 2026-03-19
See Project
9

Scribe.js

JavaScript OCR and text extraction for images and PDFs

Scribe.js is a JavaScript library that provides Optical Character Recognition (OCR) and text extraction capabilities for both images and PDF documents, aimed at developers who want to build OCR features directly into their applications. The library can take image files (such as PNG or JPEG) and recognize the text they contain, and it can also extract text from PDF files that either already contain text or are image-based scans, using modern web standards and WebAssembly under the hood. ...

Downloads: 1 This Week

Last Update: 2026-03-14
See Project
Fully Managed MySQL, PostgreSQL, and SQL Server
Automatic backups, patching, replication, and failover. Focus on your app, not your database.

Cloud SQL handles your database ops end to end, so you can focus on your app.

Try Free
10

Markdown

WeChat Markdown Editor

WeChat Markdown Editor | A highly concise WeChat Markdown editor, that supports Markdown syntax, color palette selection, multi-image upload, one-click document download, custom CSS style, one-click reset, and other features. Markdown documents are automatically rendered into WeChat graphics and text in real-time, so you no longer have to worry about the typesetting of WeChat articles! As long as you know the basic Markdown syntax, you can make a simple and beautiful WeChat graphic. ...

Downloads: 2 This Week

Last Update: 2025-10-17
See Project
11

Jodit Editor 3

Best WYSIWYG Editor for You

An excellent WYSIWYG editor written in pure TypeScript without the use of additional libraries. It's a file editor and image editor.

Downloads: 5 This Week

Last Update: 2026-04-01
See Project
12

MCP Server Amazon Bedrock

Model Context Procotol(MCP) server for using Amazon Bedrock

The Amazon Bedrock MCP Server is an MCP server that integrates with Amazon Bedrock's Nova Canvas model for AI image generation. It allows users to generate high-quality images from text descriptions using Amazon's AI capabilities.

Downloads: 0 This Week

Last Update: 2025-04-08
See Project
13

Generative AI for Beginners (Version 3)

21 Lessons, Get Started Building with Generative AI

...Lessons are split into “Learn” modules for core concepts and “Build” modules with hands-on code in Python and TypeScript, so you can jump in at any point that matches your goals. The course covers everything from model selection, prompt engineering, and chat/text/image app patterns to secure development practices and UX for AI. It also walks through modern application techniques such as function calling, RAG with vector databases, working with open source models, agents, fine-tuning, and using SLMs. Each lesson includes a short video, a written guide, runnable samples for Azure OpenAI, the GitHub Marketplace Model Catalog, and the OpenAI API, plus a “Keep Learning” section for deeper study.

Downloads: 14 This Week

Last Update: 6 days ago
See Project
14

DiscordBotClient

A patched version of discord, with bot login support

A patched version of Discord, with bot login support. Discord Bot Client allows you to use your bot, just like any other user account, except for Friends and Groups.

Downloads: 113 This Week

Last Update: 2026-03-15
See Project
15

mp-html

Small program rich text component, supports rendering and editing html

A powerful applet-rich text component. Small program rich text component supports rendering and editing HTML and supports use on WeChat, QQ, Baidu, Alipay, Toutiao, and uni-app platforms. Displaying dynamic HTML rich text is a necessary requirement for many applications. The applet platform does not support dom operations, making this a problem. The built-in rich-text component supports few tags and blocks all events, making it difficult for practical application. Therefore, there is such a...

Downloads: 1 This Week

Last Update: 2025-12-14
See Project
16

ALLWEONE

AI tool that generates custom presentations with real-time editing

...You can define slide count, language, and tone, then review or edit the AI-generated outline before finalising. Slides are built in real time, allowing you to watch content develop as the system works. Presentation AI by ALLWEONE includes image generation, rich text editing, and drag-and-drop functionality for easy adjustments. It also supports presentation mode, so you can present directly within the app. Built with modern technologies like Next.js, React, and Tailwind CSS, it integrates AI services such as OpenAI for content generation. It is fully open source under the MIT licence, making it suitable for developers who want to customise or extend its capabilities.

Downloads: 4 This Week

Last Update: 14 hours ago
See Project
17

LandPPT

An LLM-based presentation generation platform

LandPPT is an open-source AI platform that automatically generates professional presentation slides using large language models. The system allows users to create complete PowerPoint presentations simply by entering a topic or uploading source documents such as PDFs, Word files, or Markdown notes. Using natural language processing and structured content generation, the platform produces presentation outlines and converts them into fully formatted slide decks. The application integrates...

Downloads: 9 This Week

Last Update: 2026-04-13
See Project
18

comfyui-mixlab-nodes

Workflow and speech recognition app

comfyui-mixlab-nodes is a large collection of custom nodes for ComfyUI that turns workflows into interactive apps and adds real-time multimedia, LLM, and TTS capabilities. It introduces a “Workflow-to-APP” concept, where a ComfyUI graph can be transformed into a Web App through an AppInfo node, complete with categories, batch prompts, and editable configurations. The project also brings Real-time Design features like screen capture and floating video nodes, enabling creative pipelines that...

Downloads: 4 This Week

Last Update: 2025-11-28
See Project
19

Node.js Client For NLP Cloud

NLP Cloud serves high performance pre-trained or custom models

...NLP Cloud serves high-performance pre-trained or custom models for NER, sentiment analysis, classification, summarization, dialogue summarization, paraphrasing, intent classification, product description and ad generation, chatbot, grammar and spelling correction, keywords and keyphrases extraction, text generation, image generation, blog post generation, text generation, question answering, automatic speech recognition, machine translation, language detection, semantic search, semantic similarity, tokenization, POS tagging, embeddings, and dependency parsing. It is ready for production, and served through a REST API. You can either use the NLP Cloud pre-trained models, fine-tune your own models, or deploy your own models.

Downloads: 0 This Week

Last Update: 2024-11-27
See Project
20

Fabric.js

Javascript Canvas Library and SVG-to-Canvas Parser

Fabric.js is a simple yet powerful Javascript HTML5 canvas library that allows you to easily work with HTML5 canvas element in various ways. It is also an SVG-to-canvas (and vice versa) parser. Fabric provides an interactive object model on top of canvas element, so you can create and populate objects on canvas; manipulate the size, position and rotation of these objects; modify properties such as color, transparency and more. You could also group these objects together with just a simple...

Downloads: 7 This Week

Last Update: 3 days ago
See Project
21

PDFCraft

PDFCraft is a free, privacy-focused PDF toolkit

PDFCraft is an extensible toolkit for creating, editing, and transforming PDF documents with both a graphical interface and a scripting API, making it useful for users ranging from casual editors to automated document processors. At its core, the project provides a clean, modern UI where you can rearrange pages, annotate text, insert images, fill forms, and export to multiple formats, all without needing a heavyweight commercial PDF suite. But beyond manual editing, it also offers a...

Downloads: 8 This Week

Last Update: 2026-04-23
See Project
22

canvas-constructor

An ES6 utility for canvas with built-in functions and chained methods

An ES6 utility for canvas with built-in functions and chained methods. Alternatively, you can import canvas-constructor/browser. That will create a canvas with size of 300 pixels width, 300 pixels height. Set the color to #AEFD54. Draw a rectangle with the previous color, covering all the pixels from (5, 5) to (290 + 5, 290 + 5) Set the color to #FFAE23. Set the font size to 28 pixels with font Impact. Write the text 'Hello World!' in the position (130, 150) Return a buffer.

Downloads: 0 This Week

Last Update: 2024-05-22
See Project
23

carbon CLI

Beautiful images of your code, from right inside your terminal

carbon.now.sh by is a wonderful tool that lets you generate beautiful images of your source code through an intuitive UI, while letting you customize aspects like fonts, themes, window controls and much more. carbon-now-cli gives you the full power of Carbon, right at your fingertips, inside the terminal. Generate beautiful images from a source file, or sections of a source file, by running a single command. Want to customize everything before generating the image? Run it in interactive mode. Downloads the real, high-quality image (no DOM screenshots). Detects file type automatically. Supports all file extensions supported by carbon.now.sh and more. Create and share beautiful images of your source code. Start typing or drop a file into the text area to get started. Displays image directly in supported terminals. ...

Downloads: 0 This Week

Last Update: 2024-12-12
See Project
24

DOCX Document Converter

Convert .docx to .md/.txt and .html. Free, unlimited, fast.

A simple, free, unlimited, secure web-based tool that converts Microsoft Word documents (.docx) into Markdown (.md/.txt) and HTML files. Perfect for developers, writers, and anyone who needs to transform .docx MS Office Word documents into web-friendly or AI context friendly formats. Unlike those other jerks on the web that charge many dollars per month for this, I made it free, unlimited and open source. This is a better version of 'convert docx to txt' since .md files can be opened...

Downloads: 15 This Week

Last Update: 2025-08-11
See Project
25

PyGPT

Open source personal AI Assistant for Linux, Windows and Mac

PyGPT is a desktop application that allows you to talk to OpenAI's LLM models such as GPT4 and GPT3 using your own computer and OpenAI API. It allows you to talk in chat mode and in completion mode, as well as generate images using DALL-E 2. PyGPT also adds access to the Internet for GPT via Google Custom Search API and Wikipedia API and includes voice synthesis using Microsoft Azure Text-to-Speech API. Moreover, the application has implemented context memory support, context storage,...

Downloads: 3 This Week

Last Update: 2026-02-06
See Project