image to text free download

Showing 141 open source projects for "image to text"

View related business solutions

Artificial Intelligence Clear Filters & Widen Search

Bright Data - All in One Platform for Proxies and Web Scraping
Say goodbye to blocks, restrictions, and CAPTCHAs

Bright Data offers the highest quality proxies with automated session management, IP rotation, and advanced web unlocking technology. Enjoy reliable, fast performance with easy integration, a user-friendly dashboard, and enterprise-grade scaling. Powered by ethically-sourced residential IPs for seamless web scraping.

Get Started
Top-Rated Free CRM Software
216,000+ customers in over 135 countries grow their businesses with HubSpot

HubSpot is an AI-powered customer platform with all the software, integrations, and resources you need to connect your marketing, sales, and customer service. HubSpot's connected platform enables you to grow your business faster by focusing on what matters most: your customers.

Get started free
1

Image to Text

Convert an image to text to spot intelligible words.

The program will convert an image such as photo to text, with the purpose of analyzing it to spot intelligible words. Use the program with photos of clouds, sea, soil, vegetation or any other photo of natural or man-made semi-homogeneous configuration, to reveal the hidden universal-philosophical messages of the image. You can also use it on photos of people or art pieces to have a psychological insight of the person portrayed or of the image author. The resulting text will be a long string...

Downloads: 1 This Week

Last Update: 2023-08-23
See Project
2

Minimal text diffusion

A minimal implementation of diffusion models for text generation

A minimal implementation of diffusion models of text: learns a diffusion model of a given text corpus, allowing to generate text samples from the learned model. The main idea was to retain just enough code to allow training a simple diffusion model and generating samples, remove image-related terms, and make it easier to use. To train a model, run scripts/train.sh. By default, this will train a model on the simple corpus. However, you can change this to any text file using the --train_data...

Downloads: 0 This Week

Last Update: 2023-03-23
See Project
3

Image To Text tools

ITTT is a Free tool designed to Scan and extract Text from Images.

Image To Text Tools is a 100% Free user-friendly tool designed to Scan and extract containing text in images into editable text formats. Whether you need to extract text from scanned documents, photographs, or other image files, Image To Text Tools provides accurate and reliable Optical Character Recognition (OCR) capabilities to meet your needs.

Downloads: 102 This Week

Last Update: 2024-02-21
See Project
4

Tesseract OCR

Open Source OCR Engine

Tesseract is an open source OCR or optical character recognition engine and command line program. OCR is a technology that allows for the recognition of text characters within a digital image. With the latest version of Tesseract, there is a greater focus on line recognition, however it still supports the legacy Tesseract OCR engine which recognizes character patterns. Tesseract can recognize over 100 languages out-of-the-box, and can be trained to recognize other languages. It supports...

Downloads: 1,323 This Week

Last Update: 2024-06-21
See Project
Save hundreds of developer hours with components built for SaaS applications.
The #1 Embedded Analytics Solution for SaaS Teams.

Whether you want full self-service analytics or simpler multi-tenant security, Qrvey’s embeddable components and scalable data management remove the guess work.

Try Developer Playground
5

Tesseract.js

A pure Javascript Multilingual OCR

Tesseract.js is a pure Javascript port of the popular Tesseract OCR engine. Tesseract.js' library supports more than 100 languages, automatic text orientation and script detection, a simple interface for reading paragraph, word, and character bounding boxes. Tesseract.js can run either in a browser and on a server with NodeJS. Tesseract.js is a javascript library that gets words in almost any spoken language out of images. The main Tesseract.js functions (ex. recognize, detect) take an image...

Downloads: 24 This Week

Last Update: 2024-08-24
See Project
6

OCRmyPDF

OCRmyPDF adds an OCR text layer to scanned PDF files

OCRmyPDF adds an optical character recognition (OCR) text layer to scanned PDF files, allowing them to be searched. PDF is the best format for storing and exchanging scanned documents. Unfortunately, PDFs can be difficult to modify. OCRmyPDF makes it easy to apply image processing and OCR (recognized, searchable text) to existing PDFs.

Downloads: 31 This Week

Last Update: 3 days ago
See Project
7

Stable Diffusion v 2.1 web UI

Lightweight Stable Diffusion v 2.1 web UI: txt2img, img2img, depth2img

Lightweight Stable Diffusion v 2.1 web UI: txt2img, img2img, depth2img, in paint and upscale4x. Gradio app for Stable Diffusion 2 by Stability AI. It uses Hugging Face Diffusers implementation. Currently supported pipelines are text-to-image, image-to-image, inpainting, upscaling and depth-to-image.

Downloads: 12 This Week

Last Update: 2023-03-22
See Project
8

EasyOCR

Ready-to-use OCR with 80+ supported languages

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc. EasyOCR is a python module for extracting text from image. It is a general OCR that can read both natural scene text and dense text in document. We are currently supporting 80+ languages and expanding. Second-generation models: multiple times smaller size, multiple times faster inference, additional characters and comparable accuracy to the first...

Downloads: 21 This Week

Last Update: 2024-09-24
See Project
9

Label Studio

Label Studio is a multi-type data labeling and annotation tool

The most flexible data annotation tool. Quickly installable. Build custom UIs or use pre-built labeling templates. Detect objects on image, bboxes, polygons, circular, and keypoints supported. Partition image into multiple segments. Use ML models to pre-label and optimize the process. Label Studio is an open-source data labeling tool. It lets you label data types like audio, text, images, videos, and time series with a simple and straightforward UI and export to various model formats. It can...

Downloads: 11 This Week

Last Update: 2024-08-20
See Project
Deliver secure remote access with OpenVPN.
Trusted by nearly 20,000 customers worldwide, and all major cloud providers.

OpenVPN's products provide scalable, secure remote access — giving complete freedom to your employees to work outside the office while securely accessing SaaS, the internet, and company resources.

Get started — no credit card required.
10

Dream Textures

Stable Diffusion built-in to Blender

Create textures, concept art, background assets, and more with a simple text prompt. Use the 'Seamless' option to create textures that tile perfectly with no visible seam. Texture entire scenes with 'Project Dream Texture' and depth to image. Re-style animations with the Cycles render pass. Run the models on your machine to iterate without slowdowns from a service. Create textures, concept art, and more with text prompts. Learn how to use the various configuration options to get exactly what...

Downloads: 9 This Week

Last Update: 2024-08-26
See Project
11

PyGPT

Open source personal AI Assistant for Linux, Windows and Mac

PyGPT is a desktop application that allows you to talk to OpenAI's LLM models such as GPT4 and GPT3 using your own computer and OpenAI API. It allows you to talk in chat mode and in completion mode, as well as generate images using DALL-E 2. PyGPT also adds access to the Internet for GPT via Google Custom Search API and Wikipedia API and includes voice synthesis using Microsoft Azure Text-to-Speech API. Moreover, the application has implemented context memory support, context storage, history...

Downloads: 16 This Week

Last Update: 2024-08-29
See Project
12

InvokeAI

InvokeAI is a leading creative engine for Stable Diffusion models

InvokeAI is an implementation of Stable Diffusion, the open source text-to-image and image-to-image generator. It provides a streamlined process with various new features and options to aid the image generation process. It runs on Windows, Mac and Linux machines, and runs on GPU cards with as little as 4 GB or RAM. InvokeAI is a leading creative engine built to empower professionals and enthusiasts alike. Generate and create stunning visual media using the latest AI-driven technologies...

2 Reviews

Downloads: 12 This Week

Last Update: 2 hours ago
See Project
13

Intelligent Java

Integrate with the latest language models, image generation and speech

Intelligent java (IntelliJava) is the ultimate tool to integrate with the latest language models and deep learning frameworks using java. The library provides an intuitive functions for sending input to models like ChatGPT and DALL·E, and receiving generated text, speech or images. With just a few lines of code, you can easily access the power of cutting-edge AI models to enhance your projects. Access ChatGPT, GPT3 to generate text and DALL·E to generate images. OpenAI is preferred for quality...

Downloads: 7 This Week

Last Update: 2023-04-16
See Project
14

PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle

PaddleOCR offers exceptional, multilingual, and practical Optical Character Recognition (OCR) tools that can help users train better models and apply them into practice. Inspired by PaddlePaddle, PaddleOCR is an ultra lightweight OCR system, with multilingual recognition, digit recognition, vertical text recognition, as well as long text recognition. It features a PPOCR series of high-quality pre-trained models, which includes: ultra lightweight ppocr_mobile series models, general ppocr_server...

Downloads: 7 This Week

Last Update: 2024-10-22
See Project
15

Stable-Dreamfusion

Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion

A pytorch implementation of the text-to-3D model Dreamfusion, powered by the Stable Diffusion text-to-2D model. This project is a work-in-progress and contains lots of differences from the paper. The current generation quality cannot match the results from the original paper, and many prompts still fail badly! Since the Imagen model is not publicly available, we use Stable Diffusion to replace it (implementation from diffusers). Different from Imagen, Stable-Diffusion is a latent diffusion...

Downloads: 5 This Week

Last Update: 2023-05-15
See Project
16

StoryTeller

Multimodal AI Story Teller, built with Stable Diffusion, GPT, etc.

A multimodal AI story teller, built with Stable Diffusion, GPT, and neural text-to-speech (TTS). Given a prompt as an opening line of a story, GPT writes the rest of the plot; Stable Diffusion draws an image for each sentence; a TTS model narrates each line, resulting in a fully animated video of a short story, replete with audio and visuals. To develop locally, install dev dependencies and install pre-commit hooks. This will automatically trigger linting and code quality checks before each...

Downloads: 11 This Week

Last Update: 2023-08-22
See Project
17

Make-A-Video - Pytorch (wip)

Implementation of Make-A-Video, new SOTA text to video generator

Implementation of Make-A-Video, new SOTA text to video generator from Meta AI, in Pytorch. They combine pseudo-3d convolutions (axial convolutions) and temporal attention and show much better temporal fusion. The pseudo-3d convolutions isn't a new concept. It has been explored before in other contexts, say for protein contact prediction as "dimensional hybrid residual networks". The gist of the paper comes down to, take a SOTA text-to-image model (here they use DALL-E2, but the same learning...

Downloads: 5 This Week

Last Update: 2024-05-03
See Project
18

Phenaki - Pytorch

Implementation of Phenaki Video, which uses Mask GIT

... on text-to-image and then text-to-video. Similarly, for unconditional training, the researcher should be able to first train on images and then fine tune on video.

Downloads: 3 This Week

Last Update: 2024-07-29
See Project
19

OpenAI DALL·E AsyncImage SwiftUI

OpenAI swift async text to image for SwiftUI app using OpenAI

SwiftUI views that asynchronously loads and displays an OpenAI image from open API. You just type in your idea and AI will give you an art solution. DALL-E and DALL-E 2 are deep learning models developed by OpenAI to generate digital images from natural language descriptions, called "prompts". You need to have Xcode 13 installed in order to have access to Documentation Compiler (DocC) OpenAI's text-to-image model DALL-E 2 is a recent example of diffusion models. It uses diffusion models...

Downloads: 1 This Week

Last Update: 2024-09-20
See Project
20

OpenAI Web Application

A web application that allows users to interact with OpenAI's models

A web application that allows users to interact with OpenAI's modles through a simple and user-friendly interface. This app is for demo purpose to test OpenAI API and may contain issues/bugs. User-friendly interface for making requests to the OpenAI API. Responses are displayed in a chat-like format. Select Models (Davinci, Codex, DALL·E, Whisper) based on your needs. Create AI Images (DALL·E). Audio-Text Transcribe (Whisper). Highlight code syntax. Type in the input field and press enter...

Downloads: 2 This Week

Last Update: 2023-03-23
See Project
21

DALL-E 2 - Pytorch

Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis

Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch. The main novelty seems to be an extra layer of indirection with the prior network (whether it is an autoregressive transformer or a diffusion network), which predicts an image embedding based on the text embedding from CLIP. Specifically, this repository will only build out the diffusion prior network, as it is the best performing variant (but which incidentally involves a causal transformer...

Downloads: 1 This Week

Last Update: 2023-10-19
See Project
22

Super-PDF-Editor

World's most comprehensive, powerful, process-based PDF editor

World's most comprehensive, powerful, process-based and lighting fast PDF reader, editor and batch processor. PDF editing with 60+ features rich tools and function like OCR pdf and images and produce output like searchable PDF, Text, Hocr, Box, Unlv. Also, improve image enhancement before OCR operation for better OCR performance. pdf Imposition, etc. Super PDF Editor is best for bulk pdf processing, especially for the printing industry. Easy pdf imposition, booklet, n ups pages, and more. OCR...

3 Reviews

Downloads: 39 This Week

Last Update: 2023-02-02
See Project
23

LlamaParse

Parse files for optimal RAG

LlamaParse is a GenAI-native document parser that can parse complex document data for any downstream LLM use case (RAG, agents). Load in 160+ data sources and data formats, from unstructured, and semi-structured, to structured data (API's, PDFs, documents, SQL, etc.) Store and index your data for different use cases. Integrate with 40+ vector stores, document stores, graph stores, and SQL db providers.

Downloads: 2 This Week

Last Update: 6 days ago
See Project
24

Venom

Venom is the most complete javascript library for Whatsapp

Venom is a high-performance system developed with JavaScript to create a bot for WhatsApp, support for creating any interaction, such as customer service, media sending, sentence recognition based on artificial intelligence and all types of design architecture for WhatsApp. It's a high-performance alternative API to whatzapp, you can send, text messages, files, images, videos and more. Remember, the API was developed on a platform called RESTful Web services, providing interoperability between...

Downloads: 2 This Week

Last Update: 2024-09-26
See Project
25

ChatGPT Discord Bot

Integrate ChatGPT into your own discord bot

... by modifying the content in system_prompt.txt. All the text in the file will be fired as a prompt to the bot. Get the first message from ChatGPT in your discord channel!

Downloads: 2 This Week

Last Update: 2024-05-30
See Project