image to text free download

Showing 105 open source projects for "image to text"

View related business solutions

Python Clear Filters & Widen Search

Deliver secure remote access with OpenVPN.
Trusted by nearly 20,000 customers worldwide, and all major cloud providers.

OpenVPN's products provide scalable, secure remote access — giving complete freedom to your employees to work outside the office while securely accessing SaaS, the internet, and company resources.

Get started — no credit card required.
Save hundreds of developer hours with components built for SaaS applications.
The #1 Embedded Analytics Solution for SaaS Teams.

Whether you want full self-service analytics or simpler multi-tenant security, Qrvey’s embeddable components and scalable data management remove the guess work.

Try Developer Playground
1

Image to Text

Convert an image to text to spot intelligible words.

The program will convert an image such as photo to text, with the purpose of analyzing it to spot intelligible words. Use the program with photos of clouds, sea, soil, vegetation or any other photo of natural or man-made semi-homogeneous configuration, to reveal the hidden universal-philosophical messages of the image. You can also use it on photos of people or art pieces to have a psychological insight of the person portrayed or of the image author. The resulting text will be a long string...

Downloads: 1 This Week

Last Update: 2023-08-23
See Project
2

Minimal text diffusion

A minimal implementation of diffusion models for text generation

A minimal implementation of diffusion models of text: learns a diffusion model of a given text corpus, allowing to generate text samples from the learned model. The main idea was to retain just enough code to allow training a simple diffusion model and generating samples, remove image-related terms, and make it easier to use. To train a model, run scripts/train.sh. By default, this will train a model on the simple corpus. However, you can change this to any text file using the --train_data...

Downloads: 0 This Week

Last Update: 2023-03-23
See Project
3

Text to Image

Turn text into an image to spot hidden shapes (pareidolias)

The app reads content from a text file and converts it to a BMP image. You can have fun trying to spot recognizable shapes (pareidolias) in the resulting image.

Downloads: 0 This Week

Last Update: 2023-06-23
See Project
4

OCRmyPDF

OCRmyPDF adds an OCR text layer to scanned PDF files

OCRmyPDF adds an optical character recognition (OCR) text layer to scanned PDF files, allowing them to be searched. PDF is the best format for storing and exchanging scanned documents. Unfortunately, PDFs can be difficult to modify. OCRmyPDF makes it easy to apply image processing and OCR (recognized, searchable text) to existing PDFs.

Downloads: 31 This Week

Last Update: 3 days ago
See Project
Top-Rated Free CRM Software
216,000+ customers in over 135 countries grow their businesses with HubSpot

HubSpot is an AI-powered customer platform with all the software, integrations, and resources you need to connect your marketing, sales, and customer service. HubSpot's connected platform enables you to grow your business faster by focusing on what matters most: your customers.

Get started free
5

EasyOCR

Ready-to-use OCR with 80+ supported languages

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc. EasyOCR is a python module for extracting text from image. It is a general OCR that can read both natural scene text and dense text in document. We are currently supporting 80+ languages and expanding. Second-generation models: multiple times smaller size, multiple times faster inference, additional characters and comparable accuracy to the first...

Downloads: 21 This Week

Last Update: 2024-09-24
See Project
6

Label Studio

Label Studio is a multi-type data labeling and annotation tool

The most flexible data annotation tool. Quickly installable. Build custom UIs or use pre-built labeling templates. Detect objects on image, bboxes, polygons, circular, and keypoints supported. Partition image into multiple segments. Use ML models to pre-label and optimize the process. Label Studio is an open-source data labeling tool. It lets you label data types like audio, text, images, videos, and time series with a simple and straightforward UI and export to various model formats. It can...

Downloads: 11 This Week

Last Update: 2024-08-20
See Project
7

Dream Textures

Stable Diffusion built-in to Blender

Create textures, concept art, background assets, and more with a simple text prompt. Use the 'Seamless' option to create textures that tile perfectly with no visible seam. Texture entire scenes with 'Project Dream Texture' and depth to image. Re-style animations with the Cycles render pass. Run the models on your machine to iterate without slowdowns from a service. Create textures, concept art, and more with text prompts. Learn how to use the various configuration options to get exactly what...

Downloads: 9 This Week

Last Update: 2024-08-26
See Project
8

PyGPT

Open source personal AI Assistant for Linux, Windows and Mac

PyGPT is a desktop application that allows you to talk to OpenAI's LLM models such as GPT4 and GPT3 using your own computer and OpenAI API. It allows you to talk in chat mode and in completion mode, as well as generate images using DALL-E 2. PyGPT also adds access to the Internet for GPT via Google Custom Search API and Wikipedia API and includes voice synthesis using Microsoft Azure Text-to-Speech API. Moreover, the application has implemented context memory support, context storage, history...

Downloads: 16 This Week

Last Update: 2024-08-29
See Project
9

InvokeAI

InvokeAI is a leading creative engine for Stable Diffusion models

InvokeAI is an implementation of Stable Diffusion, the open source text-to-image and image-to-image generator. It provides a streamlined process with various new features and options to aid the image generation process. It runs on Windows, Mac and Linux machines, and runs on GPU cards with as little as 4 GB or RAM. InvokeAI is a leading creative engine built to empower professionals and enthusiasts alike. Generate and create stunning visual media using the latest AI-driven technologies...

2 Reviews

Downloads: 12 This Week

Last Update: 2 hours ago
See Project
Bright Data - All in One Platform for Proxies and Web Scraping
Say goodbye to blocks, restrictions, and CAPTCHAs

Bright Data offers the highest quality proxies with automated session management, IP rotation, and advanced web unlocking technology. Enjoy reliable, fast performance with easy integration, a user-friendly dashboard, and enterprise-grade scaling. Powered by ethically-sourced residential IPs for seamless web scraping.

Get Started
10

PaddleOCR

Awesome multilingual OCR toolkits based on PaddlePaddle

PaddleOCR offers exceptional, multilingual, and practical Optical Character Recognition (OCR) tools that can help users train better models and apply them into practice. Inspired by PaddlePaddle, PaddleOCR is an ultra lightweight OCR system, with multilingual recognition, digit recognition, vertical text recognition, as well as long text recognition. It features a PPOCR series of high-quality pre-trained models, which includes: ultra lightweight ppocr_mobile series models, general ppocr_server...

Downloads: 7 This Week

Last Update: 2024-10-22
See Project
11

Stable-Dreamfusion

Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion

A pytorch implementation of the text-to-3D model Dreamfusion, powered by the Stable Diffusion text-to-2D model. This project is a work-in-progress and contains lots of differences from the paper. The current generation quality cannot match the results from the original paper, and many prompts still fail badly! Since the Imagen model is not publicly available, we use Stable Diffusion to replace it (implementation from diffusers). Different from Imagen, Stable-Diffusion is a latent diffusion...

Downloads: 5 This Week

Last Update: 2023-05-15
See Project
12

StoryTeller

Multimodal AI Story Teller, built with Stable Diffusion, GPT, etc.

A multimodal AI story teller, built with Stable Diffusion, GPT, and neural text-to-speech (TTS). Given a prompt as an opening line of a story, GPT writes the rest of the plot; Stable Diffusion draws an image for each sentence; a TTS model narrates each line, resulting in a fully animated video of a short story, replete with audio and visuals. To develop locally, install dev dependencies and install pre-commit hooks. This will automatically trigger linting and code quality checks before each...

Downloads: 11 This Week

Last Update: 2023-08-22
See Project
13

Make-A-Video - Pytorch (wip)

Implementation of Make-A-Video, new SOTA text to video generator

Implementation of Make-A-Video, new SOTA text to video generator from Meta AI, in Pytorch. They combine pseudo-3d convolutions (axial convolutions) and temporal attention and show much better temporal fusion. The pseudo-3d convolutions isn't a new concept. It has been explored before in other contexts, say for protein contact prediction as "dimensional hybrid residual networks". The gist of the paper comes down to, take a SOTA text-to-image model (here they use DALL-E2, but the same learning...

Downloads: 5 This Week

Last Update: 2024-05-03
See Project
14

Phenaki - Pytorch

Implementation of Phenaki Video, which uses Mask GIT

... on text-to-image and then text-to-video. Similarly, for unconditional training, the researcher should be able to first train on images and then fine tune on video.

Downloads: 3 This Week

Last Update: 2024-07-29
See Project
15

Extract TOTP/HOTP secrets

Extract one time password (OTP) secrets from QR codes

The Python script extract_otp_secrets.py extracts one-time password (OTP) secrets from QR codes exported by two-factor authentication (2FA) apps such as "Google Authenticator".

Downloads: 4 This Week

Last Update: 6 days ago
See Project
16

DALL-E 2 - Pytorch

Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis

Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch. The main novelty seems to be an extra layer of indirection with the prior network (whether it is an autoregressive transformer or a diffusion network), which predicts an image embedding based on the text embedding from CLIP. Specifically, this repository will only build out the diffusion prior network, as it is the best performing variant (but which incidentally involves a causal transformer...

Downloads: 1 This Week

Last Update: 2023-10-19
See Project
17

LlamaParse

Parse files for optimal RAG

LlamaParse is a GenAI-native document parser that can parse complex document data for any downstream LLM use case (RAG, agents). Load in 160+ data sources and data formats, from unstructured, and semi-structured, to structured data (API's, PDFs, documents, SQL, etc.) Store and index your data for different use cases. Integrate with 40+ vector stores, document stores, graph stores, and SQL db providers.

Downloads: 2 This Week

Last Update: 6 days ago
See Project
18

Django MarkdownX

Comprehensive Markdown plugin built for Django

Django MarkdownX is a comprehensive Markdown plugin built for Django, the renowned high-level Python web framework, with flexibility, extensibility, and ease-of-use at its core.

Downloads: 0 This Week

Last Update: 2024-09-25
See Project
19

LinkChecker

Check links in web documents or full websites

LinkChecker is a free, GPL licensed website validator. LinkChecker checks links in web documents or full websites. It runs on Python 3 systems, requiring Python 3.8 or later. The version in the pip repository may be old, to find out how to get the latest code, plus platform-specific information and other advice see doc/install.txt in the source code archive. If you do not want to install any additional libraries/dependencies you can use the Docker image which is published on GitHub Packages.

Downloads: 2 This Week

Last Update: 2024-09-03
See Project
20

ChatGPT Discord Bot

Integrate ChatGPT into your own discord bot

... by modifying the content in system_prompt.txt. All the text in the file will be fired as a prompt to the bot. Get the first message from ChatGPT in your discord channel!

Downloads: 2 This Week

Last Update: 2024-05-30
See Project
21

OpenCLIP

An open source implementation of CLIP

The goal of this repository is to enable training models with contrastive image-text supervision and to investigate their properties such as robustness to distribution shift. Our starting point is an implementation of CLIP that matches the accuracy of the original CLIP models when trained on the same dataset. Specifically, a ResNet-50 model trained with our codebase on OpenAI's 15 million image subset of YFCC achieves 32.7% top-1 accuracy on ImageNet. OpenAI's CLIP model reaches 31.3% when...

Downloads: 1 This Week

Last Update: 3 days ago
See Project
22

Karlo

Text-conditional image generation model based on OpenAI's unCLIP

Karlo is a text-conditional image generation model based on OpenAI's unCLIP architecture with the improvement over the standard super-resolution model from 64px to 256px, recovering high-frequency details only in the small number of denoising steps. We train all components from scratch on 115M image-text pairs including COYO-100M, CC3M, and CC12M. In the case of Prior and Decoder, we use ViT-L/14 provided by OpenAI’s CLIP repository. Unlike the original implementation of unCLIP, we replace...

Downloads: 0 This Week

Last Update: 2023-06-08
See Project
23

Stable Diffusion in Docker

Run the Stable Diffusion releases in a Docker container

... a suitable GPU you can set the options --device cpu and --onnx instead. Since it uses the model, you will need to create a user access token in your Huggingface account. Save the user access token in a file called token.txt and make sure it is available when building the container. Create an image from an existing image and a text prompt. Modify an existing image with its depth map and a text prompt.

Downloads: 0 This Week

Last Update: 2023-09-22
See Project
24

Imagen - Pytorch

Implementation of Imagen, Google's Text-to-Image Neural Network

Implementation of Imagen, Google's Text-to-Image Neural Network that beats DALL-E2, in Pytorch. It is the new SOTA for text-to-image synthesis. Architecturally, it is actually much simpler than DALL-E2. It consists of a cascading DDPM conditioned on text embeddings from a large pre-trained T5 model (attention network). It also contains dynamic clipping for improved classifier-free guidance, noise level conditioning, and a memory-efficient unit design. It appears neither CLIP nor prior network...

Downloads: 0 This Week

Last Update: 2024-10-07
See Project
25

Quote2Image

A Python library for turning text quotes into graphical images

A Python library for turning text quotes into graphical images. Generate an image using RGB background and foreground. The package comes with a built-in GenerateColors function that generates a fg and bg color with the correct amount of luminosity and returns them in tuples. Generate an image using a custom background image. The package comes with a builtin GenerateColors function that generates a fg and bg color with the correct amount of luminosity and returns them in tuples. We can generate...

Downloads: 0 This Week

Last Update: 2023-03-23
See Project