Search Results for "ocr and data exporter"

Sort By:

Showing 85 open source projects for "ocr and data exporter"

View related business solutions

Bright Data - All in One Platform for Proxies and Web Scraping
Say goodbye to blocks, restrictions, and CAPTCHAs

Bright Data offers the highest quality proxies with automated session management, IP rotation, and advanced web unlocking technology. Enjoy reliable, fast performance with easy integration, a user-friendly dashboard, and enterprise-grade scaling. Powered by ethically-sourced residential IPs for seamless web scraping.

Get Started
Top-Rated Free CRM Software
216,000+ customers in over 135 countries grow their businesses with HubSpot

HubSpot is an AI-powered customer platform with all the software, integrations, and resources you need to connect your marketing, sales, and customer service. HubSpot's connected platform enables you to grow your business faster by focusing on what matters most: your customers.

Get started free
1

Node exporter

Exporter for machine metrics

Power your metrics and alerting with a leading open-source monitoring solution. Prometheus implements a highly dimensional data model. Time series are identified by a metric name and a set of key-value pairs. PromQL allows slicing and dicing of collected time series data in order to generate ad-hoc graphs, tables, and alerts. Prometheus has multiple modes for visualizing data: a built-in expression browser, Grafana integration, and a console template language. Prometheus stores time series...

Downloads: 12 This Week

Last Update: 2024-07-14
See Project
2
$Rapid LaTeX OCR$

Rapid LaTeX OCR

Formula recognition based on LaTeX-OCR and ONNXRuntime

Formula recognition based on LaTeX-OCR and ONNXRuntime. rapid_latex_ocr is a tool to convert formula images to latex format. The reasoning code in the repo is modified from LaTeX-OCR, the model has all been converted to ONNX format, and the reasoning code has been simplified, Inference is faster and easier to deploy. The repo only has codes based on ONNXRuntime or OpenVINO inference in onnx format and does not contain training model codes. If you want to train your own model, please move...

Downloads: 4 This Week

Last Update: 2024-09-09
See Project
3

OCRmyPDF

OCRmyPDF adds an OCR text layer to scanned PDF files

OCRmyPDF adds an optical character recognition (OCR) text layer to scanned PDF files, allowing them to be searched. PDF is the best format for storing and exchanging scanned documents. Unfortunately, PDFs can be difficult to modify. OCRmyPDF makes it easy to apply image processing and OCR (recognized, searchable text) to existing PDFs.

Downloads: 28 This Week

Last Update: 5 days ago
See Project
4

Super-PDF-Editor

World's most comprehensive, powerful, process-based PDF editor

... performs in pdf files, scanned pdf files and any pdf files. OCR performs in image files, and supports multiple image formats. Auto and manual image enhancement for better OCR accuracy and quality. Supports 165+ languages with three languages data set. Use Multiple Languages at once. International Languages: 127 Languages, High, Medium, and Fast Quality. Scanned Images (jpg, png, gif, tiff, bmp) Multi-Page and TIFF and GIF, Scanned PDFs.

3 Reviews

Downloads: 40 This Week

Last Update: 2023-02-02
See Project
Red Hat Ansible Automation Platform on Microsoft Azure
Red Hat Ansible Automation Platform on Azure allows you to quickly deploy, automate, and manage resources securely and at scale.

Deploy Red Hat Ansible Automation Platform on Microsoft Azure for a strategic automation solution that allows you to orchestrate, govern and operationalize your Azure environment.

Learn More
5
$1258baefa1ba45684ba74f4fc70f7d7dbd4af9b2aa997254aa88696b73b49100?&w=120$

Texify

Math OCR model that outputs LaTeX and markdown

Texify is an OCR model that converts images or pdfs containing math into markdown and LaTeX that can be rendered by MathJax ($$ and $ are delimiters). It can run on CPU, GPU, or MPS.

Downloads: 2 This Week

Last Update: 3 days ago
See Project
6

Super-PDF-Editor-Lite

World's most comprehensive, powerful, process-based PDF editor

.... Easy pdf imposition, booklet, n ups pages, and more. OCR performs in pdf files, scanned pdf files and any pdf files. OCR performs in image files, and supports multiple image formats. Auto and manual image enhancement for better OCR accuracy and quality. Supports 165+ languages with three languages data set. Use Multiple Languages at once. International Languages: 127 Languages, High, Medium, and Fast Quality. Scanned Images (jpg, png, gif, tiff, bmp) Multi-Page and TIFF and GIF, Scanned PDFs.

3 Reviews

Downloads: 23 This Week

Last Update: 2023-02-02
See Project
7

Paperless-ngx

A community-supported supercharged version of paperless

Paperless-ngx is a community-supported open-source document management system that transforms your physical documents into a searchable online archive so you can keep, well, less paper.

Downloads: 3 This Week

Last Update: 4 days ago
See Project
8

Doctor Dok

Doctor Dok is an AI based medical data framework

... - digitalized - accessible anywhere from Mobile or Desktop. Using AI you may translate your health records to one of 50+ languages - making abroad health services more accessible. Doctor Dok uses AI to OCR even a hardly readable photo of your health documents. Then stores it in the cloud with Zero Trust Security architecture (nobody but You can decrypt the data).

Downloads: 2 This Week

Last Update: 2024-10-18
See Project
9

docconv

Converts PDF, DOC, DOCX, XML, HTML, RTF, etc to plain text

A Go wrapper library to convert PDF, DOC, DOCX, XML, HTML, RTF, ODT, Pages documents and images (see optional dependencies below) to plain text. See go help install for details on the installation location of the installed docd executable. Make sure that the full path to the executable is in your PATH environment variable. To add image support to the docconv library you first need to install and build gosseract. Now you can add -tags ocr to any go command when building/fetching/testing docconv...

Downloads: 1 This Week

Last Update: 2023-10-30
See Project
Save hundreds of developer hours with components built for SaaS applications.
The #1 Embedded Analytics Solution for SaaS Teams.

Whether you want full self-service analytics or simpler multi-tenant security, Qrvey’s embeddable components and scalable data management remove the guess work.

Try Developer Playground
10

Amplify

Automatic enrichment, enhancement, and explanation of your data

Amplify attaches afterburners to your data. Amplify explains metadata extraction, classification, tagging, and reporting. Eriches derivative data generation like thumbnails, previews, conversions, etc. Enhances batteries-included value-adds like data quality reports, image augmentation, OCR, translations, etc. Amplify leverages the decentralized compute provided by Bacalhau to magically enrich your data. A built-in suite of pipelines decides what your data is and how to best improve upon...

Downloads: 0 This Week

Last Update: 2023-07-28
See Project
11

Tarsier

Vision utilities for web interaction agents

... as buttons, links, or input fields that are visible on the page; Tarsier can also tag all textual elements if you pass tag_text_elements=True. Furthermore, we've developed an OCR algorithm to convert a page screenshot into a whitespace-structured string (almost like ASCII art) that an LLM even without vision can understand. Since current vision-language models still lack fine-grained representations needed for web interaction tasks, this is critical.

Downloads: 1 This Week

Last Update: 2024-09-20
See Project
12

EcoPaste

Open source clipboard management tools for Windows, Macos and Linux

Open source clipboard management tools for Windows, macOS, and Linux. Built with Tauri, the application is lightweight and refined, consuming minimal resources. It also delivers a uniform user experience across both Windows, MacOS, and Linux platforms. The application is resident in the background, wakes up with one click through custom shortcut keys, saves time, and improves efficiency. Allows you to bookmark clipboard content for easy and fast access. Whether it's crucial data for work...

Downloads: 1 This Week

Last Update: 2 days ago
See Project
13

DeepDetect

Deep Learning API and Server in C++14 support for Caffe, PyTorch

... of image tagging, object detection, segmentation, OCR, Audio, Video, Text classification, CSV for tabular data and time series. Neural network templates for the most effective architectures for GPU, CPU, and Embedded devices. Training in a few hours and with small data thanks to 25+ pre-trained models. Full Open Source, with an ecosystem of tools (API clients, video, annotation, ...) Fast Server written in pure C++, a single codebase for Cloud, Desktop & Embedded.

Downloads: 0 This Week

Last Update: 2024-01-10
See Project
14

DeckTape

PDF exporter for HTML presentations

DeckTape is a high-quality PDF exporter for HTML presentation frameworks. DeckTape is built on top of Puppeteer which relies on Google Chrome for laying out and rendering Web pages and provides a headless Chrome instance scriptable with a JavaScript API. DeckTape currently supports the following presentation frameworks out of the box. DeckTape also provides a generic command that works by emulating the end-user interaction, allowing it to be used to convert presentations from virtually any kind...

Downloads: 0 This Week

Last Update: 2024-08-16
See Project
15

Data Entry System v2

Framework with web data entry, OCR & designer

Framework with web data entry,, verification, OCR & project designer. It works with Docker or Debian dedicated server. Fast and Optimized version.

Downloads: 0 This Week

Last Update: 2023-03-12
See Project
16

TagUI

Free RPA tool by AI Singapore

Write flows in simple TagUI language and automate away repetitive time-consuming tasks on your computer. Tasks include those on websites (native support for Chrome and Edge), desktop apps, or the command line. The TagUI project is open-source and free forever. It's easy to setup and use, and works on Windows, macOS and Linux. Besides English, flows can be written in 22 other languages, so you can do RPA using your native language. Check out this demo video automating data collection in 4...

Downloads: 3 This Week

Last Update: 2023-06-12
See Project
17

MySQL 2 Excel Exporter 3-105 [I.S.A]

MySQL 2 Excel: Exporter 3-105 [Improved.Simplified.Alternative]

'MySQL2Excel_Exporter' is an desktop application developed using python 3.6.8 and other add-on libaries. The application exports MySql tables as a excel file. MySQL2Excel_Exporter has two parts: 1) Export - converts all records in mySQL table into excel file 2) Export Filter - converts selected recorerds in mySQL table into excel file Compatible only for windows OS.

Downloads: 0 This Week

Last Update: 2023-06-15
See Project
18

Ox-Hugo

A carefully crafted Org exporter back-end for Hugo

ox-hugo is an Org exporter backend that exports Org to Hugo-compatible Markdown (Blackfriday) and also generates the front-matter (in TOML or YAML format). The ox-hugo backend extends from a parent backend ox-blackfriday.el. The latter is the one that primarily does the Blackfriday-friendly Markdown content generation. The main job of ox-hugo is to generate the front-matter for each exported content file, and then append that generated Markdown to it. There are, though, few functions that ox...

Downloads: 0 This Week

Last Update: 2022-10-12
See Project
19

tabtoy

High performance tabular data exporter

High-performance tabular data export tool. Support Xlsx/CSV as mixed input of tabular data. Support JSON/Golang/C#/Java/Lua/binary source, data, and type output. Automatic cell data format checking, accurate to cell errors. Support predefined enumeration can use Chinese enumeration type. Support table splitting, support multi-person collaboration. Support KV configuration table, convenient to use the table as a configuration file. Multi-core concurrent export, cache acceleration, export...

Downloads: 0 This Week

Last Update: 2022-06-09
See Project
20

NAPS2 - Not Another PDF Scanner

Scan documents to PDF and other file types, as simply as possible.

Visit NAPS2's home page at www.naps2.com. NAPS2 is a document scanning application with a focus on simplicity and ease of use. Scan your documents from WIA- and TWAIN-compatible scanners, organize the pages as you like, and save them as PDF, TIFF, JPEG, PNG, and other file formats. Available on Windows, Mac, and Linux. NAPS2 is currently available in over 40 different languages. Want to see NAPS2 in your preferred language? Help translate! See the wiki for more details.

145 Reviews

Downloads: 822 This Week

Last Update: 2024-10-06
See Project
21

Form OCR Testing Tool

A set of tools to use in Microsoft Azure Form Recognizer

An open source labeling tool for Form Recognizer, part of the Form OCR Test Toolset (FOTT). This is a MAIN branch of the Tool. It contains all the newest features available. This is NOT the most stable version since this is a preview. The purpose of this repo is to allow customers to test the tools available when working with Microsoft Forms and OCR services. Currently, Labeling tool is the first tool we present here. Users could provide feedback, and make customer-specific changes to meet...

Downloads: 0 This Week

Last Update: 2023-06-12
See Project
22

Tile Pattern Exporter

Tile large format PNG patterns into print-at-home PDF pages

You can tile large format PNG patterns into print-at-home PDF pages. Created for LearnMYOG. This set of scripts automates the tiling of large format PNG files into letter(A4), tabloid(A3), and A0 sized PDF pages with print margins, alignment and cut guides, page numbers, and a copyright stamp to each page. For best results, input an exported PNG with size in multiples of 7.5 inches wide and 10 inches tall @ 300dpi.

Downloads: 0 This Week

Last Update: 2023-02-02
See Project
23

Super PDF Editor Lite

Create, Edit, Delete, Organize , Convert, Export, Secure & Sign.

Super PDF Editor Lite is a robust and versatile PDF management software designed to streamline your document handling needs. Whether you're an individual, student, or professional, this software offers a comprehensive suite of tools to create, edit, and manage your PDFs with ease. Key Features: Extract Page: Easily extract specific pages from a PDF document. Split Page: Divide a single PDF page into multiple smaller pages. Rotate Page: Rotate pages to adjust their orientation. Merge...

6 Reviews

Downloads: 17 This Week

Last Update: 2024-08-11
See Project
24

LayoutParser

A Unified Toolkit for Deep Learning Based Document Image Analysis

With the help of state-of-the-art deep learning models, Layout Parser enables extracting complicated document structures using only several lines of code. This method is also more robust and generalizable as no sophisticated rules are involved in this process. A complete instruction for installing the main Layout Parser library and auxiliary components. Learn how to load DL Layout models and use them for layout detection. The full list of layout models currently available in Layout Parser....

Downloads: 0 This Week

Last Update: 2022-08-04
See Project
25

DocWire SDK

Award-winning modern data processing in C++17/20

DocWire SDK, a standout C++17/20 data processing tool, has received award from SourceForge and strong backing from Microsoft. It handles nearly 100 file types, empowering efficient text extraction, web data extraction, and document analysis. The upcoming integration of C++17 and C++20 will bring advanced functionalities, particularly in areas like HTTP capabilities and web data extraction. For businesses, the shift to DocWire SDK signifies a leap forward. It promises comprehensive document...

2 Reviews

Downloads: 6 This Week

Last Update: 2023-11-13
See Project