Showing 85 open source projects for "ocr and data exporter"

View related business solutions
  • Bright Data - All in One Platform for Proxies and Web Scraping Icon
    Bright Data - All in One Platform for Proxies and Web Scraping

    Say goodbye to blocks, restrictions, and CAPTCHAs

    Bright Data offers the highest quality proxies with automated session management, IP rotation, and advanced web unlocking technology. Enjoy reliable, fast performance with easy integration, a user-friendly dashboard, and enterprise-grade scaling. Powered by ethically-sourced residential IPs for seamless web scraping.
    Get Started
  • Top-Rated Free CRM Software Icon
    Top-Rated Free CRM Software

    216,000+ customers in over 135 countries grow their businesses with HubSpot

    HubSpot is an AI-powered customer platform with all the software, integrations, and resources you need to connect your marketing, sales, and customer service. HubSpot's connected platform enables you to grow your business faster by focusing on what matters most: your customers.
    Get started free
  • 1
    Node exporter

    Node exporter

    Exporter for machine metrics

    Power your metrics and alerting with a leading open-source monitoring solution. Prometheus implements a highly dimensional data model. Time series are identified by a metric name and a set of key-value pairs. PromQL allows slicing and dicing of collected time series data in order to generate ad-hoc graphs, tables, and alerts. Prometheus has multiple modes for visualizing data: a built-in expression browser, Grafana integration, and a console template language. Prometheus stores time series...
    Downloads: 12 This Week
    Last Update:
    See Project
  • 2
    Rapid LaTeX OCR

    Rapid LaTeX OCR

    Formula recognition based on LaTeX-OCR and ONNXRuntime

    Formula recognition based on LaTeX-OCR and ONNXRuntime. rapid_latex_ocr is a tool to convert formula images to latex format. The reasoning code in the repo is modified from LaTeX-OCR, the model has all been converted to ONNX format, and the reasoning code has been simplified, Inference is faster and easier to deploy. The repo only has codes based on ONNXRuntime or OpenVINO inference in onnx format and does not contain training model codes. If you want to train your own model, please move...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 3
    OCRmyPDF

    OCRmyPDF

    OCRmyPDF adds an OCR text layer to scanned PDF files

    OCRmyPDF adds an optical character recognition (OCR) text layer to scanned PDF files, allowing them to be searched. PDF is the best format for storing and exchanging scanned documents. Unfortunately, PDFs can be difficult to modify. OCRmyPDF makes it easy to apply image processing and OCR (recognized, searchable text) to existing PDFs.
    Downloads: 28 This Week
    Last Update:
    See Project
  • 4
    Super-PDF-Editor

    Super-PDF-Editor

    World's most comprehensive, powerful, process-based PDF editor

    ... performs in pdf files, scanned pdf files and any pdf files. OCR performs in image files, and supports multiple image formats. Auto and manual image enhancement for better OCR accuracy and quality. Supports 165+ languages with three languages data set. Use Multiple Languages at once. International Languages: 127 Languages, High, Medium, and Fast Quality. Scanned Images (jpg, png, gif, tiff, bmp) Multi-Page and TIFF and GIF, Scanned PDFs.
    Downloads: 40 This Week
    Last Update:
    See Project
  • Red Hat Ansible Automation Platform on Microsoft Azure Icon
    Red Hat Ansible Automation Platform on Microsoft Azure

    Red Hat Ansible Automation Platform on Azure allows you to quickly deploy, automate, and manage resources securely and at scale.

    Deploy Red Hat Ansible Automation Platform on Microsoft Azure for a strategic automation solution that allows you to orchestrate, govern and operationalize your Azure environment.
    Learn More
  • 5
    Texify

    Texify

    Math OCR model that outputs LaTeX and markdown

    Texify is an OCR model that converts images or pdfs containing math into markdown and LaTeX that can be rendered by MathJax ($$ and $ are delimiters). It can run on CPU, GPU, or MPS.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    Super-PDF-Editor-Lite

    Super-PDF-Editor-Lite

    World's most comprehensive, powerful, process-based PDF editor

    .... Easy pdf imposition, booklet, n ups pages, and more. OCR performs in pdf files, scanned pdf files and any pdf files. OCR performs in image files, and supports multiple image formats. Auto and manual image enhancement for better OCR accuracy and quality. Supports 165+ languages with three languages data set. Use Multiple Languages at once. International Languages: 127 Languages, High, Medium, and Fast Quality. Scanned Images (jpg, png, gif, tiff, bmp) Multi-Page and TIFF and GIF, Scanned PDFs.
    Downloads: 23 This Week
    Last Update:
    See Project
  • 7
    Paperless-ngx

    Paperless-ngx

    A community-supported supercharged version of paperless

    Paperless-ngx is a community-supported open-source document management system that transforms your physical documents into a searchable online archive so you can keep, well, less paper.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    Doctor Dok

    Doctor Dok

    Doctor Dok is an AI based medical data framework

    ... - digitalized - accessible anywhere from Mobile or Desktop. Using AI you may translate your health records to one of 50+ languages - making abroad health services more accessible. Doctor Dok uses AI to OCR even a hardly readable photo of your health documents. Then stores it in the cloud with Zero Trust Security architecture (nobody but You can decrypt the data).
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    docconv

    docconv

    Converts PDF, DOC, DOCX, XML, HTML, RTF, etc to plain text

    A Go wrapper library to convert PDF, DOC, DOCX, XML, HTML, RTF, ODT, Pages documents and images (see optional dependencies below) to plain text. See go help install for details on the installation location of the installed docd executable. Make sure that the full path to the executable is in your PATH environment variable. To add image support to the docconv library you first need to install and build gosseract. Now you can add -tags ocr to any go command when building/fetching/testing docconv...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Save hundreds of developer hours with components built for SaaS applications. Icon
    Save hundreds of developer hours with components built for SaaS applications.

    The #1 Embedded Analytics Solution for SaaS Teams.

    Whether you want full self-service analytics or simpler multi-tenant security, Qrvey’s embeddable components and scalable data management remove the guess work.
    Try Developer Playground
  • 10
    Amplify

    Amplify

    Automatic enrichment, enhancement, and explanation of your data

    Amplify attaches afterburners to your data. Amplify explains metadata extraction, classification, tagging, and reporting. Eriches derivative data generation like thumbnails, previews, conversions, etc. Enhances batteries-included value-adds like data quality reports, image augmentation, OCR, translations, etc. Amplify leverages the decentralized compute provided by Bacalhau to magically enrich your data. A built-in suite of pipelines decides what your data is and how to best improve upon...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Tarsier

    Tarsier

    Vision utilities for web interaction agents

    ... as buttons, links, or input fields that are visible on the page; Tarsier can also tag all textual elements if you pass tag_text_elements=True. Furthermore, we've developed an OCR algorithm to convert a page screenshot into a whitespace-structured string (almost like ASCII art) that an LLM even without vision can understand. Since current vision-language models still lack fine-grained representations needed for web interaction tasks, this is critical.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    EcoPaste

    EcoPaste

    Open source clipboard management tools for Windows, Macos and Linux

    Open source clipboard management tools for Windows, macOS, and Linux. Built with Tauri, the application is lightweight and refined, consuming minimal resources. It also delivers a uniform user experience across both Windows, MacOS, and Linux platforms. The application is resident in the background, wakes up with one click through custom shortcut keys, saves time, and improves efficiency. Allows you to bookmark clipboard content for easy and fast access. Whether it's crucial data for work...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    DeepDetect

    DeepDetect

    Deep Learning API and Server in C++14 support for Caffe, PyTorch

    ... of image tagging, object detection, segmentation, OCR, Audio, Video, Text classification, CSV for tabular data and time series. Neural network templates for the most effective architectures for GPU, CPU, and Embedded devices. Training in a few hours and with small data thanks to 25+ pre-trained models. Full Open Source, with an ecosystem of tools (API clients, video, annotation, ...) Fast Server written in pure C++, a single codebase for Cloud, Desktop & Embedded.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    DeckTape

    DeckTape

    PDF exporter for HTML presentations

    DeckTape is a high-quality PDF exporter for HTML presentation frameworks. DeckTape is built on top of Puppeteer which relies on Google Chrome for laying out and rendering Web pages and provides a headless Chrome instance scriptable with a JavaScript API. DeckTape currently supports the following presentation frameworks out of the box. DeckTape also provides a generic command that works by emulating the end-user interaction, allowing it to be used to convert presentations from virtually any kind...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Data Entry System v2

    Data Entry System v2

    Framework with web data entry, OCR & designer

    Framework with web data entry,, verification, OCR & project designer. It works with Docker or Debian dedicated server. Fast and Optimized version.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    TagUI

    TagUI

    Free RPA tool by AI Singapore

    Write flows in simple TagUI language and automate away repetitive time-consuming tasks on your computer. Tasks include those on websites (native support for Chrome and Edge), desktop apps, or the command line. The TagUI project is open-source and free forever. It's easy to setup and use, and works on Windows, macOS and Linux. Besides English, flows can be written in 22 other languages, so you can do RPA using your native language. Check out this demo video automating data collection in 4...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17
    MySQL 2 Excel Exporter 3-105 [I.S.A]

    MySQL 2 Excel Exporter 3-105 [I.S.A]

    MySQL 2 Excel: Exporter 3-105 [Improved.Simplified.Alternative]

    'MySQL2Excel_Exporter' is an desktop application developed using python 3.6.8 and other add-on libaries. The application exports MySql tables as a excel file. MySQL2Excel_Exporter has two parts: 1) Export - converts all records in mySQL table into excel file 2) Export Filter - converts selected recorerds in mySQL table into excel file Compatible only for windows OS.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Ox-Hugo

    Ox-Hugo

    A carefully crafted Org exporter back-end for Hugo

    ox-hugo is an Org exporter backend that exports Org to Hugo-compatible Markdown (Blackfriday) and also generates the front-matter (in TOML or YAML format). The ox-hugo backend extends from a parent backend ox-blackfriday.el. The latter is the one that primarily does the Blackfriday-friendly Markdown content generation. The main job of ox-hugo is to generate the front-matter for each exported content file, and then append that generated Markdown to it. There are, though, few functions that ox...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    tabtoy

    tabtoy

    High performance tabular data exporter

    High-performance tabular data export tool. Support Xlsx/CSV as mixed input of tabular data. Support JSON/Golang/C#/Java/Lua/binary source, data, and type output. Automatic cell data format checking, accurate to cell errors. Support predefined enumeration can use Chinese enumeration type. Support table splitting, support multi-person collaboration. Support KV configuration table, convenient to use the table as a configuration file. Multi-core concurrent export, cache acceleration, export...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    NAPS2 - Not Another PDF Scanner

    NAPS2 - Not Another PDF Scanner

    Scan documents to PDF and other file types, as simply as possible.

    Visit NAPS2's home page at www.naps2.com. NAPS2 is a document scanning application with a focus on simplicity and ease of use. Scan your documents from WIA- and TWAIN-compatible scanners, organize the pages as you like, and save them as PDF, TIFF, JPEG, PNG, and other file formats. Available on Windows, Mac, and Linux. NAPS2 is currently available in over 40 different languages. Want to see NAPS2 in your preferred language? Help translate! See the wiki for more details.
    Leader badge
    Downloads: 822 This Week
    Last Update:
    See Project
  • 21
    Form OCR Testing Tool

    Form OCR Testing Tool

    A set of tools to use in Microsoft Azure Form Recognizer

    An open source labeling tool for Form Recognizer, part of the Form OCR Test Toolset (FOTT). This is a MAIN branch of the Tool. It contains all the newest features available. This is NOT the most stable version since this is a preview. The purpose of this repo is to allow customers to test the tools available when working with Microsoft Forms and OCR services. Currently, Labeling tool is the first tool we present here. Users could provide feedback, and make customer-specific changes to meet...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Tile Pattern Exporter

    Tile Pattern Exporter

    Tile large format PNG patterns into print-at-home PDF pages

    You can tile large format PNG patterns into print-at-home PDF pages. Created for LearnMYOG. This set of scripts automates the tiling of large format PNG files into letter(A4), tabloid(A3), and A0 sized PDF pages with print margins, alignment and cut guides, page numbers, and a copyright stamp to each page. For best results, input an exported PNG with size in multiples of 7.5 inches wide and 10 inches tall @ 300dpi.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Super PDF Editor Lite

    Super PDF Editor Lite

    Create, Edit, Delete, Organize , Convert, Export, Secure & Sign.

    Super PDF Editor Lite is a robust and versatile PDF management software designed to streamline your document handling needs. Whether you're an individual, student, or professional, this software offers a comprehensive suite of tools to create, edit, and manage your PDFs with ease. Key Features: Extract Page: Easily extract specific pages from a PDF document. Split Page: Divide a single PDF page into multiple smaller pages. Rotate Page: Rotate pages to adjust their orientation. Merge...
    Leader badge
    Downloads: 17 This Week
    Last Update:
    See Project
  • 24
    LayoutParser

    LayoutParser

    A Unified Toolkit for Deep Learning Based Document Image Analysis

    With the help of state-of-the-art deep learning models, Layout Parser enables extracting complicated document structures using only several lines of code. This method is also more robust and generalizable as no sophisticated rules are involved in this process. A complete instruction for installing the main Layout Parser library and auxiliary components. Learn how to load DL Layout models and use them for layout detection. The full list of layout models currently available in Layout Parser....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    DocWire SDK

    DocWire SDK

    Award-winning modern data processing in C++17/20

    DocWire SDK, a standout C++17/20 data processing tool, has received award from SourceForge and strong backing from Microsoft. It handles nearly 100 file types, empowering efficient text extraction, web data extraction, and document analysis. The upcoming integration of C++17 and C++20 will bring advanced functionalities, particularly in areas like HTTP capabilities and web data extraction. For businesses, the shift to DocWire SDK signifies a leap forward. It promises comprehensive document...
    Downloads: 6 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • Next