Showing 154 open source projects for "text pdf extrator"

View related business solutions
  • Our Free Plans just got better! | Auth0 by Okta Icon
    Our Free Plans just got better! | Auth0 by Okta

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your secuirty. Auth0 now, thank yourself later.
    Try free now
  • Bright Data - All in One Platform for Proxies and Web Scraping Icon
    Bright Data - All in One Platform for Proxies and Web Scraping

    Say goodbye to blocks, restrictions, and CAPTCHAs

    Bright Data offers the highest quality proxies with automated session management, IP rotation, and advanced web unlocking technology. Enjoy reliable, fast performance with easy integration, a user-friendly dashboard, and enterprise-grade scaling. Powered by ethically-sourced residential IPs for seamless web scraping.
    Get Started
  • 1
    PDFsam

    PDFsam

    PDFsam, a desktop application to split, merge, mix, rotate PDF files

    PDFsam Basic is our free and open-source desktop application to split, merge, extract pages, rotate and mix PDF files. PDFsam Visual is a powerful tool to visually compose PDF files, reorder pages, delete pages, split, merge, rotate, encrypt, decrypt, extract text, convert to grayscale, crop PDF files. PDFsam Basic is written using JavaFX. Since version 4 it is released as a self-contained application and bundles a jlinked JDK while version 3 requires a Java Runtime Environment 8 with JavaFx...
    Downloads: 25 This Week
    Last Update:
    See Project
  • 2
    Teedy

    Teedy

    Lightweight document management system

    ...-oriented document management system, the user interface is not cluttered with buttons and menus and works both on desktop and mobile. Document searching has never been easier thanks to the powerful full-text search engine in Teddy. You can search in images (embedded OCR), DOCX, ODT, TXT, PDF, and more. Verify or validate your documents with people of your organization using workflows.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    GROBID

    GROBID

    A machine learning software for extracting information

    GROBID is a machine learning library for extracting, parsing, and re-structuring raw documents such as PDF into structured XML/TEI encoded documents with a particular focus on technical and scientific publications. First developments started in 2008 as a hobby. In 2011 the tool has been made available in open source. Work on GROBID has been steady as a side project since the beginning and is expected to continue as such. Header extraction and parsing from article in PDF format. The extraction...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4

    pdf-highlight-extractor

    PDF Highlight Extractor

    Java Swing based PDF highlight extraction utility. You can also export highlights to a regular text file. Requires Java 8 or higher installed. You can download Java setup by this link: https://java.com To use this utility right click on pdfhex.jar -> Open With -> Java
    Downloads: 24 This Week
    Last Update:
    See Project
  • Total Network Visibility for Network Engineers and IT Managers Icon
    Total Network Visibility for Network Engineers and IT Managers

    Network monitoring and troubleshooting is hard. TotalView makes it easy.

    This means every device on your network, and every interface on every device is automatically analyzed for performance, errors, QoS, and configuration.
    Learn More
  • 5
    java-pdf-table-extractor-lib

    java-pdf-table-extractor-lib

    Java Pdf Table extraction library

    The command line application is an example of usage of the Java library. The library is based on pdfbox library and works by looking for the layout of each selected pdf page, and looking for table structure patterns. After calling the library (passing the pdf filename, and the page range), the result is a List<PdfTextElement>. PdfTextElement is an interface that has two implementations. * A basic text (outside the tables) * And PdfTextTabulaElement, for table structures...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    OmegaT - multiplatform CAT tool

    OmegaT - multiplatform CAT tool

    The free computer aided translation (CAT) tool for professionals

    OmegaT is a free and open source multiplatform Computer Assisted Translation tool with fuzzy matching, translation memory, keyword search, glossaries, and translation leveraging into updated projects.
    Leader badge
    Downloads: 2,102 This Week
    Last Update:
    See Project
  • 7
    OpenKM Document Management - DMS

    OpenKM Document Management - DMS

    Document Management System and Content Management System

    .... Due to its technological architecture design, OpenKM meets the document management needs of businesses of all sizes (from SMEs to big corporations). Thanks to its elegant and intuitive interface, OpenKM transforms complex operations into easy tasks. The most relevant functions of OpenKM is the indexing of the most common types of files: text, Office, Office 2007, OpenOffice, PDF, HTML, XML, MP3, JPEG, etc. For a complete feature list take a look at http://goo.gl/au8cQy
    Leader badge
    Downloads: 961 This Week
    Last Update:
    See Project
  • 8
    Provides optical character recognition (OCR) solutions for Vietnamese language.
    Leader badge
    Downloads: 687 This Week
    Last Update:
    See Project
  • 9
    iSphere

    iSphere

    The iSphere Project for and RDi 9.5.1.3+

    The iSphere Source Code has been moved to GitHub (https://github.com/taskforce-it/isphere) on January 3rd, 2024. The update site is still available on SourceForge, as is ticket management. iSphere is an open source plug-in for IBM's Rational Developer for i 9.5.1.3+. It delivers high quality extensions to improve developer productivity. IBM's current Eclipse based Integrated Development Environment (IDE) is a huge step beyond SEU, but it still lacks features available only on the green...
    Leader badge
    Downloads: 200 This Week
    Last Update:
    See Project
  • Free and Open Source HR Software Icon
    Free and Open Source HR Software

    OrangeHRM provides a world-class HRIS experience and offers everything you and your team need to be that HR hero you know that you are.

    Give your HR team the tools they need to streamline administrative tasks, support employees, and make informed decisions with the OrangeHRM free and open source HR software.
    Learn More
  • 10
    OCR Manga Reader for Android

    OCR Manga Reader for Android

    Android Manga reader with Japanese OCR and dictionary capabilities

    OCR Manga Reader is a free and open source Android app that allows you to quickly OCR and lookup Japanese words in real-time. It does not have ads or telemetry/spyware and does not require an Internet connection. Supports both EDICT and EPWING dictionaries. Requires Android 4.0 (Ice Cream Sandwich) or higher. See http://ocrmangareaderforandroid.sourceforge.net/ for details.
    Leader badge
    Downloads: 24 This Week
    Last Update:
    See Project
  • 11

    MegaMekLab

    Create meks and more!

    MegaMekLab is a Mek creator program. Currently it can create Meks, vehicles, battlearmor, and conventional infantry. It can also print PDF record sheets for almost all Battletech units. Created units can also be exported to formatted text or html for posting online.
    Leader badge
    Downloads: 20 This Week
    Last Update:
    See Project
  • 12
    jPicEdt

    jPicEdt

    Another drawing editor for LaTeX with PSTricks & TikZ

    jPicEdt is an extensible internationalized vector-based drawing editor for LaTeX and related packages (TikZ, PsTricks,...), written in Java. It is also a library of reusable high-level graphic primitives.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 13
    Delphi : VRCalc++ OOSL (Script) and more

    Delphi : VRCalc++ OOSL (Script) and more

    Delphi : VRCalc++ OOSL & + (Paged List, TextEditor, VRAstroVision ...)

    Vincent Radio {Adrix.NT} Sources Library & Applications : Delphi C++ Java VRCalc++ C# VRCalc++ Object Oriented Scripting Language - Engine Source Pascal Code - Delphi Packages Build Prjs - VRCalc++ Scripted System Std RT Library - Guides & Docs (CHM, PDF, DOCX) - VCL & FMX (FireMonkey) Support - Script Test Code (Lang RTL VCL FMX) - Visual Stage Project : VCL & FMX Paged Lists & Iterators : Delphi C++ Java C# Multi-Dim Arrays & Direct Graph Classes : Delphi C++ Java VRCalc++ C...
    Leader badge
    Downloads: 6 This Week
    Last Update:
    See Project
  • 14
    MyBox

    MyBox

    Easy Tools of PDF, Image, File, Network, Data, and Medias

    javafx-desktop-apps pdf image ocr icc barcode color-palette text bytes markdown html archive compress digest video audio editor converter media https://github.com/Mararsh/MyBox Self-contain packages need not java env nor installation. Jar packages need Java 16 or higher.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    Delphi : VRCalc++ and more Binary Exec

    Delphi : VRCalc++ and more Binary Exec

    Delphi Java - VRCalc++ OOSL (Script) and + (Binary Exec Distro)

    Vincent Radio {Adrix.NT} Embarcadero : Delphi : Executable Binaries Delphi : VRCalc++ Object Oriented Scripting Language : Engine + Ext Libraries VRCalc++ OOSL Visual Stage Project : VCL & FMX (FireMonkey) VRCalc++ Script Executor: - VCL Console - Terminal Console - FMX Console + VRCalc++ OOSL : VR System Scripted Standard Runtime Library Delphi Applics - VR Multi Editor : Smart Text Editor - VR Lazy Code Editor : Smart RTF Multi Lang Code Text Editor - VR Astro Vision...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 16
    Orion Viewer

    Orion Viewer

    Pdf, djvu, xps and cbz file viewer for Android devices

    Orion Viewer - pdf, djvu, xps and cbz file viewer for Android devices based on mupdf and DjVuLibre libraries. IMPORTANT: project moved to GitHub https://github.com/max-kammerer/orion-viewer Google Play: https://play.google.com/store/apps/details?id=universe.constellation.orion.viewer F-Droid: https://f-droid.org/en/packages/universe.constellation.orion.viewer/ Discussion groups: 4PDA (Russian): https://4pda.to/forum/index.php?showtopic=362925 Mobile Read: https://www.mobileread.com...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17
    Common Resource Grep - crgrep

    Common Resource Grep - crgrep

    Common Resource Grep

    CRGREP searches for matching text in databases, various document formats, archives and other difficult to access resources. A command line tool for name and content text matching in database tables, plain files, MS Office documents, PDF, archives, MP3 audio, image meta-data, scanned documents, maven dependencies and web resources. CRGREP will search resources within resources of any arbitrary combination or depth, so text within a document within a zip archive, and so on. Here you...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    ant4docbook

    ant4docbook

    ANT4DOCBOOK is an ANT task for DOCBOOK

    ANT4DOCBOOK is an ANT task for DOCBOOK, a semantic markup language for technical documentation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19

    FOray

    Modular XSL-FO Implementation for Java.

    FOray is an open-source XSL-FO publishing system that is suitable for converting XML content into PDF and other document formats. Although not yet fully conformant with the XSL-FO standard, it is very useful for many applications.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    PdfInspector1

    PdfInspector1

    Application to inspect text and images of pdf books.

    With the application you can open and browse pdf books. In addition you will be able to inspect the codes for characters or locate the images. May be an example of basic use of pdfbox. JDK-17 compatibility
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    DocSearcher
    DocSearcher is a search tool for indexing and searching files on a personal computer. It uses API's to provide search functionality for common document formats. currently: Word, Excel, PDF, Libre/Open/StarOffice, RTF, Text, and HTML
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    RTextDoc

    RTextDoc

    An editor for structured documents

    RTextDoc is an editor for structured text documents such as LaTeX, AsciiDoc, DocBook. RTextDoc has proofreading capabilities: on-the-fly spelling, instant grammar checking and built-in free dictionaries. RTextDoc has syntax highlighting, bracket matching, folding, document structure browser for sections and labels, bookmarks, manager for LaTeX symbols, an editor for mathematical equations,integrated BibTeX database manager and several tools to convert LaTeX to HTML and back. AsciiDoc files...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    MagicalPdfEditor

    MagicalPdfEditor

    This is a small PDF editor based on OpenPdf core and AndroidPdfViewer

    This is a small PDF editor based on OpenPDF Core and AndroidPdfViewer. As there is not many open-source easy working PDF editors and PDF wizards, I decided to create a simple directory to resolve my issues. Here I have worked on two separate cores, add some functionality to them, and combined them together to achieve my target. I am working on this repo, any help will be appreciated. Just clone the project and trance the source code, It's really easy and clear. All functionality...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    PDFLayoutTextStripper

    PDFLayoutTextStripper

    Converts a pdf file into a text file while keeping the layout

    Converts a PDF file into a text file while keeping the layout of the original PDF. Useful to extract the content from a table or a form in a PDF file. PDFLayoutTextStripper is a subclass of PDFTextStripper class (from the Apache PDFBox library).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Next Generation Programming

    Next Generation Programming

    Compose Software Without Writing Any Programing Code

    "Next Generation Programming - Programming Without Coding Software" is a drag-drop wizard for creating simple or complex applications without writing any programming language code The Software is coded/designed with "Java Programming Language" for novice/expert programmers; Programmers can write softwares with visual tools : drag-drop components;visual editors... Programmers can use the software to compose of simple/complex applications : Database programs, circuit design, generate...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next