Showing 20 open source projects for "pdf parser"

View related business solutions
  • Our Free Plans just got better! | Auth0 by Okta Icon
    Our Free Plans just got better! | Auth0 by Okta

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your secuirty. Auth0 now, thank yourself later.
    Try free now
  • Bright Data - All in One Platform for Proxies and Web Scraping Icon
    Bright Data - All in One Platform for Proxies and Web Scraping

    Say goodbye to blocks, restrictions, and CAPTCHAs

    Bright Data offers the highest quality proxies with automated session management, IP rotation, and advanced web unlocking technology. Enjoy reliable, fast performance with easy integration, a user-friendly dashboard, and enterprise-grade scaling. Powered by ethically-sourced residential IPs for seamless web scraping.
    Get Started
  • 1
    pdf-extractor

    pdf-extractor

    Node.js module for rendering pdf pages to images, svgs and HTML files

    Pdf-extractor is a wrapper around pdf.js to generate images, svgs, html files, text files and json files from a pdf on node.js. A DOM Canvas is used to render and export the graphical layer of the pdf. Canvas exports *.png as a default but can be extended to export to other file types like .jpg. Pdf objects are converted to svg using the SVGGraphics parser of pdf.js. Pdf text is converted to HTML. This can be used as a (transparent) layer over the image to enable text selection. Pdf text...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    Jupyter Notebook Tools for Sphinx

    Jupyter Notebook Tools for Sphinx

    Sphinx source parser for Jupyter notebooks

    nbsphinx is a Sphinx extension that provides a source parser for *.ipynb files. Custom Sphinx directives are used to show Jupyter Notebook code cells (and of course their results) in both HTML and LaTeX output. Un-evaluated notebooks – i.e. notebooks without stored output cells – will be automatically executed during the Sphinx build process.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    LlamaParse

    LlamaParse

    Parse files for optimal RAG

    LlamaParse is a GenAI-native document parser that can parse complex document data for any downstream LLM use case (RAG, agents). Load in 160+ data sources and data formats, from unstructured, and semi-structured, to structured data (API's, PDFs, documents, SQL, etc.) Store and index your data for different use cases. Integrate with 40+ vector stores, document stores, graph stores, and SQL db providers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    pdf-editor

    pdf-editor

    Edit your PDFs without needing a subscription or creating accounts

    Edit your PDFs without needing a subscription or creating accounts. Add a GUI/Turn it into a web application. Add a parser for the command line to do multiple commands at once e.g. merge (cut pdf1) pdf2. Tested working with Python 3.8.5. Install venv (py -3.8 -m pip install virtualenv). PDF and Word documents are binary files, which makes them much more complex than plaintext files. In addition to text, they store lots of font, color, and layout information. If you want your programs to read...
    Downloads: 3 This Week
    Last Update:
    See Project
  • Powerful small business accounting software Icon
    Powerful small business accounting software

    For small businesses looking for desktop accounting software

    With AccountEdge, business owners can organize, process, and report on their financial information so they can focus on their business. Features include: accounting, integrated payroll, sales and purchases, contact management, inventory tracking, time billing, and more.
    Learn More
  • 5
    Publish.jl

    Publish.jl

    A universal document authoring package for Julia

    A universal document authoring package for Julia. This is a package for Julia that provides a general framework for writing prose, technical documentation is its focus, though it is general enough to be applied to any kind of written document.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Delphi : VRCalc++ OOSL (Script) and more

    Delphi : VRCalc++ OOSL (Script) and more

    Delphi : VRCalc++ OOSL & + (Paged List, TextEditor, VRAstroVision ...)

    Vincent Radio {Adrix.NT} Sources Library & Applications : Delphi C++ Java VRCalc++ C# VRCalc++ Object Oriented Scripting Language - Engine Source Pascal Code - Delphi Packages Build Prjs - VRCalc++ Scripted System Std RT Library - Guides & Docs (CHM, PDF, DOCX) - VCL & FMX (FireMonkey) Support - Script Test Code (Lang RTL VCL FMX) - Visual Stage Project : VCL & FMX Paged Lists & Iterators : Delphi C++ Java C# Multi-Dim Arrays & Direct Graph Classes : Delphi C++ Java VRCalc++ C...
    Leader badge
    Downloads: 6 This Week
    Last Update:
    See Project
  • 7
    Swagger2Markup

    Swagger2Markup

    Swagger to AsciiDoc or Markdown converter

    ... file into several AsciiDoc or GitHub Flavored Markdown documents which can be combined with hand-written documentation. The Swagger source file can be located locally or remotely via HTTP. Swagger2Markup supports the Swagger 1.2 and 2.0 specifications. Internally it uses the official swagger-parser and my markup-document-builder. You can use Swagger2Markup to convert your contract-first Swagger YAML file into a human-readable format.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    Estimate

    Estimate

    Web based Cost Estimation, Material Takeoff and Reconciliation Tool

    "Estimate" is an Open Source web based Construction Cost Estimating Software designed for medium and large Civil Construction and EPC (Engineering Procurement and Construction) companies. Features include Management of Schedule of Rates, Analysis of Rates, Project Estimation (Definitive and Control), Tender Evaluation, Cost Sheet preparation, BOQ Generation, Audit and Projection. Estimate is suitable for a wide variety of trades and businesses, including but not limited to:...
    Leader badge
    Downloads: 68 This Week
    Last Update:
    See Project
  • 9

    pdfsummary

    Summarize PDF file contents by page.

    Uses a modified form of Didier Stevens PDF parser to get object descriptions by page and then summarizes them.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Field Service Management Software | BlueFolder Icon
    Field Service Management Software | BlueFolder

    Maximize technician productivity with intuitive field service software

    Track all your service data in one easy-to-use system, enabling your team to move faster and generate more revenue for your bottom line.
    Learn More
  • 10
    CaLi2CoPi is a multiplatform PDF parser library programmed in PostScript. Works with several specialized switch in order to verify, add, extract or change any PDF content. Also supports online execution on web based user interface via Ghostscript.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    HoneyDrive

    HoneyDrive

    Honeypots in a box! HoneyDrive is the premier honeypot bundle distro.

    HoneyDrive is the premier honeypot Linux distro. It is a virtual appliance (OVA) with Xubuntu Desktop 12.04.4 LTS edition installed. It contains over 10 pre-installed and pre-configured honeypot software packages such as Kippo SSH honeypot, Dionaea and Amun malware honeypots, Honeyd low-interaction honeypot, Glastopf web honeypot and Wordpot, Conpot SCADA/ICS honeypot, Thug and PhoneyC honeyclients and more. Additionally it includes many useful pre-configured scripts and utilities to...
    Leader badge
    Downloads: 29 This Week
    Last Update:
    See Project
  • 12
    phpShare&Search

    phpShare&Search

    Group file share with advanced text parsing capability for easy search

    Originally created as a church resource sharing system, phpShare&Search allows users to create accounts, share documents, search documents, and like or report documents. phpShare&Search's power comes from its advanced document parser which extracts text from .PDF, .TXT, .DOC, and .DOCX files and its community features of liking resources and reporting them as inappropriate or SPAM. Users also subscribe to weekly updates of new content. User's may choose to download and host/install...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13

    QueLang

    QueLang is a designing tool to use for Questionnaire Design.

    This is the first implementation of QueLang. QueLang is a language I designed for Questionnaire Design and Implementation. This software can compile your code (written in .ql text files) into a special .qlc format (a kind of database). Then it can read those .qlc files to open them in viewer and export them to PDF format. It can be also used for exam and test designing! Tested on: -Linux Ubuntu 12.04 -Windows 7 64-bit QueLang can run by double clicking the .jar (or .exe) file...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14

    Andoffline

    A toolkit for some Android sms/call Apps, base64 encoder, vcf parser a

    MOVED TO: https://github.com/fulvio999/Andoffline Feature: Browser for exported SMS, CALL and CONTACT from Android Phone Save to PDF file for exported SMS, CALL and CONTACT, VCF parser Support tool for: http://android.riteshsahu.com/apps/sms-backup-restore http://android.riteshsahu.com/apps/call-logs-backup-restore Image base64 encoder/decoder ** Allow to execute job/script execution from SMS sent from remote phone (without internet connection): - connect the phone to PC...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    cextools

    Command line helpers for Conexp files.

    Some small command line programs and a file parser for Concept Explorer (conexp) written in C++. Currently features include: Converters from concept explorer into PDF, PostScript, SVG and PovRay, a modified 3D Freese layout.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    ScientificPdfParser

    Parses scientific articles from PDF and marks the meta data.

    .... The project contains three runnable classes that can work on given PDFs in batch mode via threading: a) BatchHeuristic: A parser that uses defined heuristics and rules. Especially applicable for articles with a broad set of layouts (e.g. PeDocs, http://www.pedocs.de/). b) BatchHybrid: A parser that uses machine learning (Naive Bayes) to find the correct element. Useful for e.g. ACL. c) ModelGenerator: Generates a training model, used by BatchHybrid, from given PDF and XML file
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    A full LR(1) parser generator system with many advanced features.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    ** Guys I have built a much more powerful Fully Featured CMS system at: https://github.com/MacdonaldRobinson/FlexDotnetCMS Macs CMS is a Flat File ( XML and SQLite ) based AJAX Content Management System. It focuses mainly on the Edit In Place editing concept. It comes with a built in blog with moderation support, user manager section, roles manager section, SEO / SEF URL
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    QuickDoc is a java document parser that reads documents from plain text files using a simple language and exports the document to other formats like PDF, HTML, Java Help and XML.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Automatic generation of documentation on Delphi projects from source code. Distinctive features are exact parsing gathering lots of information and a division of the parser and configurable generators (HTML, Win- & HTML-Help, PDF, LaTeX, XMI export)
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next