Showing 23 open source projects for "html pdf"

View related business solutions
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 1
    Tesseract OCR

    Tesseract OCR

    Open Source OCR Engine

    ...Tesseract can recognize over 100 languages out-of-the-box, and can be trained to recognize other languages. It supports various output formats, including plain text, HTML, PDF and more. It also has unicode (UTF-8) support.
    Downloads: 3,656 This Week
    Last Update:
    See Project
  • 2
    LandPPT

    LandPPT

    An LLM-based presentation generation platform

    LandPPT is an open-source AI platform that automatically generates professional presentation slides using large language models. The system allows users to create complete PowerPoint presentations simply by entering a topic or uploading source documents such as PDFs, Word files, or Markdown notes. Using natural language processing and structured content generation, the platform produces presentation outlines and converts them into fully formatted slide decks. The application integrates...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 3
    DB-GPT

    DB-GPT

    Revolutionizing Database Interactions with Private LLM Technology

    DB-GPT is an experimental open-source project that uses localized GPT large models to interact with your data and environment. With this solution, you can be assured that there is no risk of data leakage, and your data is 100% private and secure.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    LlamaParse

    LlamaParse

    Parse files for optimal RAG

    LlamaParse is a GenAI-native document parser that can parse complex document data for any downstream LLM use case (RAG, agents). Load in 160+ data sources and data formats, from unstructured, and semi-structured, to structured data (API's, PDFs, documents, SQL, etc.) Store and index your data for different use cases. Integrate with 40+ vector stores, document stores, graph stores, and SQL db providers.
    Downloads: 1 This Week
    Last Update:
    See Project
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 5
    Super PDF Editor (a Batch PDF Processor)

    Super PDF Editor (a Batch PDF Processor)

    Create, Edit, Delete, Organize , Convert, Export, Secure & Sign PDF.

    Super PDF Editor - Powerful, superfast, lightweight PDF processor. All-in-one PDF solution, PDF editing with 80+ tools and functions. The easy-to-use software is complete with editing tools for modifying PDF files your way. Most comprehensive, powerful, process-based and lightning-fast batch processor software. OCR PDF. PDF Imposition, Reverse Pages, Resize Page, Scale Page, Booklet, N-up Pages, Merge, Split by page, Extract Page, Rotate Page. Replace Page, Insert Page, Delete Page....
    Leader badge
    Downloads: 18 This Week
    Last Update:
    See Project
  • 6
    Extractous

    Extractous

    Fast and efficient unstructured data extraction

    Extractous is a Rust-based unstructured data extraction library focused on fast local parsing of documents and other content-heavy files. Its purpose is to extract text and metadata efficiently from formats such as PDF, Word, HTML, email archives, images, and more, without depending on external APIs or separate parsing servers. The project emphasizes performance and low memory usage, and its maintainers describe it as a local-first alternative to heavier extraction stacks. For broader format support, the system combines its Rust core with ahead-of-time compiled Apache Tika shared libraries, which allows it to extend parsing coverage while still avoiding traditional server-based overhead. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    DocsGPT

    DocsGPT

    Private AI platform for agents, enterprise search and RAG pipelines

    DocsGPT is an open-source AI platform for deploying private RAG pipelines, AI agents, and enterprise search on your own infrastructure. Connect any data source (PDFs, DOCX, CSV, Excel, HTML, audio, GitHub, databases, URLs) and get accurate, hallucination-free answers with source citations. Choose your LLM: OpenAI, Anthropic, Google Gemini, or local models. Works with Qdrant, MongoDB, and Elasticsearch and more. Deploy via Docker or Kubernetes with full data sovereignty. Build...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    LangChain Extract

    LangChain Extract

    Did you say you like data?

    LangChain Extract is an open-source reference application designed to demonstrate how large language models can be used to extract structured data from unstructured text and document files. The project implements a lightweight web service that allows developers to define extraction schemas and apply them to various sources such as plain text, HTML, or PDF documents. Built using FastAPI and the LangChain framework, the application exposes a REST API that can process documents and return structured outputs that match user-defined JSON schemas. Developers can create reusable “extractors” that define what type of information should be pulled from a document, along with example prompts that improve extraction quality through in-context learning.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    AnyTXT Searcher

    AnyTXT Searcher

    A Powerful Desktop Full-Text Search Engine, Just Like Local Google.

    ...You can quickly find any text in any file on your disk by Anytxt almost in 0.1 second. It works on Windows 11,10, 8, 7, Vista, XP, 2008, 2012, 2016,2022... AnyTXT Searcher supports the following file formats: Plain text (txt, cpp, py, html, etc.) Microsoft OneNote (one) Microsoft Word (doc, docx) Microsoft Excel (xls, xlsx) Microsoft PowerPoint (ppt, pptx) PDF WPS Office (wps, et, dps) EBook (epub, mobi, azw3, fb2 etc.) Mind Map Format (lighten, mmap, mm, xmind etc.) OFD .....
    Leader badge
    Downloads: 4,932 This Week
    Last Update:
    See Project
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 10
    chessPDFBrowser

    chessPDFBrowser

    Chess application whichs allows working with chess PDF books and PGNs.

    ...JDK-17 compatibility You will find more about it at this web sites: https://chesspdfbrowser.com?origin=sourceforge https://www.frojasg1.com:8443/downloads_web/web/html/chessPdfBrowser.html?origin=sourceforge
    Downloads: 43 This Week
    Last Update:
    See Project
  • 11
    MyBox

    MyBox

    Easy Tools of PDF, Image, File, Network, Data, and Medias

    javafx-desktop-apps pdf image ocr icc barcode color-palette text bytes markdown html archive compress digest video audio editor converter media https://github.com/Mararsh/MyBox Self-contain packages need not java env nor installation. Jar packages need Java 16 or higher.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 12
    LexiFinder

    LexiFinder

    AI-powered semantic indexing: automating the creation of book indexes

    ...LexiFinder works in two ways: as a command-line tool for scripting, automation, and batch processing, and as a graphical application for a guided, point-and-click experience. Both interfaces share the same underlying engine and support the same features. Supported input formats are PDF, DOCX, and ODT. The index can be exported as plain text, JSON, CSV, or HTML.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    DocWire SDK

    DocWire SDK

    Award-winning modern data processing SDK in C++20

    DocWire SDK, a standout C++20AI driven data processing tool, has received award from SourceForge and strong backing from Microsoft. It handles nearly 100 file types, empowering efficient text extraction, web data extraction, and document analysis. For businesses, the shift to DocWire SDK signifies a leap forward. It promises comprehensive document format support and the ability to extract valuable insights from email boxes, databases, and websites using cutting-edge AI. DocWire SDK aims to...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 14
    pdf-extractor

    pdf-extractor

    Node.js module for rendering pdf pages to images, svgs and HTML files

    Pdf-extractor is a wrapper around pdf.js to generate images, svgs, html files, text files and json files from a pdf on node.js. A DOM Canvas is used to render and export the graphical layer of the pdf. Canvas exports *.png as a default but can be extended to export to other file types like .jpg. Pdf objects are converted to svg using the SVGGraphics parser of pdf.js.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    Super-PDF-Editor-Lite

    Super-PDF-Editor-Lite

    World's most comprehensive, powerful, process-based PDF editor

    World's most comprehensive, powerful, process-based and lighting fast PDF reader, editor and batch processor. Includes features like Create PDF from Images, HTML, Text files. Create a processing log file. Extract Page, Split Page, Rotate Page, Merge Page, Duplicate page, Move Page, Printing, and Compress Page. Improve image enhancement before OCR operation for better OCR performance. pdf Imposition, etc.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Sklearn TensorFlow

    Sklearn TensorFlow

    Sklearn and TensorFlow: A Practical Guide to Machine Learning

    ...It aims to make practical machine learning education more accessible to Chinese-speaking learners by translating the technical explanations, examples, and exercises from the original English material. The repository organizes the content as structured documentation that can be compiled into multiple formats such as HTML, PDF, EPUB, and MOBI, allowing users to read the material both online and offline. It focuses on teaching core machine learning concepts using Python while demonstrating practical workflows with popular libraries like Scikit-Learn and TensorFlow. The material covers topics ranging from basic machine learning theory to deep learning techniques and model evaluation, enabling learners to build and experiment with models step by step.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    MIT Deep Learning Book

    MIT Deep Learning Book

    MIT Deep Learning Book in PDF format by Ian Goodfellow

    The Deep Learning textbook is a resource intended to help students and practitioners enter the field of machine learning in general and deep learning in particular. The online version of the book is now complete and will remain available online for free. MIT Deep Learning Book in PDF format (complete and parts) by Ian Goodfellow, Yoshua Bengio and Aaron Courville. An MIT Press book Ian Goodfellow and Yoshua Bengio and Aaron Courville. Written by three experts in the field, Deep Learning is the only comprehensive book on the subject. This is not available as PDF download. So, I have taken the prints of the HTML content and bound them into a flawless PDF version of the book, as suggested by the website itself. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 18
    NASH OS

    NASH OS

    Nash Operating System for Modern Ecommerce

    The all-built-in-one, automatic, ready-to-go out-of-box, easy-to-use state-of-the-art, and really awesome NASH OS! Over 25,000+ flexible features and controls and all scalable!! The most powerful solution ever built to instantly deliver new heights of online ecommerce enterprise to you.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 19
    DoAllWithPDF_servicemenu

    DoAllWithPDF_servicemenu

    KDE servicemenu for pdf

    allows kde user to make a lot of things whit right click on a pdf file.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Engrisi:English Sinhala Learn Dictionary

    Engrisi:English Sinhala Learn Dictionary

    Best English Sinhala Translator

    En Dictionary is Free, Efficiency Dictionary and a English Learning Tool. It is Help you to improve your English knowledge using amazing functionality. (See bellow Features List) Award for Invention of the En Dictioanry ICTA - e-Swabhimani 2012 SLIC - Dhasis 2012 Visit to More Details, http://namalyaya.blogspot.com/2014/01/free-download-english-learning-tool-en-dictionary-2014.html Easy All of Your Language Problem With The En Dictionary ICTA Recommendation....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    En:Best Alternative to MADURA Dictionary

    En Dictionary is Free, Useful Dictionary and English Learning Tool

    En Dictionary is Free, Efficiency Dictionary and a English Learning Tool. Help you to improve your English knowledge using many functionality. (See bellow) Award for Invention of the En Dictioanry ICTA - e-Swabhimani 2012 SLIC - Dhasis 2012 Visit to More Details, http://namalyaya.blogspot.com/2014/01/free-download-english-learning-tool-en-dictionary-2014.html Easy All of Your Language Problem With The En Dictionary ICTA Recommendation. http://tinyurl.com/be5tcze "Real-time...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    ANts P2P
    ANts P2P realizes a third generation P2P net. It protects your privacy while you are connected and makes you not trackable, hiding your identity (ip) and crypting everything you are sending/receiving from others.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 23
    Optex Analyzer is a software to analyze and compare algorithms to solve approximately optimization problems. It has a GUI that allows select a set of input files containing raw algorithm results. The analysis is shown with tables and charts.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB