Showing 178 open source projects for "pdf data mining"

View related business solutions
  • Ship Agents Faster Icon
    Ship Agents Faster

    Transform your applications and workflows into powerful agentic systems at global scale.

    Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.
    Get Started Free
  • Error to trace to log to deploy. One click. No SSH. Icon
    Error to trace to log to deploy. One click. No SSH.

    Catch the cause before the pager goes off.

    AppSignal links every error to the trace, the trace to the log, the log to the deploy that shipped it.
    Free 30 days.
  • 1
    DocsGPT

    DocsGPT

    Private AI platform for agents, enterprise search and RAG pipelines

    DocsGPT is an open-source AI platform for deploying private RAG pipelines, AI agents, and enterprise search on your own infrastructure. Connect any data source (PDFs, DOCX, CSV, Excel, HTML, audio, GitHub, databases, URLs) and get accurate, hallucination-free answers with source citations. Choose your LLM: OpenAI, Anthropic, Google Gemini, or local models. Works with Qdrant, MongoDB, and Elasticsearch and more. Deploy via Docker or Kubernetes with full data sovereignty. Build...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    Jina

    Jina

    Build cross-modal and multimodal applications on the cloud

    ...Jina handles the infrastructure complexity, making advanced solution engineering and cloud-native technologies accessible to every developer. Build applications that deliver fresh insights from multiple data types such as text, image, audio, video, 3D mesh, PDF with Jina AI’s DocArray. Polyglot gateway that supports gRPC, Websockets, HTTP, GraphQL protocols with TLS. Intuitive design pattern for high-performance microservices. Seamless Docker container integration: sharing, exploring, sandboxing, versioning and dependency control via Jina Hub. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Kimai

    Kimai

    Kimai is a web-based multi-user time-tracking application

    Kimai is an open-source time-tracking solution. It tracks work time and prints out a summary of your activities on demand. Yearly, monthly, daily, by the customer, by the project. Its simplicity is its strength. Due to Kimai’s browser-based interface, it runs cross-platform, even on your mobile device. With Kimai, the boring process of feeding Excel spreadsheets with your working hours is not only simplified, it also offers dozens of other exciting features that you don't even know you're...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 4
    Admidio

    Admidio

    Admidio is a free open source user management system for websites

    Admidio is a free open source user management system for websites of organizations and groups. The system has a flexible role model so that it’s possible to reflect the structure and permissions of your organization. You can create an individual profile for your members by adding or removing profile fields. Additional to these functions the system contains several modules like member lists, event manager, messages, photo album or a documents & files area. Admidio is a free online membership...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Secure File Transfer for Windows with Cerberus by Redwood Icon
    Secure File Transfer for Windows with Cerberus by Redwood

    Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

    Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.
    Try for Free
  • 5
    Gerber2PDF

    Gerber2PDF

    Gerber to PDF converter

    Gerber2PDF is a command-line tool to convert Gerber files to PDF for proofing and hobbyist printing purposes. It converts multiple Gerber files at once, placing the resulting layers each on it's own page within the PDF. Each layer has a PDF bookmark for easy reference. Layers can optionally be combined onto a single page and rendered with custom colours and transparency. There is a Drill to Gerber converter available from the downloads page.
    Leader badge
    Downloads: 19 This Week
    Last Update:
    See Project
  • 6

    toPDF

    Online service for PDF conversion (to PDF)

    A simple online service for PDF conversion. This project is a simple library and also a web application. It offers a REST service and a simple upload service for synchronous conversion. This library/application doesn't contain conversion libraries because it's a wrapper for existing tools. toPDF currently supports the open source tool PDF Creator (http://www.pdfforge.org) and the commercial solution, easy PDF, from BCL (http://www.pdfonline.com/easypdf/sdk/).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7

    Candid PDF Table

    CandidPDFTable – Deterministic TCPDF Table Builder

    CandidPDFTable (Candid PDF Table Builder) is a deterministic, colspan-aware table builder designed specifically for TCPDF. It provides a clean and predictable API to construct HTML tables for TCPDF::writeHTML() using explicit, cell-owned borders and late-stage layout computation. The library is built for programmatic table generation where precise control over rows, columns, colspans, borders, and serial numbering is essential. Building complex tables directly in TCPDF becomes difficult...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    WebHarvest - web data extraction tool
    Web data extraction (web data mining, web scraping) tool. It leverages well proved XML and text processing techologies in order to easely extract useful data from arbitrary web pages.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    JasperReports Library

    JasperReports Library

    Free Java Reporting Library

    JasperReports Library is the world's most popular open source business intelligence and reporting engine. It is entirely written in Java and it is able to use data coming from any kind of data source and produce pixel-perfect documents that can be viewed, printed or exported in a variety of document formats including HTML, PDF, Excel, OpenOffice and Word. The project is also available at: https://github.com/TIBCOSoftware/jasperreports Jaspersoft Studio is the open source report designer for the JasperReports Library. ...
    Leader badge
    Downloads: 1,363 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 10
    FastReport Open Source

    FastReport Open Source

    Free Open Source Reporting tool for .NET

    Free Open Source Reporting tool for .NET Core/.NET Framework that helps your application generate document-like reports.
    Downloads: 68 This Week
    Last Update:
    See Project
  • 11
    DocWire SDK

    DocWire SDK

    Award-winning modern data processing SDK in C++20

    DocWire SDK, a standout C++20AI driven data processing tool, has received award from SourceForge and strong backing from Microsoft. It handles nearly 100 file types, empowering efficient text extraction, web data extraction, and document analysis. For businesses, the shift to DocWire SDK signifies a leap forward. It promises comprehensive document format support and the ability to extract valuable insights from email boxes, databases, and websites using cutting-edge AI. DocWire SDK aims to...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 12
    AuroreNR
    Software developed for the analysis of Neutron Reflectivity data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    ant4docbook

    ant4docbook

    ANT4DOCBOOK is an ANT task for DOCBOOK

    ANT4DOCBOOK is an ANT task for DOCBOOK, a semantic markup language for technical documentation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    jPicEdt

    jPicEdt

    Another drawing editor for LaTeX with PSTricks & TikZ

    jPicEdt is an extensible internationalized vector-based drawing editor for LaTeX and related packages (TikZ, PsTricks,...), written in Java. It is also a library of reusable high-level graphic primitives.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 15
    Msc-generator

    Msc-generator

    Draws signalling charts, block diagrams and graphs from text input.

    NOTE! We have moved to https://gitlab.com/msc-generator/msc-generator All development happens there. Also, download new releases & submit issues there. A tool to draw various charts from textual descriptions. Currently, three types of charts are supported: Message Sequence Charts, generic Graphs, and Block Diagrams, with more to be added in the future. There is a command-line version for Linux and Mac (replacing mscgen), which now sports a GUI, as well. Msc-generator allows fine...
    Leader badge
    Downloads: 46 This Week
    Last Update:
    See Project
  • 16
    iSphere

    iSphere

    The iSphere Project for and RDi 9.5.1.3+

    The iSphere Source Code has been moved to GitHub (https://github.com/rdi-open-source/isphere-plugin) on January 3rd, 2024. Important: The update site and ticket management has been moved to GitHub as well. iSphere is an open source plug-in for IBM's Rational Developer for i 9.5.1.3+. It delivers high quality extensions to improve developer productivity. IBM's current Eclipse based Integrated Development Environment (IDE) is a huge step beyond SEU, but it still lacks features...
    Leader badge
    Downloads: 126 This Week
    Last Update:
    See Project
  • 17
    Ada Bar Codes

    Ada Bar Codes

    Bar Code (1D or 2D) generator in pure Ada

    The project Ada Bar Codes provides a framework for generating various types of bar codes (1D, or 2D, like QR codes) on different output formats and devices. Alire crate: https://alire.ada.dev/crates/bar_codes Mirror: https://github.com/zertovitch/ada-bar-codes
    Downloads: 10 This Week
    Last Update:
    See Project
  • 18
    stkpp

    stkpp

    C++ Statistical ToolKit

    ...At a convenience, we propose the source packages on sourceforge. The library offers a dense set of (mostly) template classes in C++ and is suitable for projects ranging from small one-off projects to complete data mining application suites.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    To give users the full control over the running application. This means that an application is working according to its purpose but the control over the whole interface is taken from developer and given to users. While an application is running, users can move, resize, and tune all the screen objects through which the communication with an application is going. Set of files includes the book (both in DOC and PDF formats), a big demonstration project with all its files available (all the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    dktools - Dirk Krauses tools

    dktools - Dirk Krauses tools

    Drawing, graphics conversion, software development, administration.

    GUI and command line tools for advanced users and administrators: wxdkdraw - Minimalistic drawing application for use with LaTeX, wxd2lat - Convert wxdkdraw files to LaTeX, bitmap2pp - Convert PNG/JPEG/TIFF/NetPBM to (E)PS or PDF, fig2lat - Convert XFig files to LaTeX, htmlbook - publish HTML like a book, dkcpre - C debugging and tracing preprocessor, itadmin - manage your IT using a MySQL/MariaDB database, dk-fic - file integrity checker, dk-ls - list files, output column order is configurable, dk-cat, dk-sort, dk-lines - text tools for *x and Windows, dk-send, dk-recv - transmit data stream, dk-t2h, dk-t2l - text to HTML or LaTeX conversion.
    Leader badge
    Downloads: 12 This Week
    Last Update:
    See Project
  • 21
    Elementary Algorithms

    Elementary Algorithms

    Book of elementary algorithms and data structures

    This book introduces elementary algorithms and data structure. It includes side-by-side comparison of purely functional realization and their imperative counterpart. From 2020/12, I started re-writing this book. The PDF can be downloaded for preview (EN, 中文). The 1st edition in Chinese (中文) was published in 2017. I recently switched my focus to the Mathematics of programming, the new book is also available in (github).
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    LangChain Apps on Production with Jina

    LangChain Apps on Production with Jina

    Langchain Apps on Production with Jina & FastAPI

    Jina is an open-source framework for building scalable multi-modal AI apps on Production. LangChain is another open-source framework for building applications powered by LLMs. long-chain-serve helps you deploy your LangChain apps on Jina AI Cloud in a matter of seconds. You can benefit from the scalability and serverless architecture of the cloud without sacrificing the ease and convenience of local development. And if you prefer, you can also deploy your LangChain apps on your own...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    html-pdf-chrome

    html-pdf-chrome

    HTML to PDF or image (jpeg, png, webp) converter via Chrome/Chromium

    HTML to PDF or image (jpeg, png, webp) converter via Chrome/Chromium. This library is NOT meant to accept untrusted user input. Doing so may have serious security risks such as Server-Side Request Forgery (SSRF). If you run into CORS issues, try using the --disable-web-security Chrome flag, either when you start Chrome externally, or in options.chromeFlags. This option should only be used if you fully trust the code you are executing during a print job. It is strongly recommended that you...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24

    BitMagic Library

    Compressed bit-sets, sparse bit matrices and algorithms

    BitMagic - C and C++ library implementing dynamic bitvectors and bit-set algorithms with several types of on-the-fly, adaptive compression. Designed for use in databases, search systems, data-mining algorithms, scientific projects. The core of the library is C++, but it provides C-compatibility wrappers and can be compiled without C++ runtime. Optimizations for Intel SSE2, SSE4.2 and AVX2.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    Common Resource Grep - crgrep

    Common Resource Grep - crgrep

    Common Resource Grep

    CRGREP searches for matching text in databases, various document formats, archives and other difficult to access resources. A command line tool for name and content text matching in database tables, plain files, MS Office documents, PDF, archives, MP3 audio, image meta-data, scanned documents, maven dependencies and web resources. CRGREP will search resources within resources of any arbitrary combination or depth, so text within a document within a zip archive, and so on. Here you will find binary downloads and discussion (https://sourceforge.net/p/crgrep/discussion/) . ...
    Downloads: 5 This Week
    Last Update:
    See Project
Auth0 Logo