Showing 87 open source projects for "pdf data mining"

View related business solutions
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • 1

    XmlDoclet

    A JavaDoc doclet that outputs source code structure in XML format.

    XmlDoclet is a JavaDoc doclet that outputs the source code structure of the packages, classes etc. in XML format. Later, the XML data may easily be processed by standard tools such as XSLT to produce HTML, PDF, dot graphs etc. Technically, this is done by wrapping the class and interfaces of the com.sun.javadoc packages into JAXB annotated classes, which allows for an easy serialization.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    CosmoFile

    CosmoFile

    Convert your files,Edit pdf Files,Edit Images,Download files

    Looking for free software to convert your files ?CosmoFile is created for you ,a great software absolutely free for users to convert your files to many different formats.CosmoFile is very Simple and very fast and support many formats PDF,HTML,JPG,PNG,JPG,ICO,SVG,XLSX,PPTX... Edit Pdf Files with CosmoFile Looking for free software to modify PDF documents? Sometimes you need to make minor changes to a PDF file. For instance, you may want to hide your personal phone number from a PDF...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3

    mcf2pdf

    mcf2pdf converts files of the "My CEWE Photobook" software to PDF

    mcf2pdf converts .mcf files of the "My CEWE Photobook" software (see http://www.cewe-photobook.co.uk/ or http://www.cewe-fotobuch.de (german)) to PDF files, so you can better preview the results and even send them to others by e-mail. This project has moved to GitHub. Please visit https://github.com/albrechtf/mcf2pdf/releases for download of latest version.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 4
    S1000D Transformation Toolkit
    The S1000D Transformation Toolkit provides a reference implementation supporting the transformation, packaging and viewing of S1000D data into a SCORM 2004 3rd Edition Content Package, Mobile Web Application and PDF.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Auth0 B2B Essentials: SSO, MFA, and RBAC Built In Icon
    Auth0 B2B Essentials: SSO, MFA, and RBAC Built In

    Unlimited organizations, 3 enterprise SSO connections, role-based access control, and pro MFA included. Dev and prod tenants out of the box.

    Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.
    Sign Up Free
  • 5
    JBerd

    JBerd

    Java tools for decoding and manipulating BER encoded ASN.1 Files

    A simple Java ASN BER decoder and profiler A tool for easy manipulation of BER encoded files. An "awk" for ASN.1 BER (for Unix people) or maybe a "notepad" for ASN.1 BER (for Windows people). Jberd (Java BER decoder) is a lightweight BER decoder and associated tools for interpreting and processing BER encoded ASN.1 files. The following facilities are provided: • JBerd Profiler. A tool for profiling the contents of BER encoded files • JBerd Flattener. A tool for converting BER...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    OpenSearchServer Extractor

    OpenSearchServer Extractor

    A RESTFul/JSON Web Service for text and metata extraction

    An open source RESTFul Web Service for text , meta-data extraction and analysis. oss-text-extractor supports various binary formats: Word processor (doc, docx, odt, rtf) Spreadsheet (xls, xlsx, ods) Presentation (ppt, pptx, odp) Publishing (pdf, pub) Web (rss, html/xhtml) Medias (audio, images) Others (vsd, text)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    The goal of this project is to provide a reusable library to transform common file formats to content objects and ContentProvider plugins to common file repositories like Filesystem, CMIS and others for iQser GIN Semantic Middleware (www.iqser.com).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Regain is a Java search engine based on Jakarta Lucene. It provides indexing and searching files for plenty of formats (HTML,XML,doc(x),xls(x),ppt(x),oo,PDF,RTF,mp3,mp4,Java). A TagLibrary eases integrating search results in your JSP based web page.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 9
    Texlipse is a plugin that adds Latex editing support for the popular Eclipse Java IDE. Key features include: syntax highlight, command completion, bibliography completion, outline navigation and automatic building.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Secure File Transfer for Windows with Cerberus by Redwood Icon
    Secure File Transfer for Windows with Cerberus by Redwood

    Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

    Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.
    Try for Free
  • 10

    Detexter

    Detexter is an app designed to extract text from PDF files.

    Detexter lets you extract text from multiple PDF files. Detexter uses the PDFBox library for its text extraction.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    jPod is a rich PDF manipulation and rendering framework. A complete rendering library based on jPod is available here at "jPodRenderer". To see jPod & jPodRenderer at work, have a look at www.cabaret-solutions.com
    Downloads: 5 This Week
    Last Update:
    See Project
  • 12
    jPod Renderer is based on the jPod library, also hosted here at "jpodlib". This is the long awaited release for platform specific rendering code, both on AWT and SWT. To see jPod and jPod Renderer at work, have a look at www.cabaret-solutions.com
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13

    Wordpress PDF Blog Export

    Aplicación JAVA que genera un PDF a partir de un XML de Wordpress

    Pequeña aplicación desarrollada con JAVA que convierte un archivo XML generado desde Wordpress en un archivo PDF. Queria hacer un libro de mi blog, pero las herramientas que encontré no incluïan los comentarios en el documento generado. Entonces, a la par que aprendía a utilizar las librerias de java iText y jSoup desarrollé esta utilidad en un archivo jar ejecutable. Para utilizarlo simplemente necesitaremos tener java instalado en nuestro PC. El archivo pdf generado se puede utilizar...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    File Type Checker checks the file data to determine the actual file type. As of this writing filetypechecker supports doc, rtf, xls, pdf, jpg, jpeg, and gif. more file support will be added soon.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    stymaker

    stymaker

    Create your own LaTeX style.

    Stymaker is a GUI application assisting LaTeX users with creating their own style packages. After filling a simple form one can get a new package file corresponding to the chosen settings. This package may be included in preamble of LaTeX document by: \usepackage{mystyle} The new package, based on standard LaTeX packages, allows changing of the document layout, or appearance of standard environments like such as lists. While testing new settings one may instantly view actual changes in...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    jPod is a PDF manipulation and rendering framework. This release contains the documented features, including reading, manipulating and writing. More features to be released as API matures. To see jPod at work, have a look at www.cabaret-solutions.com
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    eLML - eLesson Markup Language
    eLML (eLesson Markup Language) is an XML framework for creating structured eLessons based on a pedagogical model. eLML consists of an XMLSchema and XSLT files to create XHTML, PDF, LaTeX, IMS CP and SCORM versions, standards supported by most LMS.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Copperhead is a small and simple library providing a Swing user interface that allows one to automatically generate PDF documents from annotated objects using the iText PDF library. Copperhead is developed under GPLv3. Please download Copperhead 0.1b for iText 2 and 0.2b for iText5. Read more on http://byteality.ch/blog. Enjoy!
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Kabeja is a java library for parsing DXF and converting to SVG (dxf2svg). The library supports the SAX-api and can integrated into other applications (Cocoon,Batik). Tools for converting svg to jpeg, tiff, png and pdf are included .
    Leader badge
    Downloads: 73 This Week
    Last Update:
    See Project
  • 20
    an images to pdf converter
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Calenco XML CMS
    Calenco is a Web collaborative platform that enable remote teams of writers, proofreader, graphic designers, translators, etc. to produce together XML documents like user guides, security procedures, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Foxon is a FO emitter/indenter to be used with Saxon. It can indent and prettify XSL-FO output, making it suitable for human inspection and editing, without introducing artefacts that change the layout of the PDF file.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    This project provides a toolkit and framework based on PDFBox for document analysis of PDF files and performing custom conversion tasks and is published under the Apache licence. A GUI is also included, and is published using the GPL licence.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    xccdf2pdf renders XCCDF documents in PDF and other formats.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    This library provides tools to make a PDF/A preflight on a PDF document. It is highly based on apache PDFBOX. Conformance to the ISO 19005 (PDF/A) norm is checked. The goal is to pass completely the isartor test.
    Downloads: 0 This Week
    Last Update:
    See Project