Showing 39 open source projects for "extract java"

View related business solutions
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 1
    PDFsam

    PDFsam

    PDFsam, a desktop application to split, merge, mix, rotate PDF files

    PDFsam Basic is our free and open-source desktop application to split, merge, extract pages, rotate and mix PDF files. PDFsam Visual is a powerful tool to visually compose PDF files, reorder pages, delete pages, split, merge, rotate, encrypt, decrypt, extract text, convert to grayscale, crop PDF files. PDFsam Basic is written using JavaFX. Since version 4 it is released as a self-contained application and bundles a jlinked JDK while version 3 requires a Java Runtime Environment 8 with JavaFx installed in order to run.
    Downloads: 83 This Week
    Last Update:
    See Project
  • 2
    TextExtractor

    TextExtractor

    Extracts plain text from a variety of different file types

    TextExtractor extracts plain text from hundreds of different file types, storing the text extracted in suitably named text files. TextExtractor 1.10 works in six different modes :- Instant Mode - Just select any file and extract the text from it. Batch Mode - Select a group of files and extract the text from all of them in one go. Polling Mode - Watch a folder location, processing new files as they appear there. Hierarchical Mode - Extract Text from files in a directory...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 3
    Stirling-PDF

    Stirling-PDF

    #1 Locally hosted web application that allows you to work on PDFs

    This is a robust, locally hosted web-based PDF manipulation tool using Docker. It enables you to carry out various operations on PDF files, including splitting, merging, converting, reorganizing, adding images, rotating, compressing, and more. This locally hosted web application has evolved to encompass a comprehensive set of features, addressing all your PDF requirements. Stirling PDF does not initiate any outbound calls for record-keeping or tracking purposes. All files and PDFs...
    Leader badge
    Downloads: 81 This Week
    Last Update:
    See Project
  • 4
    PDF Split and Merge

    PDF Split and Merge

    Split and merge PDF files on any platform

    Split and merge PDF files with PDFsam, an easy-to-use desktop tool with graphical, command line and web interface.
    Leader badge
    Downloads: 225 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 5
    AvaSattva

    AvaSattva

    Search replace files or pipe

    ...And for ETL alike work like Load and filter files -> Extract -> Transform output. For replacing files, you can preview and backup, in multiple directories and files or pipe, with plain text matching or using general Regex as C++, C#, Java, Scala; So msr is a good tool to learn and test Regex since it has different colors for matched groups captured by the Regex pattern.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 6
    DocWire SDK

    DocWire SDK

    Award-winning modern data processing SDK in C++20

    DocWire SDK, a standout C++20AI driven data processing tool, has received award from SourceForge and strong backing from Microsoft. It handles nearly 100 file types, empowering efficient text extraction, web data extraction, and document analysis. For businesses, the shift to DocWire SDK signifies a leap forward. It promises comprehensive document format support and the ability to extract valuable insights from email boxes, databases, and websites using cutting-edge AI. DocWire SDK aims to...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 7
    Orbit

    Orbit

    ORBIT : Operating Business Intelligence Tool

    ORBIT : Operating Business Intelligence Tool Making Data Accessible Through Centralized Database Access ORBIT is a business intelligence tool designed to make data accessible to a broad audience within your company by centralizing access to databases. With this application, users can easily create reports, perform interactive analyses, and extract insights from raw data. The application simplifies data handling by providing easy-to-use features for non-technical users while maintaining...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 8
    MyBox

    MyBox

    Easy Tools of PDF, Image, File, Network, Data, and Medias

    javafx-desktop-apps pdf image ocr icc barcode color-palette text bytes markdown html archive compress digest video audio editor converter media https://github.com/Mararsh/MyBox Self-contain packages need not java env nor installation. Jar packages need Java 16 or higher.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9
    PDFLayoutTextStripper

    PDFLayoutTextStripper

    Converts a pdf file into a text file while keeping the layout

    Converts a PDF file into a text file while keeping the layout of the original PDF. Useful to extract the content from a table or a form in a PDF file. PDFLayoutTextStripper is a subclass of PDFTextStripper class (from the Apache PDFBox library).
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    QuickNoteCLI

    QuickNoteCLI

    QNC is a command line interface app for creating quick notes

    usage: QNC -a,--append Append text to the last note -c,--clear Clear all notes -d,--delete <arg> Delete note by index or name -e,--erase Erase last note -h,--help Print help -l,--list Print note list -n,--name <arg> Specify note name -N,--nano Open in nano editor (If installed) -p,--print Print last note text -r,--rename <arg> Rename last note -s,--show <arg> Show note by name -S,--dbs Start DB server On Windows you can also use WIN + R, qnc <args> Manual start: java -jar QNC-0.1.0.jar Installation with scripts: Extract files from zip to any directory WINDOWS: run install.bat UNIX: run install.sh or execute command in terminal: alias qnc="java -jar /path/to/jar/QNC-0.1.0.jar" https://github.com/DeMmAge/QuickNoteCLI
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    RapidMiner -- Data Mining, ETL, OLAP, BI
    ETL, data warehousing, data mining, OLAP, business intelligence (BI) in Java. 500+ modules: extract, transform, load (ETL), data mining, data analysis + Weka, statistical forecasting, preprocessing, validation, visualization, OLAP, business intelligence.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 12
    Dexter is a little java program to interactively or semi-automatically extract data from scanned graphs. In its applet incarnation it is used by the Astrophysics Data System. A rudimenary standalone version is provided as well.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    Alfresco Audit Analysis and Reporting
    With Alfresco Audit Analysis and Reporting (A.A.A.R.) is provided a solution to extract, store and query audit data together with the document/folder informations at a very detailed level, with the goal to be useful to the end-user in a very easy way. To reach that goal, to make the data more friendly for the end-user, the data are published in reports in well-known formats (pdf, Microsoft Excel, csv, etc.) and stored directly in Alfresco as static documents organized in folders, versioned,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    JPDF Viewer

    JPDF Viewer

    Your Java Swing PDF Viewer/Reader cross platform

    A simple PDF Viewer that allows you to be able to view, print and extract the contents of your pdf file in just a few clicks. You can export the contents of the pdf in svg format or txt. The Viewer is also equipped with a handy utility panel with search functions, thumbnails and annotations. Get Your PDF Reader for Android - https://play.google.com/store/apps/details?id=com.ulm.pdfreader =============================== Get now your Word Processor in pure java: https://sourceforge.net/projects/jwordprocessor/ =============================== See my web project extensions for browsers: Chrome extension: http://bit.ly/2cELWLs | http://bit.ly/1PWKVdu Add-on Firefox: http://mzl.la/1Wn51hg My Mobile Applications: http://bit.ly/1MrlgKk ======================= Visit this web site to get javascript library for PDF generator: https://ulmdevice.altervista.org/pdfapihtml5/
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15

    Adele

    Adhoc Data Exploration - Live & Easy

    Adele was developed to simplify the daily work with data. Use it as a swiss knife to fill the gap between your work with spreadsheet application like MS Excel and enterprise servers like SAP ERP. Specialized tools like Rapid Miner, KNIME or similiary stuff should not be replaced. But Adele is designed for business people working with spreadsheet applications to analyse their data. There are many technical concepts in an easier way included. For example realtime OLAP, transformations,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    Convert HTML to PDF in .NET with C#

    Convert HTML to PDF in .NET with C# using EVO HTML to PDF for .NET

    EVO HTML to PDF Converter for .NET is a library that can be easily integrated and distributed in your ASP.NET and MVC web sites, desktop applications, Windows services and Azure cloud services to convert web pages, HTML strings and streams to PDF, to images or to SVG and to create nicely formatted and easily maintainable PDF reports and documents. The converter has full support for HTML5, CSS3, SVG, Canvas, Web Fonts and JavaScript. Does not require installation or any third party tools. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Rapidminer Onomastics Extension

    Rapidminer Onomastics Extension

    Extract Gender and Origin from Personal Names

    Guessing the gender of name is not as simple as it seems: - Andrea is a male name in Italy, a female name in the US. Laurence is a female name in France and a male name in the UK or in the US - name demographics evolve, some names are genderless - in Chinese or Korean, guessing the gender is almost impossible in Latin script, truly difficult even with the original script - in most cultures, the gender is 'encoded' in the first name, in others it is encoded in the last name as well (for...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    jdbc4sapnw

    jdbc4sapnw

    JDBC driver for accessing SAP NetWeaver based systems

    jdbc4sapnw is a read-only jdbc driver for SAP NetWeaver ABAP stack systems. The driver uses the SAP Java Connector 3 (sapjco3) middleware to call the SAP system. There are no extensions required at the SAP systems by default. Only available remote functions will be used to extract data from sap (RFC technology). A valid SAP user is required for logon. The major goal for this project is to provide SAP data to other java based tools in a direct way (e.g. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19

    PDF Comaprision JINI

    This project is forged to compare two PDFs

    This project is forged to compare two PDFs . IT uses following approach in compression 1 . Extract All text of both pdfs and compare them Page by Page 2. Extract all images from both PDF and save in folders and then compare them one by one and save difference in Difference Folder 3. Convert PDF 1 and 2 pages to JPG and compare them one by one
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    GeoKettle
    GeoKettle is a powerful, metadata-driven spatial ETL (Extract, Transform and Load) tool dedicated to the integration of different data sources for building and updating geospatial databases, data warehouses and services.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 21
    CMIS Input plugin for Pentaho

    CMIS Input plugin for Pentaho

    Allows querying Content Management Systems that use the CMIS.

    Imagine being able to extract from your Enterprise Content Management System, all the metadata of your documents using simple queries with a query language very close to the traditional SQL. Imagine using the information extracted for statistical purposes, for creating reports and, more generally, to analyse your document archives in a way unthinkable until now with the current tools available. All this is possible within the Pentaho Suite, the Open Source Business Intelligence platform,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    iGi Reporter

    iGi Reporter

    Write Report,Read Multiple Notepad,pdf,Word,Design Logo and picture

    First Problem:- problem is ,many Programs need internet To Convert Word To Pdf in this Program You Can Convert Word To Pdf Without internet ----------------------------------------------------------------- Second Problem:- problem is,Many Students Don't Know How To Write CV in this Program You Can Creat Cv in One Minute ------------------------------------------------------------------ Third Problem:- problem is,Many Students Don't Know How To Write Report in this Program You Can...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    webStraktor is a programmable World Wide Web data extraction client. Its purpose is to scrape HTML based content via the HTTP protocol and extract relevant information. webStraktor features a scripting language to facilitate the collection, the extraction and the storage of information available on the web, including images. The scripting language uses elements of the Regular Expression and xPath syntax. The webStraktor scripting language has a small instruction set and its syntax is easy...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    ServiceNow Data Mart Loader

    ServiceNow Data Mart Loader

    ServiceNow Data Mart Loader (a.k.a. ServiceNow Data Pump)

    The ServiceNow Data Mart Loader (a.k.a. ServiceNow DataPump) is a Java application which uses ServiceNow’s Direct Web Services (SOAP) API to extract meta-data and data from your Service-now ITSM instance. The application automatically creates and maintains tables in an Oracle or MySQL database. Please view the Wiki Quick Start Guide for instructions. NOTE: This project has been rehosted on github (see https://github.com/gflewis/sndml).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Toolsverse ETL Framework

    Toolsverse ETL Framework

    Open source Extract Transform Load engine written in Java

    ETL Framework is a standalone Extract Transform Load engine written in Java. It includes executables for all major platforms and can be easily integrated into other applications. Key Features: * embeddable, open source and free * fast and scalable * uses target database features to do transformations and loads * manual and automatic data mapping * data streaming * bulk data loads * data quality features using SQL, JavaScript?
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB