Showing 306 open source projects for "java html parser"

View related business solutions
  • Grafana: The open and composable observability platform Icon
    Grafana: The open and composable observability platform

    Faster answers, predictable costs, and no lock-in built by the team helping to make observability accessible to anyone.

    Grafana is the open source analytics & monitoring solution for every database.
    Learn More
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 1

    ConcatPDF

    PDF Concatenation Tool

    ConcatPDF is the tool to concatenate PDF files. It can concatenate, extract, encrypt, decrypt, configure PDF files, convert image files to PDF. GUI version and CUI version are both available. iText.NET is iText porting on .NET Framework by J#. This library allows you to generate PDF, (X)HTML, XML, RTF files on Microsoft.NET Framework including ASP.NET.
    Leader badge
    Downloads: 42 This Week
    Last Update:
    See Project
  • 2
    iText®, a JAVA PDF library

    iText®, a JAVA PDF library

    PDF Library for Developers

    iText is an open-source PDF library available for Java and .NET (C#). iText allows you to effortlessly generate and manipulate standards-compliant PDF documents with a powerful and feature-rich SDK. With iText, you can create archivable and accessible PDFs, split and merge documents, fill and flatten forms, digitally sign documents, and more. iText add-ons enable additional functionality, such as PDF creation from HTML templates, secure redaction, OCR, and much more. ...
    Leader badge
    Downloads: 272 This Week
    Last Update:
    See Project
  • 3
    HTML Creator

    HTML Creator

    An easy-to-use tool to create html pages.

    This Java application allow users to easyly create web pages using a block structure to rapresent the HTML components.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Cool Reader

    Cool Reader

    A cross-platform XML/CSS based eBook reader

    CoolReader is fast and small cross-platform XML/CSS based eBook reader for desktops and handheld devices. Supported formats: FB2, TXT, RTF, DOC, TCR, HTML, EPUB, CHM, PDB, MOBI. Platforms: Win32, Linux, Android. Ported on some eInk based devices.
    Leader badge
    Downloads: 557 This Week
    Last Update:
    See Project
  • G-P - Global EOR Solution Icon
    G-P - Global EOR Solution

    Companies searching for an Employer of Record solution to mitigate risk and manage compliance, taxes, benefits, and payroll anywhere in the world

    With G-P's industry-leading Employer of Record (EOR) and Contractor solutions, you can hire, onboard and manage teams in 180+ countries — quickly and compliantly — without setting up entities.
    Learn More
  • 5
    crawler4j

    crawler4j

    Open source web crawler for Java

    crawler4j is an open source web crawler for Java which provides a simple interface for crawling the Web. Using it, you can setup a multi-threaded web crawler in few minutes. You need to create a crawler class that extends WebCrawler. This class decides which URLs should be crawled and handles the downloaded page. shouldVisit function decides whether the given URL should be crawled or not.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    Web Widget Toolkit (WTK): Server-side components for easily creating web-based user interfaces with complex navigation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    JuniCoder is a Java project that uses unicode as a base for decoding and encoding formats that invented workarounds to express characters not covered by ASCII. Decoders translate those inventions to unicode. Encoders encode to these inventions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8

    StoryParser

    A set of tools and libraries to help with writing eBooks

    A set of tools and libraries (available for C# and Java) that help with writing fiction and non-fiction drafts and then produce ePUB and Kindle eBooks. With these tools/libraries, drafts, written in HTML, can be analyzed to help with writing. such as generating outlines and associating scenes with keywords. When done writing, the tools/libraries can be used to make publishable eBook, automatically producing additional material, such as Table of Contents and Title Pages.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Cross-platform visual XSLT generator
    Downloads: 0 This Week
    Last Update:
    See Project
  • Deliver trusted data with dbt Icon
    Deliver trusted data with dbt

    dbt Labs empowers data teams to build reliable, governed data pipelines—accelerating analytics and AI initiatives with speed and confidence.

    Data teams use dbt to codify business logic and make it accessible to the entire organization—for use in reporting, ML modeling, and operational workflows.
    Learn More
  • 10
    Xml2Json Converter

    Xml2Json Converter

    Simple tool for converting large XML-files to JSON or JSON to XML

    Simple converter tool with GUI (written on JavaFX) for converting large XML-files to JSON and JSON to XML with indicating progress and uses small amount of memory for converting. Starting from 1.2.0 application supports batch converting files from directory by pattern. Uses Java 1.8+ (http://www.oracle.com/technetwork/java/javase/downloads/jre8-downloads-2133155.html). Distributions for Mac, Linux and Windows already have embedded JRE, so just download appropriate distribution and start application.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 11

    JHOVE

    File validation and characterization

    JHOVE provides functions to perform format-specific identification, validation, and characterization of digital objects. JHOVE should not be confused with JHOVE2, a product with similar aims but a completely separate code base.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    VTD-XML is the next generation XML parser/indexer/editor/slicer/assembler/xpath-engine that goes beyond DOM, SAX and PULL in performance, memory usage, and ease of use.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 13
    Simple-Scrape is a simple web-scraping library that allows for programmatic access to HTML code. No further techniques are needed and the library is very compact and thus easy to use.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Grotag
    Grotag views Amigaguide documents or converts them to HTML and DocBook XML. Additionally it can validate and pretty print such documents.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    eXtensible Text Framework (XTF)

    Framework for search and display of heterogenous document collections.

    NOTICE: This code repository is deprecated. Please visit https://github.com/cdlib/xtf for the latest updates. Obsolete Description: The eXtensible Text Framework (XTF) is an architecture that supports searching across collections of heterogeneous textual data (XML, PDF, HTML, text, and more), and the presentation of results and documents in a highly configurable manner. Includes highly customized versions of the proven open-source components Lucene and Saxon.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16

    JSONjuicer

    JSON parser and encoder

    A Java open-source library which makes encoding and decoding Java data-structures to and from JSON text easy and intuitive.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17

    XmlDoclet

    A JavaDoc doclet that outputs source code structure in XML format.

    XmlDoclet is a JavaDoc doclet that outputs the source code structure of the packages, classes etc. in XML format. Later, the XML data may easily be processed by standard tools such as XSLT to produce HTML, PDF, dot graphs etc. Technically, this is done by wrapping the class and interfaces of the com.sun.javadoc packages into JAXB annotated classes, which allows for an easy serialization.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Jericho HTML Parser is a java library allowing analysis and manipulation of parts of an HTML document, including server-side tags, while reproducing verbatim any unrecognised or invalid HTML.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 19
    thymeleaf
    Thymeleaf is a java web template engine designed for XML/XHTML/HTML5.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    CosmoFile

    CosmoFile

    Convert your files,Edit pdf Files,Edit Images,Download files

    Looking for free software to convert your files ?CosmoFile is created for you ,a great software absolutely free for users to convert your files to many different formats.CosmoFile is very Simple and very fast and support many formats PDF,HTML,JPG,PNG,JPG,ICO,SVG,XLSX,PPTX... Edit Pdf Files with CosmoFile Looking for free software to modify PDF documents? Sometimes you need to make minor changes to a PDF file. For instance, you may want to hide your personal phone number from a PDF...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    rdi-exchange-prototype

    Rights Exchange prototype from the Rights Data Integration project

    Implementation in Java of a prototypical Rights Exchange as defined in the standards of the Linked Content Coalition (http://linkedcontentcoalition.org) and the Rights Data Integration project (http://www.rdi-project.org). Shows how to parse CRF-XML data containing Creation, Rights, RightsOffer, and Party data, to pose and respond to queries of a Hub, and to convert CRF-XML data into HTML for display.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    ISO SAX

    ISO SAX

    Callback-based ISO media file parser

    ISO SAX is a callback-based parser for ISO container files (ISO/IEC 14496-12), e.g. MPEG-4. The libraries that are out there either won't run on Android, have many megabytes of dependent JARs, or will fail to parse your favorite media file due to a "technicality" it thinks it is mal-formed. For example, a perfectly good M4B file gets declared "invalid" because it had a video track in it (the album art), along with the sound track. Really!? Don't let these libraries "judge" the format...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Chunk, an HTML Template Engine for Java

    Chunk, an HTML Template Engine for Java

    Clean, powerful templates for Java

    A powerful Java Template Engine, great for building HTML or XML docs. Chunk can handle many other needs and situations as well. In-tag filters & default values, multiple snippets per file, layered themes, macros, conditional includes, localization & more.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    NekoHTML is a simple HTML scanner and tag balancer that enables application programmers to parse HTML documents and access the information using standard XML interfaces.
    Downloads: 25 This Week
    Last Update:
    See Project
  • 25

    tinyWSDL

    JAXB-based WSDL 2.0 parser

    The main idea is to create a very small and simple WSDL 2.0 parser library. The library supports WSDL 2.0 Adjunct extensions (SOAP, HTTP and RPC) and an extension mechanism is also provided.
    Downloads: 0 This Week
    Last Update:
    See Project