Showing 62 open source projects for "java html parser"

View related business solutions
  • Cloud-based help desk software with ServoDesk Icon
    Cloud-based help desk software with ServoDesk

    Full access to Enterprise features. No credit card required.

    What if You Could Automate 90% of Your Repetitive Tasks in Under 30 Days? At ServoDesk, we help businesses like yours automate operations with AI, allowing you to cut service times in half and increase productivity by 25% - without hiring more staff.
    Try ServoDesk for free
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    Build gen AI apps with an all-in-one modern database: MongoDB Atlas

    MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.
    Start Free
  • 1
    ANTLR

    ANTLR

    Parser generator to read, process, or translate structured text

    ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files. It's widely used to build languages, tools, and frameworks. From a grammar, ANTLR generates a parser that can build and walk parse trees. It’s widely used in academia and industry to build all sorts of languages, tools, and frameworks. Twitter search uses ANTLR for query parsing, with over 2 billion queries a day. The languages for...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 2
    OmegaT - multiplatform CAT tool

    OmegaT - multiplatform CAT tool

    The free computer aided translation (CAT) tool for professionals

    OmegaT is a free and open source multiplatform Computer Assisted Translation tool with fuzzy matching, translation memory, keyword search, glossaries, and translation leveraging into updated projects.
    Leader badge
    Downloads: 2,024 This Week
    Last Update:
    See Project
  • 3
    ant4docbook

    ant4docbook

    ANT4DOCBOOK is an ANT task for DOCBOOK

    ANT4DOCBOOK is an ANT task for DOCBOOK, a semantic markup language for technical documentation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    RTextDoc

    RTextDoc

    An editor for structured documents

    RTextDoc is an editor for structured text documents such as LaTeX, AsciiDoc, DocBook. RTextDoc has proofreading capabilities: on-the-fly spelling, instant grammar checking and built-in free dictionaries. RTextDoc has syntax highlighting, bracket matching, folding, document structure browser for sections and labels, bookmarks, manager for LaTeX symbols, an editor for mathematical equations,integrated BibTeX database manager and several tools to convert LaTeX to HTML and back. AsciiDoc...
    Leader badge
    Downloads: 2 This Week
    Last Update:
    See Project
  • Leverage AI to Automate Medical Coding Icon
    Leverage AI to Automate Medical Coding

    Medical Coding Solution

    As a healthcare provider, you should be paid promptly for the services you provide to patients. Slow, inefficient, and error-prone manual coding keeps you from the financial peace you deserve. XpertDox’s autonomous coding solution accelerates the revenue cycle so you can focus on providing great healthcare.
    Learn More
  • 5
    SimplyHTML is an application and a java component for rich text processing. It stores documents as HTML files in combination with Cascading Style Sheets (CSS). SimplyHTML is not intended to be used as an editor for web pages.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 6
    Leseratte is a Java parser for German written language. Currently, it contains a German lexicon (based on the Wiktionary), inflexion rules, a grammar and a parser. (Semantics component planned.) Usable as a Java library, also provides a graphical UI.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Web Book Downloader

    Web Book Downloader

    Download websites as e-book: pdf, txt, epub.

    This application allows user to download chapters from website in 3 ways: - from table of contents; - from range: first chapter address, last chapter address; - by crawling from first chapter to n; In settings you can customize language, input(website encoding) for simplicity output is in the same encoding. If you want your language add new class into strings package, and new fields into Settings class and GUI menu(initialize method).
    Downloads: 6 This Week
    Last Update:
    See Project
  • 8

    ConcatPDF

    PDF Concatenation Tool

    ConcatPDF is the tool to concatenate PDF files. It can concatenate, extract, encrypt, decrypt, configure PDF files, convert image files to PDF. GUI version and CUI version are both available. iText.NET is iText porting on .NET Framework by J#. This library allows you to generate PDF, (X)HTML, XML, RTF files on Microsoft.NET Framework including ASP.NET.
    Leader badge
    Downloads: 42 This Week
    Last Update:
    See Project
  • 9
    iText®, a JAVA PDF library

    iText®, a JAVA PDF library

    PDF Library for Developers

    iText is an open-source PDF library available for Java and .NET (C#). iText allows you to effortlessly generate and manipulate standards-compliant PDF documents with a powerful and feature-rich SDK. With iText, you can create archivable and accessible PDFs, split and merge documents, fill and flatten forms, digitally sign documents, and more. iText add-ons enable additional functionality, such as PDF creation from HTML templates, secure redaction, OCR, and much more. ...
    Leader badge
    Downloads: 272 This Week
    Last Update:
    See Project
  • Say goodbye to broken revenue funnels and poor customer experiences Icon
    Say goodbye to broken revenue funnels and poor customer experiences

    Connect and coordinate your data, signals, tools, and people at every step of the customer journey.

    LeanData is a Demand Management solution that supports all go-to-market strategies such as account-based sales development, geo-based territories, and more. LeanData features a visual, intuitive workflow native to Salesforce that enables users to view their entire lead flow in one interface. LeanData allows users to access the drag-and-drop feature to route their leads. LeanData also features an algorithms match that uses multiple fields in Salesforce.
    Learn More
  • 10

    Ghawwas_V4

    An open source system for Arabic corpora processing

    Ghawwas (previously known as Khawas) is an open source system for Arabic corpora processing. Ghawwas V4.0 provides the following main functions: a. Frequency list for single word and N-Grams b. Concordance c. Collocation (MI, CHI Squared, LL, T-Score, Z Score, Dice, Log Dice, Weirdness Coefficient) d. Lexical patterns search e. Two corpora frequency profile comparison based on MI, CHI, LL, T-Score, Z Score, Dice, Log Dice, Weirdness Coefficient f. Accept Windows and UTF-8 character...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Command-line/Ant-task/embeddable text file preprocessor. Macros, flow control, expressions. Recursive directory processing. Extensible in Java to display data from any data sources (as database). Can generate complete homepages (tree of HTML-s, images, etc.)
    Leader badge
    Downloads: 105 This Week
    Last Update:
    See Project
  • 12
    cpDetector is a proxy for codepage detection of documents. It delegates to multiple instances that try to detect the codepage by different techinques. A command line executeable is shipped that allows to sort documents by codepage.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 13
    Jericho HTML Parser is a java library allowing analysis and manipulation of parts of an HTML document, including server-side tags, while reproducing verbatim any unrecognised or invalid HTML.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 14
    A java-based parser for parsing/grabbing web sites and other text or XML documents, based on a nondeterministic parser language, creating XML output. Also contains a few utility classes for HTML, CSV and text parsing, and additional character sets.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Jaxe
    Jaxe is a free Java XML editor with a configurable GUI, using XML schemas for validation and XSL for exports in HTML or XML.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 16
    Simple Java delimited and fixed width file parser. Handles CSV, Excel CSV, Tab, Pipe delimiters, just to name a few. Maps column positions in the file to user friendly names via XML. See "FlatPack Feature List" under News for complete feature list.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17

    Spock Text Editor

    Simply text editor for php, jsp, html, etc...

    This app is designed in Java, so is fully compatible with Win, Mac and Linux 32 or 64 bits. It's a simple and fast text editor and supports: *.txt *.jsp *.php *.c *.h(headers for C language) *.java *.htm/html This is the first version and my first application on java. I hope you like it! See you in version 2! ;-)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18

    SutraReader

    Arranges a Sutra text in the traditional layout

    This is an application designed to arrange / lay out a Chinese or Japanese Sutra text in the traditional layout (from top to bottom, from right to left). The input can be any file (the application can pick out the relevant parts) and the output is the layout (arranged in HTML file(s)) and the content (a plain text file with the content). Beside this, you can get a statistics about the ideograms and can exclude certain characters or ideograms from the content.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    The DocBook Publishing Utilities tools, which make creation and publishing of DocBook easier. The tools are: Maven plug-in to Transform HTML into XML (use after docbkx); Eclipse DocBook table editor; Eclipse wizards for initial DocBook files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Drag-and-drop files/directories/HTML-URLs into a Java GUI. Perform text operations on the files into output files. Operations include concatention, text and regex editing, and other file/string/row/column/script operations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    TagParser is a java parser based on CSS formulas (like JQuery) and can parse any documents based on tags such as XML, HTML. Furthermore, it doesn't require documents to be well formed and can parse complex documents with embedded scripts or CSS parts
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    ApexText

    ApexText

    Efficient and lightweight text editor with rich functionalities.

    ApexText is a general purpose text editor for developers and non-developers. It supports synatx highlighting for Java, C, C++, Perl, SQL, JSP, HTML etc., tooling for Java. Many UI features are configurable.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    PODR is a PHP mailmerging and converting library mostly designed to parse and convert ODT templates to DOC/PDF. Templating is based on Savant, Conversion uses a webservice of JODConverter. A filter is available to include runtime generated images.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    A stand-alone editor using Mediawiki markup language to generate HTML code. You can create and preview pages written using Mediawiki markup (i.e. Wikipedia pages) while off-line.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    HTMLtools includes several Java HTML tools for preparing Web pages. The HTMLtools program automates batch conversion of tab-delimited spreadsheet text files to HTML Web-page files, file & table editing, keyword mapping, templates, and more.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next