Showing 126 open source projects for "java html parser"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • 1
    html-to-markdown

    html-to-markdown

    Convert HTML to Markdown. Even works with entire websites

    Convert HTML into Markdown with Go. It is using an HTML Parser to avoid the use of regexp as much as possible. That should prevent some weird cases and allows it to be used for cases where the input is totally unknown.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 2
    ANTLR

    ANTLR

    Parser generator to read, process, or translate structured text

    ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files. It's widely used to build languages, tools, and frameworks. From a grammar, ANTLR generates a parser that can build and walk parse trees. It’s widely used in academia and industry to build all sorts of languages, tools, and frameworks. Twitter search uses ANTLR for query parsing, with over 2 billion queries a day. The languages for...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 3
    markdown-rs

    markdown-rs

    CommonMark compliant markdown parser in Rust with ASTs and extensions

    markdown-rs is an open-source markdown parser written in Rust. It’s implemented as a state machine (#![no_std] + alloc) that emits concrete tokens, so that every byte is accounted for, with positional info. The API then exposes this information as an AST, which is easier to work with, or it compiles directly to HTML. While most markdown parsers work towards compliancy with CommonMark (or GFM), this project goes further by following how the reference parsers (cmark, cmark-gfm) work, which is confirmed with thousands of extra tests. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Markdig

    Markdig

    A fast, powerful, CommonMark compliant, extensible Markdown processor

    A fast, powerful, CommonMark compliant, extensible Markdown processor for .NET. Very fast parser and HTML renderer (no-regexp), very lightweight in terms of GC pressure. Abstract Syntax Tree with precise source code location for syntax tree, useful when building a Markdown editor. Check out MarkdownEditor for Visual Studio powered by Markdig! Even the core Markdown/CommonMark parsing is pluggable, so it allows to disable built-in Markdown/Commonmark parsing (e.g Disable HTML parsing) or change behavior. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Rent Manager Software Icon
    Rent Manager Software

    Landlords, multi-family homes, manufactured home communities, single family homes, associations, commercial properties and mixed portfolios.

    Rent Manager is award-winning property management software built for residential, commercial, and short-term-stay portfolios of any size. The program’s fully customizable features include a double-entry accounting system, maintenance management/scheduling, marketing integration, mobile applications, more than 450 insightful reports, and an API that integrates with the best PropTech providers on the market.
    Learn More
  • 5
    Geany

    Geany

    A fast and lightweight IDE

    Geany is a powerful, stable and lightweight programmer's text editor that provides tons of useful features without bogging down your workflow. It runs on Linux, Windows and macOS, is translated into over 40 languages, and has built-in support for more than 50 programming languages.
    Downloads: 59 This Week
    Last Update:
    See Project
  • 6
    Quarkdown

    Quarkdown

    Markdown with superpowers, from ideas to papers, and presentations

    Quarkdown is a lightweight Markdown processor and static site generator written in Java. It converts Markdown files into styled HTML pages with customizable themes, supporting blog creation and simple documentation websites. Quarkdown emphasizes simplicity and speed, providing an out-of-the-box experience for minimal personal sites.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    tika-python

    tika-python

    Python binding to the Apache Tika™ REST services

    A Python port of the Apache Tika library that makes Tika available using the Tika REST Server. This makes Apache Tika available as a Python library, installable via Setuptools, Pip and easy to install. To use this library, you need to have Java 7+ installed on your system as tika-python starts up the Tika REST server in the background. To get this working in a disconnected environment, download a tika server file (both tika-server.jar and tika-server.jar.md5, which can be found here) and set...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Centaur Emacs

    Centaur Emacs

    A Fancy and Fast Emacs Configuration

    ...It is compatible ONLY with GNU Emacs 26.1 and above. In general you’re advised to always run with the latest stable release, currently 28.2. Supports multiple programming languages, C/C++/Object-C/C#/Java, Python/Ruby/Perl/PHP/Shell/Powershell/Bat, JavaScript/Typescript/JSON/YAML, HTML/CSS/XML, and Golang/Swift/Rust/Dart/Elixir.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    OmegaT - multiplatform CAT tool

    OmegaT - multiplatform CAT tool

    The free computer aided translation (CAT) tool for professionals

    OmegaT is a free and open source multiplatform Computer Assisted Translation tool with fuzzy matching, translation memory, keyword search, glossaries, and translation leveraging into updated projects.
    Leader badge
    Downloads: 1,988 This Week
    Last Update:
    See Project
  • The Original Buy Center Software. Icon
    The Original Buy Center Software.

    Never Go To The Auction Again.

    VAN sources private-party vehicles from over 20 platforms and provides all necessary tools to communicate with sellers and manage opportunities. Franchise and Independent dealers can boost their buy center strategies with our advanced tools and an experienced Acquisition Coaching™ team dedicated to your success.
    Learn More
  • 10
    EditPlus

    EditPlus

    Text editor for Windows with built-in FTP, FTPS and sftp

    EditPlus is a lightweight text editor designed for Windows that caters to programmers, web developers, and anyone working with code or text. It offers powerful features like syntax highlighting, code folding, and a customizable interface, making it an excellent alternative to more complex Integrated Development Environments (IDEs). EditPlus supports a wide range of programming languages, including HTML, CSS, PHP, JavaScript, C++, and more. It also integrates tools for FTP, SFTP, and...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 11

    RecordEditor

    Editor for Fixed Width, Csv and Existing Xml files.

    The RecordEditor is a Data File editor for Flat Files (delimited and fixed field position). It supports Unix / PC / Legacy (e.g. Mainframe) file formats, both Text and binary files. The Editor uses a Record-Layout description to format the files. This is ideal for Fixed width (Text or Binary) files, Cobol Data Files, Mainframe files and complicated Csv files. Cobol Copybooks can be used to format Cobol Data files. As well as an editor, The following utilities are supplied * Formatted...
    Leader badge
    Downloads: 71 This Week
    Last Update:
    See Project
  • 12
    ant4docbook

    ant4docbook

    ANT4DOCBOOK is an ANT task for DOCBOOK

    ANT4DOCBOOK is an ANT task for DOCBOOK, a semantic markup language for technical documentation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    DocWire SDK

    DocWire SDK

    Award-winning modern data processing SDK in C++20

    DocWire SDK, a standout C++20AI driven data processing tool, has received award from SourceForge and strong backing from Microsoft. It handles nearly 100 file types, empowering efficient text extraction, web data extraction, and document analysis. For businesses, the shift to DocWire SDK signifies a leap forward. It promises comprehensive document format support and the ability to extract valuable insights from email boxes, databases, and websites using cutting-edge AI. DocWire SDK aims to...
    Leader badge
    Downloads: 8 This Week
    Last Update:
    See Project
  • 14
    myScite

    myScite

    The allRound pocket sized CodeEditor.

    .... -- Features -- - Full MinGW and GTK SDKs Autocomplete.(190+) - Do system scripting (bash, applescript, cmd, powershell, perl, j/vbscript, awk) - Examine all sorts of data files (sql, regedit, mib, xml, yaml, json, vcard ...) - Review difference and patch files - Create makefiles (gnu make / cmake) - Edit html, css and config files (with calltips) - Describe circuits in vhdl and spice. ... - And finally; read & write source code: - [ Syntax highlighted ] - go, vala, pike, swift, flash, ch, rust - [ Calltip assisted ] - c/cpp11, js&jQuery, python, php, ruby, lua, c#, java, perl --Others-- - Restructured config files with inline docs - Scriptable via lua Extension...
    Leader badge
    Downloads: 6 This Week
    Last Update:
    See Project
  • 15
    wxMEdit

    wxMEdit

    wxMEdit, Cross-platform Text/Hex Editor, Improved Version of MadEdit

    •Added automatically checking for updates •Added bookmark support •Added right-click context menu for each tab •Added purging histories support •Added selecting a line by triple click •Added FreeBASIC syntax file •Added an option to place configuration files into %APPDATA% directory under Windows •Improved support for Find/Replace •Improved Mac OS X support •Improved system integration under Windows •Improved encoding detection result •Improved Hex editing support •Added more...
    Leader badge
    Downloads: 185 This Week
    Last Update:
    See Project
  • 16
    Writer2LaTeX and Writer2xhtml is a collection of converters from OpenDocument Format (ODF) to LaTeX/BibTeX, HTML+MathML and EPUB. It is delivered as a standalone java library, as a command line application and as extensions for LibreOffice.
    Leader badge
    Downloads: 31 This Week
    Last Update:
    See Project
  • 17
    Madedit-Mod

    Madedit-Mod

    MadEdit-Mod is a cross platform Text/Hex editor based on MadEdit

    Madedit-Mod is a cross platform text/hex editor base on MadEdit with a log of critical bug fix from me or other developers. A lot of new features were added, such as Drag-Drop Edit(cross platform), Highlight word, etc. The reason that I maintained this project is that the author of MadEdit had not worked on it for for a long time and I really like it and need more features. Find more information on Wiki pages. Currently supported Languages: English Chinese Simplified...
    Leader badge
    Downloads: 61 This Week
    Last Update:
    See Project
  • 18
    RTextDoc

    RTextDoc

    An editor for structured documents

    RTextDoc is an editor for structured text documents such as LaTeX, AsciiDoc, DocBook. RTextDoc has proofreading capabilities: on-the-fly spelling, instant grammar checking and built-in free dictionaries. RTextDoc has syntax highlighting, bracket matching, folding, document structure browser for sections and labels, bookmarks, manager for LaTeX symbols, an editor for mathematical equations,integrated BibTeX database manager and several tools to convert LaTeX to HTML and back. AsciiDoc...
    Leader badge
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    SimplyHTML is an application and a java component for rich text processing. It stores documents as HTML files in combination with Cascading Style Sheets (CSS). SimplyHTML is not intended to be used as an editor for web pages.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20
    XML Editor/Validator/Designer with CAMV

    XML Editor/Validator/Designer with CAMV

    CAM XML Editor for XML+JSON+Hibernate+SQL Open-XDX sponsored by Oracle

    Java/Eclipse +Saxon/XSL
    Downloads: 13 This Week
    Last Update:
    See Project
  • 21
    Leseratte is a Java parser for German written language. Currently, it contains a German lexicon (based on the Wiktionary), inflexion rules, a grammar and a parser. (Semantics component planned.) Usable as a Java library, also provides a graphical UI.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    remarkable

    remarkable

    Markdown parser, done right

    Markdown parser, done right. Commonmark support, extensions, syntax plugins, high speed, all in one. Gulp and metalsmith plugins are available. Used by Facebook, Docusaurus, and many others! Supports the CommonMark spec + syntax extensions + sugar (URL auto-linking, typographer). Configurable syntax! You can add new rules and even replace existing ones.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    rest-dev-vnc-docker

    rest-dev-vnc-docker

    Restful / SOAP API Development with common tools in VNC/noVNC Docker

    The idea is to use Docker with VNC/noVNC to aggregate all the needed and related Developments tools/IDEs within a single Docker as an agile way to stand up specific collections of tools quick within a Container quick computing needs. REST Development (this GIT) to cover end-to-end needs from JSON/XML, REST connection, Swagger, MongoDB, Test, etc. The use-cases of this kind of VNC/noVNC docker container is just limited by your imaginations and your device or network limitations. Virtually...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Web Book Downloader

    Web Book Downloader

    Download websites as e-book: pdf, txt, epub.

    This application allows user to download chapters from website in 3 ways: - from table of contents; - from range: first chapter address, last chapter address; - by crawling from first chapter to n; In settings you can customize language, input(website encoding) for simplicity output is in the same encoding. If you want your language add new class into strings package, and new fields into Settings class and GUI menu(initialize method).
    Downloads: 10 This Week
    Last Update:
    See Project
  • 25

    ConcatPDF

    PDF Concatenation Tool

    ConcatPDF is the tool to concatenate PDF files. It can concatenate, extract, encrypt, decrypt, configure PDF files, convert image files to PDF. GUI version and CUI version are both available. iText.NET is iText porting on .NET Framework by J#. This library allows you to generate PDF, (X)HTML, XML, RTF files on Microsoft.NET Framework including ASP.NET.
    Leader badge
    Downloads: 46 This Week
    Last Update:
    See Project