Showing 396 open source projects for "java html parser"

View related business solutions
  • Secure remote access solution to your private network, in the cloud or on-prem. Icon
    Secure remote access solution to your private network, in the cloud or on-prem.

    Deliver secure remote access with OpenVPN.

    OpenVPN is here to bring simple, flexible, and cost-effective secure remote access to companies of all sizes, regardless of where their resources are located.
    Get started — no credit card required.
  • Top-Rated Free CRM Software Icon
    Top-Rated Free CRM Software

    216,000+ customers in over 135 countries grow their businesses with HubSpot

    HubSpot is an AI-powered customer platform with all the software, integrations, and resources you need to connect your marketing, sales, and customer service. HubSpot's connected platform enables you to grow your business faster by focusing on what matters most: your customers.
    Get started free
  • 1
    html-metadata

    html-metadata

    MetaData html scraper and parser for Node.js (supports Promises

    The aim of this library is to be a comprehensive source for extracting all HTML-embedded metadata. Currently, it supports Schema.org microdata using a third-party library, a native BEPress, Dublin Core, Highwire Press, JSON-LD, Open Graph, Twitter, EPrints, PRISM, and COinS implementation, and some general metadata that doesn't belong to a particular standard (for instance, the content of the title tag, or meta description tags). Planned is support for RDFa, AGLS, and other yet unheard...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    jsoup

    jsoup

    Java library for working with real-world HTML

    jsoup is a Java library for working with real-world HTML. It provides a very convenient API for fetching URLs and extracting and manipulating data, using the best of HTML5 DOM methods and CSS selectors. jsoup implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers do. jsoup is designed to deal with all varieties of HTML found in the wild; from pristine and validating, to invalid tag-soup; jsoup will create a sensible parse tree. The parser will make every...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    WebMagic

    WebMagic

    A scalable web crawler framework for Java

    WebMagic is a scalable crawler framework. It covers the whole lifecycle of crawler, downloading, url management, content extraction and persistent. It can simplify the development of a specific crawler. WebMagic is a simple but scalable crawler framework. You can develop a crawler easily based on it. WebMagic has a simple core with high flexibility, a simple API for html extracting. It also provides annotation with POJO to customize a crawler, and no configuration is needed. Some other features...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    The CSS Parser is implemented as a package of Java classes, that inputs Cascading Style Sheets source text and outputs a Document Object Model Level 2 Style tree. Alternatively, applications can use SAC: The Simple API for CSS. Its purpose is to allow developers working with Java to incorporate Cascading Style Sheet information, primarily in conjunction with XML application developments. The CSS Parser package includes parsers for: * Cascading Style Sheets Level 3, * Cascading...
    Downloads: 9 This Week
    Last Update:
    See Project
  • Red Hat Enterprise Linux on Microsoft Azure Icon
    Red Hat Enterprise Linux on Microsoft Azure

    Deploy Red Hat Enterprise Linux on Microsoft Azure for a secure, reliable, and scalable cloud environment, fully integrated with Microsoft services.

    Red Hat Enterprise Linux (RHEL) on Microsoft Azure provides a secure, reliable, and flexible foundation for your cloud infrastructure. Red Hat Enterprise Linux on Microsoft Azure is ideal for enterprises seeking to enhance their cloud environment with seamless integration, consistent performance, and comprehensive support.
    Learn More
  • 5
    Lobo Evolution - Java Web Browser

    Lobo Evolution - Java Web Browser

    Lobo Evolution is an extensible all-Java web browser and RIA platform

    Lobo Evolution is a fork of Lobo Browser. The project continuing the work of Lobo Browser(lobochief). Lobo Evolution is an extensible all-Java web browser and RIA platform. It supports HTML 4, HTML5 Javascript, CSS 3 and Java (Swing) rendering. CobraEvolution is the web browser's renderer API; also a Javascript-aware HTML parser. Lobo Evolution 3.1 relesed CHANGELOG: https://github.com/LoboEvolution/LoboEvolution/releases Read wiki: https://loboevolution.github.io/LoboEvolution/project...
    Leader badge
    Downloads: 3 This Week
    Last Update:
    See Project
  • 6
    Web Spider, Web Crawler, Email Extractor

    Web Spider, Web Crawler, Email Extractor

    Free Extracts Emails, Phones and custom text from Web using JAVA Regex

    In Files there is WebCrawlerMySQL.jar which supports MySql Connection Free Web Spider & Crawler. Extracts Information from Web by parsing millions of pages. Store data into Derby Database and data are not being lost after force closing the spider. - Free Web Spider , Parser, Extractor, Crawler - Extraction of Emails , Phones and Custom Text from Web - Export to Excel File - Data Saved into Derby and MySQL Database - Written in Java Cross Platform Also See Free email Sender...
    Leader badge
    Downloads: 74 This Week
    Last Update:
    See Project
  • 7
    ZK - Simply Ajax and Mobile
    Ajax+Mobile Java Web framework. With 200+ Ajax components and event-driven, Ajax/RIA apps are as effortless and rich as desktop apps and HTML/XUL pages. Support JSP/JSF/JavaEE/Spring, Ajax Push and Client-fusion; also Java/Groovy/Python/JavaScript.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 8
    Web Spider, Web Crawler, Email Extractor

    Web Spider, Web Crawler, Email Extractor

    Free Extracts Emails, Phones and custom text from Web using JAVA Regex

    In Files there is WebCrawlerMySQL.jar which supports MySql Connection Please follow this link to get latest version https://sourceforge.net/projects/web-spider-web-crawler-extract/ Free Web Spider & Crawler. Extracts Information from Web by parsing millions of pages. Store data into Derby OR MySQL Database and data are not being lost after force closing the spider. - Free Web Spider , Parser, Extractor, Crawler - Extraction of Emails , Phones and Custom Text from Web - Export...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9

    DocJGenerator

    Wiki generator and Java Help System

    Allows to generate a wiki (interlinked HTML files) from a bunch of XML formatted files. It also allows to add a Help-system to a Swing or JavaFX application. Also it is also possible to generate a PDF, Word (docx), or epub document rather than a wiki. The tool also provides a visual editor to edit the wiki. The project also support both the Mediawiki and Markdown syntax.
    Downloads: 0 This Week
    Last Update:
    See Project
  • The #1 Embedded Analytics Solution for SaaS Teams. Icon
    The #1 Embedded Analytics Solution for SaaS Teams.

    Qrvey saves engineering teams time and money with a turnkey multi-tenant solution connecting your data warehouse to your SaaS application.

    Qrvey’s comprehensive embedded analytics software enables you to design more customizable analytics experiences for your end users.
    Try Developer Playground
  • 10
    FireWeb

    FireWeb

    Java Single Page Web Application Framework

    Java Single Page Application Framework. Using POJO/JavaScript/WebSocket/HTML. Without any scripting - only pure Java classes. Simply and easy to learn, use and extend.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Lots of small projects: games, VST plugins, experimental IRC server, ROM hacking tools, net tools, font tools, html tools, etc. Browse CVS!
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    JSSoup

    JSSoup

    JavaScript + BeautifulSoup = JSSoup

    I'm a fan of Python library BeautifulSoup. It's feature-rich and very easy to use. But when I am working on a small react-native project, and I tried to find a HTML parser library like BeautifulSoup, I failed. So I want to write a HTML parser library that can be so easy to use just like BeautifulSoup in Javascript. JSSoup uses tautologistics/node-htmlparser as HTML dom parser, and creates a series of BeautifulSoup like API on top of it. JSSoup supports both node and react-native. JSSoup tries...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    JDynamiTe, Dynamic Template in Java

    JDynamiTe, Dynamic Template in Java

    Dynamically generate documents from templates

    ... separation between data (content), presentation (container) and content generation code (written in Java). JDynamiTe does not include a specific template language, and it is not a complete framework. It is a simple "brick" in your software architecture, a "glue" between your data model and your presentation model. JDynamiTe is a Java package, which is designed to be flexible and open. For more details and a lot of examples, visit the homepage here: http://jdynamite.sourceforge.net
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    K-Framework
    The KFramework is the first integral SOFEA/SOUI framework for web based business applications using Domain Driven Design. The framework provides a web delivered SWING frontend and a WebServices based backend.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15

    multilanguageServlet

    servlet for multilanguage simple html pages

    Library for tomcat that implements a servlet to deal with multi-language html pages. There is no magic ... You have to create a template of your html, replacing language dependent texts for labels with format: /*LABEL*/ And create one properties file for every language with those LABELs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    jmarkdownviewer

    markdown viewer for java

    This is a Markdown viewer for java. Primarily it tries to display github styled markdown scripts. Download the release jar file and run java -jar jmdviewer-x.y.jar where x.y is the version number. You can pass the markdown filename on the command line java -jar jmdviewer-x.y.jar [filename.md]
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    CSSBox

    CSSBox

    Pure Java HTML / CSS rendering engine

    CSSBox is an (X)HTML/CSS rendering engine written in pure Java. Its primary purpose is to provide a complete information about the rendered page suitable for further processing. However, it also allows displaying the rendered document.
    Leader badge
    Downloads: 30 This Week
    Last Update:
    See Project
  • 18

    FreeMarker template engine

    Generates text that depends on changing data (like dynamic HTML).

    FreeMarker is a template engine. That is, it provides an easy way to generate text (HTML, source code, configuration files, emails, etc.) that depends on changing data. It's designed to separate the rendering/formatting logic (like visual design, HTML issues, etc.) from the backing application logic and technical complexity. It has a flexible API so you can integrate it into your application the way that best fits it.
    Downloads: 40 This Week
    Last Update:
    See Project
  • 19
    Cerberus Content Management System

    Cerberus Content Management System

    Cerberus Content Management System

    Cerberus Content Management System is a Monolithic and Modular Content Management System that is written in 100% Pure PHP code with 100% Pure HTML output, and it supports multiple Database Management Systems. Cerberus Content Management System source code is completely handwritten by the author(s). The CerberusCMS project is focused on data security and ease of use, therefore we have decided to make very little use of JavaScript in the PurePHP Releases. The still-secure, and easier...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Web Book Downloader

    Web Book Downloader

    Download websites as e-book: pdf, txt, epub.

    This application allows user to download chapters from website in 3 ways: - from table of contents; - from range: first chapter address, last chapter address; - by crawling from first chapter to n; In settings you can customize language, input(website encoding) for simplicity output is in the same encoding. If you want your language add new class into strings package, and new fields into Settings class and GUI menu(initialize method).
    Downloads: 6 This Week
    Last Update:
    See Project
  • 21
    iText®, a JAVA PDF library

    iText®, a JAVA PDF library

    PDF Library for Developers

    iText is an open-source PDF library available for Java and .NET (C#). iText allows you to effortlessly generate and manipulate standards-compliant PDF documents with a powerful and feature-rich SDK. With iText, you can create archivable and accessible PDFs, split and merge documents, fill and flatten forms, digitally sign documents, and more. iText add-ons enable additional functionality, such as PDF creation from HTML templates, secure redaction, OCR, and much more. The latest versions...
    Leader badge
    Downloads: 457 This Week
    Last Update:
    See Project
  • 22
    crawler4j

    crawler4j

    Open source web crawler for Java

    crawler4j is an open source web crawler for Java which provides a simple interface for crawling the Web. Using it, you can setup a multi-threaded web crawler in few minutes. You need to create a crawler class that extends WebCrawler. This class decides which URLs should be crawled and handles the downloaded page. shouldVisit function decides whether the given URL should be crawled or not. In the above example, this example is not allowing .css, .js and media files and only allows pages within...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    Command-line/Ant-task/embeddable text file preprocessor. Macros, flow control, expressions. Recursive directory processing. Extensible in Java to display data from any data sources (as database). Can generate complete homepages (tree of HTML-s, images, etc.)
    Downloads: 11 This Week
    Last Update:
    See Project
  • 24
    OpenSearchServer Search Engine

    OpenSearchServer Search Engine

    An open source search engine with RESTFul API and crawlers

    OpenSearchServer is a powerful, enterprise-class, search engine program. Using the web user interface, the crawlers (web, file, database, etc.) and the client libraries (REST/API , Ruby, Rails, Node.js, PHP, Perl) you will be able to integrate quickly and easily advanced full-text search capabilities in your application: Full-text with basic semantic, join queries, boolean queries, facet and filter, document (PDF, Office, etc.) indexation, web scrapping,etc. OpenSearchServer runs on...
    Downloads: 21 This Week
    Last Update:
    See Project
  • 25
    Lanproxy

    Lanproxy

    Intranet penetration tool that proxies local area network computers

    Lanproxy is an intranet penetration tool that proxies local area network personal computers and servers to the public network. It supports tcp traffic forwarding and any tcp upper layer protocol (access to intranet websites, local payment interface debugging, ssh access, remote desktop, http proxy) , https proxy, socks5 proxy...). Penetration basic functions, same as the open source version, high performance, can support tens of thousands of penetration connections at the same time. Support...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next