Showing 514 open source projects for "html source extractor"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 1
    jsoup

    jsoup

    Java library for working with real-world HTML

    jsoup is a Java library for working with real-world HTML. It provides a very convenient API for fetching URLs and extracting and manipulating data, using the best of HTML5 DOM methods and CSS selectors. jsoup implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers do. jsoup is designed to deal with all varieties of HTML found in the wild; from pristine and validating, to invalid tag-soup; jsoup will create a sensible parse tree. The parser will make...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Asciidoc Editor based on JavaFX 20

    Asciidoc Editor based on JavaFX 20

    Asciidoc Editor and Toolchain written with JavaFX 19

    Asciidoc FX is a WYSIWYG editor for the Asciidoc markup language. You can build PDF, Epub, and HTML books, documents, and slides. Supported Operating Systems and Builds shows the list of available builds with links for reference. If you are looking for the very latest version, visit the link in the note above to be guaranteed of downloading the latest and greatest version of AsciidocFX. AsciidocFX converts documents via the AsciidoctorJ library.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 3
    Hawtio

    Hawtio

    Hawtio web console helps you manage your JVM stuff and stay cool

    Hawtio is a lightweight and modular Web console for managing Java applications. Hawtio has plugins such as: Apache Camel and JMX (Logs, Spring Boot, Quartz, and more will be provided soon). You can dynamically extend Hawtio with your own plugins or automatically discover plugins inside the JVM. The only server-side dependency (other than the static HTML/CSS/JS/images) is the excellent Jolokia library which has a small footprint (around 300KB) and is available as a JVM agent or comes embedded...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 4
    commonmark-java

    commonmark-java

    Java library for parsing and rendering CommonMark (Markdown)

    Java library for parsing and rendering Markdown text according to the CommonMark specification (and some extensions). Provides classes for parsing input to an abstract syntax tree of nodes (AST), visiting and manipulating nodes, and rendering to HTML. It started out as a port of commonmark.js, but has since evolved into a full library with a nice API.
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    JBake

    JBake

    Java based open source static site/blog generator for developers

    JBake is a Java-based, open source, static site/blog generator for developers & designers. The project uses Gradle 4.9+ as the build system. We configured the gradle check style Plugin to run with the check task. It does not break the build if convention violations are found. But prints a warning and generates a report. Source available on GitHub, licensed under MIT License.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    WebMagic

    WebMagic

    A scalable web crawler framework for Java

    WebMagic is a scalable crawler framework. It covers the whole lifecycle of crawler, downloading, url management, content extraction and persistent. It can simplify the development of a specific crawler. WebMagic is a simple but scalable crawler framework. You can develop a crawler easily based on it. WebMagic has a simple core with high flexibility, a simple API for html extracting. It also provides annotation with POJO to customize a crawler, and no configuration is needed. Some other...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 7
    Acode

    Acode

    A powerful text/code editor for Android

    ...Acode lets you build and run websites right in your browser, debug with ease using the built-in console, and edit a wide range of source files from Python and CSS to Java, JavaScript, Dart, and more.
    Downloads: 63 This Week
    Last Update:
    See Project
  • 8
    Java Tablesaw

    Java Tablesaw

    Java dataframe and visualization library

    Tablesaw is a dataframe and visualization library that supports loading, cleaning, transforming, filtering, and summarizing data. If you work with data in Java, it may save you time and effort. Tablesaw also supports descriptive statistics and can be used to prepare data for working with machine learning libraries like Smile, Tribuo, H20.ai, DL4J. Import data from RDBMS, Excel, CSV, TSV, JSON, HTML, or Fixed Width text files, whether they are local or remote (http, S3, etc.) Tablesaw...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 9
    Apache Log4j

    Apache Log4j

    Apache Log4j 2 is a versatile, feature-rich, efficient logging API

    Apache Log4j is a versatile, industrial-grade Java logging framework composed of an API, its implementation, and components to assist the deployment for various use cases. Log4j is used by 8% of the Maven ecosystem and listed as one of the top 100 critical open source software projects. The project is actively maintained by a team of several volunteers and supported by a big community.
    Downloads: 17 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 10
    thymeleaf

    thymeleaf

    Thymeleaf is a modern server-side Java template engine for web

    Thymeleaf is a modern server-side Java template engine for both web and standalone environments. Thymeleaf's main goal is to bring elegant natural templates to your development workflow, HTML that can be correctly displayed in browsers and also work as static prototypes, allowing for stronger collaboration in development teams. With modules for Spring Framework, a host of integrations with your favorite tools, and the ability to plug in your own functionality, Thymeleaf is ideal for...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    capacitor

    capacitor

    Build cross-platform native progressive web apps for iOS and Android

    Capacitor is an open source native runtime for building Web Native apps. Create cross-platform iOS, Android, and Progressive Web Apps with JavaScript, HTML, and CSS. Capacitor’s native plugin APIs make it extremely easy to access and invoke common device functionality across multiple platforms. Build web-based applications that run equally well across iOS, Android, and as Progressive Web Apps.
    Downloads: 42 This Week
    Last Update:
    See Project
  • 12
    Kryo

    Kryo

    Java binary serialization and cloning, fast, efficient, automatic

    Kryo is a fast and efficient binary object graph serialization framework for Java. The goals of the project are high speed, low size, and an easy-to-use API. The project is useful any time objects need to be persisted, whether to a file, database or over the network. Kryo can also perform automatic deep and shallow copying/cloning. This is direct copying from object to object, not object to bytes to object. Kryo has three sets of methods for reading and writing objects. If the concrete class...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 13
    springdoc-openapi

    springdoc-openapi

    Library for OpenAPI 3 with spring-boot

    An extended support for springdoc-openapi v1 project is now available for organizations that need support beyond 2023. The springdoc-openapi Java library helps automating the generation of API documentation using Spring Boot projects. springdoc-openapi works by examining an application at runtime to infer API semantics based on Spring configurations, class structure and various annotations. The library automatically generates documentation in JSON/YAML and HTML formatted pages. The generated...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14
    Karate

    Karate

    Test automation made simple

    Karate is the only open-source tool to combine API test-automation, mocks, performance-testing and even UI automation into a single, unified framework. The BDD syntax popularized by Cucumber is language-neutral, and easy for even non-programmers. Assertions and HTML reports are built-in, and you can run tests in parallel for speed. There’s also a cross-platform stand-alone executable for teams not comfortable with Java.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 15
    java-pdf-table-extractor-lib

    java-pdf-table-extractor-lib

    Java Pdf Table extraction library

    The command line application is an example of usage of the Java library. The library is based on pdfbox library and works by looking for the layout of each selected pdf page, and looking for table structure patterns. After calling the library (passing the pdf filename, and the page range), the result is a List<PdfTextElement>. PdfTextElement is an interface that has two implementations. * A basic text (outside the tables) * And PdfTextTabulaElement, for table structures. That...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    OpenAPI Generator

    OpenAPI Generator

    OpenAPI Generator allows generation of API client libraries

    With 50+ client generators, you can easily generate code to interact with any server which exposes an OpenAPI document. Maintainers of APIs may also automatically generate and distribute clients as part of official SDKs. Each client supports different options and features, but all templates can be replaced with your own Mustache-based templates. Getting started with server development can be tough, especially if you're evaluating technologies. We can reduce the burden when you bring your own...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 17
    JeecgBoot

    JeecgBoot

    Low-code enterprise web development platform

    JeecgBoot is a low-code platform built on Spring Boot that accelerates enterprise application development with online forms, code generation, and a modern Vue-based frontend. It can generate CRUD screens, data dictionaries, and menu structures from database schemas, producing clean starter code that developers can extend. The platform integrates common enterprise features—RBAC permissions, data scopes, dictionary management, logging, and file/OSS integration—so teams start from a...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 18
    Schema Spy

    Schema Spy

    SchemaSpy code home

    This is a new code repository for SchemaSpy tool initially created and maintained by John Currier. I personally believe that work on SchemaSpy should be continued, and a lot of still existing issues should be resolved. Last released version of the SchemaSpy was in 2010, and I have a plan to change this. Process of installation is very simple because SchemaSpy is only one Java .jar application. You can learn more read the installation doc. When you environment will be ready, and you can start...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    picocli

    picocli

    Framework for building GraalVM-enabled command line apps

    Picocli is a one-file framework for creating Java command-line applications with almost zero code. It supports a variety of command-line syntax styles including POSIX, GNU, MS-DOS and more. It generates highly customizable usage help messages that use ANSI colors and styles to contrast important elements and reduce the cognitive load on the user. Picocli-based applications can have command line TAB completion showing available options, option parameters, and subcommands, for any level of...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    Swagger Codegen

    Swagger Codegen

    Template-driven engine to generate documentation

    Swagger-Codegen contains a template-driven engine to generate documentation, API clients and server stubs in different languages by parsing your OpenAPI / Swagger definition. Simplify API development for users, teams, and enterprises with the Swagger open source and professional toolset. Find out how Swagger can help you design and document your APIs at scale. The power of Swagger tools starts with the OpenAPI Specification, the industry standard for RESTful API design. Individual tools to...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    JasperReports Library

    JasperReports Library

    Free Java Reporting Library

    JasperReports Library is the world's most popular open source business intelligence and reporting engine. It is entirely written in Java and it is able to use data coming from any kind of data source and produce pixel-perfect documents that can be viewed, printed or exported in a variety of document formats including HTML, PDF, Excel, OpenOffice and Word. The project is also available at: https://github.com/TIBCOSoftware/jasperreports Jaspersoft Studio is the open source report designer for the JasperReports Library. ...
    Leader badge
    Downloads: 1,588 This Week
    Last Update:
    See Project
  • 22

    HtmlUnit

    Java GUI-Less browser, supporting JavaScript, to run against web pages

    A java GUI-Less browser, which allows high-level manipulation of web pages, such as filling forms and clicking links; just getPage(url), find a hyperlink, click() and you have all the HTML, JavaScript, and Ajax are automatically processed.
    Leader badge
    Downloads: 37 This Week
    Last Update:
    See Project
  • 23

    DocJGenerator

    Wiki generator and Java Help System

    Allows to generate a wiki (interlinked HTML files) from a bunch of XML formatted files. It also allows to add a Help-system to a Swing or JavaFX application. Also it is also possible to generate a PDF, Word (docx), or epub document rather than a wiki. The tool also provides a visual editor to edit the wiki. The project also support both the Mediawiki and Markdown syntax.
    Downloads: 57 This Week
    Last Update:
    See Project
  • 24
    DownSmith Markdown Editor

    DownSmith Markdown Editor

    A powerful, feature-rich Markdown editor with real-time HTML preview.

    DownSmith provides an intuitive editing experience with comprehensive formatting tools, syntax highlighting, live preview, table creation, spell checking, footnotes, HTML export, and intelligent image handling. Runs without Java being installed on Windows. On macOS and Linux requires Java 11 or better installed. A Java 8 version is provided that has all the functionality of the Java 11 version except footnotes.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 25
    OmegaT - multiplatform CAT tool

    OmegaT - multiplatform CAT tool

    The free computer aided translation (CAT) tool for professionals

    OmegaT is a free and open source multiplatform Computer Assisted Translation tool with fuzzy matching, translation memory, keyword search, glossaries, and translation leveraging into updated projects.
    Leader badge
    Downloads: 1,627 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB