Showing 232 open source projects for "html source extractor"

View related business solutions
  • Train ML Models With SQL You Already Know Icon
    Train ML Models With SQL You Already Know

    BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

    Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.
    Try Free
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 1
    Cool Reader

    Cool Reader

    A cross-platform XML/CSS based eBook reader

    CoolReader is fast and small cross-platform XML/CSS based eBook reader for desktops and handheld devices. Supported formats: FB2, TXT, RTF, DOC, TCR, HTML, EPUB, CHM, PDB, MOBI. Platforms: Win32, Linux, Android. Ported on some eInk based devices.
    Leader badge
    Downloads: 383 This Week
    Last Update:
    See Project
  • 2
    crawler4j

    crawler4j

    Open source web crawler for Java

    crawler4j is an open source web crawler for Java which provides a simple interface for crawling the Web. Using it, you can setup a multi-threaded web crawler in few minutes. You need to create a crawler class that extends WebCrawler. This class decides which URLs should be crawled and handles the downloaded page. shouldVisit function decides whether the given URL should be crawled or not.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Web Widget Toolkit (WTK): Server-side components for easily creating web-based user interfaces with complex navigation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    JuniCoder is a Java project that uses unicode as a base for decoding and encoding formats that invented workarounds to express characters not covered by ASCII. Decoders translate those inventions to unicode. Encoders encode to these inventions.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 5

    StoryParser

    A set of tools and libraries to help with writing eBooks

    A set of tools and libraries (available for C# and Java) that help with writing fiction and non-fiction drafts and then produce ePUB and Kindle eBooks. With these tools/libraries, drafts, written in HTML, can be analyzed to help with writing. such as generating outlines and associating scenes with keywords. When done writing, the tools/libraries can be used to make publishable eBook, automatically producing additional material, such as Table of Contents and Title Pages.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Cross-platform visual XSLT generator
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7

    JHOVE

    File validation and characterization

    JHOVE provides functions to perform format-specific identification, validation, and characterization of digital objects. JHOVE should not be confused with JHOVE2, a product with similar aims but a completely separate code base.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Simple-Scrape is a simple web-scraping library that allows for programmatic access to HTML code. No further techniques are needed and the library is very compact and thus easy to use.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Grotag
    Grotag views Amigaguide documents or converts them to HTML and DocBook XML. Additionally it can validate and pretty print such documents.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 10

    eXtensible Text Framework (XTF)

    Framework for search and display of heterogenous document collections.

    ...Please visit https://github.com/cdlib/xtf for the latest updates. Obsolete Description: The eXtensible Text Framework (XTF) is an architecture that supports searching across collections of heterogeneous textual data (XML, PDF, HTML, text, and more), and the presentation of results and documents in a highly configurable manner. Includes highly customized versions of the proven open-source components Lucene and Saxon.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11

    XmlDoclet

    A JavaDoc doclet that outputs source code structure in XML format.

    XmlDoclet is a JavaDoc doclet that outputs the source code structure of the packages, classes etc. in XML format. Later, the XML data may easily be processed by standard tools such as XSLT to produce HTML, PDF, dot graphs etc. Technically, this is done by wrapping the class and interfaces of the com.sun.javadoc packages into JAXB annotated classes, which allows for an easy serialization.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12

    JSONjuicer

    JSON parser and encoder

    A Java open-source library which makes encoding and decoding Java data-structures to and from JSON text easy and intuitive.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    thymeleaf
    Thymeleaf is a java web template engine designed for XML/XHTML/HTML5.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    CosmoFile

    CosmoFile

    Convert your files,Edit pdf Files,Edit Images,Download files

    Looking for free software to convert your files ?CosmoFile is created for you ,a great software absolutely free for users to convert your files to many different formats.CosmoFile is very Simple and very fast and support many formats PDF,HTML,JPG,PNG,JPG,ICO,SVG,XLSX,PPTX... Edit Pdf Files with CosmoFile Looking for free software to modify PDF documents? Sometimes you need to make minor changes to a PDF file. For instance, you may want to hide your personal phone number from a PDF...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15

    rdi-exchange-prototype

    Rights Exchange prototype from the Rights Data Integration project

    Implementation in Java of a prototypical Rights Exchange as defined in the standards of the Linked Content Coalition (http://linkedcontentcoalition.org) and the Rights Data Integration project (http://www.rdi-project.org). Shows how to parse CRF-XML data containing Creation, Rights, RightsOffer, and Party data, to pose and respond to queries of a Hub, and to convert CRF-XML data into HTML for display.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Jericho HTML Parser is a java library allowing analysis and manipulation of parts of an HTML document, including server-side tags, while reproducing verbatim any unrecognised or invalid HTML.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    Chunk, an HTML Template Engine for Java

    Chunk, an HTML Template Engine for Java

    Clean, powerful templates for Java

    A powerful Java Template Engine, great for building HTML or XML docs. Chunk can handle many other needs and situations as well. In-tag filters & default values, multiple snippets per file, layered themes, macros, conditional includes, localization & more.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    OpenSearchServer Extractor

    OpenSearchServer Extractor

    A RESTFul/JSON Web Service for text and metata extraction

    An open source RESTFul Web Service for text , meta-data extraction and analysis. oss-text-extractor supports various binary formats: Word processor (doc, docx, odt, rtf) Spreadsheet (xls, xlsx, ods) Presentation (ppt, pptx, odp) Publishing (pdf, pub) Web (rss, html/xhtml) Medias (audio, images) Others (vsd, text)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19

    wikihtml

    Converts wikitext documents into HTML documents

    This project is an application that converts wikitext documents into HTML documents. Wiki markup or wikitext is a markup language to write documents in wiki-based systems, such as web sites powered by MediaWiki.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    The project is meant to create tools that will be used to manage and analyze data related to team sports. The data will include things like tournaments, matches with dates, scores and results, individual players information and statistics, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    fsc-hippo

    fsc-hippo

    专业的铸件供需线上交流平台

    专业的铸件供需线上交流平台
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    NekoHTML is a simple HTML scanner and tag balancer that enables application programmers to parse HTML documents and access the information using standard XML interfaces.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    OpenBEXI HTML Builder
    OpenBEXI is a WYSIWYG HTML builder using the magic of HTML5 and CSS3 . By resizing, dragging and dropping various HTML widgets it is easy to build a web page. All texts using the DOJO editor, pictures, charts, chart-flows, Dygraphs, timelines, lists and DOJO widgets edited on your browser look like the HTML page you are going to publish to your web site. OpenBEXI provides a powerful CSS and JavaScript editor to change on the fly the presentation and the behavior of your web...
    Downloads: 21 This Week
    Last Update:
    See Project
  • 24

    WSDLComparator

    WSDL Comparator compares two different wsdl files and gives a report

    WSDL Comparator compares two different wsdl files and gives a report This is the initial release of the WSDL comparator. The WSDL comparator compares two WSDL files with related XSD files in depth.It analyzed if the WSDL files are backward compatible or not. The comparison results could be view in the application itself and in a HTML view.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Observation Manager
    Java based astronomical logging software which stores it's data in a free and open XML based format (OpenAstronomyLog). Discontinued project: Please check out the fork: https://github.com/capape/observation-manager for an updated version
    Downloads: 3 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB