Deploy pre-built tools that crawl websites, extract structured data, and feed your applications. Reliable web data without maintaining scrapers.
Automate web data collection with cloud tools that handle anti-bot measures, browser rendering, and data transformation out of the box. Extract content from any website, push to vector databases for RAG workflows, or pipe directly into your apps via API. Schedule runs, set up webhooks, and connect to your existing stack. Free tier available, then scale as you need to.
Explore 10,000+ tools
Atera all-in-one platform IT management software with AI agents
Ideal for internal IT departments or managed service providers (MSPs)
Atera’s AI agents don’t just assist, they act. From detection to resolution, they handle incidents and requests instantly, taking your IT management from automated to autonomous.
Writer2LaTeX and Writer2xhtml is a collection of converters from OpenDocument Format (ODF) to LaTeX/BibTeX, HTML+MathML and EPUB.
It is delivered as a standalone java library, as a command line application and as extensions for LibreOffice.
JHOVE provides functions to perform format-specific identification, validation, and characterization of digital objects. JHOVE should not be confused with JHOVE2, a product with similar aims but a completely separate code base.
A powerful Java Template Engine, great for building HTML or XML docs. Chunk can handle many other needs and situations as well. In-tag filters & default values, multiple snippets per file, layered themes, macros, conditional includes, localization & more.
Picks up text from a web page using a html template.
A java html picker - text extractor
Picks up text from a web page using a html template. Useful if you have regularly data to extract from the same site. You may use the same url or you may build urls having parameters. These parameters are fetch from a text file.
VNC for use with the BrowserMob Selenium JavaScript Validator. This tool is made available for users of BrowserMob FREE Website Monitoring and Load Testing. The BrowserMob Local Validation Service can be downloaded from https://browsermob.com/tools.
HTTP functional and non-functional (load and performance) toolkit based on jython/grinder (http://grinder.sf.net) ...includes capabilities to support: SOA services, REST, json/xml encoding, AES and WS security ... and a stub to collect requests
Incredable is the first DLT-secured platform that allows you to save time, eliminate errors, and ensure your organization is compliant all in one place.
For healthcare Providers and Facilities
Incredable streamlines and simplifies the complex process of medical credentialing for hospitals and medical facilities, helping you save valuable time, reduce costs, and minimize risks. With Incredable, you can effortlessly manage all your healthcare providers and their credentials within a single, unified platform. Our state-of-the-art technology ensures top-notch data security, giving you peace of mind.
No Latte is an interpreter for a variation of the Latte language (cf. http://www.latte.org/) for writing XHTML documents in a functional-programming style---LaTeX sensibilities with LISP semantics.
SYMPLiK RANGEHOOD is a Javadoc-like tool for Oracle database. This pure-Java program "sucks up" data dictionary and object source code from database and generate document for Tables, Views, Triggers, Packages, Procedures, Functions, and others.
A stand-alone editor using Mediawiki markup language to generate HTML code. You can create and preview pages written using Mediawiki markup (i.e. Wikipedia pages) while off-line.
Program converts HTML pages into LaTeX format. Own mappings between HTML tags and character entities can be defined. CSS formatting properties are also supported (including colours). Implemented in Java.
HTMLtools includes several Java HTML tools for preparing Web pages. The HTMLtools program automates batch conversion of tab-delimited spreadsheet text files to HTML Web-page files, file & table editing, keyword mapping, templates, and more.
Goldify is a set of tools that allow automated addition of links into electronic documents. Its main purpose is to allow such addition of links into documents that wish to link to the IUPAC GoldBook (http://goldbook.iupac.org).
This is a commandline, Java base, utility to combine many css files linked via the @import declaration into one in order to minimize the number of http request on a website for css files.
Relaxed is an HTML validation app. as well as a XHTML 1.0 / HTML 4.01 and WCAG 1.0 schema definition written in Relax NG with embedded Schematron. Those expressive languages allow automated validation of many additional restrictions inexpressible by DTD.
htmlCharset is a file conversion tool, useful for replacing HTML entities by the actual characters that they represent, or vice versa. As a spin-off, it can also be used as a general charset converter for arbitrary text files.
XHTML Doclet is a standards-compliant alternative to the Javadoc standard HTML doclet. It revises the document structure to exclude outdated tags and inline styles, creates valid XHTML markup, and provides better hooks for more flexible CSS manipulation.