Build gen AI apps with an all-in-one modern database: MongoDB Atlas
MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.
Start Free
Our Free Plans just got better! | Auth0
With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.
You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
Web data extraction (web data mining, web scraping) tool. It leverages well proved XML and text processing techologies in order to easely extract useful data from arbitrary web pages.
Writer2LaTeX and Writer2xhtml is a collection of converters from OpenDocument Format (ODF) to LaTeX/BibTeX, HTML+MathML and EPUB.
It is delivered as a standalone java library, as a command line application and as extensions for LibreOffice.
Jericho HTML Parser is a java library allowing analysis and manipulation of parts of an HTML document, including server-side tags, while reproducing verbatim any unrecognised or invalid HTML.
A java-based parser for parsing/grabbing web sites and other text or XML documents, based on a nondeterministic parser language, creating XML output. Also contains a few utility classes for HTML, CSV and text parsing, and additional character sets.
Picks up text from a web page using a html template.
A java html picker - text extractor
Picks up text from a web page using a html template. Useful if you have regularly data to extract from the same site. You may use the same url or you may build urls having parameters. These parameters are fetch from a text file.
This library contains utility classes such as a converter from plain text to HTML (for safe inclusion of user-supplied text into web pages, avoiding XSS attacks, etc.), converters from binary to hex representation, and similar functions
Java browser and wysiwyg | source editor of html SFI(Structure-Fragment-Identifier)-files: 1) creates dynamically Table-Of-Contents 2) One-to-One mapping ToC - browser|editors 3) indexes the words 4) developers can use it for their help-system.
A stand-alone editor using Mediawiki markup language to generate HTML code. You can create and preview pages written using Mediawiki markup (i.e. Wikipedia pages) while off-line.
OrangeHRM provides a world-class HRIS experience and offers everything you and your team need to be that HR hero you know that you are.
Give your HR team the tools they need to streamline administrative tasks, support employees, and make informed decisions with the OrangeHRM free and open source HR software.
htmlCharset is a file conversion tool, useful for replacing HTML entities by the actual characters that they represent, or vice versa. As a spin-off, it can also be used as a general charset converter for arbitrary text files.
JLoom is a JSP like template language for text generation - e.g. source code, HTML, XML. JLoom templates are modular encapsulated. Parameters can be any Java type, even Generics or Varargs. There is a plugin for Eclipse and a command line tool.
An embeddable WYSWYG HTML editor for Java Swing. It is based on standard Swing JEditorPane component and provides a rich set of editing features, including paragraph and inline styling, inserting links and images, find/replace functions etc.
Use Xilize to create XHTML pages or entire websites with just a plain-text editor. The markup is similar to Textile and extensible via BeanShell. Run as a jEdit plugin, from the command line, or embed in a Java program. Small, fast, easy-to-use.
Strip out useless tags and other junk from HTML files. Shrink files, enhance readability of HTML source, promote privacy, and clean HTML exported from Microsoft Word (MS-Word). Run HTMLStrip as-is or customize it with your own regular expressions.
International Address Formats project provides for every country in the world a java.text.MessageFormat-like format for producing text (or XML/HTML-like) representation of a given address data object appropriate for a given destination country.
What is a domain-specific language intended for constructing web
services on top of ordinary web pages, or otherwise automate
web-related tasks.
Using a powerful pattern matching sublanguage, What strives to be for XML/HTML what Perl is for text analysis