Deploy pre-built tools that crawl websites, extract structured data, and feed your applications. Reliable web data without maintaining scrapers.
Automate web data collection with cloud tools that handle anti-bot measures, browser rendering, and data transformation out of the box. Extract content from any website, push to vector databases for RAG workflows, or pipe directly into your apps via API. Schedule runs, set up webhooks, and connect to your existing stack. Free tier available, then scale as you need to.
Explore 10,000+ tools
Total Network Visibility for Network Engineers and IT Managers
Network monitoring and troubleshooting is hard. TotalView makes it easy.
This means every device on your network, and every interface on every device is automatically analyzed for performance, errors, QoS, and configuration.
HTMLtools includes several Java HTML tools for preparing Web pages. The HTMLtools program automates batch conversion of tab-delimited spreadsheet text files to HTML Web-page files, file & table editing, keyword mapping, templates, and more.
PDML is an informal markup language written in PHP that is similar to HTML. It allows for the creation of complex PDF documents and can also be used in conjunction with PHP, to define templates which can generate dynamic PDF documents.
Shared Questionnaire System(SQS) is a full-functional Optical Mark Reader(OMR) form processing system implemented in Java-Swing, XSL-FO and AJAX with straightforward GUIs. It is aimed at developing social platform to share knowledge about questionnaire.
Realistic Workplace Simulations that Show Applicant Skills in Action
Skillfully transforms hiring through AI-powered skill simulations that show you how candidates actually perform before you hire them. Our platform helps companies cut through AI-generated resumes and rehearsed interviews by validating real capabilities in action. Through dynamic job specific simulations and skill-based assessments, companies like Bloomberg and McKinsey have cut screening time by 50% while dramatically improving hire quality.
Provide a robust and efficient implementation of n-gram based classifiers to Java. N-Gram algorithms have shown to be surprisingly good at tasks like guessing the language/encoding from an arbitrary text file. And there are many more applications.
OCR c++ library. Include: contour recognition; vectorisation; matrix letter feature recognition; auto page segmentation and detect rotation; SS3 ASM core; XML base; web-based GUI; 99,6% printed Unicode text recognition; letter base up to 1200 letters.
Rezku is an all-inclusive ordering platform and management solution for all types of restaurant and bar concepts. You can now get a fully custom branded downloadable smartphone ordering app for your restaurant exclusively from Rezku.
WebSHi (Web Syntax Highlighter) is just another code syntax highlighting engine. Though it's written in JavaScript, WebSHi is very fast and scalable, can process 30,000+ lines of code in seconds even on slower browsers like IE6.
A Pure Java Office suite which is compatible with the MS file formats. Since it doesn't require native libraries, it can be loaded inside any browsers and any platforms. Notably it promotes Enterprise 2.0 by combination of uEngine BPM and web office
PHPSCH is php based source code syntax highlighter. It supports php, c++, c#, java, delphi, css, html, mysql. You can change the default syntax colors with yours. project page: www.ibonette.com/phpsch-source-code-highlighter-kaynak-kod-renklendirici
Since Azeri Turkish is written in different alphabets throughout the world, This project aims to convert texts between some mainly used alphabets. specifcally, conversion between arbaic and latin alphabets is intended.
Likhon is a context sensitive input method (transliterator) for natural Languages. It is designed to analyze the pattern and context of input character sequences and generate output characters based on a predefined map script.
MOStlyCE is a "What You See Is What You Get" (WYSIWYG) editor for the open source Mia Content Management System (MiaCMS). MOStlyCE aims to bring simply, yet power, HTML editing capabilities into the hands on the average user.
JCopist is a template-based document generation server based on OpenOffice.org.
Its templates are regular OpenDocuments enhanced with the FreeMarker scripting language.
A wide range of formats are available, eg. : ODT, PDF, RTF, HTML, MS Word, MS Excel
oEdtk is an open source project for automated printing processing.
It's a toolkit for building applications that prepare flat file data for massive printing of documents.
This project has MOVED to http://savannah.gnu.org/projects/libiconv/ !!! This library provides an iconv() implementation, for use on systems systems which don't have one, or whose implementation cannot convert from/to Unicode.
RexRex is a Regexp matching engine based on Automata Theory Principles (so-called DFA Engine) Implemented in ANSI C . currently it supports (,),*,+,?,Character classes ([],w,s,..etc) and escaping
FWIW Rex is king in Latin;so it's RexKing :)
The DITA Open Platform is a free, open-source project which goal is to provide an enterprise platform for the edition, management and processing of DITA documents.