Crawl websites, sync to vector databases, and power RAG applications. Pre-built integrations for LLM pipelines and AI assistants.
Build data pipelines that feed your AI models and agents without managing infrastructure. Crawl any website, transform content, and push directly to your preferred vector store. Use 10,000+ tools for RAG applications, AI assistants, and real-time knowledge bases. Monitor site changes, trigger workflows on new data, and keep your AIs fed with fresh, structured information. Cloud-native, API-first, and free to start until you need to scale.
Try for free
Dun and Bradstreet Risk Analytics - Supplier Intelligence
Use an AI-powered solution for supply and compliance teams who want to mitigate costly supplier risks intelligently.
Risk, procurement, and compliance teams across the globe are under pressure to deal with geopolitical and business risks. Third-party risk exposure is impacted by rapidly scaling complexity in domestic and cross-border businesses, along with complicated and diverse regulations. It is extremely important for companies to proactively manage their third-party relationships. An AI-powered solution to mitigate and monitor counterparty risks on a continuous basis, this cutting-edge platform is powered by D&B’s Data Cloud with 520M+ Global Business Records and 2B+ yearly updates for third-party risk insights. With high-risk procurement alerts and multibillion match points, D&B Risk Analytics leverages best-in-class risk data to help drive informed decisions. Perform quick and comprehensive screening, using intelligent workflows. Receive ongoing alerts of key business indicators and disruptions.
HTMLtools includes several Java HTML tools for preparing Web pages. The HTMLtools program automates batch conversion of tab-delimited spreadsheet text files to HTML Web-page files, file & table editing, keyword mapping, templates, and more.
PDML is an informal markup language written in PHP that is similar to HTML. It allows for the creation of complex PDF documents and can also be used in conjunction with PHP, to define templates which can generate dynamic PDF documents.
Shared Questionnaire System(SQS) is a full-functional Optical Mark Reader(OMR) form processing system implemented in Java-Swing, XSL-FO and AJAX with straightforward GUIs. It is aimed at developing social platform to share knowledge about questionnaire.
Rise Vision is the #1 digital signage company, offering easy-to-use cloud digital signage software compatible with any player across multiple screens. Forget about static displays. Save time and boost sales with 500+ customizable content templates for your screens. If you ever need help, get free training and exceptionally fast support.
Provide a robust and efficient implementation of n-gram based classifiers to Java. N-Gram algorithms have shown to be surprisingly good at tasks like guessing the language/encoding from an arbitrary text file. And there are many more applications.
OCR c++ library. Include: contour recognition; vectorisation; matrix letter feature recognition; auto page segmentation and detect rotation; SS3 ASM core; XML base; web-based GUI; 99,6% printed Unicode text recognition; letter base up to 1200 letters.
WebSHi (Web Syntax Highlighter) is just another code syntax highlighting engine. Though it's written in JavaScript, WebSHi is very fast and scalable, can process 30,000+ lines of code in seconds even on slower browsers like IE6.
A Pure Java Office suite which is compatible with the MS file formats. Since it doesn't require native libraries, it can be loaded inside any browsers and any platforms. Notably it promotes Enterprise 2.0 by combination of uEngine BPM and web office
PHPSCH is php based source code syntax highlighter. It supports php, c++, c#, java, delphi, css, html, mysql. You can change the default syntax colors with yours. project page: www.ibonette.com/phpsch-source-code-highlighter-kaynak-kod-renklendirici
Since Azeri Turkish is written in different alphabets throughout the world, This project aims to convert texts between some mainly used alphabets. specifcally, conversion between arbaic and latin alphabets is intended.
Likhon is a context sensitive input method (transliterator) for natural Languages. It is designed to analyze the pattern and context of input character sequences and generate output characters based on a predefined map script.
MOStlyCE is a "What You See Is What You Get" (WYSIWYG) editor for the open source Mia Content Management System (MiaCMS). MOStlyCE aims to bring simply, yet power, HTML editing capabilities into the hands on the average user.
JCopist is a template-based document generation server based on OpenOffice.org.
Its templates are regular OpenDocuments enhanced with the FreeMarker scripting language.
A wide range of formats are available, eg. : ODT, PDF, RTF, HTML, MS Word, MS Excel
oEdtk is an open source project for automated printing processing.
It's a toolkit for building applications that prepare flat file data for massive printing of documents.
This project has MOVED to http://savannah.gnu.org/projects/libiconv/ !!! This library provides an iconv() implementation, for use on systems systems which don't have one, or whose implementation cannot convert from/to Unicode.
RexRex is a Regexp matching engine based on Automata Theory Principles (so-called DFA Engine) Implemented in ANSI C . currently it supports (,),*,+,?,Character classes ([],w,s,..etc) and escaping
FWIW Rex is king in Latin;so it's RexKing :)
The DITA Open Platform is a free, open-source project which goal is to provide an enterprise platform for the edition, management and processing of DITA documents.