Crawl websites, sync to vector databases, and power RAG applications. Pre-built integrations for LLM pipelines and AI assistants.
Build data pipelines that feed your AI models and agents without managing infrastructure. Crawl any website, transform content, and push directly to your preferred vector store. Use 10,000+ tools for RAG applications, AI assistants, and real-time knowledge bases. Monitor site changes, trigger workflows on new data, and keep your AIs fed with fresh, structured information. Cloud-native, API-first, and free to start until you need to scale.
Try for free
The Most Powerful Software Platform for EHSQ and ESG Management
Addresses the needs of small businesses and large global organizations with thousands of users in multiple locations.
Choose from a complete set of software solutions across EHSQ that address all aspects of top performing Environmental, Health and Safety, and Quality management programs.
A 100% Java client for the DICT protocol (RFC2229). This provides access to lexicons, translating dictionaries, thesauri and similar database over a TCP/IP protocol.
AnissTranslator is the open source development of Hebrew and spoken arabic online translation. It is devleoped by the peace-seeking Aniss organization and intended to be used by web applications or as a standalone php package
Martus Solutions provides seamless budgeting, reporting, and forecasting tools that integrate with accounting systems for real-time financial insights
Martus' collaborative and easy-to-use budgeting and reporting platform will save you hundreds of hours each year. It's designed to make the entire budgeting process easier and create unlimited financial transparency.
otl is a text processor for generating markup from plain text. Much of both the input and output formats can be customized. otl supports structures such as nested ordered lists, headers and footers, and tables.
TextMarker is now developed and hosted at Apache UIMA (http://uima.apache.org/textmarker.html). TextMarker is a UIMA-based tool for information extraction and more. The full featured editor of the rule language and the build process of UIMA descriptors are complemented with components for visualization, explanation, testing and rule learning.
DAT Freight and Analytics operates DAT One truckload freight marketplace
DAT Freight & Analytics operates DAT One, North America’s largest truckload freight marketplace; DAT iQ, the industry’s leading freight data analytics service; and Trucker Tools, the leader in load visibility. Shippers, transportation brokers, carriers, news organizations, and industry analysts rely on DAT for market trends and data insights, informed by nearly 700,000 daily load posts and a database exceeding $1 trillion in freight market transactions. Founded in 1978, DAT is a business unit of Roper Technologies (Nasdaq: ROP), a constituent of the Nasdaq 100, S&P 500, and Fortune 1000. Headquartered in Beaverton, Ore., DAT continues to set the standard for innovation in the trucking and logistics industry.
Camomile is a Unicode library for ocaml. Camomile
provides Unicode character type, UTF-8, UTF-16, UTF-32 strings,
conversion to/from about 200 encodings, collation and locale-sensitive
case mappings, and more.
Plugins for Firefox and Google Chrome that automates usage of „Typograf“ service hosted at http://www.artlebedev.ru/tools/typograf/. Plugin takes text from any text area in Firefox and processes it according to typographic rules (e.g. inserts typ
BonGoLipi (Bong-Go-Lipi) is a transliteration tool to convert phonetically typed Bengali (Bangla) into text displayable with Unicode or non-Unicode fonts. It supports different transliteration schemes. The objective is to propagate standardized Bengali.
A simple way to create a syntax highlighting editor for a custom language/grammar and/or create custom grammar parsers. This is a .NET project written in C#. See details here: http://acct001.com/wordpress/?p=190
JPDF Export is a java library built on the famous iText library. It provides simple functions that can be used to build complex pdf files. It also provides simple classes to merge, split and convert pdf files
PregexEval is a tiny and a simple regex evaluator. You type your regex, type a string to match and, for every characters entered you'll see if it matches and also see a full description of the matching. It's a NFA-based matching as in Perl 5.
A Ruby file parser/interpreter/preprocessor that comments lines of code based on conditions at the time the file is required. Very handy to implement debugging logs and code that has to be commented (not just dynamically switched off).
Find And Replace Text command line utility. New & improved version of the well-known grep command, with advanced features such as: case-adaption of the replace string; find (& replace) in filenames, auto CVS edit.
Moved to https://github.com/lionello/fart-it
A module for python/pygame used for typesetting text to the screen.
It provides specialized functions for scrolling text, pages of text, selectable text, and an on screen text editor.
Filecmp is a command-line application that gets two filenames as argument and outputs the comparison between them - e.g. if they are the same or not... it may look irrelevant but sometime it's very useful, specially inside scripts.
JSnapScreen provides snap screen service in java environment, which exposes enough interfaces to capture image and get the captured image, you can integrate it with your appliction easily and perfectly.
Teng is a general purpose templating engine written in C++ (i.e. library). It is also available as Python module or PHP extension. The main idea of teng is to strictly separate application logic from presentation layer. Widely used on dynamic web sites.
Track changes in LaTeX documents. The goal is to provide editing facilities as known from word processors like Microsoft Word or OpenOffice Writer for LaTeX. The project comprises a LaTeX package and additional software to accept/reject changes etc.