Crawl websites, sync to vector databases, and power RAG applications. Pre-built integrations for LLM pipelines and AI assistants.
Build data pipelines that feed your AI models and agents without managing infrastructure. Crawl any website, transform content, and push directly to your preferred vector store. Use 10,000+ tools for RAG applications, AI assistants, and real-time knowledge bases. Monitor site changes, trigger workflows on new data, and keep your AIs fed with fresh, structured information. Cloud-native, API-first, and free to start until you need to scale.
Try for free
Say goodbye to broken revenue funnels and poor customer experiences
Connect and coordinate your data, signals, tools, and people at every step of the customer journey.
LeanData is a Demand Management solution that supports all go-to-market strategies such as account-based sales development, geo-based territories, and more. LeanData features a visual, intuitive workflow native to Salesforce that enables users to view their entire lead flow in one interface. LeanData allows users to access the drag-and-drop feature to route their leads. LeanData also features an algorithms match that uses multiple fields in Salesforce.
FigTeX manages images and their easy inclusion in LaTeX documents. Similar to BibTex, the image information is stored in an external file and is imported into the document as needed. It comes with a comfortable GUI for managing the image library.
The Information Extraction Plugin allows the use of information extraction techniques within RapidMiner.
It can be seen as an interface between natural language and IE- or datamining-methods, by extracting interesting information out of documents.
A tool to modify existing PDF documents in a simple and user-friendly way. The main features are merging, erasing pages, changing the page order and rotating pages in 90° steps.
TagParser is a java parser based on CSS formulas (like JQuery) and can parse any documents based on tags such as XML, HTML. Furthermore, it doesn't require documents to be well formed and can parse complex documents with embedded scripts or CSS parts
Inventors: Validate Your Idea, Protect It and Gain Market Advantages
SenseIP is ideal for individual inventors, startups, and businesses
senseIP is an AI innovation platform for inventors, automating any aspect of IP from the moment you have an idea. You can have it researched for uniqueness and protected; quickly and effortlessly, without expensive attorneys. Built for business success while securing your competitive edge.
Track changes in LaTeX documents. The goal is to provide editing facilities as known from word processors like Microsoft Word or OpenOffice Writer for LaTeX. The project comprises a LaTeX package and additional software to accept/reject changes etc.
LaTeX-Mk is a collection of makefile fragments for managing small to large LaTeX
based documentation projects. The idea is that especially large documents, there may be many many steps required to typeset the document (export modified figures to postscr
PDML is an informal markup language written in PHP that is similar to HTML. It allows for the creation of complex PDF documents and can also be used in conjunction with PHP, to define templates which can generate dynamic PDF documents.
DAT Freight and Analytics operates DAT One truckload freight marketplace
DAT Freight & Analytics operates DAT One, North America’s largest truckload freight marketplace; DAT iQ, the industry’s leading freight data analytics service; and Trucker Tools, the leader in load visibility. Shippers, transportation brokers, carriers, news organizations, and industry analysts rely on DAT for market trends and data insights, informed by nearly 700,000 daily load posts and a database exceeding $1 trillion in freight market transactions. Founded in 1978, DAT is a business unit of Roper Technologies (Nasdaq: ROP), a constituent of the Nasdaq 100, S&P 500, and Fortune 1000. Headquartered in Beaverton, Ore., DAT continues to set the standard for innovation in the trucking and logistics industry.
RTF2HTML is a name for a cross-platform C++ library (DLL, OCX) and command-line utility, which is intended to convert documents from Rich Text Format (e.g. Word, OO Writer) to HTML. Its features are tiny size, speed, low mem usage and compact output.
oEdtk is an open source project for automated printing processing.
It's a toolkit for building applications that prepare flat file data for massive printing of documents.
Contains a LaTeX style file and an associated GUI that allow for the annotation of LaTeX documents. Tracks changes made by multiple editors. This package provides a way for multiple authors to collaboratively edit a latex document.
The DITA Open Platform is a free, open-source project which goal is to provide an enterprise platform for the edition, management and processing of DITA documents.
XVCL is a general-purpose language for configuring variants in all sorts of textual documents (including programs). It is based on frame technology. XVCL processor automates the customization process to produce system from specification of variants.
this is a small JSP tag library which allows you to create PDF documents within your JSPs. All you need to do is add the jar file to your lib folder under WEB-INF and the tld file in a tld folder under WEB-INF and you are ready to use the tags.
An OpenOffice.org add-on which provide basic work with Subverversion. You can import/commit/checkout/update your documents into/from the repository, also with support of showing history and document diferences.
This a text editor which has the option to save the content as a PDF document. It can also read existing .rtf documents and render them in the editor. These can then be saved as PDF there by providing a converter from RTF to PDF format.
Flesh is a Java application designed to analyze a document (plain text, rich text, Word documents, and PDFs) and display the difficulty associated with comprehending using the Flesch-Kincaid Grade Level and the Flesch Reading Ease Score.
SaD Edition Modume : Stand-alone Documents Edition Module, is a generic module (for CMS/BLOGS) writen in PHP that provides read and edit richt text documents with attachements stored in one file to the WWW users.
Hyperlatex allows the development of documents that are to be distributed either as printed
matter or as HTML. It is not a general-purpose translator of LaTeX files into HTML, but allows
writers to write for both media simultaneously.