Ready to implement AI with confidence (without sacrificing security)?
Connect your AI agents to apps and data more securely, give users control over the actions AI agents can perform and the data they can access, and enable human confirmation for critical agent actions.
Start building today
Cloud tools for web scraping and data extraction
Deploy pre-built tools that crawl websites, extract structured data, and feed your applications. Reliable web data without maintaining scrapers.
Automate web data collection with cloud tools that handle anti-bot measures, browser rendering, and data transformation out of the box. Extract content from any website, push to vector databases for RAG workflows, or pipe directly into your apps via API. Schedule runs, set up webhooks, and connect to your existing stack. Free tier available, then scale as you need to.
Search and replace operations on file content accross multiple files. Recursive operations within entire directory trees. FAR comes with support for regular expressions (regex) over multiple lines, automatic backup and various character encodings. Run grep like extractions to condense or rearrange sources, or perform bulk file renaming.
A file watching facility for Java. Uses native platform support to avoid polling on selected platforms (currently supports win32, Mac OS X, Linux and FreeBSD on x86 platforms). Implements JDK 7's WatchService, but also runs on Java 5 and 6
iMeMex is a dataspace management system. iMeMex is a research prototype. The package also provides several useful components for research such as external sorting, B+-trees, inverted indexes, content converters, query operators, and graph indexes.
For companies looking to automate their consolidation and financial statement function
The software is cloud based and automates complexities around consolidating and reporting for groups with multiple year ends, currencies and ERP systems with a slice and dice approach to reporting. While retaining the structure, control and validation needed in a financial reporting tool, we’ve managed to keep things flexible.
XSpace is a globally accessible repository for hierarchically-keyed information. It provides persistence for trees with an elegant tree-navigation API. XSpace also publishes real-time events whenever a persisted tree is updated.
Java program to extract postings and comments from http://www.livejournal.com (blog) into DB and view/classify/process it. LJ loader. Components to reuse: perl-like, but efficient Web pages scraper, trees analyzer, concurrent scheduler.