Deploy pre-built tools that crawl websites, extract structured data, and feed your applications. Reliable web data without maintaining scrapers.
Automate web data collection with cloud tools that handle anti-bot measures, browser rendering, and data transformation out of the box. Extract content from any website, push to vector databases for RAG workflows, or pipe directly into your apps via API. Schedule runs, set up webhooks, and connect to your existing stack. Free tier available, then scale as you need to.
Explore 10,000+ tools
Create and run cloud-based virtual machines.
Secure and customizable compute service that lets you create and run virtual machines.
Computing infrastructure in predefined or custom machine sizes to accelerate your cloud transformation. General purpose (E2, N1, N2, N2D) machines provide a good balance of price and performance. Compute optimized (C2) machines offer high-end vCPU performance for compute-intensive workloads. Memory optimized (M2) machines offer the highest memory and are great for in-memory databases. Accelerator optimized (A2) machines are based on the A100 GPU, for very demanding applications.
The application should take a directory and parse it recursively or not, and execute a given executor. The executors shuold be pluggable and rendered on the application's ui according to some strict rules.
iTagged is a Java Swing application that allows the user to create tags intuitively for the files stored on his/ her machine locally. This is a very effective tool to organize and book mark things that we deem necessary.
Turn traffic into pipeline and prospects into customers
For account executives and sales engineers looking for a solution to manage their insights and sales data
Docket is an AI-powered sales enablement platform designed to unify go-to-market (GTM) data through its proprietary Sales Knowledge Lake™ and activate it with intelligent AI agents. The platform helps marketing teams increase pipeline generation by 15% by engaging website visitors in human-like conversations and qualifying leads. For sales teams, Docket improves seller efficiency by 33% by providing instant product knowledge, retrieving collateral, and creating personalized documents. Built for GTM teams, Docket integrates with over 100 tools across the revenue tech stack and offers enterprise-grade security with SOC 2 Type II, GDPR, and ISO 27001 compliance. Customers report improved win rates, shorter sales cycles, and dramatically reduced response times. Docket’s scalable, accurate, and fast AI agents deliver reliable answers with confidence scores, empowering teams to close deals faster.
BeanQuery is a Java solution that allows for querying arbitrary collections of arbitrary object types by using a criteria like API in a declarative and typesafe manner.
I AM File Indexing can index files in given folders and make the content search able. Written in pure java it is meant for people who need very basic web-site search or multiple files search capability for their java applications.
A searcher and indexer to allow easy and fast locating of relevent information from a large collection of research papers. A Java backend with a web based frontend. Based on the Lucene indexer and searcher
Find files within other ZIP JAR WAR EAR files. Search recursively for file names or search strings. This Java based utility can save you time when you wonder in which jar or ear file a particular class is defined.
As a healthcare provider, you should be paid promptly for the services you provide to patients. Slow, inefficient, and error-prone manual coding keeps you from the financial peace you deserve. XpertDox’s autonomous coding solution accelerates the revenue cycle so you can focus on providing great healthcare.
Looks at file names in a directory and finds common parts in them trying to search similar and repeated ones. Useful when you have multiple files which differ in checksum and somehow in names. You can collect them and make a decision afterwards.
InfoSpace is an application which indexes and then allows you to search your personal information space, such as your email, your documents, your music, your videos, your Flickr account, your news feeds, the web pages you've visited and much more.
ScenConnect shows scenarios as networks of situation and event tag sets, for fast comparisons. It links scenarios to tags, scores, and other metadata, creating situationals suitable for search, mining, machine learning, and planning.
Hyper-M is a bluetooth based DHT peer-to-peer infrastructure for J2ME (CLDC1.1/ MIDP2.0) enabled handphones. Hyper-M allows the user to create a peer-to-peer network and share and retrieve files on this network. Has been tested mainly on Nokia handphones
The Fast Index Library is an opensource C++ template library which is used to build full text indexes based on Boolean, vector, extended Boolean or probabilistic models.
Clucened is a project to build a daemon around CLucene, which is a C++ implementation of the Lucene search engine. This is *not* the CLucene project, but is a separate project to write a generic daemon based on CLucene.
The final goal of this software is to ease to go back to the workflow from the final output. For example, you can find the original word file by right-clicking the final output PDF file.