Deploy pre-built tools that crawl websites, extract structured data, and feed your applications. Reliable web data without maintaining scrapers.
Automate web data collection with cloud tools that handle anti-bot measures, browser rendering, and data transformation out of the box. Extract content from any website, push to vector databases for RAG workflows, or pipe directly into your apps via API. Schedule runs, set up webhooks, and connect to your existing stack. Free tier available, then scale as you need to.
Explore 10,000+ tools
Cloud data warehouse to power your data-driven innovation
BigQuery is a serverless and cost-effective enterprise data warehouse that works across clouds and scales with your data.
BigQuery Studio provides a single, unified interface for all data practitioners of various coding skills to simplify analytics workflows from data ingestion and preparation to data exploration and visualization to ML model creation and use. It also allows you to use simple SQL to access Vertex AI foundational models directly inside BigQuery for text processing tasks, such as sentiment analysis, entity extraction, and many more without having to deal with specialized models.
The Open Pervasive Computing Environment facilitates the creation of distributed context-sensitive systems (inc. embedded). It provides a number of frameworks for creating complex event processing systems driving the development of ubiquitous technology.
This Project aims to create a Linux Distribution focused on the IBM zOS system, it includes a set of Mainframe tools, based all in Linux Kernel modules, C and Python.
-MainFrame Access Methods
-ISPF menus
-REXX scripting
-RACF security
-JCL Batch Job
XDAQ is a software platform designed specifically for the development of distributed data acquisition systems. The development is carried out at CERN, the European Organization for Nuclear Research. Please visit http://xdaq.web.cern.ch
A C/C++ based client and server implementation of the OGSA Basic Execution Service, used to provide a Web Services interface to distributed resource managers such as Platform LSF and PBS/PBS Pro. The SOAP stack is provided by the gSOAP toolkit.
SnowFlock enables high performance computing on virtual machine (VM)-based cloud environments. In SnowFlock, a VM transparently becomes a cluster of VMs by cloning in sub-second time into multiple copies executing on different physical hosts.
RT-BOINC stands for a Real-Time BOINC. It was designed for managing highly-interactive, short-term, and massively-parallel real-time applications.
We implemented RT-BOINC on top of the recent BOINC server source codes.
Distributed Parallel Programming for Python! This package builds on traditional Python by enabling users to write distributed, parallel programs based on MPI message passing primitives. General python objects can be messaged between processors. Ru
Roomy is a programming language extension for writing parallel disk-based applications. All details of parallelism and disk I/O are hidden within the Roomy library.
AlacrityVM is a hypervisor based on the Linux KVM (http://linux-kvm.org) project which aims specifically for high Performance, targeting HPC and Real-Time computing in the data-center.
Core Balance is a simple TCP Load Balancing proxy designed to balance connections based on node speed and number of cores. The design was intended to balance a distcc cluster. It features a status report in HTTP and an interactive mode.
CLara is a framework that enables you to access the computing power of many graphics processors in an IP-based network using a kind of Client/Proxy/Server model. Its programming interface conforms to the OpenCL 1.0 standard.
Tpl makes it easy to serialize your C data using just a handful of API functions. The data is stored in its native binary form for maximum efficiency. C, Perl and XML supported. Data is portable across CPU types and OS's from Unix to Mac to Windows.
Meerkat is a distributed programming environment. It consists of a virtual machine which is suited to parallel processing. The data model is based on the concept of actors, although it is much more permissive than the traditional description.
LIME (Less-is-More) is parallel/concurrent programming environment based on C. Internally, it uses XML technology to describe tasks and their dependencies. Externally, it offers the ANSI C99 programming as well as a set of visually-oriented interfaces.
xlayout is a terminal based utility to get and set information about X11 windows and the pointer. Its designed to be easily integrated into bash shell scripts and takes advantage of the X11 protocol to allow it to access remote X11 desktops.
An implementation of the Open Group's Application Response Measurement (ARM) Version 4 standard. The ARM standard describes a means of breaking an application down into it's constituent transactions, and measuring response time across multiple tiers.
Cryopid2 is a development of the excellent Cyropid process freezer for Linux developed initially by Bernard Blackham. Cryopid2 adds a host of functionality to the original package.
The CodeTime platform covers every aspect of parallel software from authoring, through distribution, to run-time. Its goals are: high programmer productivity; write once, run high performance anywhere; and wide acceptance.
lysis is working on a features rich home automation system that ranges from CAN-bus, 1-wire and RF up to a HTPC; both hardware and software are addressed to make the really smart home; the domotics will provide energy saving, comfort, flexibility, safety
A novel Grid System which is Python based and Cell powered. By extending Namespace into GridSpace, any objects are accesable throughout the Grid. And the codes are distributed executed and be JIT compiled into Cell SPE instructions automatically.
OSN is an open source open protocol distributed social network. Public key cryptography makes the network resilient to spam. User profiles are based on FOAF XML and users can migrate their profile from one site of the federation to another.
FedStage OpenDSP is an open implementation of SOAP Web Service multi-user access and policy-based job control using OGF DRMAA routines supported by distributed resource management systems like Sun Grid Engine (SGE), LSF, PBSPro, Torque or Condor.