+
+

Related Products

  • MongoDB Atlas
    1,640 Ratings
    Visit Website
  • groundcover
    32 Ratings
    Visit Website
  • Google Cloud BigQuery
    1,851 Ratings
    Visit Website
  • AnalyticsCreator
    46 Ratings
    Visit Website
  • Teradata VantageCloud
    975 Ratings
    Visit Website
  • DataBuck
    6 Ratings
    Visit Website
  • dbt
    197 Ratings
    Visit Website
  • RunPod
    167 Ratings
    Visit Website
  • Ant Media Server
    214 Ratings
    Visit Website
  • Dragonfly
    16 Ratings
    Visit Website

About

Unified stream and batch data processing that's serverless, fast, and cost-effective. Fully managed data processing service. Automated provisioning and management of processing resources. Horizontal autoscaling of worker resources to maximize resource utilization. OSS community-driven innovation with Apache Beam SDK. Reliable and consistent exactly-once processing. Streaming data analytics with speed. Dataflow enables fast, simplified streaming data pipeline development with lower data latency. Allow teams to focus on programming instead of managing server clusters as Dataflow’s serverless approach removes operational overhead from data engineering workloads. Allow teams to focus on programming instead of managing server clusters as Dataflow’s serverless approach removes operational overhead from data engineering workloads. Dataflow automates provisioning and management of processing resources to minimize latency and maximize utilization.

About

You select the size of the cluster, node capacity, and a set of services, and Yandex Data Proc automatically creates and configures Spark and Hadoop clusters and other components. Collaborate by using Zeppelin notebooks and other web apps via a UI proxy. You get full control of your cluster with root permissions for each VM. Install your own applications and libraries on running clusters without having to restart them. Yandex Data Proc uses instance groups to automatically increase or decrease computing resources of compute subclusters based on CPU usage indicators. Data Proc allows you to create managed Hive clusters, which can reduce the probability of failures and losses caused by metadata unavailability. Save time on building ETL pipelines and pipelines for training and developing models, as well as describing other iterative tasks. The Data Proc operator is already built into Apache Airflow.

Platforms Supported

Windows
Mac
Linux
Cloud
On-Premises
iPhone
iPad
Android
Chromebook

Platforms Supported

Windows
Mac
Linux
Cloud
On-Premises
iPhone
iPad
Android
Chromebook

Audience

Teams that want unified stream and batch data processing that's serverless, fast, and cost-effective

Audience

Anyone interested in a solution for processing multi-terabyte data arrays

Support

Phone Support
24/7 Live Support
Online

Support

Phone Support
24/7 Live Support
Online

API

Offers API

API

Offers API

Screenshots and Videos

Screenshots and Videos

Pricing

No information available.
Free Version
Free Trial

Pricing

$0.19 per hour
Free Version
Free Trial

Reviews/Ratings

Overall 0.0 / 5
ease 0.0 / 5
features 0.0 / 5
design 0.0 / 5
support 0.0 / 5

This software hasn't been reviewed yet. Be the first to provide a review:

Review this Software

Reviews/Ratings

Overall 0.0 / 5
ease 0.0 / 5
features 0.0 / 5
design 0.0 / 5
support 0.0 / 5

This software hasn't been reviewed yet. Be the first to provide a review:

Review this Software

Training

Documentation
Webinars
Live Online
In Person

Training

Documentation
Webinars
Live Online
In Person

Company Information

Google
Founded: 1998
United States
cloud.google.com/dataflow

Company Information

Yandex
Founded: 1997
Russia
cloud.yandex.com/en/services/data-proc

Alternatives

Apache Beam

Apache Beam

Apache Software Foundation

Alternatives

Amazon MWAA

Amazon MWAA

Amazon

Categories

Categories

Streaming Analytics Features

Data Enrichment
Data Wrangling / Data Prep
Multiple Data Source Support
Process Automation
Real-time Analysis / Reporting
Visualization Dashboards

Integrations

Apache Airflow
Apache Flume
Apache Hive
Apache Spark
Apache Zeppelin
CData Connect
Dataplex Universal Catalog
Google Cloud Bigtable
Google Cloud Composer
Google Cloud Profiler
Matplotlib
New Relic
NumPy
Protegrity
Python
Sedai
TensorFlow
Yandex DataSphere
pandas
scikit-image

Integrations

Apache Airflow
Apache Flume
Apache Hive
Apache Spark
Apache Zeppelin
CData Connect
Dataplex Universal Catalog
Google Cloud Bigtable
Google Cloud Composer
Google Cloud Profiler
Matplotlib
New Relic
NumPy
Protegrity
Python
Sedai
TensorFlow
Yandex DataSphere
pandas
scikit-image
Claim Google Cloud Dataflow and update features and information
Claim Google Cloud Dataflow and update features and information
Claim Yandex Data Proc and update features and information
Claim Yandex Data Proc and update features and information