Alternatives to DPR

Compare DPR alternatives for your business or organization using the curated list below. SourceForge ranks the best alternatives to DPR in 2024. Compare features, ratings, user reviews, pricing, and more from DPR competitors and alternatives in order to make an informed decision for your business.

  • 1
    Minitab Connect
    The best insights are based on the most complete, most accurate, and most timely data. Minitab Connect empowers data users from across the enterprise with self-serve tools to transform diverse data into a governed network of data pipelines, feed analytics initiatives and foster organization-wide collaboration. Users can effortlessly blend and explore data from databases, cloud and on-premise apps, unstructured data, spreadsheets, and more. Flexible, automated workflows accelerate every step of the data integration process, while powerful data preparation and visualization tools help yield transformative insights. Flexible, intuitive data integration tools let users connect and blend data from a variety of internal and external sources, like data warehouses, data lakes, IoT devices, SaaS applications, cloud storage, spreadsheets, and email.
  • 2
    Pantomath

    Pantomath

    Pantomath

    Organizations continuously strive to be more data-driven, building dashboards, analytics, and data pipelines across the modern data stack. Unfortunately, most organizations struggle with data reliability issues leading to poor business decisions and lack of trust in data as an organization, directly impacting their bottom line. Resolving complex data issues is a manual and time-consuming process involving multiple teams all relying on tribal knowledge to manually reverse engineer complex data pipelines across different platforms to identify root-cause and understand the impact. Pantomath is a data pipeline observability and traceability platform for automating data operations. It continuously monitors datasets and jobs across the enterprise data ecosystem providing context to complex data pipelines by creating automated cross-platform technical pipeline lineage.
  • 3
    Google Cloud Dataflow
    Unified stream and batch data processing that's serverless, fast, and cost-effective. Fully managed data processing service. Automated provisioning and management of processing resources. Horizontal autoscaling of worker resources to maximize resource utilization. OSS community-driven innovation with Apache Beam SDK. Reliable and consistent exactly-once processing. Streaming data analytics with speed. Dataflow enables fast, simplified streaming data pipeline development with lower data latency. Allow teams to focus on programming instead of managing server clusters as Dataflow’s serverless approach removes operational overhead from data engineering workloads. Allow teams to focus on programming instead of managing server clusters as Dataflow’s serverless approach removes operational overhead from data engineering workloads. Dataflow automates provisioning and management of processing resources to minimize latency and maximize utilization.
  • 4
    Trifacta

    Trifacta

    Trifacta

    The fastest way to prep data and build data pipelines in the cloud. Trifacta provides visual and intelligent guidance to accelerate data preparation so you can get to insights faster. Poor data quality can sink any analytics project. Trifacta helps you understand your data so you can quickly and accurately clean it up. All the power with none of the code. Trifacta provides visual and intelligent guidance so you can get to insights faster. Manual, repetitive data preparation processes don’t scale. Trifacta helps you build, deploy and manage self-service data pipelines in minutes not months.
  • 5
    CloverDX

    CloverDX

    CloverDX

    Design, debug, run and troubleshoot data transformations and jobflows in a developer-friendly visual designer. Orchestrate data workloads that require tasks to be carried out in the right sequence, orchestrate multiple systems with the transparency of visual workflows. Deploy data workloads easily into a robust enterprise runtime environment. In cloud or on-premise. Make data available to people, applications and storage under a single unified platform. Manage your data workloads and related processes together in a single platform. No task is too complex. We’ve built CloverDX on years of experience with large enterprise projects. Developer-friendly open architecture and flexibility lets you package and hide the complexity for non-technical users. Manage the entire lifecycle of a data pipeline from design, deployment to evolution and testing. Get things done fast with the help of our in-house customer success teams.
    Starting Price: $5000.00/one-time
  • 6
    AWS Data Pipeline
    AWS Data Pipeline is a web service that helps you reliably process and move data between different AWS compute and storage services, as well as on-premises data sources, at specified intervals. With AWS Data Pipeline, you can regularly access your data where it’s stored, transform and process it at scale, and efficiently transfer the results to AWS services such as Amazon S3, Amazon RDS, Amazon DynamoDB, and Amazon EMR. AWS Data Pipeline helps you easily create complex data processing workloads that are fault tolerant, repeatable, and highly available. You don’t have to worry about ensuring resource availability, managing inter-task dependencies, retrying transient failures or timeouts in individual tasks, or creating a failure notification system. AWS Data Pipeline also allows you to move and process data that was previously locked up in on-premises data silos.
    Starting Price: $1 per month
  • 7
    DataKitchen

    DataKitchen

    DataKitchen

    Reclaim control of your data pipelines and deliver value instantly, without errors. The DataKitchen™ DataOps platform automates and coordinates all the people, tools, and environments in your entire data analytics organization – everything from orchestration, testing, and monitoring to development and deployment. You’ve already got the tools you need. Our platform automatically orchestrates your end-to-end multi-tool, multi-environment pipelines – from data access to value delivery. Catch embarrassing and costly errors before they reach the end-user by adding any number of automated tests at every node in your development and production pipelines. Spin-up repeatable work environments in minutes to enable teams to make changes and experiment – without breaking production. Fearlessly deploy new features into production with the push of a button. Free your teams from tedious, manual work that impedes innovation.
  • 8
    Openbridge

    Openbridge

    Openbridge

    Uncover insights to supercharge sales growth using code-free, fully-automated data pipelines to data lakes or cloud warehouses. A flexible, standards-based platform to unify sales and marketing data for automating insights and smarter growth. Say goodbye to messy, expensive manual data downloads. Always know what you’ll pay and only pay for what you use. Fuel your tools with quick access to analytics-ready data. As certified developers, we only work with secure, official APIs. Get started quickly with data pipelines from popular sources. Pre-built, pre-transformed, and ready-to-go data pipelines. Unlock data from Amazon Vendor Central, Amazon Seller Central, Instagram Stories, Facebook, Amazon Advertising, Google Ads, and many others. Code-free data ingestion and transformation processes allow teams to realize value from their data quickly and cost-effectively. Data is always securely stored directly in a trusted, customer-owned data destination like Databricks, Amazon Redshift, etc.
    Starting Price: $149 per month
  • 9
    Dagster Cloud

    Dagster Cloud

    Dagster Labs

    Dagster is a next-generation orchestration platform for the development, production, and observation of data assets. Unlike other data orchestration solutions, Dagster provides you with an end-to-end development lifecycle. Dagster gives you control over your disparate data tools and empowers you to build, test, deploy, run, and iterate on your data pipelines. It makes you and your data teams more productive, your operations more robust, and puts you in complete control of your data processes as you scale. Dagster brings a declarative approach to the engineering of data pipelines. Your team defines the data assets required, quickly assessing their status and resolving any discrepancies. An assets-based model is clearer than a tasks-based one and becomes a unifying abstraction across the whole workflow.
    Starting Price: $0
  • 10
    QuerySurge
    QuerySurge leverages AI to automate the data validation and ETL testing of Big Data, Data Warehouses, Business Intelligence Reports and Enterprise Apps/ERPs with full DevOps functionality for continuous testing. Use Cases - Data Warehouse & ETL Testing - Hadoop & NoSQL Testing - DevOps for Data / Continuous Testing - Data Migration Testing - BI Report Testing - Enterprise App/ERP Testing QuerySurge Features - Projects: Multi-project support - AI: automatically create datas validation tests based on data mappings - Smart Query Wizards: Create tests visually, without writing SQL - Data Quality at Speed: Automate the launch, execution, comparison & see results quickly - Test across 200+ platforms: Data Warehouses, Hadoop & NoSQL lakes, databases, flat files, XML, JSON, BI Reports - DevOps for Data & Continuous Testing: RESTful API with 60+ calls & integration with all mainstream solutions - Data Analytics & Data Intelligence:  Analytics dashboard & reports
  • 11
    Qlik Compose
    Qlik Compose for Data Warehouses (formerly Attunity Compose for Data Warehouses) provides a modern approach by automating and optimizing data warehouse creation and operation. Qlik Compose automates designing the warehouse, generating ETL code, and quickly applying updates, all whilst leveraging best practices and proven design patterns. Qlik Compose for Data Warehouses dramatically reduces the time, cost and risk of BI projects, whether on-premises or in the cloud. Qlik Compose for Data Lakes (formerly Attunity Compose for Data Lakes) automates your data pipelines to create analytics-ready data sets. By automating data ingestion, schema creation, and continual updates, organizations realize faster time-to-value from their existing data lake investments.
  • 12
    Kestra

    Kestra

    Kestra

    Kestra is an open-source, event-driven orchestrator that simplifies data operations and improves collaboration between engineers and business users. By bringing Infrastructure as Code best practices to data pipelines, Kestra allows you to build reliable workflows and manage them with confidence. Thanks to the declarative YAML interface for defining orchestration logic, everyone who benefits from analytics can participate in the data pipeline creation process. The UI automatically adjusts the YAML definition any time you make changes to a workflow from the UI or via an API call. Therefore, the orchestration logic is defined declaratively in code, even if some workflow components are modified in other ways.
  • 13
    Unravel

    Unravel

    Unravel Data

    Unravel makes data work anywhere: on Azure, AWS, GCP or in your own data center– Optimizing performance, automating troubleshooting and keeping costs in check. Unravel helps you monitor, manage, and improve your data pipelines in the cloud and on-premises – to drive more reliable performance in the applications that power your business. Get a unified view of your entire data stack. Unravel collects performance data from every platform, system, and application on any cloud then uses agentless technologies and machine learning to model your data pipelines from end to end. Explore, correlate, and analyze everything in your modern data and cloud environment. Unravel’s data model reveals dependencies, issues, and opportunities, how apps and resources are being used, what’s working and what’s not. Don’t just monitor performance – quickly troubleshoot and rapidly remediate issues. Leverage AI-powered recommendations to automate performance improvements, lower costs, and prepare.
  • 14
    CData Sync

    CData Sync

    CData Software

    CData Sync is a universal data pipeline that delivers automated continuous replication between hundreds of SaaS applications & cloud data sources and any major database or data warehouse, on-premise or in the cloud. Replicate data from hundreds of cloud data sources to popular database destinations, such as SQL Server, Redshift, S3, Snowflake, BigQuery, and more. Configuring replication is easy: login, select the data tables to replicate, and select a replication interval. Done. CData Sync extracts data iteratively, causing minimal impact on operational systems by only querying and updating data that has been added or changed since the last update. CData Sync offers the utmost flexibility across full and partial replication scenarios and ensures that critical data is stored safely in your database of choice. Download a 30-day free trial of the Sync application or request more information at www.cdata.com/sync
  • 15
    Astera Centerprise
    Astera Centerprise is a complete on-premise data integration solution that helps extract, transform, profile, cleanse, and integrate data from disparate sources in a code-free, drag-and-drop environment. The software is designed to cater to enterprise-level data integration needs and is used by Fortune 500 companies, like Wells Fargo, Xerox, HP, and more. Through process orchestration, workflow automation, job scheduling, instant data preview, and more, enterprises can easily get accurate, consolidated data for their day-to-day decision making at the speed of business.
  • 16
    Oarkflow

    Oarkflow

    Oarkflow

    Automate your business pipeline with our flow builder. Use operations that matters to you. Bring your own service providers for email, sms and http services. Use our advanced query builder to query and analyze csv with any field numbers and rows. We store the csv files you've uploaded on our platform in a secured vault and account activity logs. We don't store any data records you request for processing.
    Starting Price: $0.0005 per task
  • 17
    Montara

    Montara

    Montara

    Montara enables BI teams and data analysts, using SQL only, to model and transform their data easily and seamlessly and enjoy benefits such as modular code, CI/CD, versioning and automated testing and documentation. With Montara, analysts can quickly understand how changes to models impact analysis, reports and dashboards with report-level lineage and support for 3rd party visualization tools such as Tableau and Looker. Furthermore, BI teams can perform ad-hoc analysis and create reports and dashboards on Montara directly.
    Starting Price: $100/user/month
  • 18
    Datazoom

    Datazoom

    Datazoom

    Improving the experience, efficiency, and profitability of streaming video requires data. Datazoom enables video publishers to better operate distributed architectures through centralizing, standardizing, and integrating data in real-time to create a more powerful data pipeline and improve observability, adaptability, and optimization solutions. Datazoom is a video data platform that continually gathers data from endpoints, like a CDN or a video player, through an ecosystem of collectors. Once the data is gathered, it is normalized using standardized data definitions. This data is then sent through available connectors to analytics platforms like Google BigQuery, Google Analytics, and Splunk and can be visualized in tools such as Looker and Superset. Datazoom is your key to a more effective and efficient data pipeline. Get the data you need in real-time. Don’t wait for your data when you need to resolve an issue immediately.
  • 19
    Castor

    Castor

    Castor

    Castor is a data catalog designed for mass adoption across the whole company. Have an overview of all your data environment. Search for data instantly thanks to our powerful search engine. Onboard to a new data infrastructure and access data in a breeze. Go beyond your traditional data catalog. Modern data teams now have numerous data sources, build one truth. With its delightful and automated documentation experience, Castor makes it dead simple to trust data. Column-level, cross-system data lineage in minutes. Get a bird’s eye view of your data pipelines to build trust in your data. Troubleshoot data issues, perform impact analyses, comply with GDPR in one tool. Optimize performance, cost, compliance, and security for your data. Keep your data stack healthy with our automated infrastructure monitoring system.
    Starting Price: $699 per month
  • 20
    Informatica Data Engineering
    Ingest, prepare, and process data pipelines at scale for AI and analytics in the cloud. Informatica’s comprehensive data engineering portfolio provides everything you need to process and prepare big data engineering workloads to fuel AI and analytics: robust data integration, data quality, streaming, masking, and data preparation capabilities. Rapidly build intelligent data pipelines with CLAIRE®-powered automation, including automatic change data capture (CDC) Ingest thousands of databases and millions of files, and streaming events. Accelerate time-to-value ROI with self-service access to trusted, high-quality data. Get unbiased, real-world insights on Informatica data engineering solutions from peers you trust. Reference architectures for sustainable data engineering solutions. AI-powered data engineering in the cloud delivers the trusted, high quality data your analysts and data scientists need to transform business.
  • 21
    Stripe Data Pipeline
    Stripe Data Pipeline sends all your up-to-date Stripe data and reports to Snowflake or Amazon Redshift in a few clicks. Centralize your Stripe data with other business data to close your books faster and unlock richer business insights. Set up Stripe Data Pipeline in minutes and automatically receive your Stripe data and reports in your data warehouse on an ongoing basis–no code required. Create a single source of truth to speed up your financial close and access better insights. Identify your best-performing payment methods, analyze fraud by location, and more. Send your Stripe data directly to your data warehouse without involving a third-party extract, transform, and load (ETL) pipeline. Offload ongoing maintenance with a pipeline that’s built into Stripe. No matter how much data you have, your data is always complete and accurate. Automate data delivery at scale, minimize security risks, and avoid data outages and delays.
    Starting Price: 3¢ per transaction
  • 22
    K2View

    K2View

    K2View

    At K2View, we believe that every enterprise should be able to leverage its data to become as disruptive and agile as the best companies in its industry. We make this possible through our patented Data Product Platform, which creates and manages a complete and compliant dataset for every business entity – on demand, and in real time. The dataset is always in sync with its underlying sources, adapts to changes in the source structures, and is instantly accessible to any authorized data consumer. Data Product Platform fuels many operational use cases, including customer 360, data masking and tokenization, test data management, data migration, legacy application modernization, data pipelining and more – to deliver business outcomes in less than half the time, and at half the cost, of any other alternative. The platform inherently supports modern data architectures – data mesh, data fabric, and data hub – and deploys in cloud, on-premise, or hybrid environments.
  • 23
    DataOps.live

    DataOps.live

    DataOps.live

    DataOps.live, the Data Products company, delivers productivity and governance breakthroughs for data developers and teams through environment automation, pipeline orchestration, continuous testing and unified observability. We bring agile DevOps automation and a powerful unified cloud Developer Experience (DX) ​to modern cloud data platforms like Snowflake.​ DataOps.live, a global cloud-native company, is used by Global 2000 enterprises including Roche Diagnostics and OneWeb to deliver 1000s of Data Product releases per month with the speed and governance the business demands.
  • 24
    Lyftrondata

    Lyftrondata

    Lyftrondata

    Whether you want to build a governed delta lake, data warehouse, or simply want to migrate from your traditional database to a modern cloud data warehouse, do it all with Lyftrondata. Simply create and manage all of your data workloads on one platform by automatically building your pipeline and warehouse. Analyze it instantly with ANSI SQL, BI/ML tools, and share it without worrying about writing any custom code. Boost the productivity of your data professionals and shorten your time to value. Define, categorize, and find all data sets in one place. Share these data sets with other experts with zero codings and drive data-driven insights. This data sharing ability is perfect for companies that want to store their data once, share it with other experts, and use it multiple times, now and in the future. Define dataset, apply SQL transformations or simply migrate your SQL data processing logic to any cloud data warehouse.
  • 25
    Integrate.io

    Integrate.io

    Integrate.io

    Unify Your Data Stack: Experience the first no-code data pipeline platform and power enlightened decision making. Integrate.io is the only complete set of data solutions & connectors for easy building and managing of clean, secure data pipelines. Increase your data team's output with all of the simple, powerful tools & connectors you’ll ever need in one no-code data integration platform. Empower any size team to consistently deliver projects on-time & under budget. We ensure your success by partnering with you to truly understand your needs & desired outcomes. Our only goal is to help you overachieve yours. Integrate.io's Platform includes: -No-Code ETL & Reverse ETL: Drag & drop no-code data pipelines with 220+ out-of-the-box data transformations -Easy ELT & CDC :The Fastest Data Replication On The Market -Automated API Generation: Build Automated, Secure APIs in Minutes - Data Warehouse Monitoring: Finally Understand Your Warehouse Spend - FREE Data Observability: Custom
  • 26
    Azure Event Hubs
    Event Hubs is a fully managed, real-time data ingestion service that’s simple, trusted, and scalable. Stream millions of events per second from any source to build dynamic data pipelines and immediately respond to business challenges. Keep processing data during emergencies using the geo-disaster recovery and geo-replication features. Integrate seamlessly with other Azure services to unlock valuable insights. Allow existing Apache Kafka clients and applications to talk to Event Hubs without any code changes—you get a managed Kafka experience without having to manage your own clusters. Experience real-time data ingestion and microbatching on the same stream. Focus on drawing insights from your data instead of managing infrastructure. Build real-time big data pipelines and respond to business challenges right away.
    Starting Price: $0.03 per hour
  • 27
    Hevo

    Hevo

    Hevo Data

    Hevo Data is a no-code, bi-directional data pipeline platform specially built for modern ETL, ELT, and Reverse ETL Needs. It helps data teams streamline and automate org-wide data flows that result in a saving of ~10 hours of engineering time/week and 10x faster reporting, analytics, and decision making. The platform supports 100+ ready-to-use integrations across Databases, SaaS Applications, Cloud Storage, SDKs, and Streaming Services. Over 500 data-driven companies spread across 35+ countries trust Hevo for their data integration needs. Try Hevo today and get your fully managed data pipelines up and running in just a few minutes.
    Starting Price: $249/month
  • 28
    Nextflow Tower

    Nextflow Tower

    Seqera Labs

    Nextflow Tower is an intuitive centralized command post that enables large-scale collaborative data analysis. With Tower, users can easily launch, manage, and monitor scalable Nextflow data analysis pipelines and compute environments on-premises or on most clouds. Researchers can focus on the science that matters rather than worrying about infrastructure engineering. Compliance is simplified with predictable, auditable pipeline execution and the ability to reliably reproduce results obtained with specific data sets and pipeline versions on demand. Nextflow Tower is developed and supported by Seqera Labs, the creators and maintainers of the open-source Nextflow project. This means that users get high-quality support directly from the source. Unlike third-party frameworks that incorporate Nextflow, Tower is deeply integrated and can help users benefit from Nextflow's complete set of capabilities.
  • 29
    Lightbend

    Lightbend

    Lightbend

    Lightbend provides technology that enables developers to easily build data-centric applications that bring the most demanding, globally distributed applications and streaming data pipelines to life. Companies worldwide turn to Lightbend to solve the challenges of real-time, distributed data in support of their most business-critical initiatives. Akka Platform provides the building blocks that make it easy for businesses to build, deploy, and run large-scale applications that support digitally transformative initiatives. Accelerate time-to-value and reduce infrastructure and cloud costs with reactive microservices that take full advantage of the distributed nature of the cloud and are resilient to failure, highly efficient, and operative at any scale. Native support for encryption, data shredding, TLS enforcement, and continued compliance with GDPR. Framework for quick construction, deployment and management of streaming data pipelines.
  • 30
    StreamScape

    StreamScape

    StreamScape

    Make use of Reactive Programming on the back-end without the need for specialized languages or cumbersome frameworks. Triggers, Actors and Event Collections make it easy to build data pipelines and work with data streams using simple SQL-like syntax, shielding users from the complexities of distributed system development. Extensible Data Modeling is a key feature that supports rich semantics and schema definition for representing real-world things. On-the-fly validation and data shaping rules support a variey of formats like XML and JSON, allowing you to easily describe and evolve your schema, keeping pace with changing business requirements. If you can describe it, we can query it. Know SQL and Javascript? Then you already know how to use the data engine. Whatever the format, a powerful query language lets you instantly test logic expressions and functions, speeding up development and simplifying deployment for unmatched data agility.
  • 31
    Google Cloud Composer
    Cloud Composer's managed nature and Apache Airflow compatibility allows you to focus on authoring, scheduling, and monitoring your workflows as opposed to provisioning resources. End-to-end integration with Google Cloud products including BigQuery, Dataflow, Dataproc, Datastore, Cloud Storage, Pub/Sub, and AI Platform gives users the freedom to fully orchestrate their pipeline. Author, schedule, and monitor your workflows through a single orchestration tool—whether your pipeline lives on-premises, in multiple clouds, or fully within Google Cloud. Ease your transition to the cloud or maintain a hybrid data environment by orchestrating workflows that cross between on-premises and the public cloud. Create workflows that connect data, processing, and services across clouds to give you a unified data environment.
    Starting Price: $0.074 per vCPU hour
  • 32
    Pandio

    Pandio

    Pandio

    Connecting systems to scale AI initiatives is complex, expensive, and prone to fail. Pandio’s cloud-native managed solution simplifies your data pipelines to harness the power of AI. Access your data from anywhere at any time in order to query, analyze, and drive to insight. Big data analytics without the big cost. Enable data movement seamlessly. Streaming, queuing and pub-sub with unmatched throughput, latency, and durability. Design, train, and deploy machine learning models locally in less than 30 minutes. Accelerate your path to ML and democratize the process across your organization. And it doesn’t require months (or years) of disappointment. Pandio’s AI-driven architecture automatically orchestrates your models, data, and ML tools. Pandio works with your existing stack to accelerate your ML initiatives. Orchestrate your models and messages across your organization.
    Starting Price: $1.40 per hour
  • 33
    Amazon MWAA

    Amazon MWAA

    Amazon

    Amazon Managed Workflows for Apache Airflow (MWAA) is a managed orchestration service for Apache Airflow that makes it easier to set up and operate end-to-end data pipelines in the cloud at scale. Apache Airflow is an open-source tool used to programmatically author, schedule, and monitor sequences of processes and tasks referred to as “workflows.” With Managed Workflows, you can use Airflow and Python to create workflows without having to manage the underlying infrastructure for scalability, availability, and security. Managed Workflows automatically scales its workflow execution capacity to meet your needs, and is integrated with AWS security services to help provide you with fast and secure access to data.
    Starting Price: $0.49 per hour
  • 34
    BettrData

    BettrData

    BettrData

    Our automated data operations platform will allow businesses to reduce or reallocate the number of full-time employees needed to support their data operations. This is traditionally a very manual and expensive process, and our product packages it all together to simplify the process and significantly reduce costs. With so much problematic data in business, most companies cannot give appropriate attention to the quality of their data because they are too busy processing it. By using our product, you automatically become a proactive business when it comes to data quality. With clear visibility of all incoming data and a built-in alerting system, our platform ensures that your data quality standards are met. We are a first-of-its-kind solution that has taken many costly manual processes and put them into a single platform. The BettrData.io platform is ready to use after a simple installation and several straightforward configurations.
  • 35
    datuum.ai

    datuum.ai

    Datuum

    AI-powered data integration tool that helps streamline the process of customer data onboarding. It allows for easy and fast automated data integration from various sources without coding, reducing preparation time to just a few minutes. With Datuum, organizations can efficiently extract, ingest, transform, migrate, and establish a single source of truth for their data, while integrating it into their existing data storage. Datuum is a no-code product and can reduce up to 80% of the time spent on data-related tasks, freeing up time for organizations to focus on generating insights and improving the customer experience. With over 40 years of experience in data management and operations, we at Datuum have incorporated our expertise into the core of our product, addressing the key challenges faced by data engineers and managers and ensuring that the platform is user-friendly, even for non-technical specialists.
  • 36
    Osmos

    Osmos

    Osmos

    With Osmos, your customers can easily clean their messy data files and import them directly into your operational system without writing a line of code. At the core, we have an AI-powered data transformation engine that enables users to map, validate, and clean data with only a few clicks. Your account will be charged or credited based on the percentage of the billing cycle left at the time the plan was changed. An eCommerce company automates ingestion of product catalog data from multiple distributors and vendors into their database. A manufacturing company automates the data ingestion of purchase orders from email attachments into Netsuite. Automatically clean up and reformat incoming data to match your destination schema. Never deal with custom scripts and spreadsheets again.
    Starting Price: $299 per month
  • 37
    StreamSets

    StreamSets

    StreamSets

    StreamSets DataOps Platform. The data integration platform to build, run, monitor and manage smart data pipelines that deliver continuous data for DataOps, and power modern analytics and hybrid integration. Only StreamSets provides a single design experience for all design patterns for 10x greater developer productivity; smart data pipelines that are resilient to change for 80% less breakages; and a single pane of glass for managing and monitoring all pipelines across hybrid and cloud architectures to eliminate blind spots and control gaps. With StreamSets, you can deliver the continuous data that drives the connected enterprise.
    Starting Price: $1000 per month
  • 38
    RudderStack

    RudderStack

    RudderStack

    RudderStack is the smart customer data pipeline. Easily build pipelines connecting your whole customer data stack, then make them smarter by pulling analysis from your data warehouse to trigger enrichment and activation in customer tools for identity stitching and other advanced use cases. Start building smarter customer data pipelines today.
    Starting Price: $750/month
  • 39
    Y42

    Y42

    Datos-Intelligence GmbH

    Y42 is the first fully managed Modern DataOps Cloud. It is purpose-built to help companies easily design production-ready data pipelines on top of their Google BigQuery or Snowflake cloud data warehouse. Y42 provides native integration of best-of-breed open-source data tools, comprehensive data governance, and better collaboration for data teams. With Y42, organizations enjoy increased accessibility to data and can make data-driven decisions quickly and efficiently.
  • 40
    definity

    definity

    definity

    Monitor and control everything your data pipelines do with zero code changes. Monitor data and pipelines in motion to proactively prevent downtime and quickly root cause issues. Optimize pipeline runs and job performance to save costs and keep SLAs. Accelerate code deployments and platform upgrades while maintaining reliability and performance. Data & performance checks in line with pipeline runs. Checks on input data, before pipelines even run. Automatic preemption of runs. definity takes away the effort to build deep end-to-end coverage, so you are protected at every step, across every dimension. definity shifts observability to post-production to achieve ubiquity, increase coverage, and reduce manual effort. definity agents automatically run with every pipeline, with zero footprints. Unified view of data, pipelines, infra, lineage, and code for every data asset. Detect in run-time and avoid async checks. Auto-preempt runs, even on inputs.
  • 41
    BigBI

    BigBI

    BigBI

    BigBI enables data specialists to build their own powerful big data pipelines interactively & efficiently, without any coding! BigBI unleashes the power of Apache Spark enabling: Scalable processing of real Big Data (up to 100X faster) Integration of traditional data (SQL, batch files) with modern data sources including semi-structured (JSON, NoSQL DBs, Elastic, Hadoop), and unstructured (Text, Audio, video), Integration of streaming data, cloud data, AI/ML & graphs
  • 42
    TIBCO Data Fabric
    More data sources, more silos, more complexity, and constant change. Data architectures are challenged to keep pace—a big problem for today's data-driven organizations, and one that puts your business at risk. A data fabric is a modern distributed data architecture that includes shared data assets and optimized data fabric pipelines that you can use to address today's data challenges in a unified way. Optimized data management and integration capabilities so you can intelligently simplify, automate, and accelerate your data pipelines. Easy-to-deploy and adapt distributed data architecture that fits your complex, ever-changing technology landscape. Accelerate time to value by unlocking your distributed on-premises, cloud, and hybrid cloud data, no matter where it resides, and delivering it wherever it's needed at the pace of business.
  • 43
    Fosfor Spectra
    Spectra is a comprehensive DataOps (data ingestion, transformation, and preparation) platform to build and manage complex, varied data pipelines using a low-code user interface with domain-specific features to deliver data solutions at speed and at scale. Maximize your ROI with faster time-to-market and time-to-value, and reduced cost of ownership. Get access to over 50 pre-built native connectors with readily available data processing functions such as sort, look-up, join, transform, grouping and many more. Process structured, semi-structured, and unstructured data in batch or real-time streaming data. Optimize and control infrastructure spending for your organization by efficiently managing data processing and data pipeline. Spectra’s pushdown transformation capabilities with Snowflake Data Cloud enables enterprises to leverage Snowflake’s high-performing processing power and scalable architecture.
  • 44
    Actifio

    Actifio

    Google

    Automate self-service provisioning and refresh of enterprise workloads, integrate with existing toolchain. High-performance data delivery and re-use for data scientists through a rich set of APIs and automation. Recover any data across any cloud from any point in time – at the same time – at scale, beyond legacy solutions. Minimize the business impact of ransomware / cyber attacks by recovering quickly with immutable backups. Unified platform to better protect, secure, retain, govern, or recover your data on-premises or in the cloud. Actifio’s patented software platform turns data silos into data pipelines. Virtual Data Pipeline (VDP) delivers full-stack data management — on-premises, hybrid or multi-cloud – from rich application integration, SLA-based orchestration, flexible data movement, and data immutability and security.
  • 45
    Gathr

    Gathr

    Gathr

    The only all-in-one data pipeline platform. Built ground-up for a cloud-first world, Gathr is the only platform to handle all your data integration and engineering needs - ingestion, ETL, ELT, CDC, streaming analytics, data preparation, machine learning, advanced analytics and more. With Gathr, anyone can build and deploy pipelines in minutes, irrespective of skill levels. Create Ingestion pipelines in minutes, not weeks. Ingest data from any source, deliver to any destination. Build applications quickly with a wizard-based approach. Replicate data in real-time using a templatized CDC app. Native integration for all sources and targets. Best-in-class capabilities with everything you need to succeed today and tomorrow. Choose between free, pay-per-use or customize as per your requirements.
  • 46
    Google Cloud Data Fusion
    Open core, delivering hybrid and multi-cloud integration. Data Fusion is built using open source project CDAP, and this open core ensures data pipeline portability for users. CDAP’s broad integration with on-premises and public cloud platforms gives Cloud Data Fusion users the ability to break down silos and deliver insights that were previously inaccessible. Integrated with Google’s industry-leading big data tools. Data Fusion’s integration with Google Cloud simplifies data security and ensures data is immediately available for analysis. Whether you’re curating a data lake with Cloud Storage and Dataproc, moving data into BigQuery for data warehousing, or transforming data to land it in a relational store like Cloud Spanner, Cloud Data Fusion’s integration makes development and iteration fast and easy.
  • 47
    Nextflow

    Nextflow

    Seqera Labs

    Data-driven computational pipelines. Nextflow enables scalable and reproducible scientific workflows using software containers. It allows the adaptation of pipelines written in the most common scripting languages. Its fluent DSL simplifies the implementation and deployment of complex parallel and reactive workflows on clouds and clusters. Nextflow is built around the idea that Linux is the lingua franca of data science. Nextflow allows you to write a computational pipeline by making it simpler to put together many different tasks. You may reuse your existing scripts and tools and you don't need to learn a new language or API to start using it. Nextflow supports Docker and Singularity containers technology. This, along with the integration of the GitHub code-sharing platform, allows you to write self-contained pipelines, manage versions, and rapidly reproduce any former configuration. Nextflow provides an abstraction layer between your pipeline's logic and the execution layer.
    Starting Price: Free
  • 48
    Etleap

    Etleap

    Etleap

    Etleap was built from the ground up on AWS to support Redshift and snowflake data warehouses and S3/Glue data lakes. Their solution simplifies and automates ETL by offering fully-managed ETL-as-a-service. Etleap's data wrangler and modeling tools let users control how data is transformed for analysis, without writing any code. Etleap monitors and maintains data pipelines for availability and completeness, eliminating the need for constant maintenance, and centralizes data from 50+ disparate sources and silos into your data warehouse or data lake.
  • 49
    Upsolver

    Upsolver

    Upsolver

    Upsolver makes it incredibly simple to build a governed data lake and to manage, integrate and prepare streaming data for analysis. Define pipelines using only SQL on auto-generated schema-on-read. Easy visual IDE to accelerate building pipelines. Add Upserts and Deletes to data lake tables. Blend streaming and large-scale batch data. Automated schema evolution and reprocessing from previous state. Automatic orchestration of pipelines (no DAGs). Fully-managed execution at scale. Strong consistency guarantee over object storage. Near-zero maintenance overhead for analytics-ready data. Built-in hygiene for data lake tables including columnar formats, partitioning, compaction and vacuuming. 100,000 events per second (billions daily) at low cost. Continuous lock-free compaction to avoid “small files” problem. Parquet-based tables for fast queries.
  • 50
    Tarsal

    Tarsal

    Tarsal

    Tarsal's infinite scalability means as your organization grows, Tarsal grows with you. Tarsal makes it easy for you to switch where you're sending data - today's SIEM data is tomorrow's data lake data; all with one click. Keep your SIEM and gradually migrate analytics over to a data lake. You don't have to rip anything out to use Tarsal. Some analytics just won't run on your SIEM. Use Tarsal to have query-ready data on a data lake. Your SIEM is one of the biggest line items in your budget. Use Tarsal to send some of that data to your data lake. Tarsal is the first highly scalable ETL data pipeline built for security teams. Easily exfil terabytes of data in just just a few clicks, with instant normalization, and route that data to your desired destination.