Compare the Top Data Lake Solutions in Brazil as of September 2024

What are Data Lake Solutions in Brazil?

Data lakes are repositories and systems of data that are centralized and can store high volumes of raw data in object storage and a flat architecture rather than a hierarchical structure like a data warehouse. Compare and read user reviews of the best Data Lake solutions in Brazil currently available using the table below. This list is updated regularly.

  • 1
    Scalytics Connect
    Scalytics Connect enables AI and ML to process and analyze data, makes it easier and more secure to use different data processing platforms at the same time. Built by the inventors of Apache Wayang, Scalytics Connect is the most enhanced data management platform, reducing the complexity of ETL data pipelines dramatically. Scalytics Connect is a data management and ETL platform that helps organizations unlock the power of their data, regardless of where it resides. It empowers businesses to break down data silos, simplify access, and gain valuable insights through a variety of features, including: - AI-powered ETL: Automates tasks like data extraction, transformation, and loading, freeing up your resources for more strategic work. - Unified Data Landscape: Breaks down data silos and provides a holistic view of all your data, regardless of its location or format. - Effortless Scaling: Handles growing data volumes with ease, so you never get bottlenecked by information overload
    Starting Price: $0
  • 2
    DataLakeHouse.io

    DataLakeHouse.io

    DataLakeHouse.io

    DataLakeHouse.io (DLH.io) Data Sync provides replication and synchronization of operational systems (on-premise and cloud-based SaaS) data into destinations of their choosing, primarily Cloud Data Warehouses. Built for marketing teams and really any data team at any size organization, DLH.io enables business cases for building single source of truth data repositories, such as dimensional data warehouses, data vault 2.0, and other machine learning workloads. Use cases are technical and functional including: ELT, ETL, Data Warehouse, Pipeline, Analytics, AI & Machine Learning, Data, Marketing, Sales, Retail, FinTech, Restaurant, Manufacturing, Public Sector, and more. DataLakeHouse.io is on a mission to orchestrate data for every organization particularly those desiring to become data-driven, or those that are continuing their data driven strategy journey. DataLakeHouse.io (aka DLH.io) enables hundreds of companies to managed their cloud data warehousing and analytics solutions.
    Starting Price: $99
  • 3
    Snowflake

    Snowflake

    Snowflake

    Your cloud data platform. Secure and easy access to any data with infinite scalability. Get all the insights from all your data by all your users, with the instant and near-infinite performance, concurrency and scale your organization requires. Seamlessly share and consume shared data to collaborate across your organization, and beyond, to solve your toughest business problems in real time. Boost the productivity of your data professionals and shorten your time to value in order to deliver modern and integrated data solutions swiftly from anywhere in your organization. Whether you’re moving data into Snowflake or extracting insight out of Snowflake, our technology partners and system integrators will help you deploy Snowflake for your success.
    Starting Price: $40.00 per month
  • 4
    Cloudera

    Cloudera

    Cloudera

    Manage and secure the data lifecycle from the Edge to AI in any cloud or data center. Operates across all major public clouds and the private cloud with a public cloud experience everywhere. Integrates data management and analytic experiences across the data lifecycle for data anywhere. Delivers security, compliance, migration, and metadata management across all environments. Open source, open integrations, extensible, & open to multiple data stores and compute architectures. Deliver easier, faster, and safer self-service analytics experiences. Provide self-service access to integrated, multi-function analytics on centrally managed and secured business data while deploying a consistent experience anywhere—on premises or in hybrid and multi-cloud. Enjoy consistent data security, governance, lineage, and control, while deploying the powerful, easy-to-use cloud analytics experiences business users require and eliminating their need for shadow IT solutions.
  • 5
    Narrative

    Narrative

    Narrative

    Create new streams of revenue using the data you already collect with your own branded data shop. Narrative is focused on the fundamental principles that make buying and selling data easier, safer, and more strategic. Ensure that the data you access meets your standards, whatever they may be. Know exactly who you’re working with and how the data was collected. Easily access new supply and demand for a more agile and accessible data strategy. Own your data strategy entirely with end-to-end control of inputs and outputs. Our platform simplifies and automates the most time- and labor-intensive aspects of data acquisition, so you can access new data sources in days, not months. With filters, budget controls, and automatic deduplication, you’ll only ever pay for the data you need, and nothing that you don’t.
    Starting Price: $0
  • 6
    ChaosSearch

    ChaosSearch

    ChaosSearch

    Log analytics should not break the bank. Because most logging solutions use one or both of these technologies - Elasticsearch database and/ or Lucene index - the cost of operation is unreasonably high. ChaosSearch takes a revolutionary approach. We reinvented indexing, which allows us to pass along substantial cost savings to our customers. See for yourself with this price comparison calculator. ChaosSearch is a fully managed SaaS platform that allows you to focus on search and analytics in AWS S3 rather than spend time managing and tuning databases. Leverage your existing AWS S3 infrastructure and let us do the rest. Watch this short video to learn how our unique approach and architecture allow ChaosSearch to address the challenges of today’s data & analytic requirements. ChaosSearch indexes your data as-is, for log, SQL and ML analytics, without transformation, while auto-detecting native schemas. ChaosSearch is an ideal replacement for the commonly deployed Elasticsearch solutions.
    Starting Price: $750 per month
  • 7
    Sprinkle

    Sprinkle

    Sprinkle Data

    Businesses today need to adapt faster with ever evolving customer requirements and preferences. Sprinkle helps you manage these expectations with agile analytics platform that meets changing needs with ease. We started Sprinkle with the goal to simplify end to end data analytics for organisations, so that they don’t worry about integrating data from various sources, changing schemas and managing pipelines. We built a platform that empowers everyone in the organisation to browse and dig deeper into the data without any technical background. Our team has worked extensively with data while building analytics systems for companies like Flipkart, Inmobi, and Yahoo. These companies succeed by maintaining dedicated teams of data scientists, business analyst and engineers churning out reports and insights. We realized that most organizations struggle for simple self-serve reporting and data exploration. So we set out to build solution that will help all companies leverage data.
    Starting Price: $499 per month
  • 8
    iomete

    iomete

    iomete

    Modern lakehouse built on top of Apache Iceberg and Apache Spark. Includes: Serverless lakehouse, Serverless Spark Jobs, SQL editor, Advanced data catalog and built-in BI (or connect 3rd party BI e.g. Tableau, Looker). iomete has an extreme value proposition with compute prices is equal to AWS on-demand pricing. No mark-ups. AWS users get our platform basically for free.
    Starting Price: Free
  • 9
    Lyzr

    Lyzr

    Lyzr AI

    Lyzr is an enterprise Generative AI company that offers private and secure AI Agent SDKs and an AI Management System. Lyzr helps enterprises build, launch and manage secure GenAI applications, in their AWS cloud or on-prem infra. No more sharing sensitive data with SaaS platforms or GenAI wrappers. And no more reliability and integration issues of open-source tools. Differentiating from competitors such as Cohere, Langchain, and LlamaIndex, Lyzr.ai follows a use-case-focused approach, building full-service yet highly customizable SDKs, simplifying the addition of LLM capabilities to enterprise applications. AI Agents: Jazon - The AI SDR Skott - The AI digital marketer Kathy - The AI competitor analyst Diane - The AI HR manager Jeff - The AI customer success manager Bryan - The AI inbound sales specialist Rachelz - The AI legal assistant
    Starting Price: $0 per month
  • 10
    Utilihive

    Utilihive

    Greenbird Integration Technology

    Utilihive is a cloud-native big data integration platform, purpose-built for the digital data-driven utility, offered as a managed service (SaaS). Utilihive is the leading Enterprise-iPaaS (iPaaS) that is purpose-built for energy and utility usage scenarios. Utilihive provides both the technical infrastructure platform (connectivity, integration, data ingestion, data lake, API management) and pre-configured integration content or accelerators (connectors, data flows, orchestrations, utility data model, energy data services, monitoring and reporting dashboards) to speed up the delivery of innovative data driven services and simplify operations. Utilities play a vital role towards achieving the Sustainable Development Goals and now have the opportunity to build universal platforms to facilitate the data economy in a new world including renewable energy. Seamless access to data is crucial to accelerate the digital transformation.
  • 11
    Sesame Software

    Sesame Software

    Sesame Software

    Sesame Software specializes in secure, efficient data integration and replication across diverse cloud, hybrid, and on-premise sources. Our patented scalability ensures comprehensive access to critical business data, facilitating a holistic view in the BI tools of your choice. This unified perspective empowers your own robust reporting and analytics, enabling your organization to regain control of your data with confidence. At Sesame Software, we understand what’s at stake when you need to move a massive amount of data between environments quickly—while keeping it protected, maintaining centralized access, and ensuring compliance with regulations. Over the past 23+ years, we’ve helped hundreds of organizations like Proctor & Gamble, Bank of America, and the U.S. government connect, move, store, and protect their data.
  • 12
    IBM Storage Scale
    IBM Storage Scale is software-defined file and object storage that enables organizations to build a global data platform for artificial intelligence (AI), high-performance computing (HPC), advanced analytics, and other demanding workloads. Unlike traditional applications that work with structured data, today’s performance-intensive AI and analytics workloads operate on unstructured data, such as documents, audio, images, videos, and other objects. IBM Storage Scale software provides global data abstraction services that seamlessly connect multiple data sources across multiple locations, including non-IBM storage environments. It’s based on a massively parallel file system and can be deployed on multiple hardware platforms including x86, IBM Power, IBM zSystem mainframes, ARM-based POSIX client, virtual machines, and Kubernetes.
    Starting Price: $19.10 per terabyte
  • 13
    Mozart Data

    Mozart Data

    Mozart Data

    Mozart Data is the all-in-one modern data platform that makes it easy to consolidate, organize, and analyze data. Start making data-driven decisions by setting up a modern data stack in an hour - no engineering required.
  • 14
    Dataleyk

    Dataleyk

    Dataleyk

    Dataleyk is the secure, fully-managed cloud data platform for SMBs. Our mission is to make Big Data analytics easy and accessible to all. Dataleyk is the missing link in reaching your data-driven goals. Our platform makes it quick and easy to have a stable, flexible and reliable cloud data lake with near-zero technical knowledge. Bring all of your company data from every single source, explore with SQL and visualize with your favorite BI tool or our advanced built-in graphs. Modernize your data warehousing with Dataleyk. Our state-of-the-art cloud data platform is ready to handle your scalable structured and unstructured data. Data is an asset, Dataleyk is a secure, cloud data platform that encrypts all of your data and offers on-demand data warehousing. Zero maintenance, as an objective, may not be easy to achieve. But as an initiative, it can be a driver for significant delivery improvements and transformational results.
    Starting Price: €0.1 per GB
  • 15
    ELCA Smart Data Lake Builder
    Classical Data Lakes are often reduced to basic but cheap raw data storage, neglecting significant aspects like transformation, data quality and security. These topics are left to data scientists, who end up spending up to 80% of their time acquiring, understanding and cleaning data before they can start using their core competencies. In addition, classical Data Lakes are often implemented by separate departments using different standards and tools, which makes it harder to implement comprehensive analytical use cases. Smart Data Lakes solve these various issues by providing architectural and methodical guidelines, together with an efficient tool to build a strong high-quality data foundation. Smart Data Lakes are at the core of any modern analytics platform. Their structure easily integrates prevalent Data Science tools and open source technologies, as well as AI and ML. Their storage is cheap and scalable, supporting both unstructured data and complex data structures.
    Starting Price: Free
  • 16
    Openbridge

    Openbridge

    Openbridge

    Uncover insights to supercharge sales growth using code-free, fully-automated data pipelines to data lakes or cloud warehouses. A flexible, standards-based platform to unify sales and marketing data for automating insights and smarter growth. Say goodbye to messy, expensive manual data downloads. Always know what you’ll pay and only pay for what you use. Fuel your tools with quick access to analytics-ready data. As certified developers, we only work with secure, official APIs. Get started quickly with data pipelines from popular sources. Pre-built, pre-transformed, and ready-to-go data pipelines. Unlock data from Amazon Vendor Central, Amazon Seller Central, Instagram Stories, Facebook, Amazon Advertising, Google Ads, and many others. Code-free data ingestion and transformation processes allow teams to realize value from their data quickly and cost-effectively. Data is always securely stored directly in a trusted, customer-owned data destination like Databricks, Amazon Redshift, etc.
    Starting Price: $149 per month
  • 17
    BigLake

    BigLake

    Google

    BigLake is a storage engine that unifies data warehouses and lakes by enabling BigQuery and open-source frameworks like Spark to access data with fine-grained access control. BigLake provides accelerated query performance across multi-cloud storage and open formats such as Apache Iceberg. Store a single copy of data with uniform features across data warehouses & lakes. Fine-grained access control and multi-cloud governance over distributed data. Seamless integration with open-source analytics tools and open data formats. Unlock analytics on distributed data regardless of where and how it’s stored, while choosing the best analytics tools, open source or cloud-native over a single copy of data. Fine-grained access control across open source engines like Apache Spark, Presto, and Trino, and open formats such as Parquet. Performant queries over data lakes powered by BigQuery. Integrates with Dataplex to provide management at scale, including logical data organization.
    Starting Price: $5 per TB
  • 18
    Hydrolix

    Hydrolix

    Hydrolix

    Hydrolix is a streaming data lake that combines decoupled storage, indexed search, and stream processing to deliver real-time query performance at terabyte-scale for a radically lower cost. CFOs love the 4x reduction in data retention costs. Product teams love 4x more data to work with. Spin up resources when you need them and scale to zero when you don’t. Fine-tune resource consumption and performance by workload to control costs. Imagine what you can build when you don’t have to sacrifice data because of budget. Ingest, enrich, and transform log data from multiple sources including Kafka, Kinesis, and HTTP. Return just the data you need, no matter how big your data is. Reduce latency and costs, eliminate timeouts, and brute force queries. Storage is decoupled from ingest and query, allowing each to independently scale to meet performance and budget targets. Hydrolix’s high-density compression (HDX) typically reduces 1TB of stored data to 55GB.
    Starting Price: $2,237 per month
  • 19
    Databricks Data Intelligence Platform
    The Databricks Data Intelligence Platform allows your entire organization to use data and AI. It’s built on a lakehouse to provide an open, unified foundation for all data and governance, and is powered by a Data Intelligence Engine that understands the uniqueness of your data. The winners in every industry will be data and AI companies. From ETL to data warehousing to generative AI, Databricks helps you simplify and accelerate your data and AI goals. Databricks combines generative AI with the unification benefits of a lakehouse to power a Data Intelligence Engine that understands the unique semantics of your data. This allows the Databricks Platform to automatically optimize performance and manage infrastructure in ways unique to your business. The Data Intelligence Engine understands your organization’s language, so search and discovery of new data is as easy as asking a question like you would to a coworker.
  • 20
    Upsolver

    Upsolver

    Upsolver

    Upsolver makes it incredibly simple to build a governed data lake and to manage, integrate and prepare streaming data for analysis. Define pipelines using only SQL on auto-generated schema-on-read. Easy visual IDE to accelerate building pipelines. Add Upserts and Deletes to data lake tables. Blend streaming and large-scale batch data. Automated schema evolution and reprocessing from previous state. Automatic orchestration of pipelines (no DAGs). Fully-managed execution at scale. Strong consistency guarantee over object storage. Near-zero maintenance overhead for analytics-ready data. Built-in hygiene for data lake tables including columnar formats, partitioning, compaction and vacuuming. 100,000 events per second (billions daily) at low cost. Continuous lock-free compaction to avoid “small files” problem. Parquet-based tables for fast queries.
  • 21
    Qubole

    Qubole

    Qubole

    Qubole is a simple, open, and secure Data Lake Platform for machine learning, streaming, and ad-hoc analytics. Our platform provides end-to-end services that reduce the time and effort required to run Data pipelines, Streaming Analytics, and Machine Learning workloads on any cloud. No other platform offers the openness and data workload flexibility of Qubole while lowering cloud data lake costs by over 50 percent. Qubole delivers faster access to petabytes of secure, reliable and trusted datasets of structured and unstructured data for Analytics and Machine Learning. Users conduct ETL, analytics, and AI/ML workloads efficiently in end-to-end fashion across best-of-breed open source engines, multiple formats, libraries, and languages adapted to data volume, variety, SLAs and organizational policies.
  • 22
    Lyftrondata

    Lyftrondata

    Lyftrondata

    Whether you want to build a governed delta lake, data warehouse, or simply want to migrate from your traditional database to a modern cloud data warehouse, do it all with Lyftrondata. Simply create and manage all of your data workloads on one platform by automatically building your pipeline and warehouse. Analyze it instantly with ANSI SQL, BI/ML tools, and share it without worrying about writing any custom code. Boost the productivity of your data professionals and shorten your time to value. Define, categorize, and find all data sets in one place. Share these data sets with other experts with zero codings and drive data-driven insights. This data sharing ability is perfect for companies that want to store their data once, share it with other experts, and use it multiple times, now and in the future. Define dataset, apply SQL transformations or simply migrate your SQL data processing logic to any cloud data warehouse.
  • 23
    Datametica

    Datametica

    Datametica

    At Datametica, our birds with unprecedented capabilities help eliminate business risks, cost, time, frustration, and anxiety from the entire process of data warehouse migration to the cloud. Migration of existing data warehouse, data lake, ETL, and Enterprise business intelligence to the cloud environment of your choice using Datametica automated product suite. Architecting an end-to-end migration strategy, with workload discovery, assessment, planning, and cloud optimization. Starting from discovery and assessment of your existing data warehouse to planning the migration strategy – Eagle gives clarity on what’s needed to be migrated and in what sequence, how the process can be streamlined, and what are the timelines and costs. The holistic view of the workloads and planning reduces the migration risk without impacting the business.
  • 24
    Qlik Data Integration
    The Qlik Data Integration platform for managed data lakes automates the process of providing continuously updated, accurate, and trusted data sets for business analytics. Data engineers have the agility to quickly add new sources and ensure success at every step of the data lake pipeline from real-time data ingestion, to refinement, provisioning, and governance. A simple and universal solution for continually ingesting enterprise data into popular data lakes in real-time. A model-driven approach for quickly designing, building, and managing data lakes on-premises or in the cloud. Deliver a smart enterprise-scale data catalog to securely share all of your derived data sets with business users.
  • 25
    Huawei Cloud Data Lake Governance Center
    Simplify big data operations and build intelligent knowledge libraries with Data Lake Governance Center (DGC), a one-stop data lake operations platform that manages data design, development, integration, quality, and assets. Build an enterprise-class data lake governance platform with an easy-to-use visual interface. Streamline data lifecycle processes, utilize metrics and analytics, and ensure good governance across your enterprise. Define and monitor data standards, and get real-time alerts. Build data lakes quicker by easily setting up data integrations, models, and cleaning rules, to enable the discovery of new reliable data sources. Maximize the business value of data. With DGC, end-to-end data operations solutions can be designed for scenarios such as smart government, smart taxation, and smart campus. Gain new insights into sensitive data across your entire organization. DGC allows enterprises to define business catalogs, classifications, and terms.
    Starting Price: $428 one-time payment
  • 26
    NewEvol

    NewEvol

    Sattrix Software Solutions

    NewEvol is the technologically advanced product suite that uses data science for advanced analytics to identify abnormalities in the data itself. Supported by visualization, rule-based alerting, automation, and responses, NewEvol becomes a more compiling proposition for any small to large enterprise. Machine Learning (ML) and security intelligence feed makes NewEvol a more robust system to cater to challenging business demands. NewEvol Data Lake is super easy to deploy and manage. You don’t require a team of expert data administrators. As your company’s data need grows, it automatically scales and reallocates resources accordingly. NewEvol Data Lake has extensive data ingestion to perform enrichment across multiple sources. It helps you ingest data from multiple formats such as delimited, JSON, XML, PCAP, Syslog, etc. It offers enrichment with the help of a best-of-breed contextually aware event analytics model.
  • 27
    Onehouse

    Onehouse

    Onehouse

    The only fully managed cloud data lakehouse designed to ingest from all your data sources in minutes and support all your query engines at scale, for a fraction of the cost. Ingest from databases and event streams at TB-scale in near real-time, with the simplicity of fully managed pipelines. Query your data with any engine, and support all your use cases including BI, real-time analytics, and AI/ML. Cut your costs by 50% or more compared to cloud data warehouses and ETL tools with simple usage-based pricing. Deploy in minutes without engineering overhead with a fully managed, highly optimized cloud service. Unify your data in a single source of truth and eliminate the need to copy data across data warehouses and lakes. Use the right table format for the job, with omnidirectional interoperability between Apache Hudi, Apache Iceberg, and Delta Lake. Quickly configure managed pipelines for database CDC and streaming ingestion.
  • 28
    Talend Data Fabric
    Talend Data Fabric’s suite of cloud services efficiently handles all your integration and integrity challenges — on-premises or in the cloud, any source, any endpoint. Deliver trusted data at the moment you need it — for every user, every time. Ingest and integrate data, applications, files, events and APIs from any source or endpoint to any location, on-premise and in the cloud, easier and faster with an intuitive interface and no coding. Embed quality into data management and guarantee ironclad regulatory compliance with a thoroughly collaborative, pervasive and cohesive approach to data governance. Make the most informed decisions based on high quality, trustworthy data derived from batch and real-time processing and bolstered with market-leading data cleaning and enrichment tools. Get more value from your data by making it available internally and externally. Extensive self-service capabilities make building APIs easy— improve customer engagement.
  • 29
    BryteFlow

    BryteFlow

    BryteFlow

    BryteFlow builds the most efficient automated environments for analytics ever. It converts Amazon S3 into an awesome analytics platform by leveraging the AWS ecosystem intelligently to deliver data at lightning speeds. It complements AWS Lake Formation and automates the Modern Data Architecture providing performance and productivity. You can completely automate data ingestion with BryteFlow Ingest’s simple point-and-click interface while BryteFlow XL Ingest is great for the initial full ingest for very large datasets. No coding is needed! With BryteFlow Blend you can merge data from varied sources like Oracle, SQL Server, Salesforce and SAP etc. and transform it to make it ready for Analytics and Machine Learning. BryteFlow TruData reconciles the data at the destination with the source continually or at a frequency you select. If data is missing or incomplete you get an alert so you can fix the issue easily.
  • 30
    Hadoop

    Hadoop

    Apache Software Foundation

    The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures. A wide variety of companies and organizations use Hadoop for both research and production. Users are encouraged to add themselves to the Hadoop PoweredBy wiki page. Apache Hadoop 3.3.4 incorporates a number of significant enhancements over the previous major release line (hadoop-3.2).
  • Previous
  • You're on page 1
  • 2
  • Next