Compare the Top Data Lineage Tools in Brazil as of September 2024 - Page 2

  • 1
    Datakin

    Datakin

    Datakin

    Instantly reveal the order hidden within your complex data world, and always know exactly where to look for answers. Datakin automatically traces data lineage, showing your entire data ecosystem in a rich visual graph. It clearly illustrates the upstream and downstream relationships for each dataset. The Duration tab summarizes a job’s performance in a Gantt-style chart along with its upstream dependencies, making it easy to find bottlenecks. When you need to pinpoint the exact moment of a breaking change, the Compare tab shows how your jobs and datasets have changed between runs. Sometimes jobs that run successfully produce bad output. The Quality tab surfaces critical data quality metrics, showing how they change over time so anomalies become obvious. Datakin helps you find the root cause of issues quickly – and prevent new ones from occurring.
    Starting Price: $2 per month
  • 2
    Blindata

    Blindata

    Blindata

    Blindata covers all the functions of a Data Governance program: Business Glossary, Data Catalog & Data Lineage build an integrated and complete view on your Data. Data Classification module gives a semantic meaning to the data while the Data Quality, Issue Management & Data Stewardship modules improve the reliability and trust on data. Moreover, privacy compliance can leverage specific features: registry of processing activities, centralized privacy note management, consent registry with Blockchain integrated notarization. Blindata Agent can connect to different data sources, collecting metadata such data structures (Tables, Views, Fields, …), data quality metrics, reverse lineage, etc. Blindata has a modular and entirely API based architecture allowing systematic integration with the most critical business systems (DBMS, Active Directory, e-commerce, Data Platforms). Blindata is available as SaaS, can be installed “on Premise” or purchased on AWS Marketplace.
    Starting Price: $2000/year/user
  • 3
    Montara

    Montara

    Montara

    Montara enables BI teams and data analysts, using SQL only, to model and transform their data easily and seamlessly and enjoy benefits such as modular code, CI/CD, versioning and automated testing and documentation. With Montara, analysts can quickly understand how changes to models impact analysis, reports and dashboards with report-level lineage and support for 3rd party visualization tools such as Tableau and Looker. Furthermore, BI teams can perform ad-hoc analysis and create reports and dashboards on Montara directly.
    Starting Price: $100/user/month
  • 4
    Foundational

    Foundational

    Foundational

    Identify code and optimization issues in real-time, prevent data incidents pre-deploy, and govern data-impacting code changes end to end—from the operational database to the user-facing dashboard. Automated, column-level data lineage, from the operational database all the way to the reporting layer, ensures every dependency is analyzed. Foundational automates data contract enforcement by analyzing every repository from upstream to downstream, directly from source code. Use Foundational to proactively identify code and data issues, find and prevent issues, and create controls and guardrails. Foundational can be set up in minutes with no code changes required.
  • 5
    Collibra

    Collibra

    Collibra

    With a best-in-class catalog, flexible governance, continuous quality, and built-in privacy, the Collibra Data Intelligence Cloud is your single system of engagement for data. Support your users with a best-in-class data catalog that includes embedded governance, privacy and quality. Raise the grade, by ensuring teams can quickly find, understand and access data across sources, business applications, BI and data science tools in one central location. Give your data some much-needed privacy. Centralize, automate and guide workflows to encourage collaboration, operationalize privacy and address global regulatory requirements. Get the full story around your data with Collibra Data Lineage. Automatically map relationships between systems, applications and reports to provide a context-rich view across the enterprise. Hone in on the data you care about most and trust that it is relevant, complete and trustworthy.
  • 6
    Trifacta

    Trifacta

    Trifacta

    The fastest way to prep data and build data pipelines in the cloud. Trifacta provides visual and intelligent guidance to accelerate data preparation so you can get to insights faster. Poor data quality can sink any analytics project. Trifacta helps you understand your data so you can quickly and accurately clean it up. All the power with none of the code. Trifacta provides visual and intelligent guidance so you can get to insights faster. Manual, repetitive data preparation processes don’t scale. Trifacta helps you build, deploy and manage self-service data pipelines in minutes not months.
  • 7
    IBM Databand
    Monitor your data health and pipeline performance. Gain unified visibility for pipelines running on cloud-native tools like Apache Airflow, Apache Spark, Snowflake, BigQuery, and Kubernetes. An observability platform purpose built for Data Engineers. Data engineering is only getting more challenging as demands from business stakeholders grow. Databand can help you catch up. More pipelines, more complexity. Data engineers are working with more complex infrastructure than ever and pushing higher speeds of release. It’s harder to understand why a process has failed, why it’s running late, and how changes affect the quality of data outputs. Data consumers are frustrated with inconsistent results, model performance, and delays in data delivery. Not knowing exactly what data is being delivered, or precisely where failures are coming from, leads to persistent lack of trust. Pipeline logs, errors, and data quality metrics are captured and stored in independent, isolated systems.
  • 8
    IBM DataStage
    Accelerate AI innovation with cloud-native data integration on IBM Cloud Pak for data. AI-powered data integration, anywhere. Your AI and analytics are only as good as the data that fuels them. With a modern container-based architecture, IBM® DataStage® for IBM Cloud Pak® for Data delivers that high-quality data. It combines industry-leading data integration with DataOps, governance and analytics on a single data and AI platform. Automation accelerates administrative tasks to help reduce TCO. AI-based design accelerators and out-of-the-box integration with DataOps and data science services speed AI innovation. Parallelism and multicloud integration let you deliver trusted data at scale across hybrid or multicloud environments. Manage the data and analytics lifecycle on the IBM Cloud Pak for Data platform. Services include data science, event messaging, data virtualization and data warehousing. Parallel engine and automated load balancing.
  • 9
    ASG Data Intelligence

    ASG Data Intelligence

    ASG Technologies

    The demand for data-driven insights and innovation has never been higher. Maintaining a competitive edge in today’s global enterprises hinges on the ability to leverage trusted data to make informed and strategic business decisions. Unfortunately, while most companies collect a wealth of data, it too often remains unused because business leaders either can’t find it or simply don’t understand and trust it. ASG Data Intelligence (ASG DI) is the solution for data distrust. It is a metadata-driven platform that makes technical data “smarter” with end-to-end views of the data and its movements (data lineage) combined with business meaning and usage guardrails. Data value is unleashed by making it available, understood and trusted to users across your organization – data scientists, analysts, marketers and more. Build trust in data with improved understanding of its origins, processes and business context.
  • 10
    Kylo

    Kylo

    Teradata

    Kylo is an open source enterprise-ready data lake management software platform for self-service data ingest and data preparation with integrated metadata management, governance, security and best practices inspired by Think Big's 150+ big data implementation projects. Self-service data ingest with data cleansing, validation, and automatic profiling. Wrangle data with visual sql and an interactive transform through a simple user interface. Search and explore data and metadata, view lineage, and profile statistics. Monitor health of feeds and services in the data lake. Track SLAs and troubleshoot performance. Design batch or streaming pipeline templates in Apache NiFi and register with Kylo to enable user self-service. Organizations can expend significant engineering effort moving data into Hadoop yet struggle to maintain governance and data quality. Kylo dramatically simplifies data ingest by shifting ingest to data owners through a simple guided UI.
  • 11
    Tokern

    Tokern

    Tokern

    Open source data governance suite for databases and data lakes. Tokern is a simple to use toolkit to collect, organize and analyze data lake's metadata. Run as a command-line app for quick tasks. Run as a service for continuous collection of metadata. Analyze lineage, access control and PII datasets using reporting dashboards or programmatically in Jupyter notebooks. Tokern is an open source data governance suite for databases and data lakes. Improve ROI of your data, comply with regulations like HIPAA, CCPA and GDPR and protect critical data from insider threats with confidence. Centralized metadata management of users, datasets and jobs. Powers other data governance features. Track Column Level Data Lineage for Snowflake, AWS Redshift and BigQuery. Build lineage from query history or ETL scripts. Explore lineage using interactive graphs or programmatically using APIs or SDKs.
  • 12
    Apache Atlas

    Apache Atlas

    Apache Software Foundation

    Atlas is a scalable and extensible set of core foundational governance services – enabling enterprises to effectively and efficiently meet their compliance requirements within Hadoop and allows integration with the whole enterprise data ecosystem. Apache Atlas provides open metadata management and governance capabilities for organizations to build a catalog of their data assets, classify and govern these assets and provide collaboration capabilities around these data assets for data scientists, analysts and the data governance team. Pre-defined types for various Hadoop and non-Hadoop metadata. Ability to define new types for the metadata to be managed. Types can have primitive attributes, complex attributes, object references; can inherit from other types. Instances of types, called entities, capture metadata object details and their relationships. REST APIs to work with types and instances allow easier integration.
  • 13
    Truedat

    Truedat

    Bluetab Solutions

    Truedat is an open source data governance business solution tool developed by Bluetab Solutions in order to help our clients become data-driven companies. We help to define business processes, roles & responsibilities. We also help putting processes into practice. Integration and customization of truedat´s open source components to support the data governance processes. We guarantee the support and maintenance of the process & software of our solution modules installed by us. Based on our experience in, we have developed a solution that covers the need for Data Governance, allowing to manage and control highly complex and changing data architectures. The highly increasing migration of enterprise IT platforms to cloud, multi-cloud and hybrid architectures, increases the sources, complexity and types of data and therefore rises the need for truedat. Our solution comes from more than 8 years of experience in Data Governance consulting and development projects.
  • 14
    Talend Data Catalog
    Talend Data Catalog gives your organization a single, secure point of control for your data. With robust tools for search and discovery, and connectors to extract metadata from virtually any data source, Data Catalog makes it easy to protect your data, govern your analytics, manage data pipelines, and accelerate your ETL processes. Data Catalog automatically crawls, profiles, organizes, links, and enriches all your metadata. Up to 80% of the information associated with the data is documented automatically and kept up-to-date through smart relationships and machine learning, continually delivering the most current data to the user. Make data governance a team sport with a secure single point of control where you can collaborate to improve data accessibility, accuracy, and business relevance. Support data privacy and regulatory compliance with intelligent data lineage tracing and compliance tracking.
  • 15
    Data360 Govern

    Data360 Govern

    Precisely

    Your organization knows the value of data and the need to get it into the hands of business users for maximum impact, but without enterprise data governance, that data might be hard to find, understand, and trust. Data360 Govern is an enterprise data governance, catalog, and metadata management solution that gives you confidence in the quality, value, and trustworthiness of your data. It automates governance and stewardship tasks to help you answer essential questions about your data’s source, use, meaning, ownership, and quality. With Data360 Govern, you can make faster decisions on data usage and management, build collaboration across your entire organization, and allow users to get the answers they need – when they need them. Transparency into your organization’s data landscape gives you the power to track the critical data aligned with your most important business outcomes.
  • 16
    Global IDs

    Global IDs

    Global IDs

    Find out some of the best platform features provided by Global IDs, which brings in a set of Enterprise Data Solutions like data governance, data compliance, cloud migration, rationalization, privacy, analytics & much more! Global IDs EDA Platform feature comprises a set of core functions: automated discovery and profiling, data classification, data lineage, data quality and more – that render data transparent, trustworthy, and explainable across the ecosystem. Global IDs EDA platform architecture is designed for integration from the ground up with all platform functionality accessible via APIs. Global IDs EDA platform automates data management for enterprises of any size or data ecosystem.
  • 17
    DataHawk

    DataHawk

    We-Bridge

    Visualize data lineage by automatically extracting data flow from data source to target. A data lineage management solution that automatically collects and analyzes data lineage of mission-critical data, visualizing data flow and derivation rule from data source to target. Data Lineage is the flow of data from the source to the target. Tracking Data Lineage means understanding what flow and derivation rules the data processed, transformed and used. Multi-tier column level data lineage graph and list from source to target. Drill down data lineage – business system, table and column level. Provide parsers for various environment analysis and support analysis of Big Data technologies. Path sensitive dynamic string analysis and data flow analysis inside programs with our patented technology.
  • 18
    Sifflet

    Sifflet

    Sifflet

    Automatically cover thousands of tables with ML-based anomaly detection and 50+ custom metrics. Comprehensive data and metadata monitoring. Exhaustive mapping of all dependencies between assets, from ingestion to BI. Enhanced productivity and collaboration between data engineers and data consumers. Sifflet seamlessly integrates into your data sources and preferred tools and can run on AWS, Google Cloud Platform, and Microsoft Azure. Keep an eye on the health of your data and alert the team when quality criteria aren’t met. Set up in a few clicks the fundamental coverage of all your tables. Configure the frequency of runs, their criticality, and even customized notifications at the same time. Leverage ML-based rules to detect any anomaly in your data. No need for an initial configuration. A unique model for each rule learns from historical data and from user feedback. Complement the automated rules with a library of 50+ templates that can be applied to any asset.
  • 19
    Aggua

    Aggua

    Aggua

    Aggua is a data fabric augmented AI platform that enables data and business teams Access to their data, creating Trust and giving practical Data Insights, for a more holistic, data-centric decision-making. Instead of wondering what is going on underneath the hood of your organization's data stack, become immediately informed with a few clicks. Get access to data cost insights, data lineage and documentation without needing to take time out of your data engineer's workday. Instead of spending a lot of time tracing what a data type change will break in your data pipelines, tables and infrastructure, with automated lineage, your data architects and engineers can spend less time manually going through logs and DAGs and more time actually making the changes to infrastructure.
  • 20
    DataGalaxy

    DataGalaxy

    DataGalaxy

    DataGalaxy’s all-in-one data catalog offers out-of-the-box actionability with fully-customizable attributes, visualization tools, and AI integration to give business teams the ability to document, link, and track all their metadata assets. The Data Catalog 360°’s user-centric platform is dedicated to metadata mapping, management, and knowledge sharing to help your organization manage data your way. A data catalog enables employees from all teams to collaborate using centralized, homogeneous data sets. Our data catalog provides clarity on data definitions, synonyms, and essential business attributes with a semantic layer so all users can understand and leverage their data as an asset. When you need answers about specific metadata, turn to the data catalog that identifies a topic’s 360° data experts, owners, and stewards empowering your team through streamlined collaboration.
  • 21
    Validio

    Validio

    Validio

    See how your data assets are used: popularity, utilization, and schema coverage. Get important insights about your data assets such as popularity, utilization, quality, and schema coverage. Find and filter the data you need based on metadata tags and descriptions. Get important insights about your data assets such as popularity, utilization, quality, and schema coverage. Drive data governance and ownership across your organization. Stream-lake-warehouse lineage to facilitate data ownership and collaboration. Automatically generated field-level lineage map to understand the entire data ecosystem. Anomaly detection learns from your data and seasonality patterns, with automatic backfill from historical data. Machine learning-based thresholds are trained per data segment, trained on actual data instead of metadata only.
  • 22
    IBM Manta Data Lineage
    IBM Manta Data Lineage is a data lineage platform that increases data pipeline transparency so businesses can determine data accuracy throughout their models and systems. As businesses integrate AI into their workflows and data becomes more complex, data quality, provenance, and lineage are increasingly important. In fact, IBM’s 2023 CEO study found the number one barrier to generative AI adoption is concerns about the lineage of data.  IBM offers an automated data lineage platform that automatically scans your applications to build a powerful map of all data flows. The platform then delivers the info through a native user interface (UI) and other channels to both technical and nontechnical users. With IBM Manta Data Lineage, data operations teams get comprehensive visibility and control of their data pipeline. By improving your understanding and use of dynamic metadata, you can ensure that data is managed efficiently and accurately across complex systems.
  • 23
    Datalogz

    Datalogz

    Datalogz

    Data knowledge management platform that enables teams to streamline data discovery and understanding with the ultimate goal of being able to trust their data. Prevent misreporting analytics and costly mistakes today!
  • 24
    Dremio

    Dremio

    Dremio

    Dremio delivers lightning-fast queries and a self-service semantic layer directly on your data lake storage. No moving data to proprietary data warehouses, no cubes, no aggregation tables or extracts. Just flexibility and control for data architects, and self-service for data consumers. Dremio technologies like Data Reflections, Columnar Cloud Cache (C3) and Predictive Pipelining work alongside Apache Arrow to make queries on your data lake storage very, very fast. An abstraction layer enables IT to apply security and business meaning, while enabling analysts and data scientists to explore data and derive new virtual datasets. Dremio’s semantic layer is an integrated, searchable catalog that indexes all of your metadata, so business users can easily make sense of your data. Virtual datasets and spaces make up the semantic layer, and are all indexed and searchable.