Tokern
Open source data governance suite for databases and data lakes. Tokern is a simple to use toolkit to collect, organize and analyze data lake's metadata. Run as a command-line app for quick tasks. Run as a service for continuous collection of metadata. Analyze lineage, access control and PII datasets using reporting dashboards or programmatically in Jupyter notebooks. Tokern is an open source data governance suite for databases and data lakes. Improve ROI of your data, comply with regulations like HIPAA, CCPA and GDPR and protect critical data from insider threats with confidence. Centralized metadata management of users, datasets and jobs. Powers other data governance features. Track Column Level Data Lineage for Snowflake, AWS Redshift and BigQuery. Build lineage from query history or ETL scripts. Explore lineage using interactive graphs or programmatically using APIs or SDKs.
Learn more
MANTA
Manta is the world-class automated approach to visualize, optimize, and modernize how data moves through your organization through code-level lineage. By automatically scanning your data environment with the power of 50+ out-of-the-box scanners, Manta builds a powerful map of all data pipelines to drive efficiency and productivity. Visit manta.io to learn more.
With Manta platform, you can make your data a truly enterprise-wide asset, bridge the understanding gap, enable self-service, and easily:
• Increase productivity
• Accelerate development
• Shorten time-to-market
• Reduce costs and manual effort
• Run instant and accurate root cause and impact analyses
• Scope and perform effective cloud migrations
• Improve data governance and regulatory compliance (GDPR, CCPA, HIPAA, and more)
• Increase data quality
• Enhance data privacy and data security
Learn more
Manta
Manta is an automated data lineage platform that helps organizations record, track, visualize, and optimize how data flows from its origin through transformation to consumption across their entire data environment, delivering full visibility and control of data pipelines that manual methods can’t match. It automatically scans metadata, SQL code, ETL workflows, BI/report definitions, and other data sources with support for dozens of technologies to build detailed, end-to-end lineage maps showing where data comes from, how it’s transformed, and where it’s used, enabling accurate impact analysis, root-cause tracing, and error detection. It provides rich visualizations with dynamic filtering, granular lineage at table and column levels, and APIs for integration with metadata catalogs, CI/CD workflows, and governance systems, reducing manual effort and accelerating DataOps, migrations, compliance, and governance initiatives.
Learn more
DataHawk
Visualize data lineage by automatically extracting data flow from data source to target. A data lineage management solution that automatically collects and analyzes data lineage of mission-critical data, visualizing data flow and derivation rule from data source to target. Data Lineage is the flow of data from the source to the target. Tracking Data Lineage means understanding what flow and derivation rules the data processed, transformed and used. Multi-tier column level data lineage graph and list from source to target. Drill down data lineage – business system, table and column level. Provide parsers for various environment analysis and support analysis of Big Data technologies. Path sensitive dynamic string analysis and data flow analysis inside programs with our patented technology.
Learn more