Back
Blog Post

Column-Level Data Lineage in Action: 5 Real-World Examples

Nicole Mitich
November 17, 2023


From agriculture to SaaS, companies in all industries are already capitalizing on column-level lineage to inform decision-making and drive innovation. Whether they're detecting the root causes of data quality issues or streamlining compliance processes, the applications are as varied as they are impactful. 

Here’s how organizations are leveraging column-level lineage to address their unique challenges, transform their data ecosystems, and achieve their goals.

Bowery Farming Makes Data Accessible and Manageable at Scale 

Bowery Farming, a pioneer in indoor agriculture, encountered scalability issues as their operations grew to over 1,000 retailers throughout the Northeast and Mid-Atlantic. With a data science team of 11 managing a warehouse of 900 tables with billions of data points, they faced challenges in data visibility and dashboard reliability, which led to frequent operational disruptions and an increasing number of support tickets.

Automating the detection of column-level lineage transformed Bowery Farming's data management. Now, they can ascertain dependencies between data tables and dashboards quickly, clearly, and comprehensively. When they need to fix broken dashboards—which happens much less frequently now—they spend 1-2 minutes looking at the lineage instead of 30 minutes to an hour reading through the SQL queries and dbt models underneath. As a result, the data team significantly cut down on support tickets.

Supplying the business stakeholders with the exact numbers they need when they need them — and keeping that pipeline flowing without fail — helped Bowery Farming turn their ambitious agricultural concept into a sustainable and scalable business. With the upcoming expansion to two more facilities, column-level lineage ensures that Bowery Farming's growth remains on track with streamlined operations and trusted data.

🔑 Read the Case Study

Xometry Saves Millions of Dollars from Data Inaccuracies

Xometry, a global connector of manufacturing suppliers and partners, uses an AI-based algorithm for calculating manufacturing runs, prices, and lead times. 

Though advantageous to expediting their operations, their algorithm is also complex; as the data moves through Xometry’s systems, it has to be matched, cleansed, transformed, and verified.


With a lack of context on how their data moved through their systems, they experienced data outages, long decision-making times, and increased human error. These inaccuracies were costing Xometry millions each month – they needed visibility into their pipeline process.

After evaluating every column-level lineage tool on the market, Xometry adopted Select Star to gain visibility into how changes in upstream objects could affect downstream processes. Column-level lineage provided clear visibility into how data was derived and transformed across different tables. This context allowed them to track data flow end-to-end, pinpointing the root cause of issues and significantly reducing data outages. It also allowed Xometry to proactively manage changes by setting up error messages that link directly to Confluence pages which detail the implications of each change.

As a result, Xometry saved over 200 hours annually for their data engineering team by reducing the number of outages in the first place and debugging issues that did happen 36x faster. This efficiency gain improved internal operations and enhanced the accuracy of customer-facing information, reinforcing stakeholder trust in Xometry's data and services.

🦸 Read the Case Study

AlphaSense Transforms their BigQuery Data Warehouse

AlphaSense, an AI-driven insights provider, faced challenges typical of high-growth companies with their core data model. As they scaled rapidly, the data team prioritized quick solutions, leading to a fragmented and inefficient data warehouse. This approach soon increased maintenance time, hindered efficiency, and delayed the delivery of accurate insights. The data team was overwhelmed with managing thousands of tables and dashboards and struggled to maintain data quality and governance.

In response, AlphaSense onboarded Select Star for its holistic data governance with column-level lineage and automated data documentation capabilities. Because of column-level lineage, AlphaSense gained much-needed visibility into which tables were apart of their critical data pipelines like revenue reporting. This provided clear insights into their data's journey and enabled effective impact analysis and change management.

This clarity led to rapid insights, improved downstream visibility, a 43% reduction in data tables, and a 66% decluttering of dashboards. They completed a significant data governance project, focusing on the most relevant datasets and the underlying tables that were identified via column level-lineage (and saving over 100 hours in the process). This efficiency enabled self-serve access to critical data, reducing reliance on the engineering team and empowering other business teams with accurate and reliable data. 

🪄 Read the Case Study

Block Automates Data at Exabyte Scale

Block, a global finance company, has to manage and secure billions of data points while complying with stringent regulations like GDPR, CCPA, and PCI DSS. Their data ecosystem—including dbt, Fivetran, Amazon S3, Snowflake, and Looker—was pushed to its limits with over 6,600 monthly active users and millions of Snowflake database tables. 

This decentralized, organic operating model led to inconsistencies and fragmented communication, which made it challenging to ensure data accuracy and manage sensitive information securely. 


To resolve their challenges, they opted for Select Star’s column-level lineage to manage dependencies and handle sensitive data. Since its adoption, Block has centralized their metadata, gaining clarity on data models and enabling efficient impact analysis using lineage. This helped identify risks associated with spreading sensitive data, particularly PII columns, streamlining issue debugging and schema updates.

The implementation has saved countless hours for Block's teams, automating access control management and reducing manual efforts in tracking dependencies. They developed Bellhop, an internal tool integrated with Select Star's API, to automate security requests and manage access permissions. With over 11,000 dashboards and 25,000 Looks used by 6,600+ users, column-level lineage bridges the gap between Looker and Snowflake, providing a clear view of table usage so the Block team can deprecate redundancies. 

🚀 Read the Case Study

Faire Slashes Data Pipeline Costs by 70%

Faire started as a platform to connect boutique brands and small retailers, but they have quickly evolved beyond a wholesale marketplace. This growth—from a few brands to over 100,000 and retailers from a few to over 700,000 globally—complicated their machine learning models and data queries. Their data volume growth caused frequent cluster halts which caused downtime and blocked critical business analytics and engineering initiatives. They needed to redesign their ETL and core data model to enable data democratization across the business.

To address these challenges, Faire chose Select Star’s column-level lineage. With automatic popularity rankings, the data engineering team could easily distinguish essential columns for regular analysis from those used in outdated or ad hoc queries. 

The popularity ranking in these columns reveals the frequency of their use (with the higher popularity identifier being more important).


Using Select Star's automated popularity ranking, Faire identified underused columns, logically grouped columns commonly used together, and consolidated common joins into core tables. This resulted in a 70% reduction in core data pipeline costs, an 80% decrease in debugging hours for analytics engineering, and a 20% increase in user engagement. Column-level lineage has been pivotal for understanding and navigating Faire’s data platform, and has, in turn, allowed them to streamline all their data management processes.

💰 Read the Case Study

Column-level Lineage Changes How Companies Operate

Column-level lineage transforms how modern companies manage, understand, and utilize their data. As seen in the real-world examples of Bowery Farming, Xometry, AlphaSense, Block, and Faire, implementing column-level lineage has streamlined their data management, improved their operational efficiency, and fostered better decision-making and innovation. 

Companies in all industries are leveraging column-level lineage to enhance their data quality—even as they scale. Tracing data from its source to its destination with precision ensures that all organizations can adapt to their growing data needs and maintain a competitive edge in their respective industries. In essence, column-level lineage is not just a tool for data teams; it's a strategic asset that empowers entire organizations to operate more effectively and confidently in the data-driven world.

Want to see if Select Star is the right column-level lineage tool for your organization? Book a demo with one of our experts.

Related Posts

Snowflake 2024: AI, Developer Experience, and Data Governance
Learn More
Data Deprecation with Confidence: A Step-by-Step Guide
Learn More
How AlphaSense Harnessed Metadata to Control Dashboard Sprawl
Learn More
Data Lineage
Data Lineage
Data Quality
Data Quality
Data Documentation
Data Documentation
Data Engineering
Data Engineering
Data Catalog
Data Catalog
Data Science
Data Science
Data Analytics
Data Analytics
Data Mesh
Data Mesh
Company News
Company News
Case Study
Case Study
Technology Architecture
Technology Architecture
Data Governance
Data Governance
Data Discovery
Data Discovery
Business
Business
Data Lineage
Data Lineage
Data Quality
Data Quality
Data Documentation
Data Documentation
Data Engineering
Data Engineering
Data Catalog
Data Catalog
Data Science
Data Science
Data Analytics
Data Analytics
Data Mesh
Data Mesh
Company News
Company News
Case Study
Case Study
Technology Architecture
Technology Architecture
Data Governance
Data Governance
Data Discovery
Data Discovery
Business
Business
Turn your metadata into real insights