Faire Slashes Data Pipeline Costs by 70% with Snowflake and Select Star

Faire connects local retailers with boutique brands and wholesalers across the globe, allowing them to easily find the right products for their shop. Over the last five years, Faire's mission and platform have resonated with brands and retailers growing to over 100,000 brands in more than 100 countries, including 700,000 retailers in North America and Europe alone.

70%
Cost Reduction in Core Data Pipeline Management
80%
Decrease in Debugging Hours for Analytics Engineering
Industry:
B2B E-commerce
Company
size:
1500 Employees
Integrations:
Snowflake
Mode
Looker

Challenge

Rapid growth complicated access to self-serve data and impacted data quality

Faire started as a platform to connect boutique brands and small retailers, but they have quickly evolved beyond a wholesale marketplace. They are now a major growth partner for retail brands, managing digital e-commerce storefronts, order fulfillment, customer relationships, and transactions.

But along with their success came difficulties. As the platform expanded from a few brands to 100,000, and from a few retailers to 700,000 globally, this rapid growth led to a data explosion, complicating the machine learning models and data queries.

“Sometime around 2021, our data and workload volume had grown enough that week-to-week changes in peak capacity meant frequent cluster halts that would require downtime. Much of the business data analytics and engineering initiatives like democratizing data access and expanding data capabilities were all blocked,” explained Ben Thompson, Staff Analytics Engineer at Faire.

Faire’s previous data warehouse couldn't keep up with Faire's accelerated growth, halting all their critical business processes. This is where Snowflake stepped in, offering scalable solutions and efficient peak load handling. 

After the transition, the improvements were clear: Snowflake not only eradicated downtime for cluster scaling but also reduced the average runtime for BI queries by 75%. 

Now on the new data platform, Faire needed to rethink their ETL design and their “core data” model to truly enable data democratization across the business. 

The data team had a new mission to accommodate the company's growth – especially as the company was continuing to launch new features to the customers. Everyone at Faire needed access to data at any time to make informed decisions, take calculated bets, and quantify values without barriers to access or understanding. To achieve this, the data team needed to balance scalability, self-serve access, and data quality.

“There was friction in getting into the data, and using it for day-to-day work was a big inefficiency for us,” Ben explained. “Often we'd have slightly different permutations of metrics or attributes within our tables and it would take a while for users to learn which one should be used in which scenario. Or worse, they would just pick one at random and often get the wrong value.”

Faire recognized a need for a data catalog that would coalesce knowledge on hundreds of columns and make it discoverable for everyone at the company. 

testim icon

Select Star allows us to engage users early with exactly how each upstream change impacts their downstream workflows. This has built a lot of trust in the team and helps iron out thorny issues quickly.

Ben Thompson
Senior Staff Analytics Engineer at Faire

Solution

Intuitive UI/UX and automatic popularity rankings bring immediate data insights

Faire’s data team evaluated several open-source tools and vendor products. Select Star was chosen for two main reasons:

  1. Ease of use. Select Star’s intuitive user interface makes it easy for everyone at Faire to navigate and make sense of the core data that the data team produced. Because it’s designed for more than just the engineers and analysts (i.e. Ops, PMs, and business stakeholders) to use, Faire experienced faster ramp times and higher adoption among technical and non-technical users alike.
  2. Fast time to insights. Select Star ranks everything by popularity automatically, so even tables without explicit documentation would give the user some sense of what the right column was to use in a given situation. This popularity ranking feature helped Faire distinguish between columns used in outdated dashboards or accidental ad hoc queries and those essential for regular analysis and reporting. That granted Faire a consolidated overview of data knowledge and context, identifying which columns could be deprecated and which needed to be retained.

With Select Star as the single source of truth, Faire managed to harmonize usage across tables.

testim icon

Taking on the task of synthesizing usage in dozens of tables with hundreds of columns was a very daunting task, but Select Star's Platform made it possible for us.

Using Select Star, the data team was able to build their core data model:

  1. What columns aren’t used? Although tables may have had hundreds of columns, sometimes only a handful were seeing heavy engagement, and others weren’t relied on anywhere. Select Star took it a step further to find columns that were only leveraged by outdated dashboards—or those that were accidentally pulled in ad hoc queries—versus those that were relied on. 
The popularity ranking in these columns reveals the frequency of their use (with the higher popularity identifier being more important). This functionality helped Faire identify which columns are crucial for operations and decision-making.
  1. What columns are often used together? Select Star's column-level lineage allowed Faire to find that quickly by flipping through columns and checking common downstream tables and reports. Finding columns that are generally used together and not with others helped Faire group things together logically into distinct tables versus one single monolith. Because Select Star annotates these lineage links with what type of operation is done on them, it makes the process even more efficient.
Select Star’s column-level lineage enabled Faire to see how their data is used at a glance. Currently, no other competitors are providing this level of granularity.
  1. What are common joins that could be consolidated into a core table? Before Select Star, Faire’s core tables had over 100 columns. While most of their focus was on removing columns, it was also important for them to look at what could be added. Although user experience can be improved by making schemas more approachable, Faire also wanted to make sure that they're more important and valuable. Select Star aggregates common joins to specific core data tables and points them toward new columns that might be valuable to include. 
Select Star shows the popular joins within Faire’s columns, emphasizing those linked by the same 'retailer ID.' This was key for identifying interconnected columns.

By quickly identifying interconnected columns, Faire added new columns to their data tables without having to worry about the usual change management issues when new data models are introduced.

“Select Star helped us answer all of these questions about each table and also about who to talk to,” Ben said.

Result

More self-service data users and empowered internal teams

By incorporating Select Star into their daily workflows, Faire has successfully democratized access to core data, fostering a culture of transparency and informed decision-making. Making connections between tables, dashboards, and columns to specific users or teams enabled user-focused insights and facilitated early feedback.

All this led to the following results:

  • A 70% overall reduction in core data pipeline costs. Faire’s data team was able to implement a horizontally scaling architecture, which minimizes complex queries and redundant data handling steps. In addition to more efficient use of their warehouses, they reduced the time and resources required to manage and extract value from the data. This efficiency not only speeds up data access but also significantly cuts down on Faire’s operational costs.
  • An 80% decrease in debugging hours for analytics engineering. Select Star facilitated user engagement from the early stages, providing users with a transparent view of alterations in their workflow. This early and clear engagement enabled the swift resolution of complex issues, fostering trust in the team and optimizing the workflow. “Engaging users early with a clear picture of how their workflow will change built a lot of trust in the team and helped iron out thorny issues quickly,” said Ben.
  • A 20% boost in user engagement. Before Select Star, internal data users were either not leveraging data in their day-to-day work, or they were relying on raw data sources that led to incorrect results. From having a single source of truth that they can trust, more team members were able to leverage data in their business and operations.

“As stewards of the company's data, we will continue to invest in these advanced features of Select Star, including linking tables, not just core ones, to their associated Airflow and SQL files in the code base, making it simple to audit pipelines and understand, as a user, how the data you're using is generated and where it comes from,” explained Ben.

When Faire initially integrated Select Star, the expectation was to primarily use it for documentation and search. However, as the platform became more integrated into their overall processes, it became a vital asset for understanding and navigating Faire's internal data platform.

The journey with Select Star has set Faire on a path of continuous excellence in data management and utilization. Moving forward, Faire plans to continue leveraging Select Star's capabilities to further streamline data management processes. 

Want to learn more about how other companies are leveraging advanced data management tools like Select Star? Talk to us.

More Customer Stories

Wallbox Gains Clarity and Control in Data Governance with Select Star
Learn More
AlphaSense Transforms their BigQuery Data Warehouse with Select Star
Learn More
Block Automates Column-level Data Lineage at Exabyte Scale with Select Star
Learn More
Turn your metadata into real insights