Since 2013, Xometry (NYSE: XMTR) has helped engineers and product teams meet their custom manufacturing needs with an industrial marketplace supporting products from aerospace to consumer goods. Artificial intelligence enables the 5,000+ Xometry manufacturing partners to connect with customers around the world, making better matches and determining optimal pricing.
Data Outages and Losing Trust in Data
In order to connect manufacturing suppliers and partners across the globe, Xometry built an AI-based algorithm to show potential manufacturing runs with prices and lead times.
However, the data required to make the AI pricing platform work tends to be unwieldy and chaotic: it takes in the manufacturer's capabilities, previous work, evolving customer reviews, preferences, locations, lead times, and more. Also, whenever that data moves between systems, it must be matched, cleansed, transformed, and verified.
“We use our data to predict seller and buyer behavior accurately. The first part of the problem was sorting out where each dataset was coming from. If we don't know where the data is coming from, we can't use it in our prediction. This is why we started looking for data lineage solutions,” says Jisan.
But when Jisan and the team started evaluating data lineage tools, they realized that there wasn’t a fully automated column-lineage solution in the market. “There are many data modeling tools for table-level lineage, but the truly challenging part of data is in column-level lineage, where it discerns where that data is coming from, what it means, and what it represents.”
At this point, the data team was starting to experience data outages related to reporting, revealing cracks in their data pipeline. Now, there was pressure from the business to make sure accurate data was available when needed. Data outages were a major issue because they led to the inability to make informed business decisions, long turnaround times, and increased human error.
“In our applications, data accuracy is the most important thing. We estimate that data inaccuracies cost Xometry millions of dollars every month,” says Jisan.
"We chose Select Star because it automatically detects and displays column-level data lineage, so it’s easy to see where data comes from and flag issues in real time."
Sr Data Engineer at Xometry
Column-level Data Lineage Integrated into CI Workflow
Looking for a solution that could identify issues quickly and protect their data pipeline, Xometry decided their most important need was to understand the impact of column changes across tables. “In order to eliminate data outages, we needed to track our end-to-end data flow back to the root cause of the issue - where the data is being captured or transformed,” says Jisan.
After a year of trying out every data lineage tool in the market, Xometry chose Select Star.
“We chose Select Star because it automatically detects and displays column-level data lineage, so it’s easy to see where data comes from and flag issues in real time”, says Jisan.
“Select Star eliminated the data quality issues we had before and brought transparency in our data. Our engineers can now understand their impacts on downstream data easily since Select Star highlights and reports on differences right away.“
Select Star gave Xometry instant access to their data flows and showed where issues threatened the data pipelines. “Before Select Star, we would dig for answers. Now, we are just able to see that a column, say column X in our data, was derived from column F in another table,” says Jisan. “And that column F had been derived from columns A and B somewhere else. This eliminates all the time spent searching for where the problem is coming from, and locating the root cause of it.”
Furthermore, Jisan and team have integrated Select Star’s column-level lineage API, which put an end to their data outages. “We still have updates every week on our data pipeline, but nobody has run into data outage issues anymore, which is pretty impressive,” says Jisan. With no data incidents of erroneous reporting, Jisan and his team have stepped up the level of data trust at Xometry.
Building Trust in Data with Select Star
“Since we integrated Select Star in our CI pipeline, data quality issues are visible to the data producers, and it’s handled long before they get into production. That saves time, since we never need to hunt for where the data issues are coming from”, says Jisan. “More importantly, our stakeholders can trust the numbers are correct and our customers benefit from more accurate quotes”.
Xometry’s data engineering team has saved over 200 hours this year, over 30hrs / month, by using Select Star, allowing the data engineer team to channel their time into higher-value tasks. In the past, they would have spent those hours tracking down the origins of data downtime.
“Because of the time we’ve saved, we’re being more proactive about our data needs instead of being reactive to internal requests,” says Jisan.
Want to learn more about how other companies improve their data quality and outages? Talk to us.