AlphaSense Transforms their BigQuery Data Warehouse with Select Star

AlphaSense is a market intelligence and search platform used by the world’s leading companies and financial institutions. Today, AlphaSense has over 4,000 enterprise customers – including more than 85% of the S&P 100.

6,000+
Unused Tables Deprecated
66%
Of Dashboards Decluttered
Industry:
Software Development
Company size:
1000+ employees
Integrations:
This is some text inside of a div block.
Integrations:

Challenge

Adapting a core data model to hyper-growth demands

AlphaSense leverages proprietary AI and language models to help professionals extract relevant insights from an extensive universe of public and private content, including equity research, earnings calls, company filings, news, trade journals, and expert interviews. Like many high-growth companies, AlphaSense faced some common challenges central to their data model.

To support a large and fast-growing company stakeholder, the team often had to prioritize agility and immediate results, resorting to shortcuts and temporary solutions (i.e. responding to requests with more and more dashboards and tables). 

While these ad-hoc solutions solved problems quickly in the beginning, this approach restricted the data team’s performance. Maintenance time increased, posing a risk to their continued efficiency and agility in enabling accurate and complete insights for the company stakeholders.  

“We were trying to add value as quickly as possible,” said Mallory Principe, Staff Technical PM at AlphaSense. “We ended up creating table after table and model after model without thinking holistically or agnostically about data warehouse management and dimensional modeling best practices.”

With an ever-increasing data volume and more users accessing that data, there was a need for more data contracts and stewardship to reduce the chances of errors and inconsistencies within their data. This is where table- and column-level definitions and table ownership capabilities would make the biggest difference.

To add another layer of complexity to the mix, the data team was responsible for managing thousands of tables, dashboards, looks, and explores, in addition to different reporting requirements for every team across the organization. With so much responsibility, there was limited bandwidth for unexpected issues. 

When issues did arise, dedicating time to resolving them intensified the data team’s daily workload. And with AlphaSense’s continued growth, it was clear they needed to step up and refactor their core data model, especially given how much was on their data team’s plate.

“We ended up in a situation where we had a data warehouse that looked more like a data lake because everyone was just creating more tables, joining those tables, and enabling dashboards,” explained Stefka Antonova, Analytics Engineer at AlphaSense. “We didn't have a well-organized data warehouse that followed any sort of dimensional modeling best practices.”

This reactive approach, coupled with an inability to track data lineage, resulted in long resolution times and signaled the pressing need for improved data governance and quality. It was essential to monitor who was accessing the data and make sure that sensitive information was viewed only by authorized personnel.

“Ultimately, we understood that stewardship has to be there in order to enable data quality,” said Stephanie Loesche, Director of Data Governance, Compliance & Process at AlphaSense. “In order to get to the level of data quality we need, you have to have data governance. You have to have data owners and make sure that they have consistent rules across all the different platforms where that data travels.”

It wasn’t just about restriction; it was also about understanding the data’s relevance and adjusting access based on the data's importance. The end game? Aligning perfectly with leading industry standards by mastering their data management.

testim icon

With Select Star’s column-level lineage, we were able to see where we might have waste and how things were connected.

Stephanie Loesche

Director of Data Governance, Compliance & Process at AlphaSense

Solution

Flexible and scalable platform for evolving data needs

AlphaSense aimed to find a holistic solution that would address all their underlying issues, streamline their data management processes, and ensure scalability for their future growth.

Their criteria for the ideal data governance tool included the following “must-haves”:

  • A user-friendly tool that wouldn't strain their data team.
  • Quick setup time.
  • Minimal resources required for implementation.
  • Native integrations with BigQuery, dbt, Looker, and Tableau.
  • Ability to see data models across all business domains such as product and content.
  • Consistent data representation across all business segments.
  • Data management capabilities, including data governance, metadata management, and data lineage.

After evaluating all the data catalog and data governance tool options on the market, they ultimately chose Select Star for two main reasons:

  1. Best-in-class column-level lineage. AlphaSense saw immense value in Select Star’s detailed column-level lineage because it offers clear insights into their data's journey and origins. This level of granularity enabled impact analysis and change management, so they’d have a good sense of potential fallout and be better equipped to handle consequences before making a change to their tables or queries.
  2. Automated data documentation and tagging. Instead of relying solely on manual input where human error could sneak in, they wanted a system where data was documented and tagged automatically. This would ensure that their data documentation would remain consistent, up-to-date, and, most importantly, accurate.
testim icon

Select Star improved our data quality and culture around governance.

These specific features were especially powerful for a tech giant like AlphaSense. They empowered AlphaSense to identify all the tables and datasets containing sensitive PII — a crucial part of ensuring compliance with the IPO's regulatory requirements. 

After adding Select Star to their data stack, AlphaSense could finally see the column-level lineage of their revenue data and isolate which data they needed to control. AlphaSense’s data team finally had the necessary visibility—they could understand where and how their data was being used and implement more stringent processes where needed.

“Any company that's looking to scale and potentially IPO needs a data catalog. We need to know what our data is, where it's coming from, how we're labeling it and defining it, and work with the upstream data owners,” Mallory said.

Result

Streamlined insights, table reductions, and data prioritization

When AlphaSense embarked on their journey to reshape their data infrastructure, they were aiming for more than just operational improvements. After all, that’s why they chose Select Star. 

Since the very first day they’ve seen:

  1. Rapid insights. Within the first 24 hours of connecting their BigQuery, Looker, and Tableau instances, Select Star computed their data lineage, pinpointed frequently used datasets, and identified inconsistencies in the data models that should be further optimized. This swift clarity on their data landscape showcased how quickly they could move from questions to actionable decisions.
  1. Improved downstream visibility. The transition wasn't just about moving data. It was about achieving a standardized data model that brought a level of consistency across the board. Select Star pinpoints every table or dashboard affected by an upstream change in data. This allows for precise adjustments and provides a clear understanding of potential time and cost implications, enhancing efficiency throughout the dbt migration processes.
  1. 43% reduction in data tables. The sheer volume of tables was daunting, but with Select Star’s usage analysis, AlphaSense’s data team was able to reduce their overall number of tables. That resulted in massive savings in terms of time and cost.
  1. 66% of dashboards decluttered. To effectively consolidate their dashboards, AlphaSense leveraged insights into dashboard usage patterns and data origins, helping them identify and eliminate redundancies, and focus on the most impactful and frequently accessed reports. This data-driven approach enabled a strategic reorganization of only essential dashboards.
  1. Completed their data governance project. AlphaSense started by taking into account regulatory adherence, internal evaluations, and a strategic understanding of the information investors and the public market would focus on. With Select Star’s automated cataloging, metadata management, and column-level data lineage features, AlphaSense quickly pinpointed and focused on the datasets that fit the bill: those that were most relevant and essential to their business operations. It turns out that only 10% of their existing data required such a high level of scrutiny. In the long run, that realization was a game-changer. It translated to 100+ hours saved, letting the team zero in on the most business-critical data without compromising their overall speed or agility.
  1. Increased self-serve access to data. Out of the hundreds of data fields that they started with, AlphaSense spotlighted business-critical ones based on Select Star’s analysis of popularity and downstream object counts. The impact? A reduced need to lean on the engineering team. All data consumers at AlphaSense now have a clearer path to key data, leading to sharper, more informed decisions and ability to build their own dashboards.


By democratizing access to essential metadata insights, various business teams at AlphaSense found themselves with unhindered access and even more trust in the data's accuracy and reliability. This trust and accessibility combined meant the data team was no longer a potential bottleneck. Empowered with confidence in the data, teams could focus on growth, and the data specialists were free to pursue new, more pivotal projects.

Thanks to Select Star, AlphaSense has a data-informed workspace where every team has the self-service information they need to excel. “Select Star improved our data quality and culture around governance. Now, everyone can see how much data actually exists and how we're using it,” said Mallory.

Related Posts

Wallbox Gains Clarity and Control in Data Governance with Select Star
Learn More
Faire Slashes Data Pipeline Costs by 70% with Snowflake and Select Star
Learn More
Block Automates Column-level Data Lineage at Exabyte Scale with Select Star
Learn More

Unlock the full context of your data

Get Started
Ring