AlphaSense is a market intelligence and search platform used by the world’s leading companies and financial institutions. Today, AlphaSense has over 4,000 enterprise customers – including more than 85% of the S&P 100.
Adapting a core data model to hyper-growth demands
AlphaSense leverages proprietary AI and language models to help professionals extract relevant insights from an extensive universe of public and private content, including equity research, earnings calls, company filings, news, trade journals, and expert interviews. Like many high-growth companies, AlphaSense faced some common challenges central to their data model.
To support a large and fast-growing company stakeholder, the team often had to prioritize agility and immediate results, resorting to shortcuts and temporary solutions (i.e. responding to requests with more and more dashboards and tables).
While these ad-hoc solutions solved problems quickly in the beginning, this approach restricted the data team’s performance. Maintenance time increased, posing a risk to their continued efficiency and agility in enabling accurate and complete insights for the company stakeholders.
“We were trying to add value as quickly as possible,” said Mallory Principe, Staff Technical PM at AlphaSense. “We ended up creating table after table and model after model without thinking holistically or agnostically about data warehouse management and dimensional modeling best practices.”
With an ever-increasing data volume and more users accessing that data, there was a need for more data contracts and stewardship to reduce the chances of errors and inconsistencies within their data. This is where table- and column-level definitions and table ownership capabilities would make the biggest difference.
To add another layer of complexity to the mix, the data team was responsible for managing thousands of tables, dashboards, looks, and explores, in addition to different reporting requirements for every team across the organization. With so much responsibility, there was limited bandwidth for unexpected issues.
When issues did arise, dedicating time to resolving them intensified the data team’s daily workload. And with AlphaSense’s continued growth, it was clear they needed to step up and refactor their core data model, especially given how much was on their data team’s plate.
“We ended up in a situation where we had a data warehouse that looked more like a data lake because everyone was just creating more tables, joining those tables, and enabling dashboards,” explained Stefka Antonova, Analytics Engineer at AlphaSense. “We didn't have a well-organized data warehouse that followed any sort of dimensional modeling best practices.”
This reactive approach, coupled with an inability to track data lineage, resulted in long resolution times and signaled the pressing need for improved data governance and quality. It was essential to monitor who was accessing the data and make sure that sensitive information was viewed only by authorized personnel.
“Ultimately, we understood that stewardship has to be there in order to enable data quality,” said Stephanie Loesche, Director of Data Governance, Compliance & Process at AlphaSense. “In order to get to the level of data quality we need, you have to have data governance. You have to have data owners and make sure that they have consistent rules across all the different platforms where that data travels.”
It wasn’t just about restriction; it was also about understanding the data’s relevance and adjusting access based on the data's importance. The end game? Aligning perfectly with leading industry standards by mastering their data management.
With Select Star’s column-level lineage, we were able to see where we might have waste and how things were connected.
Director of Data Governance, Compliance & Process at AlphaSense
Flexible and scalable platform for evolving data needs
AlphaSense aimed to find a holistic solution that would address all their underlying issues, streamline their data management processes, and ensure scalability for their future growth.
Their criteria for the ideal data governance tool included the following “must-haves”:
After evaluating all the data catalog and data governance tool options on the market, they ultimately chose Select Star for two main reasons:
Select Star improved our data quality and culture around governance.
These specific features were especially powerful for a tech giant like AlphaSense. They empowered AlphaSense to identify all the tables and datasets containing sensitive PII — a crucial part of ensuring compliance with the IPO's regulatory requirements.
After adding Select Star to their data stack, AlphaSense could finally see the column-level lineage of their revenue data and isolate which data they needed to control. AlphaSense’s data team finally had the necessary visibility—they could understand where and how their data was being used and implement more stringent processes where needed.
“Any company that's looking to scale and potentially IPO needs a data catalog. We need to know what our data is, where it's coming from, how we're labeling it and defining it, and work with the upstream data owners,” Mallory said.
Streamlined insights, table reductions, and data prioritization
When AlphaSense embarked on their journey to reshape their data infrastructure, they were aiming for more than just operational improvements. After all, that’s why they chose Select Star.
Since the very first day they’ve seen:
By democratizing access to essential metadata insights, various business teams at AlphaSense found themselves with unhindered access and even more trust in the data's accuracy and reliability. This trust and accessibility combined meant the data team was no longer a potential bottleneck. Empowered with confidence in the data, teams could focus on growth, and the data specialists were free to pursue new, more pivotal projects.
Thanks to Select Star, AlphaSense has a data-informed workspace where every team has the self-service information they need to excel. “Select Star improved our data quality and culture around governance. Now, everyone can see how much data actually exists and how we're using it,” said Mallory.