From a cultural standpoint, generative artificial intelligence (GenAI) has made a big splash. AI assistants can write your emails, answer your questions, plan your diet – the use cases seem endless.
AI has made a stunning impact in software development, where it’s helping developers code faster and spend more energy on high-value activities. But what does it mean for data analytics?
In all the excitement around new technology, the future possibilities can overshadow the baby steps between here and there – though those steps are no less exciting.
Select Star founder and CEO Shinji Kim sat down with Scott Breitenother, Chief Data Officer at Velir and founder and CEO of Brooklyn Data, to talk about the future of data in an AI-powered world.
The Near Future of GenAI in Data
The first wave of AI in data will be machines acting in a “copilot” role, Scott predicted, taking over tedious tasks like autocompleting syntax.
From there, Shinji believes it won’t be long before technology takes a second, more exciting step – a system that can provide business intelligence without human-generated SQL.
The problem is getting the system to deliver reliable, high-quality data. Data analysts know that often, the business users requesting information aren’t actually sure what they’re looking for. Only after getting data back and interacting with it can they refine their question.
Current AI models, without being trained on rich, well-structured context, may hallucinate false answers to a query. Unless there is a human in the loop to verify accuracy, these answers could lead users down the wrong path.
The Select Star approach to the context conundrum is to leverage existing queries and dashboards. Instead of generating new SQL, the AI is tasked with pointing users in the right direction.
“Then, underneath those assets – depending on how things are being queried – we can put together a custom SQL based on the joins and queries and tables and columns that are actually being used,” Shinji explained.
Right now, Shinji estimated, AI data tools are about halfway to being able to deliver the business intelligence people crave. The human-influenced approach used at Select Star gets them closer to 90%.
AI and Data Security
Recent years have seen widespread adoption of data democratization, a practice that gives people throughout the organization access to data.
There are a lot of benefits to this approach: it frees up engineering time to work on innovating and optimizing rather than building queries, and it allows people throughout the company to work faster, better, and with greater insight.
But friction, Scott pointed out, isn’t always a bad thing. When there’s easy access to data, people can stray into areas they don’t belong. And when spend is based on consumption, a user can accidentally trigger a lot of spend with a single query.
With role-based and policy-based access control, AI can help businesses protect data while still giving non-data teams some freedom.
“We gave people a key to the car; now we have to give them a curfew,” Scott said. “When all the assets are in one place and anybody has access, there’s a risk for leaks and for people seeing things they shouldn’t see.”
AI data tools may also have a role in next-generation penetration testing, making sure security protocols are sound and personally identifiable information (PII) has been properly anonymized. We can soon expect to see layers of testing algorithms, with main logic and issue resolution overseen by humans.
As companies build up weak spots highlighted by AI, Scott predicted, there will be greater enforcement of lapsed PII protections that should have been practiced all along.
AI and Regulatory Compliance
Many of those PII protections are codified into regulations like CCPA and GDPR. AI-powered adaptive governance tools will be essential to medium businesses navigating a patchwork quilt of local and regional privacy regulations.
“Maybe large companies have counsel who can say, ‘This is what CCPA means to you for data.’ But most companies don’t,” Scott said. “An adaptive governance tool that can review CCPA requirements and tell you where you need to remediate is huge. Almost every state is coming up with its own privacy laws. It’s impossible for a medium-sized business to navigate all those regulations.”
AI can also generate data governance policies, clearly documenting what data is collected and for what purpose, in the event of a regulatory audit. This is a tremendous benefit, Shinji said, since for regulations like GDPR, there is no certification process or test audit businesses can take to check their compliance.
AI and the Modern Data Stack
As the modern data stack continues to evolve, leading companies are exploring innovative ways AI can elevate their offerings. As tools have become more sophisticated, users have grown in sophistication alongside them, so the bar for new features is high.
“I think a lot of vendors have done a lot of breadth of features and now they’re adding on top. What I hear from the market from time to time is that it doesn’t always work as intended,” Shinji said. “There is a lot on the technical side that still needs to improve.”
While AI startups have gotten a lot of media coverage, Scott said he believes innovation in the data space will mostly come from the well-funded research and development teams at existing large vendors. These companies have the compute and budgets to build and train machine learning models, and the customer trust to enjoy rapid adoption of new tools they create.
“AI tools are going to be features of SaaS tools companies already use rather than pure, innovative AI startups,” he predicted. “Most companies would rather enable an AI feature in Salesforce or Microsoft, with security and governance they already feel comfortable with.”
What Does the Future Hold for Generative AI in Data?
Modern AI technology like ChatGPT is ANI, or artificial narrow intelligence. It performs a task, and if a human wants more information or context, they must ask for it.
Shinji believes the tech world is very close to the next iteration, AGI, or artificial generalized intelligence. This AI thinks more like a human – it can provide context to the human requester, and suggest or perform additional tasks the person didn’t ask for, in the interest of achieving the desired result.
At that point, AI tools would graduate from their current role as robot assistants into a higher-level role as robot advisors.
“I don’t think we’re too far from that future,” Shinji said. “It’s going to be the expert at my fingertips that can do thinking and execution. At that leap, we’re going to start seeing things we can’t even imagine right now.”
Ready or Not, the Future is Now
When compared with other tech sectors, AI penetration of the data industry seems slow, due to the complexity of data and the crucial nature of data accuracy.
Nonetheless, there is no question that AI is having a transformative impact on data. The ability of machines to do menial tasks is quickly expanding to an ability to act as guides, guardians, and eventually, advisors.
As AI-powered data products evolve, businesses will enter a new era of rich, informed decision making at the click of a button. An automated discovery platform like Select Star can help organizations keep the abundance of human- and machine-generated data organized, accurate, and reliable. See for yourself with a demo.