Back
Blog Post

Automated Data Documentation: Entity-Relationship Diagram (ERD)

Shinji Kim
June 15, 2022

Hello from Snowflake Summit 2022!

Today, we are releasing Auto-generated ERD (Entity-Relationship Diagram) in order to better support data analysts and citizen data scientists to understand and leverage data more effectively.

What is Entity Relationship Diagram (ERD)?

Entity Relationship Diagram (ERD) is the baseline of where any data analysis starts. According to TechTarget, Entity Relationship Diagrams provide a visual starting point for database design that can also be used to help determine information system requirements throughout an organization. After a relational database is rolled out, an ERD can still serve as a reference point, should any debugging or business process re-engineering be needed later.

This makes ERDs to be one of the most essential parts of data modeling and architecture, especially for any relational databases that execute SQL.

Select Star’s auto-generated ERD

Today, most of the ERDs are created as the blueprint and mapping of the business process to the data in the beginning. However, as the company grows and as data democratization happens in the company, keeping the ERDs up-to-date manually is almost impossible.

At Select Star we think ERDs are one of the core parts of working with data - in order for anyone to understand how each dataset can be joined with another.

While data lineage maps out the flow of data (where did the data come from & is going to?), ERD maps out how the different datasets can be used together.

Automatically Generating ERDs

So how does this work?

First, Select Star looks at any primary key & foreign key labels in the database. Based on the relationships that it sees, it will add the relationship to our ERD model.

Select Star will import Primary Key / Foreign Key relationships from your database, or you can define them in the UI.

Second, Select Star will also look at all the joins & its join conditions that the column makes. In SQL, the join conditions may look like this:

Looking up ORDERS table in Select Star. You can see that there are 3 other tables related to this table - ORDER_ITEMS, CUSTOMERS, and ORDER_PAYMENTS.

From these join conditions, we can infer which tables are related to each other and how.

“The automatic ERD feature allows data analysts to more easily discover what table and field to join for a query without having to talk to another analyst who has tribal knowledge.”

Veronica Zhai, Director of Analytics Product and Operation @ Fivetran

As the majority of our user base is data analysts, we’re excited to bring this feature to live production today, and continue bringing more insights about your data. Please check it out and let us know what you think!

Related Posts

Understanding Snowflake Data Usage for Cost Optimization
Learn More
Monte Carlo Integration for Enhanced Data Observability
Learn More
Semantic Layers 101: Everything You Need to Know to Get Started
Learn More
Data Lineage
Data Lineage
Data Quality
Data Quality
Data Documentation
Data Documentation
Data Engineering
Data Engineering
Data Catalog
Data Catalog
Data Science
Data Science
Data Analytics
Data Analytics
Data Mesh
Data Mesh
Company News
Company News
Case Study
Case Study
Technology Architecture
Technology Architecture
Data Governance
Data Governance
Data Discovery
Data Discovery
Business
Business
Data Lineage
Data Lineage
Data Quality
Data Quality
Data Documentation
Data Documentation
Data Engineering
Data Engineering
Data Catalog
Data Catalog
Data Science
Data Science
Data Analytics
Data Analytics
Data Mesh
Data Mesh
Company News
Company News
Case Study
Case Study
Technology Architecture
Technology Architecture
Data Governance
Data Governance
Data Discovery
Data Discovery
Business
Business
Turn your metadata into real insights