When the heck do you actually need a data catalog? The one metric you should know
💥 How to proactively recognize when it’s time to buy a data catalog
This week, I’m excited to hand over Metadata Weekly to our first guest author, Austin Kronz (Director of Data Strategy @ Atlan), to share a deep dive into the “when” of data catalogs!
A question that I hear a lot is, “When is the right time to invest in a data catalog? How do I know I need one?”
Since joining Atlan a few months ago, I’ve spent a lot of time talking with the amazing data teams and leaders that we work with. One of my favorite chats was with Otávio Leite Bastos, Global Data Governance Lead at Contentsquare, who has built an amazing way to think through when you actually need data catalogs and data governance at your organization.
Otávio and I recently collaborated to share his learnings and best practices to help data teams, including how to proactively recognize when you actually need a data catalog. Keep reading for the one metric that will tell you when it’s time, or read the full version here.
✨ Spotlight: How to know it’s time to buy a data catalog
A data governance team ensures the delivery of trust (verification of data sources and protection of PII) and clarity (well documented and accessible data products) around data to every decision-maker. To do this, there needs to be a centralized place that brings these data governance principles to life.
In the modern data stack, this is the data catalog. Unlike traditional data catalogs, these next-gen catalogs must be able to activate metadata to support all the facets of data governance.
So when is the right time to buy a data catalog?
There will be early signals that your team needs a catalog. Detecting these signals will be a mix of listening for qualitative feedback and quantitative analysis.
Some early qualitative signals to listen for:
Analysts are unsure what data sets they can use, and not sure if they can trust them.
Different teams are calculating the same metric in different ways.
Analysts and business users are unsure what metric definitions even mean.
Quantitatively, signals that you need a data catalog revolve around the ever common time to value metric data teams often fixate on. Create a baseline time to value calculation and monitor how this metric changes over the course of a few weeks to a month. To calculate time to value:
[Project Delivery Date] — [Project Committed Date] = [Time to Value]
As your team grows, any consistent increase in time to value (for example, quarter over quarter) is a sign you need to invest in a data catalog.
The bottom of this curve is the ideal time to buy a data catalog.
It’s common for these qualitative and quantitative signals to emerge or increase as your company and team grows. Naturally, demands for data will increase. In the early days, analysts will double as engineers (and vice versa).
However, when your team recognizes the signals mentioned above, you’ll need to divide and specialize into analyst tasks and engineering tasks. This is when you will start to need data governance and when you should start procuring a data catalog, as your next step after splitting data analysts and engineers will likely be creating the first data governance team.
Without a user-friendly data catalog, data governance teams would have to be significantly larger in an attempt to keep up with data demands. This is economically inefficient and not scalable.
Recognizing the inflection point of growth in data and analytics roles and the effects on time to value are tell-tale signs that it is time to formalize data governance efforts and procure a modern data catalog. Without this, organizations would have to invest excessive amounts of money on hiring to manually manage new data products — something that isn’t possible in the economic conditions faced in 2023.
Instead, investing in the right technologies early on in your data governance journey can ultimately save time and money down the road. Utilizing a next-gen catalog centralizes the management of governance rules but democratizes data discovery, leading to an efficient data governance program.
Read the full article on Atlan’s blog. ➡️
Contentsquare just hosted an amazing masterclass on how they use data governance to accelerate analytics and BI.
💫 Last year in Atlan: Our 15 favorite updates from 2022
Almost two years ago, we laid out our idea of modern metadata for the modern data stack. Since then, it’s been a dream seeing our vision come to life. This transformation was driven by real product developments, ones that turned our abstract dream into something that truly can transform the way that data teams work.
In 2022, our engineering team shipped over 200 new features, including a brand new version of our product. These improvements proved the power of active metadata and even got us ranked as a Leader in the Forrester Wave, Q2 2022. Before we take on the next year, we wanted to take a few minutes to look back at our favorite features from 2022.
👉 Enhance collaboration and stay in flow with Slack, Jira, and Chrome integrations
👉 Minimize risk and increase visibility with our deep Fivetran, dbt, and GitHub integrations
👉 Reduce manual work with Atlan Playbooks, Trident suggestions, and AWS EventBridge-driven automations
👉 Create a Netflix-like experience with personalization, custom metadata, and a brand new UI
👉 and much more!
Read our 2022 product roundup here ➡️
📚 More from my reading list
Transform your data team into a performance powerhouse by Blake Burch
Defining ownership and making it actionable by Mikkel Dengsøe
Do data teams have product-market fit? by Benn Stancil
Overcoming some of the worst parts of being a data scientist by Ani Madurkar
Data mesh: concepts and best practices for implementing a product-centric data architecture by Saeed Mohajeryami
P.S. Liked reading this edition of the newsletter? I would love it if you could take a moment and share it with your friends on social! If someone shared this with you, subscribe to upcoming issues here.