Lessons from Forrester’s Michele Goetz on the future of data catalogs
✨ Spotlight: Michele Goetz from Forrester on the future of data catalogs
We recently asked 250+ data practitioners, “What comes to mind when you think of a data catalog?”
As you can see, the responses were split.
There were plenty of words describing the capabilities and components of a data catalog — inventory, governance and compliance, metadata, lineage, governance, data dictionary, business glossary, etc. But there were also a lot of words describing the value of a data catalog — trust, self-service, opportunities, know your data, structure, productizing data, knowledge, etc.
The problem is, that value isn’t universal. Ananth Packkildurai recently did a poll on LinkedIn where only 26% of people said that they are happy with their data catalog, and 59% said that they found their data catalogs somewhat useful.
We get it. We started as a data team ourselves, and we failed to implement a data catalog 3x. But when we got it right, our data team became 6x more agile.
This sort of transformation becomes possible when you get a data catalog right. But what does that actually mean? How can great data catalogs help data teams work together better?
🚀 Spotlight: Michele Goetz from Forrester on the future of data catalogs
Michele Goetz, VP Principal Analyst at Forrester Research, recently hosted an insightful masterclass on the past, present, and future of data catalogs.
At one point, data teams were just IT professionals, and their “data catalogs” were simple glossary and inventory management systems. But today, data teams and data people across the organization are incredibly diverse.
Michele explained how modern data teams work today at the intersection of three patterns:
Self-service: Decentralized and democratized data product management
AI: Intelligently automating experiences
Analytics/BI: Making decisions to execute
This diverse work at the intersection of diverse teams creates new types of problems for data teams, especially around privacy and security. Data practitioners need to think about these topics early and often, rather than treating them as an afterthought.
This is where data catalogs can play a role by bringing together metadata and context from across the data stack — but traditional data catalogs don’t cut it anymore. Instead, Michele talked about the importance of leveling up your data catalog to embrace a new way of work.
When companies make the transition from thinking about “governing” their data to actively using metadata to drive value and growth across the company, this is when Data Catalogs for DataOps come into play. According to Michele, great data catalogs are the secret to enabling DataOps’ processes and methods across the data lifecycle.
Here’s her big takeaways for modern organizations today:
Build connected intelligence: Data is no longer static, so use a network of dynamic data and intelligence to drive company growth.
Foster a DataOps culture: Establish a model for decentralized data product development and management
Implement an Enterprise Data Catalog: This will enable and reinforce the set of DataOps best practices, creating a vibrant community of data producers.
To learn more about these ideas, watch the masterclass or read The Forrester Wave™: Enterprise Data Catalogs for DataOps report.
👥 Improve data discovery with persona-driven strategies
By Jacob Frackson
Any data person knows that a generic or superficial data model can be worse than none at all.
While some company-wide metrics can be very powerful and help tie everyone together – such as Customers Served All Time or Monthly Customer Growth – others can be an accident waiting to happen.
With revenue, for example, it’s possible to maintain a single universal definition. But what happens when the finance team wants to start reporting on revenue net of cancellations or refunds? What if sales wants to move the data up and start counting revenue when the contract is signed, not when the payment is collected? Well, now the simple term “revenue” can’t cover these analyses and use cases! How do we decide who gets to use “revenue”, and what should everyone else use?
At the other end of the spectrum, if everyone is left to define revenue on their own, we have either low adoption or more misunderstanding. With less structure, many potential data stakeholders will be pushed out due to their lack of familiarity with the tool or lack of confidence in their skills.
So where does that leave data teams? In this great article, Jacob Frackson (Practice Lead at Montreal Analytics, our partner 💙) talks about how data teams can strike the right balance between those two extremes and design a data stack that really works by defining and leveraging internal personas.
Read more about improving data discovery with persona-driven strategies in his blog.
📚 More from my reading list
9 predictions for data in 2023 by Tomasz Tunguz
People-first data stacks by Ilan Man
How does data drive growth in practice? with Abhi Sivasailam (Growth and Analytics Leader) on the Analytics Engineering Podcast
An engineer’s guide to data contracts by Chad Sanderson and Adrian Kreuziger
The eternal suffering of data practitioners by Pedram Navid
How we enhanced productivity of Zapr’s data platform and saved costs by Sonaiyakarthick Poongavanam
Managing data products in hurricane-like headwinds by Eric Weber
The over-optimization of everything by James Densmore
We haven’t included a big list of links in this newsletter for a few weeks, so here’s making up for the lost time. I’ve also added some more resources to my data stack reading list. If you haven’t checked out the list yet, you can find and bookmark it here.
P.S. Liked reading this edition of the newsletter? We’d love it if you could take a moment and share it with your friends on social.