Traditional data management doesn’t work. Here’s what could.
3 real-life case studies, 2 new data persons, and 1 free ebook
In the last few editions of this newsletter, we’ve talked a lot about DataOps. Honestly, we’re really excited about it!
Today, data is growing faster than any company can keep up. Data teams are becoming more diverse than ever before. Data stacks and infrastructure are becoming increasingly complicated. The result is data chaos like never before.
While we don’t have all the answers, we’re super excited about the potential of DataOps to bring calm and clarity to this chaos.
Today we’re excited to launch our latest ebook, which puts the spotlight on actually implementing DataOps and making it a reality. It’s an actionable guide to how active metadata helps modern data organizations embrace the DataOps way, based on our experiences working with modern data teams around the world.
We’ve included a snippet here on DataOps personas and active metadata platforms.
You can download the full ebook here. ➡️
✨ Spotlight: How activating metadata holds the key to the DataOps dream
DataOps is a central function that enables the rest of the organization, but what does that actually mean? The best way to think about DataOps is through analogies to other teams: e.g. RevenueOps teams activate revenue data to improve revenue growth, and ProductOps teams activate product data to build better products.
DataOps teams activate “data data” (aka metadata) to help organizations achieve value from data. 🚀
Structuring a DataOps function
There are two key personas in a modern DataOps team:
DataOps Enablement Leads: They understand data and users, and are great at cross-team collaboration and bringing people together. DataOps Enablement Leads often come from backgrounds like Information Architects, Data Governance Managers, Library Sciences, Data Strategists, Data Evangelists, and even extroverted Data Analysts and Engineers.
DataOps Enablement Engineers: They are the automation brain in the DataOps team. Their key strength is a sound knowledge of data and how it flows between systems/teams, acting as both advisors and executors on automation. They are often former Developers, Data Architects, Data Engineers, and Analytics Engineers.
Below is an example of how WeWork structured its DataOps function around two key personas.
Note: Check out the ebook to learn about the real people behind these personas at WeWork.
How active metadata creates the “single source of truth” behind DataOps
Instead of just collecting metadata from the rest of the stack and bringing it back into a passive data catalog, active metadata makes a two-way movement of metadata possible. It sends enriched metadata back into every tool in the data stack, giving the humans of data context wherever and whenever they need it. Active metadata couples this idea with automation to enable powerful programmatic use cases, from automated data deprecation to data quality management.
Here are a few examples of how real companies are using active metadata for DataOps:
A leading job site improved its enterprise’s collaboration by unifying the context in a business glossary. They unlocked embedded collaboration by activating metadata from Atlan into Looker, ensuring that the team always has metadata and context at their fingertips without endless tool-switching.
A multi-billion dollar investment firm used personalization to make its data mesh (which is based on the idea of domain-based personalization) a reality. The firm receives data from 10,000+ external data feeds. They use granular Personas and Purposes, along with detailed data and metadata policy management, to serve the right products to the right users in the right domains.
A leading media analytics platform needed to track and delete regularly, but this happened manually. This sometimes led to data not being deleted, resulting in contractual breaches and costing the firm from a legal and compliance perspective. They used active metadata to automate manual data deletion processes for petabytes of data to improve data-contract compliance.
Read more about DataOps and active metadata in the full ebook. ➡️
📚 More from my reading list
How to build a data product that won’t come back to haunt you by Marian Nodine
Upgrading data warehouse infrastructure at Airbnb by Ronnie Zhu
Data engineering excellency at Netflix by Xinran Waibel
Forget about algorithms and models — Learn how to solve problems first by Ari Joury, PhD
[ICYMI] Forrester changed the way they think about data catalogs, and here’s what you need to know
I’ve also added some more resources to my data stack reading list. If you haven’t checked out the list yet, you can find and bookmark it here.
💙 Next week @ dbt Coalesce in New Orleans
Super excited to be with one of our favorite communities at dbt Coalesce next week to talk about our fav topic: real metadata use cases with Atlan and dbt Labs 🧡
Metadata has been one of the hottest topics of this year! Traditionally, metadata has been used only for a few use cases like static and passive data catalogs. However, metadata can be key to unlocking a variety of use cases, acting as the glue that binds together our diverse modern data stacks (e.g. dbt, Snowflake, Fivetran, Databricks, Looker, and Tableau) and diverse teams (e.g. analytics engineers, data analysts, data engineers, and business users)!
But beyond the buzz, in this session, we’ll cover how some amazing data teams are driving real value by leveraging active metadata for use cases like column-level lineage, programmatic governance, root cause analysis, proactive upstream alerts, dynamic pipeline optimization, cost optimization, data deprecation, automated quality control, metrics management, and more.
If you are at dbt Coalesce in New Orleans this year, we’d love to meet you. Come say hi, meet us at booth 208 or join us for one of the fun events we’re going to be at! :)
P.S. Liked reading this edition of the newsletter? I would love it if you could take a moment and share it with your friends on social.