Agile in data, data team career ladder, and more
✨ Spotlight: Is agile in data the same as agile in software engineering?
Welcome to this week's edition of the ✨ Metadata Weekly ✨ newsletter.
Every week I bring you my recommended reads and share my (meta?) thoughts on everything metadata! ✨ If you’re new here, subscribe to the newsletter and get the latest from the world of metadata and the modern data stack.
Last week, the reverse ETL space was on fire. 🔥 Airbyte announced their acquisition of Grouparoo, an open-source reverse ETL tool, and Rudderstack announced their reverse ETL solution. All in just one week!
Has the bundling season officially begun? I am excited to see how it unfolds for the modern data stack in 2022!
⏩ Is agile in data the same as agile in software engineering?
For the last few years, there’s been a ton of talk about “agile” in data science — how can we take software engineering practices and infuse them into the way data teams work?
Agile, for example, revolutionized the software engineering world and the way software is built and shipped today. But... can we simply take practices like agile that work in software engineering and adopt them in our data teams?
Here’s the thing: software work and data work are fundamentally different.
The fundamental difference between software and data is that in software, humans create code. In data, however, we usually cannot control the raw data that we are working with. This makes scoping a project hard as it's tough to estimate how much transformation and cleaning effort a project is going to take.
Also, in software, you almost always know what you’re going to build. This makes it easier to measure the “velocity” of execution. However, in data, many problems are exploratory. For example, why is our ARR number dropping? That’s an exploratory analytics project — almost a research problem! It’s really difficult to scope a problem like that on day zero.
This is why I caution data leaders from blindly picking “software engineering” practices like agile and implementing them in data teams. There is a TON that we can learn from software engineering — but in my mind, there’s a lot that data teams can learn from other kinds of functions and practices as well.
Here are some teams and functions that I think we have a ton to learn from as we figure out how data teams should work:
→ Product and design teams: One of the biggest challenges in data is how easy it is to forget our end users in data projects. We’ve all heard about the pretty dashboard that never gets used. I believe that there’s a ton that we can learn from product and design teams about how to bring user research and problem scoping into the fundamental DNA of every project.
→ Manufacturing and supply chain teams: These aren’t ****the sexiest teams of the 2020s, but if you think about it, data pipelines are similar to manufacturing pipelines. That means there is a ton that we can learn from methodologies like lean manufacturing.
→ Sales enablement and SalesOps: Today, running a sales team without the sales enablement and SalesOps functions would be unthinkable! I think that a lot of the practices they drive can inspire better data practices — creating ”data 360s” like customer 360s, setting up a data workspace like how the CRM functions as a collaborative workspace, and adding a “data enablement” function to help everyone speak the same language and create common best practices.
What are the other teams and practices that you think data teams should be learning from? I’d love to hear your thoughts!
P.S. I speak about some of these ideas in my recent chat with Eric Dodds and Kostas Pardalis for the Data Stack Show here.
💙 Special Shoutout: Data Team Career Ladder by Elizabeth Simion
Shoutout to Elizabeth Simion, Director of Data and Analytics at EZ Texting, for sharing her “Data Team Career Ladder” sheet on the dbt Labs Slack channel to help other data leaders looking to scale their data teams. You can check out the public copy here.
The data function is so new and we’re just building out career paths, so resources like this are super valuable to data leaders. I’d love to see more similar resources being open-sourced by data practitioners and leaders!
📘 More from my reading list
The end of Big Data by Benn Stancil
The metrics layer has growing up to do by Amit Prakash
Product Sketch: The New Corp Times by Stephen Bailey
Data mesh is not a magic fix to build data product by Nicolas Claudon
Netflix has innovated with data by James Nanscawen
I’ve also added some more resources to my data stack reading list. If you haven’t checked out the list yet, you can find and bookmark it here.
🗓 Event Invite: Metrics Layers & Metadata
Metrics layers have been all the rage in 2022! dbt Labs incorporated a metrics layer into their product, and just last week Transform open-sourced MetricFlow (their metric creation framework).
The metrics layer is just forming in the data stack, and I’m personally super excited about this layer of the stack. So, I’m excited to announce our next edition of the Great Data Debate with the two most prolific product thinkers in the space — Drew Banin (Co-founder of dbt Labs) and Nick Handel (Co-founder of Transform).
Some questions I’m personally super excited about deep diving into:
WTF is the metrics layer?
What do most people get wrong about the metrics layer?
What are some real-life use cases of the metrics layer?
What is the interplay between the metrics layer, metadata layer, and other layers of the data stack?
How should data leaders think about incorporating the metrics layer into their data stacks?
Why shouldn’t you adopt a metrics layer?
Atlan’s Great Data Debates are closed-door sessions that are highly interactive and super fun! Sign up here and join us on the 21st of April for fiery takes on all things metrics layer and metadata. 🔥
I'll see you next week with more interesting updates from the modern data stack! 👋 Meanwhile, you can subscribe to the newsletter on Substack and connect with me on LinkedIn here.