dbt Coalesce 2025: What 14,000 Practitioners Learned
Beyond tools and trends, Coalesce 2025 revealed the blueprint for data systems that can think, learn, and earn trust.
Coalesce 2025 in Las Vegas marked a turning point. With more than 2,000 attendees in person and 14,000 online, the event revealed something critical. The future of data and AI belongs to organizations that treat context as infrastructure, not documentation.
This wasn’t a product launch, it was a line in the sand. It marked the transition from an era focused on moving data faster to one centered on understanding data better. The modern data stack is being reimagined as the foundation for AI-ready, interoperable systems where metadata, lineage, and semantics are as essential as the pipelines themselves.
Perhaps nothing captured this shift more clearly than the headline announcement:
dbt Labs and Fivetran coming together under one roof.
Why the Merger Matters?
The headline announcement about dbt Labs merging with Fivetran to create nearly $600M in combined ARR was perhaps the biggest highlight at Coalesce 2025.
As both companies noted in their joint statement, this move brings data movement and transformation under one roof. The logic behind the merger is undeniable: simplifying the path from source to model, from extraction to transformation.
For years, those layers have lived in isolation. Data moved without shared understanding, and transformations ran without inherited context. Each handoff introduced drift, lineage broke, definitions diverged, and trust decayed.
The merger does not erase that fragmentation overnight, but it signals a broader industry correction, an acknowledgment that efficiency without continuity is incomplete.
The next evolution is not just about unifying tools, it is about unifying context. It is about ensuring that meaning travels with data as it moves through every system and persona. The future of the stack is not just about connected infrastructure, it is about connected understanding.
From Pipelines to Intelligence: The Fusion Engine
For years, data industry optimized for speed. The new dbt Fusion engine, built in Rust, asks a different question:
what if systems could understand what actually needs to run?
State-aware orchestration means the system detects which models have changed and intelligently skips unnecessary recomputations. It’s not just about efficiency, it’s about systems that understand their own state.
The results are dramatic. EQT Group reported 60% faster runtimes and 45% lower warehouse costs after adopting Fusion. dbt’s internal projects saw 64% total cost savings through a combination of intelligent reuse and optimization.
But the real innovation goes deeper. Fusion understands your entire project as a connected graph. Tracking dependencies, validating syntax before execution, and automatically updating downstream references when you change a column. When you rename a field, it updates every reference and flags potential breaks. When you validate a model, it works across Snowflake, Databricks, BigQuery, and ClickHouse simultaneously. The pipeline hasn’t just gotten faster, it’s become perceptive. Moving us from pipelines to perception.
Metadata as Operating System
Here’s the fundamental shift: metadata is no longer passive documentation sitting in a catalog. It is the active layer that enables everything else.
When Norges Bank Investment Management deployed conversational analytics through the dbt MCP server, they saw a tenfold increase in catalog adoption compared to their previous system. This improvement did not happen because the interface was more attractive, but because the metadata was alive. Continuously updated, deeply connected, and operationally integrated.
Metadata now describes not just what exists, but why it exists, how it relates to other assets, who trusts it, and what it means in business terms. It has become the operating system that allows AI to reason about data rather than guess.
This evolution from documentation to infrastructure came to life with dbt Agents.
Governed AI: Intelligence Within Guardrails
Coalesce 2025 introduced dbt Agents. Not as experimental tools, but as production-ready systems operating within governance boundaries. Four specialized agents now work across the analytics lifecycle:
Developer Agent explains logic, validates changes, and refactors code directly in VS Code or dbt Studio, making development faster and safer.
Discovery Agent helps users find the right datasets with clear explanations of what makes them trustworthy, surfacing lineage and definitions automatically.
Observability Agent monitors jobs, diagnoses root causes, and proposes fixes, dramatically reducing manual remediation work.
Analyst Agent answers natural language questions using governed metrics, generating and executing SQL from your dbt models with full lineage and definitions attached.
The critical distinction? These aren’t LLMs guessing at SQL. They’re agents working with structured context. The models, metrics, tests, and lineage your team has already defined and governs. When Norges Bank deployed hundreds of production agents, they worked reliably because answers were grounded in governed context, not probabilistic generation.
This is what makes AI trustworthy at scale. Structured context that flows from your existing governance framework directly to the agents that need it.
Standardizing Semantics: MetricFlow Goes Open Source
Perhaps the most significant long-term announcement. MetricFlow is now fully open source under Apache 2.0. It powers the Open Semantic Interchange initiative alongside Snowflake, Salesforce, Atlan, and several other well-known names in the space.
Without standardized semantics, every tool interprets “revenue” differently. AI agents guess at calculations, dashboards display conflicting numbers, and teams spend weeks reconciling metrics that should match by definition.
MetricFlow makes metrics deterministic. When you define “monthly recurring revenue” once, every dashboard, agent, and analysis uses the exact same calculation. Not approximations, not variations, but the same governed logic.
This is context as infrastructure in action. Shared definitions that flow freely across systems, enabling AI to reason consistently rather than hallucinate variably. When metrics are standardized and portable, trust scales across tools and clouds.
The Open Semantic Interchange initiative takes this further by creating vendor-neutral standards for semantic data exchange. As Josh Klahr from Snowflake noted, fragmented data definitions remain one of the largest barriers to AI adoption. MetricFlow provides the engine that compiles metric definitions into provably correct calculations that every system can rely on.
What This Means for Data Leaders
The through line across every announcement, from the merger to Fusion, dbt Agents, and MetricFlow, is context as an architectural principle.
Context as infrastructure means treating lineage, semantics, and governance with the same rigor as the compute layer. When metadata is live and connected, systems self-optimize. When definitions are standardized and portable, AI becomes trustworthy.
Context as a competitive advantage means that organizations able to surface the right context at the right time, for both humans and machines, will move faster than those drowning in static documentation.
This is not theoretical. It shows up in measurable outcomes:
Warehouse costs dropping 45–64% through intelligent orchestration.
AI-enriched catalogs showing 10x higher adoption, translating to four times faster insight delivery and measurable improvements in revenue-impacting decisions.
Production AI agents that teams actually trust because they operate within governed boundaries.
Three Strategic Questions
As you assess your data strategy:
Is your metadata live infrastructure or static documentation? If lineage, ownership, and definitions live in wikis instead of operational systems, you are building AI on quicksand.
Does your context flow across tools? In multi-engine, multi-platform environments, proprietary semantic layers often trap meaning within individual systems. Your context layer should transcend those boundaries and work everywhere your data does.
Can you measure trust at scale? The future belongs to systems that explain their outputs as clearly as they generate them. Observability, lineage, and auditability are not optional features; they are prerequisites for AI that organizations will actually use.
The Bigger Picture
Coalesce 2025 was not defined by its announcements, although they mattered. It was defined by a collective realization that the question is no longer how fast we move data, but how well our systems understand it.
Data systems are becoming reflexive. They understand their own state, adapt to change, and operate within governed boundaries. The modern stack is not just open, it is aware.
The tools exist. The standards are emerging. The agents demonstrate that governed AI works. The open-source commitment ensures portability.
The only question is execution. Will your organization lead this transformation or follow it?
The shift from pipelines to perception is not coming, it is already here. And at the center of this transformation lies the context. The connective layer that allows data, systems, and intelligence to work in harmony.
Organizations that treat context as infrastructure will move beyond efficiency toward understanding. They will build systems that not only execute but also explain. And in doing so, they will define the next era of data intelligence.
A Fun Picture From The Event 🎉
The Insight Index: Your Weekly Data & AI Digest
Top resources and recommended reads, carefully curated for you.
The Human Fabric of Knowledge Architecture — François Rosselet
The House That Data Built: A Blueprint for Modern Architecture — Alex Posar
Why Asking AI “What Would You Ask Me?” Gets You Better Results — Donabel Santos
Why Your Agentic Enterprise Needs an Ontology — Vin Vashishta
That’s all for this edition. Stay curious, stay contextual. See you all soon in the next one!
About Metadata Weekly
Metadata Weekly isn’t just a newsletter. It’s shared community space where practitioners, builders, and thinkers come together to share stories, lessons, and ideas about what truly matters in the world of data and AI: trust, governance, context, discovery, and the human side of doing meaningful work.
Our goal is simple, to create a space that cuts through the noise and celebrates the people behind the amazing things that are happening in the data & AI domain.
Whether you’re solving messy problems, experimenting with AI, or figuring out how to make data more human, Metadata Weekly is your place to learn, reflect, and connect.
Got something on your mind? We’d love to hear from you. Hit Reply!





Exciting times!
Exactly. This whole idea of context as infrastructure rather than just documentation really hits home, you nailed why understanding data better is the real game now. It kinda reminds me of my Pilate practice; you can do all the fancy moves, but without a strong core and knowing how everything coonecs, it's just not sustainable or truly effective.