Your Data Catalog Doesn't Govern Anything

March 22, 2026 • By Billy Newport

I've built data platforms where regulatory compliance wasn't optional—where getting it wrong meant fines, audit failures, and career-limiting events. In that world, you learn fast that there are two fundamentally different approaches to governance.

One approach watches what happened and tells you about it. The other prevents it from happening in the first place.

Most enterprises today are investing heavily in the first approach and calling it governance. It isn't.

The Catalog Trap

Data catalogs are useful tools. They crawl your databases, index your tables, profile your data quality, and generate dashboards. Some even do lineage. Products like DataHub, Amundsen, and Atlan give you visibility into what exists, where it came from, and whether it looks healthy.

But here's what a catalog actually does: it observes the system after the fact and reports what it found. If someone pushed a breaking schema change, the catalog tells you about it—after your downstream consumers are already broken. If PII data ended up in a region it shouldn't be in, the catalog flags it—after it's already there. If an unauthorized team modified a dataset they don't own, the catalog might notice—eventually.

This is procedural governance. It's compliance reporting. It has the same relationship to actual governance that a security camera has to a lock.

You wouldn't secure a building with cameras alone. Why are we securing data platforms that way?

Structural Governance: Prevention Over Detection

Structural governance means non-compliance is architecturally impossible. Not unlikely. Not flagged-for-review. Impossible.

DataSurface implements this by making governance a property of the system itself, not a layer observing the system from outside. Every change to the data ecosystem flows through a model that is validated before it can be merged:

  • Ownership is enforced, not documented. Every datastore, dataset, and workspace maps to an owning Git repository. If a team modifies something they don't own, the merge is blocked. Not flagged. Blocked.
  • Schema compatibility is checked before deployment, not after breakage. Column removals, type narrowing, and breaking changes are caught at PR time. Your consumers never see the problem because the problem never reaches them.
  • Data classification policies are merge gates, not dashboard widgets. If a governance zone says PII data must stay in EU infrastructure, that rule is checked every time the model changes. A dataset that violates the policy can't be deployed—it can't even be merged into the model.
  • Deprecation requires bilateral agreement. A producer can't silently remove data that consumers depend on. Consumers must explicitly acknowledge the deprecation or the model won't build.

The difference isn't subtle. A catalog tells you "this PII data is in the wrong region." Structural governance makes it so the PII data could never have gotten there.

The Control Plane, Not the Platform

Here's the part that surprises people: DataSurface doesn't replace your existing data platform. It sits on top of it.

Running Snowflake? Fine. DataSurface governs what goes into Snowflake, how it gets there, who can consume it, and what rules apply. Your Snowflake investment is protected. Your catalog can still crawl Snowflake and generate its dashboards. Nothing changes for those tools—except now the data arriving in Snowflake is already governed, already compliant, already consistent.

This is the control plane pattern. The platform underneath handles storage and compute. The control plane above handles intent, governance, and orchestration. They're different jobs.

And because the control plane is platform-independent, you're not locked in. When the business needs a different engine—Trino/Iceberg for analytics, PostgreSQL for operational workloads, a second cloud provider for regulatory reasons—you add it to the model. The producers and consumers don't change. The governance doesn't change. The pipelines don't break. It's a configuration change, not a rewrite.

Compare that to building directly on a vendor platform. Every pipeline is written to that vendor's APIs. Every governance check assumes that vendor's metadata format. When you need to move—and you will—you're rewriting everything. That's not technical debt. That's technical mortgage.

Near-Zero Pipeline Cost

The other thing structural governance enables is radical simplification of pipeline creation.

When governance is procedural, every new pipeline is a project. Someone writes the ingestion code, someone else reviews it, someone configures the quality checks, someone updates the catalog, someone documents the lineage. Each pipeline is bespoke. Each one accumulates its own technical debt.

When governance is structural, a new pipeline is a model change. You declare the datastore—source, schema, ingestion strategy. You declare the workspace—what data it needs, retention, latency, regulatory constraints. The platform builds the pipeline, provisions the jobs, and enforces all the governance rules you've already defined. The marginal cost of the next pipeline approaches zero.

For enterprises running hydration projects—migrating legacy systems into modern data platforms—this is the difference between a multi-year program and a model definition exercise.

Catalogs Still Have a Job

I'm not arguing against data catalogs. They're genuinely useful for discovery, data quality profiling, search, and operational dashboards. A good catalog helps people find data and understand what's available.

But a catalog is not a governance system. It's a reporting system. And treating it as governance creates a dangerous illusion: the dashboards are green, the reports look clean, and meanwhile the underlying system has no structural guarantees about any of it.

Use your catalog for what it's good at. Use a control plane for governance. They're complementary—but they are not the same thing.

The Question to Ask

Next time a vendor tells you their product "provides governance," ask one question:

Does it prevent non-compliance, or does it report non-compliance?

If the answer is "report," you're buying a camera. You still need the lock.