Resolving circular dependencies in spatial models

This page shows you how to eliminate a Found a cycle compilation error in a geometry-heavy dbt project by replacing bidirectional ref() relationships with a single materialized anchor layer that both models read from.

Circular dependencies are a silent pipeline killer when spatial transformations enter a modern data stack. Spatial predicates such as ST_Intersects, ST_DWithin, ST_Contains, and ST_NearestNeighbor invite bidirectional lookups, recursive catchment calculations, and cross-referenced geometry columns. Modeled naively, these patterns make Model A ref() Model B while Model B ref()s Model A — the dbt compiler detects the cycle, halts, and the stall cascades into every downstream reporting mart. The fix is part architectural refactor, part materialization strategy, sequenced so CI keeps running while you do it.

When to use this approach

Reach for the anchor-layer decoupling described here when:

The cycle is structural, not accidental. Both models genuinely need the same geometry — a bidirectional spatial join or a metric that feeds back into a filter. If the cycle is an stray ref() left in by mistake, just delete it; you do not need this pattern. Tracing the directional flow is easier once you have read the parent reference on spatial model dependency graphs.
You want the shared geometry computed once. If the same ST_Transform / ST_Centroid / ST_Envelope work is being repeated in two places, materializing it in an anchor model is both the cycle fix and a performance win. Prefer a recursive CTE inside one model only when the loop is graph-traversal (network routing) rather than a model-to-model reference.
The loop spans engines or schemas. When one side runs against PostGIS and the other validates in CI, the engine choice itself can be the constraint — see choosing the right spatial adapter before committing to a refactor that bakes in one warehouse.

Prerequisites

dbt-core ≥ 1.7 for stable tags, enabled, and incremental unique_key semantics used below.
A spatial-capable adapter: dbt-postgres against PostGIS ≥ 3.1 (see setting up PostGIS with dbt), or dbt-duckdb ≥ 1.7 with the DuckDB spatial extension integration loaded for local and CI validation.
Grants to run CREATE INDEX / DDL in the target schema, since the anchor table is index-backed.
A single agreed source SRID (this guide uses EPSG:4326) so the anchor materializes one canonical geometry column.
Connection values supplied through dbt’s env_var() pattern, never hard-coded.

Step-by-step instructions

Step 0 (optional): Triage to unblock CI first

If the cycle is currently breaking every CI run and you are at risk of a freshness SLA breach, restore execution before refactoring. Temporarily break the cycle at the compilation layer without touching production data:

# Map the exact ref() chain causing the deadlock
dbt ls --select +fct_parcels

# Unblock the run by excluding the downstream side of the cycle
dbt build --exclude dim_zoning

Verify the run now compiles: a clean dbt build exit code 0 confirms the bypass works. Do not leave it in place — an excluded model means stale downstream data, and spatial cycles compound planner inefficiencies that eventually hit warehouse memory limits. Treat this as triage only and continue to Step 1.

Step 1: Map and diagnose the dependency loop

Generate the lineage graph and trace the exact ref() chain where two spatial models each depend on the other.

dbt docs generate
dbt ls --select +fct_parcels+ --output path

Document the spatial predicate forcing the bidirectional requirement — typically Model A reads a geometry column from Model B while Model B reads an aggregated spatial metric from Model A. The cycle almost always looks like this:

Verification: running dbt compile at this point still fails with Found a cycle: model.fct_parcels --> model.dim_zoning --> model.fct_parcels. That error message naming both nodes is your confirmation you have found the right loop.

Step 2: Decouple with an intermediate anchor layer

Replace the direct cross-references with one dedicated staging model that materializes the shared geometry and a precomputed spatial index once. This collapses the bidirectional relationship into a strict unidirectional flow.

-- models/staging/stg_spatial_anchor.sql
{{ config(materialized='table', tags=['spatial', 'anchor']) }}

SELECT
    parcel_id,
    ST_Transform(geom, 4326) AS geom_4326,
    ST_Centroid(geom)        AS centroid,
    ST_Envelope(geom)        AS bbox,
    ST_IsValid(geom)         AS is_valid
FROM {{ ref('raw_parcels') }}
WHERE ST_IsValid(geom) = TRUE

Both dependent models now ref('stg_spatial_anchor') instead of referencing each other. Materializing this layer as a table (or incremental) forces the engine to compute bounding boxes and centroids one time, sharply reducing planner overhead on every downstream join.

Verification: dbt run --select stg_spatial_anchor should build a table whose row count equals the count of valid input geometries — confirm with SELECT count(*) FROM stg_spatial_anchor against the warehouse.

Step 3: Refactor the bidirectional spatial logic

With the anchor in place, rewrite the downstream models to consume precomputed metrics rather than recalculating geometry on the fly. Split the original feedback loop into two unidirectional models:

-- models/marts/fct_parcels.sql  (Model A: proximity calculation)
{{ config(materialized='table', tags=['spatial']) }}

SELECT
    a.parcel_id,
    ST_DWithin(a.centroid, s.station_geom, 500) AS near_transit,
    ST_Distance(a.centroid, s.station_geom)      AS distance_m
FROM {{ ref('stg_spatial_anchor') }} a
CROSS JOIN LATERAL (
    SELECT station_geom
    FROM {{ ref('dim_transit') }} t
    ORDER BY a.centroid <-> t.station_geom
    LIMIT 1
) s

-- models/marts/dim_zoning.sql  (Model B: business logic)
{{ config(materialized='table', tags=['spatial']) }}

SELECT
    a.parcel_id,
    a.bbox,
    z.zone_code,
    f.near_transit
FROM {{ ref('stg_spatial_anchor') }} a          -- reads geometry from the anchor
JOIN {{ ref('fct_parcels') }} f USING (parcel_id) -- reads the metric, one direction only
JOIN zoning_polygons z
  ON ST_Intersects(a.geom_4326, z.geom)

Model B now reads the metric from Model A and the geometry from the anchor — it never calls back into Model A’s geometry, so no edge points upstream.

Verification: dbt compile succeeds with no cycle error. The PostGIS documentation on spatial indexing with GiST explains why a bounding-box index on the anchor keeps these ST_Intersects joins off a full-table scan.

Step 4: Validate and optimize execution plans

dbt compile            # DAG must now be acyclic
dbt build --select +dim_zoning

Inspect the warehouse execution plan and confirm:

Index scans, not nested-loop seq scans on the spatial predicates — EXPLAIN ANALYZE should show a GiST index scan against the anchor.
The anchor is not recomputed unnecessarily by downstream selects.
Tests pass: add dbt test assertions for ST_IsValid, row-count parity, and referential integrity on parcel_id.

Verification: a green dbt build plus EXPLAIN output naming the GiST index confirms the cycle is gone and the rewrite did not regress performance.

Configuration reference

Parameter	Where	Accepted values	Default	Spatial-specific note
`materialized`	anchor `config()`	`table`, `incremental`	`view`	Use `table`/`incremental` — a `view` recomputes geometry per query and re-introduces the planner cost you are trying to remove.
`tags`	model `config()`	list of strings	`[]`	Tag anchors (`['spatial','anchor']`) so `dbt build --select tag:spatial` isolates the geometry subgraph.
`enabled`	model `config()`	`true`, `false`	`true`	Only for temporary triage (Step 0); never a permanent cycle fix.
`unique_key`	incremental anchor	column name(s)	—	Required when the anchor is `incremental`, e.g. `parcel_id`, to avoid duplicate geometry rows.
`+post-hook`	`dbt_project.yml`	SQL string	—	`CREATE INDEX ... USING GIST (geom_4326)` then `ANALYZE` so downstream `ST_Intersects` uses an index scan.
project SRID	macro/var	integer EPSG code	—	Pin one source SRID (4326 here) so the anchor materializes a single canonical geometry column.

Gotchas & edge cases

The cycle is a graph traversal, not a model loop. Network routing or recursive catchments need a recursive CTE inside one model; do not try to model each hop as a separate dbt node — you will recreate the cycle. The anchor pattern is for model-to-model references only.
Enforce layering to prevent recurrence. Adopt a bronze/silver/gold topology: raw geometries cleaned in staging, relationships computed in intermediate models, metrics aggregated in marts. Never let a downstream model write back to an upstream geometry source. This convention is covered in Core Fundamentals & Architecture for dbt Geospatial.
Dynamic ref() from macros hides cycles. When a macro generates ref() calls, dbt may not surface the loop until compile. Declare depends_on explicitly so the parser orders the nodes deterministically.
SRID drift across the loop. If Model A and Model B held geometry in different SRIDs, folding them into one anchor with a single ST_Transform is what actually fixes silent metric corruption — re-check distance/area results after the refactor.
enabled: false left in production. An excluded model silently stops refreshing. Grep your project for stray enabled: false before merging the permanent fix.

FAQ

Why does dbt report "Found a cycle" only at compile, not when I save the model?

dbt resolves ref() edges by parsing the whole project into a graph, which happens at dbt compile/dbt build time — not on file save. A bidirectional ref() is valid SQL in isolation, so the cycle only surfaces once dbt tries to topologically sort the DAG. Run dbt compile after every ref() change to catch it early.

Can I break the cycle with an ephemeral model instead of a table?

No. An ephemeral model is inlined as a CTE into its consumers, so it does not create a real DAG node that breaks the edge — and it recomputes the geometry inside every consumer. Materialize the anchor as table or incremental so the shared geometry exists once and the dependency genuinely points one direction.

The loop is a recursive catchment calculation. Does the anchor pattern apply?

Not directly. Genuine graph traversal — walking a road network or nested catchments — belongs in a single model using a recursive CTE, where the recursion lives inside one node and never crosses a ref() boundary. Use the anchor layer only when two distinct models reference each other’s geometry.

Why is dim_zoning still slow after I removed the cycle?

Removing the cycle fixes compilation, not the planner. If ST_Intersects against the anchor still runs a sequential scan, the GiST index is missing or stale. Add a +post-hook that runs CREATE INDEX ... USING GIST on the anchor’s geometry column followed by ANALYZE, then re-check the plan.

Does materializing the anchor as a table cost more storage than it saves?

Usually no. The anchor holds one geometry column plus a centroid and bbox per row, and it replaces repeated ST_Transform/ST_Centroid compute in two or more downstream models. For very large geometry sets, switch the anchor to incremental with a unique_key so only changed parcels recompute.

Spatial Model Dependency Graphs — how to order geometry-heavy models into a reliable acyclic DAG in the first place.
Choosing the Right Spatial Adapter — pick the engine whose constraints shape your dependency layering.
Setting Up PostGIS with dbt — provisioning and GiST index hooks the anchor layer relies on.

Up: Part of Spatial Model Dependency Graphs.