Building Custom Spatial Macros

Spatial operations rarely conform to a single SQL template. Analytics engineers, GIS backend developers, and spatial data scientists work across warehouses whose spatial extensions diverge in syntax, default units, and precision handling — ST_DWithin means metres in one engine and degrees in another, ST_Transform exists in three dialects with three signatures, and a geometry tagged SRID = 0 silently corrupts every downstream join. Hardcoding those calls throughout a dependency graph is how spatial pipelines accumulate maintenance debt and silent inaccuracy.

This guide covers building custom spatial macros: compile-time abstractions that standardize coordinate reference system resolution, enforce accuracy thresholds, and harmonize execution across heterogeneous engines. It is the implementation companion to Advanced Spatial Macros & UDF Patterns — where that overview sets the architectural boundaries, this page delivers the macro code, the dispatch wiring, and the tests that keep the abstraction honest.

Prerequisites

Before writing dispatched spatial macros, confirm the following are in place:

dbt Core 1.5+ (or dbt Cloud on a recent version) — adapter.dispatch and namespace search-order config are assumed.
A spatial adapter installed and verified: dbt-postgres against a PostGIS 3.x database, and/or dbt-duckdb with the spatial extension for lightweight runs. Adapter trade-offs are weighed in choosing the right spatial adapter; local-engine setup is covered in configuring the DuckDB spatial extension in dbt projects.
Database grants: CREATE on the target schema plus permission to run the spatial functions (USAGE on the PostGIS extension schema where it is isolated).
Environment variables wired through dbt’s env_var() so the same macros run identically in CI and production — never hardcode hosts, credentials, or the default SRID.
A canonical project SRID decided up front (commonly EPSG:4326 for storage, a metric CRS for distance work). Every macro defaults to it.

# dbt_project.yml — project-wide defaults the macros read
vars:
  project_srid: 4326          # canonical storage CRS
  metric_srid: 3857           # planar CRS for metre-based math
  spatial_tolerance: 0.0001   # acceptable floating-point drift in tests

dispatch:
  - macro_namespace: dbt_geo
    search_order: ['my_project', 'dbt_geo']

Where Custom Macros Fit in the Spatial DAG

A spatial macro is not a convenience wrapper; it is the contract enforced at every layer boundary. Raw geometry enters at staging with an unknown or inconsistent SRID, passes through a normalization macro that sets and transforms it to the canonical CRS, and only then reaches the intermediate joins and mart aggregations that assume a clean, single-projection input. The same dispatched interface that runs against PostGIS in production runs against the DuckDB spatial extension in CI, so a pull request validates spatial logic without a live PostGIS instance.

Configuration Walkthrough

Dispatched macros need three things configured: a namespace search order (above), an on-run-start hook that fails fast if the spatial extension is missing, and per-target profiles so the same code resolves to the right engine.

# dbt_project.yml — verify the spatial extension before any model runs
on-run-start:
  - "{{ assert_spatial_ready() }}"

-- macros/assert_spatial_ready.sql
{% macro assert_spatial_ready() %}
  {% if execute and target.type == 'postgres' %}
    {% set result = run_query("SELECT PostGIS_Version()") %}
    {% if result.rows | length == 0 %}
      {{ exceptions.raise_compiler_error("PostGIS not available on target " ~ target.name) }}
    {% endif %}
  {% endif %}
{% endmacro %}

# profiles.yml — same macros, two engines, env_var() everywhere
dbt_geo:
  target: ci
  outputs:
    ci:
      type: duckdb
      path: "{{ env_var('DBT_DUCKDB_PATH', 'ci.duckdb') }}"
      extensions: ["spatial"]
    prod:
      type: postgres
      host: "{{ env_var('DBT_PG_HOST') }}"
      user: "{{ env_var('DBT_PG_USER') }}"
      password: "{{ env_var('DBT_PG_PASSWORD') }}"
      dbname: "{{ env_var('DBT_PG_DATABASE') }}"
      schema: "{{ env_var('DBT_PG_SCHEMA', 'analytics') }}"

Core Implementation

Deterministic CRS handling and validation gates

The foundational macro establishes a deterministic coordinate reference system workflow. Spatial precision degrades fast when planar and geographic systems mix, so a production macro accepts raw geometry, inspects its SRID, and applies a transformation only when strictly necessary — avoiding both floating-point drift and redundant CPU cycles.

-- macros/ensure_crs.sql
{% macro ensure_crs(geometry_col, target_srid=none, source_srid=3857) %}
  {%- set target_srid = target_srid or var('project_srid', 4326) -%}
  CASE
    WHEN ST_SRID({{ geometry_col }}) IS NULL OR ST_SRID({{ geometry_col }}) = 0 THEN
      ST_Transform(ST_SetSRID({{ geometry_col }}, {{ source_srid }}), {{ target_srid }})
    WHEN ST_SRID({{ geometry_col }}) != {{ target_srid }} THEN
      ST_Transform({{ geometry_col }}, {{ target_srid }})
    ELSE
      {{ geometry_col }}
  END
{% endmacro %}

Calling {{ ensure_crs('geom') }} in a staging model is a strict gate: every geometry entering a proximity calculation or spatial partition adheres to one documented projection. Note the order of operations in the unknown-SRID branch — ST_SetSRID labels the geometry with its true source SRID first, then ST_Transform reprojects it. Reversing those is a common source of NULL output. This same canonical-CRS discipline is applied at scale in automating CRS conversions in dbt pipelines.

Cross-engine abstraction with adapter.dispatch

Warehouses implement the same spatial concept with different signatures. PostGIS wants ST_DWithin(geom1, geom2, distance_in_meters) on a geography cast; Snowflake wants TO_GEOGRAPHY wrapping; BigQuery’s ST_DWITHIN operates strictly on spherical geography. The adapter.dispatch mechanism isolates that vendor syntax behind one interface — choosing an adapter is discussed in choosing the right spatial adapter.

-- macros/spatial_proximity_check.sql
{% macro spatial_proximity_check(geom_a, geom_b, distance, units='meters') %}
  {{ return(adapter.dispatch('spatial_proximity_check', 'dbt_geo')(geom_a, geom_b, distance, units)) }}
{% endmacro %}

{% macro default__spatial_proximity_check(geom_a, geom_b, distance, units) %}
  -- PostGIS: geography cast gives true metre distances on the spheroid
  ST_DWithin({{ geom_a }}::geography, {{ geom_b }}::geography, {{ distance }})
{% endmacro %}

{% macro snowflake__spatial_proximity_check(geom_a, geom_b, distance, units) %}
  ST_DWITHIN(TO_GEOGRAPHY({{ geom_a }}), TO_GEOGRAPHY({{ geom_b }}), {{ distance }})
{% endmacro %}

{% macro duckdb__spatial_proximity_check(geom_a, geom_b, distance, units) %}
  -- DuckDB spatial works in projected units; reproject to a metric CRS first
  ST_DWithin(
    ST_Transform({{ geom_a }}, 'EPSG:4326', 'EPSG:3857'),
    ST_Transform({{ geom_b }}, 'EPSG:4326', 'EPSG:3857'),
    {{ distance }}
  )
{% endmacro %}

With the function signature abstracted, models swap compute engines without refactoring. This pattern is decisive when optimizing proximity joins, because the macro can inject a bounding-box pre-filter per adapter. For distance filtering specifically, the standardized interface in writing reusable ST_DWithin macros in dbt keeps tolerance thresholds consistent across development, staging, and production.

Validation & Testing

A spatial macro is only as reliable as the tests around it. Extend dbt’s generic-test framework to assert topology validity, SRID consistency, and distance tolerances rather than trusting the abstraction blindly. First, verify the environment itself:

-- Confirm the extension and version in an analysis or scratch model
SELECT PostGIS_Version();              -- PostGIS path
-- SELECT * FROM duckdb_extensions() WHERE extension_name = 'spatial';  -- DuckDB path

Then assert that geometries leaving a macro-driven model are valid and carry the canonical SRID:

# models/staging/_staging.yml
models:
  - name: stg_parcels
    columns:
      - name: geom
        tests:
          - is_valid_geometry          # custom generic test wrapping ST_IsValid
          - dbt_utils.expression_is_true:
              expression: "ST_SRID(geom) = 4326"

-- macros/tests/is_valid_geometry.sql
{% test is_valid_geometry(model, column_name) %}
  SELECT {{ column_name }}
  FROM {{ model }}
  WHERE NOT ST_IsValid({{ column_name }})
{% endtest %}

A passing test returns zero rows. Seed a handful of deterministic WKT fixtures — interior, boundary, and degenerate (self-intersecting) cases — so the suite asserts against known-correct topology instead of live production data. Distance macros should be checked against a hand-computed expected value within var('spatial_tolerance').

Advanced Patterns

Compile-time EPSG routing. For multi-source ingestion, extend ensure_crs to read EPSG codes from a metadata seed and route each source through the correct source_srid at compile time, eliminating per-source copy-paste.

Incremental spatial models. Wrap heavy reprojection in an incremental materialization so only changed geometries are transformed on each run. Pair the macro with a unique_key and an is_incremental() predicate to keep reruns idempotent.

Index injection. Macros are natural injection points for planner control. A post_hook macro can build or rebuild the GiST index after a full refresh; bounding-box pre-filters (&& in PostGIS) can be added inside the dispatch implementation to cut the candidate set before precise geometry math runs. Deeper planner steering is covered in index hints for spatial queries.

Engine-portable validation. Because the same dispatched macros resolve against DuckDB in CI and PostGIS in production, the full transformation chain documented in Geometry Transformation Pipelines can be validated on every pull request before promotion.

Troubleshooting

Symptom	Root cause	Fix
`ensure_crs` returns NULL	Geometry tagged `SRID = 0`; `ST_Transform` ran before `ST_SetSRID`	Label the source SRID with `ST_SetSRID` before transforming (as in the macro above)
`adapter.dispatch` raises “macro not found”	`search_order` namespace missing from `dbt_project.yml`, or implementation named without the `default__`/`adapter__` prefix	Add the `dispatch` config and name implementations `default__`, `duckdb__`, etc.
Distances wrong by orders of magnitude	Planar `geometry` math used where metric distance expected	Cast to `::geography` (PostGIS) or reproject to a metric CRS before `ST_DWithin`
Macro passes in CI, fails in production	Function-signature drift between DuckDB and PostGIS	Route every dialect difference through a dispatched implementation, not shared SQL
`is_valid_geometry` test fails on ingest	Self-intersecting or unclosed input polygons	Run `ST_MakeValid` inside the staging macro; quarantine rows that still fail

FAQ

Why does my macro need both ST_SetSRID and ST_Transform?

ST_SetSRID only labels a geometry with a spatial reference identifier — it changes no coordinates. ST_Transform reprojects coordinates from the geometry’s current SRID to a target. A geometry tagged SRID = 0 has no known source, so transforming it returns NULL. Always ST_SetSRID to the true source first, then ST_Transform to the canonical SRID.

When should I use adapter.dispatch instead of a Jinja if/else on target.type?

Use adapter.dispatch whenever a third party (or another team) might need to override your spatial logic for an engine you do not maintain — dispatch respects the search_order, so an override slots in without editing your macro. Reserve a plain if target.type branch for one-off, project-private differences that no one else will extend.

How do I test spatial macros without a live PostGIS database?

Run the same dispatched macros against the DuckDB spatial extension in CI. Seed deterministic WKT fixtures, materialize the macro-driven model, and assert validity and SRID with generic tests. Promote to PostGIS only after the DuckDB run is green.

Can one macro return both a geometry and a distance result?

Keep them separate. A CRS-normalization macro returns a geometry expression; a proximity macro returns a boolean predicate. Mixing concerns makes dispatch implementations harder to override per engine and complicates testing.

Writing reusable ST_DWithin macros in dbt — a standardized distance-filter interface with consistent tolerances.
Geometry Transformation Pipelines — deterministic projection and topology chains the macros plug into.
Optimizing Proximity Joins — bounding-box and partition strategies the dispatch layer can inject.
Index Hints for Spatial Queries — steering the planner toward spatial access paths.
Choosing the Right Spatial Adapter — PostGIS vs. DuckDB vs. BigQuery trade-offs behind the dispatch.

Up: Part of Advanced Spatial Macros & UDF Patterns.

Explore this section