dbt Model Layers

TL;DR

Organize dbt models into 3 layers: Staging (1:1 with sources, rename/cast/clean), Intermediate (join/transform between staging and marts), and Marts (business-ready tables for end users). This is the dbt-recommended structure used by most teams.

Explain Like I'm 12

Think of a restaurant kitchen. Staging = the prep station where you wash vegetables, peel potatoes, dice onions. One station per ingredient, output is clean and consistent. Intermediate = the cooking station where you combine prepped ingredients into sauces, fillings, and bases. Marts = the plated dishes served to customers. Each dish (mart) is designed for a specific customer (business team).

The 3 Layers

Staging (stg_)

1:1 with source tables. Rename, cast, clean.

Intermediate (int_)

Join, aggregate, transform. Bridge staging to marts.

Marts (fct_ / dim_)

Business-ready. Consumed by BI tools and analysts.

Staging Layer

Purpose: Create a clean, consistent version of each raw source table. One staging model per source table.

Rule	Example
Prefix: `stg_`	`stg_stripe__payments.sql`
1:1 with source	One `stg_` model per raw table
Rename columns	`id` → `payment_id`
Cast types	`CAST(amount AS numeric)`
Filter deleted rows	`WHERE _fivetran_deleted = false`
Materialization	`view` (lightweight, always fresh)
Only ref'd by	Intermediate and mart models

-- models/staging/stripe/stg_stripe__payments.sql
WITH source AS (
    SELECT * FROM {{ source('stripe', 'payments') }}
),

renamed AS (
    SELECT
        id AS payment_id,
        order_id,
        amount::numeric(10,2) AS amount,
        currency,
        status,
        created_at::timestamp AS created_at
    FROM source
    WHERE _fivetran_deleted = false
)

SELECT * FROM renamed

Intermediate Layer

Purpose: Complex transformations that bridge staging to marts. Joins, aggregations, business logic that doesn't belong in staging or marts alone.

Rule	Example
Prefix: `int_`	`int_orders__pivoted_by_status.sql`
Only ref staging/other int models	Never ref raw source tables
Materialization	`ephemeral` or `view` (not exposed to end users)
Not queried by BI tools	Internal to dbt project only

-- models/intermediate/int_orders__joined.sql
SELECT
    o.order_id,
    o.customer_id,
    c.customer_name,
    c.customer_segment,
    o.order_date,
    o.status,
    p.amount AS payment_amount,
    p.currency
FROM {{ ref('stg_shopify__orders') }} o
LEFT JOIN {{ ref('stg_shopify__customers') }} c ON o.customer_id = c.customer_id
LEFT JOIN {{ ref('stg_stripe__payments') }} p ON o.order_id = p.order_id

When to create intermediate models: (1) A join is used by 2+ mart models (DRY). (2) A mart query is getting too long/complex. (3) You need to pivot, window-function, or aggregate before the final mart.

Marts Layer

Purpose: Business-ready tables consumed by BI tools, analysts, and other consumers. Organized by business domain (finance, marketing, product).

Type	Prefix	What It Contains
Fact	`fct_`	Events/transactions: orders, clicks, payments. Grain = one row per event.
Dimension	`dim_`	Entities/attributes: customers, products, regions. Grain = one row per entity.
Metric	`mrt_` or `rpt_`	Pre-aggregated report tables: daily_revenue, weekly_retention.

-- models/marts/finance/fct_orders.sql
{{
  config(materialized='table')
}}

SELECT
    order_id,
    customer_id,
    customer_name,
    customer_segment,
    order_date,
    status,
    payment_amount,
    currency,
    CASE WHEN status = 'returned' THEN -payment_amount ELSE payment_amount END AS net_amount
FROM {{ ref('int_orders__joined') }}

Directory structure:

models/
├── staging/
│   ├── stripe/
│   │   ├── _stripe__sources.yml
│   │   └── stg_stripe__payments.sql
│   └── shopify/
│       ├── _shopify__sources.yml
│       ├── stg_shopify__orders.sql
│       └── stg_shopify__customers.sql
├── intermediate/
│   └── int_orders__joined.sql
└── marts/
    ├── finance/
    │   ├── fct_orders.sql
    │   └── dim_customers.sql
    └── marketing/
        └── fct_campaigns.sql

Naming Conventions

Pattern	Example	When
`stg_{source}__{table}`	`stg_stripe__payments`	All staging models
`int_{entity}__{verb}`	`int_orders__pivoted_by_status`	Intermediate models
`fct_{event/process}`	`fct_orders`	Fact tables (events)
`dim_{entity}`	`dim_customers`	Dimension tables (entities)

Double underscore (__) separates source from table in staging names. Single underscore separates words. This makes it instantly clear where data comes from.

Test Yourself

Q: What belongs in a staging model?

Renaming columns, casting types, filtering deleted rows. One model per source table. No joins, no business logic. Materialized as views. The goal is a clean, consistent interface to raw data.

Q: When should you create an intermediate model?

When a join/transformation is used by 2+ mart models (DRY principle), when a mart query is too complex, or when you need aggregations/window functions before the final mart.

Q: What is the difference between fct_ and dim_ models?

fct_ = fact tables with events/transactions (orders, clicks). One row per event. Contain measures (amounts, counts). dim_ = dimension tables with entities/attributes (customers, products). One row per entity. Contain descriptive attributes used for filtering and grouping.

Interview Questions

Q: Walk me through how you'd structure a dbt project for an e-commerce company with Stripe (payments) and Shopify (orders, customers) as data sources.

Staging: stg_stripe__payments, stg_shopify__orders, stg_shopify__customers — rename, cast, filter each source. Views.
Intermediate: int_orders__joined (join orders + customers + payments). Ephemeral or view.
Marts: fct_orders (one row per order with all details), dim_customers (one row per customer with lifetime metrics), fct_daily_revenue (aggregated by day). Tables.
Tests: unique + not_null on all primary keys, relationships between staging models, accepted_values on status fields, custom test for negative amounts.

Q: Why materialize staging models as views and marts as tables?

Staging views are lightweight (no storage cost), always reflect latest source data, and build instantly. Marts as tables are pre-computed — BI tools query them directly and need fast response times. If staging models were tables, you'd double your storage and rebuild time for simple rename/cast operations.

Q: How do you decide between an incremental model and a full table rebuild?

Incremental when: (1) table is large (millions+ rows), (2) you can reliably identify new/changed rows (timestamp or ID), (3) full rebuild takes too long. Full table when: (1) data is small, (2) logic is complex and hard to make incremental, (3) source data can be updated retroactively (backfills). Default to table until rebuild time becomes a problem.

dbt Model Layers

The 3 Layers

Staging Layer

Intermediate Layer

Marts Layer

Naming Conventions

Test Yourself

Interview Questions

Related Topics