Microsoft Fabric Core Concepts

Q: What’s the difference between a lakehouse SQL endpoint and a warehouse?

The lakehouse SQL endpoint is read-only — you can SELECT but not INSERT/UPDATE/DELETE. The warehouse supports full T-SQL DML (INSERT, UPDATE, DELETE), stored procedures, and cross-database queries. Both store data as Delta Parquet in OneLake.

Q: When would you use a pipeline vs a Dataflow Gen2 vs a notebook?

Pipeline: orchestration, scheduling, and data movement at scale (copy activity). Dataflow Gen2: no-code ETL for business users using Power Query Online. Notebook: complex PySpark transformations, data science, or anything requiring code-level control.

Q: What are the three layers of the medallion architecture?

Bronze: raw data as-is from sources. Silver: cleaned, deduplicated, validated data. Gold: business-level aggregates and curated datasets ready for dashboards and reports.

By QuickLearnPro Editorial · Editorial standards

TL;DR

Fabric has 7 building blocks: OneLake (unified storage), Workspaces (project folders), Capacities (compute you pay for), Lakehouse (files + Delta tables), Warehouse (T-SQL analytics), Pipelines & Dataflows (ETL), and Power BI (reports). Everything lives in OneLake and shares one security model.

Fabric building blocks: OneLake at center connected to Lakehouse, Warehouse, Notebooks, Pipelines, and Power BI, all within a Workspace powered by Capacity Units

Explain Like I'm 12

Think of Fabric as a huge shared Google Drive for data. OneLake is the drive itself. Workspaces are folders — each team gets one. Capacities are like how much electricity the building gets (more capacity = more people can work at once). Inside each folder you have a Lakehouse (a messy inbox that auto-organizes), a Warehouse (a perfectly organized filing cabinet), Pipelines (mail carriers that deliver data), and Power BI (a printer that turns data into pretty charts).

Cheat Sheet

Concept	What It Does	Plain English
OneLake	Unified ADLS Gen2 storage for all Fabric items	“One shared hard drive for all your data”
Workspace	Container for items (lakehouses, warehouses, reports)	“A project folder with access controls”
Capacity	Compute resources measured in CUs (Capacity Units)	“The engine that powers everything — bigger engine = faster”
Lakehouse	Schema-on-read store with files + Delta tables + SQL endpoint	“Dump data in, query it with Spark or SQL”
Warehouse	Full T-SQL data warehouse with DML support	“A traditional SQL warehouse, fully managed”
Pipeline	ADF-like data orchestration (copy, transform, schedule)	“Automated data delivery — move data from A to B on a schedule”
Dataflow Gen2	Power Query Online for no-code ETL into lakehouses	“Excel-like drag-and-drop data cleaning that writes to OneLake”
Notebook	PySpark/Spark SQL notebook for data engineering + ML	“Jupyter notebook connected to your lakehouse”
Semantic Model	Power BI data model (formerly “dataset”) with DAX measures	“The brain behind your Power BI reports”
Direct Lake	Power BI reads Delta tables in OneLake directly — no import/copy	“Reports load from the lake at import speed without importing”

1. OneLake

OneLake is Fabric’s single storage layer. It’s built on Azure Data Lake Storage Gen2 but abstracted away so you never manage storage accounts. Every Fabric tenant gets exactly one OneLake — every workspace, lakehouse, and warehouse writes to it.

Key point: OneLake stores everything as Delta Parquet. This open format means any engine (Spark, T-SQL, Power BI) can read the same data without copying.

Shortcuts

OneLake shortcuts let you reference external data (ADLS Gen2, S3, GCS, Dataverse) without moving it. The shortcut appears as a folder in your lakehouse, but the data stays where it is. This is Fabric’s answer to “I don’t want to copy terabytes.”

-- You can query shortcut data just like local data
SELECT * FROM lakehouse1.shortcut_to_s3_bucket.sales_2024
WHERE region = 'US'

2. Workspaces

A workspace is a container for Fabric items — lakehouses, warehouses, notebooks, pipelines, reports. Think of it as a project folder with role-based access control (Admin, Member, Contributor, Viewer).

Role	Can Do
Admin	Everything + manage workspace settings and access
Member	Create, edit, delete all items + share items
Contributor	Create and edit items, but cannot share or manage access
Viewer	View items only (read reports, browse data)

Best practice: Use separate workspaces for Dev, Test, and Prod. Fabric supports deployment pipelines to promote items between them.

3. Capacities

A capacity is the compute engine that powers your Fabric workloads. You buy a capacity SKU (F2, F4, F8 … F2048), and all workspaces assigned to that capacity share its resources. Measured in CUs (Capacity Units).

SKU	CUs	Typical Use
F2	2	Dev/test, small team exploration
F64	64	Production workloads, department-level
F256+	256+	Enterprise, heavy Spark/warehouse workloads

Watch out: All Fabric experiences share the same capacity pool. A runaway Spark notebook can starve your Power BI reports. Use capacity settings to throttle or isolate workloads.

4. Lakehouse

The lakehouse is Fabric’s core data store. It combines unstructured files and structured Delta tables in one place. You get:

Files section — drop CSVs, Parquet, JSON, images, anything
Tables section — Delta tables with ACID transactions
SQL endpoint — auto-generated read-only T-SQL interface to your Delta tables
Default semantic model — auto-created Power BI model for Direct Lake reporting

# PySpark in a Fabric notebook — read CSV, write as Delta table
df = spark.read.csv("Files/raw/sales_2024.csv", header=True, inferSchema=True)
df.write.format("delta").mode("overwrite").saveAsTable("sales_2024")

5. Warehouse

The Fabric warehouse is a full T-SQL data warehouse. Unlike the lakehouse SQL endpoint (read-only), the warehouse supports INSERT, UPDATE, DELETE, stored procedures, and cross-database queries.

Feature	Lakehouse SQL Endpoint	Warehouse
Read (SELECT)	Yes	Yes
Write (INSERT/UPDATE/DELETE)	No (read-only)	Yes
Stored procedures	No	Yes
Cross-database queries	No	Yes
Storage format	Delta Parquet (OneLake)	Delta Parquet (OneLake)

6. Pipelines & Dataflows Gen2

Pipelines are Fabric’s orchestration engine (based on Azure Data Factory). Use them to copy data from 100+ sources, run notebooks, trigger dataflows, and schedule everything.

Dataflows Gen2 are Power Query Online mashups that land data directly into lakehouses or warehouses. They’re the no-code/low-code option for data transformation.

When to use which: Use Pipelines for orchestration and data movement at scale. Use Dataflows Gen2 for citizen-developer ETL (business users who know Excel/Power Query but not code). Use Notebooks for complex Spark transformations.

7. Power BI & Direct Lake

Power BI is built into Fabric as a first-class experience. The game-changer is Direct Lake mode — Power BI reads Delta tables directly from OneLake without importing a copy. You get import-level speed with DirectQuery-level freshness.

Mode	How It Works	Speed	Data Freshness
Import	Copies data into Power BI model	Fast	Stale until refresh
DirectQuery	Queries source on every interaction	Slow	Always live
Direct Lake	Reads Delta Parquet from OneLake directly	Fast	Near real-time

Auto-generated semantic model: Every lakehouse and warehouse automatically creates a default semantic model. You can build Power BI reports on it immediately — no manual dataset setup required.

8. Medallion Architecture

Fabric encourages the medallion architecture (Bronze → Silver → Gold) for organizing data in lakehouses:

Layer	Purpose	Data Quality
Bronze	Raw ingestion — land data as-is from sources	Raw, unvalidated
Silver	Cleaned, deduplicated, joined — “single source of truth”	Validated, conformed
Gold	Business-level aggregates — ready for reports and dashboards	Curated, optimized

Implementation: Create separate lakehouses for each layer (e.g., bronze_lakehouse, silver_lakehouse, gold_lakehouse), or use schemas within a single lakehouse. Notebooks or pipelines move data between layers.

Test Yourself

Q: What is OneLake and how is it different from ADLS Gen2?

OneLake is Fabric’s built-in unified storage layer, built on top of ADLS Gen2 but fully managed. You don’t create storage accounts — every tenant gets one OneLake automatically. All Fabric items (lakehouses, warehouses, reports) store data in OneLake as Delta Parquet.

Q: What’s the difference between a lakehouse SQL endpoint and a warehouse?

The lakehouse SQL endpoint is read-only — you can SELECT but not INSERT/UPDATE/DELETE. The warehouse supports full T-SQL DML (INSERT, UPDATE, DELETE), stored procedures, and cross-database queries. Both store data as Delta Parquet in OneLake.

Q: What is Direct Lake mode in Power BI?

Direct Lake lets Power BI read Delta tables directly from OneLake without importing data. It combines the speed of Import mode with the freshness of DirectQuery. No data copy needed — the report always reflects the latest Delta table state.

Q: When would you use a pipeline vs a Dataflow Gen2 vs a notebook?

Pipeline: orchestration, scheduling, and data movement at scale (copy activity). Dataflow Gen2: no-code ETL for business users using Power Query Online. Notebook: complex PySpark transformations, data science, or anything requiring code-level control.

Q: What are the three layers of the medallion architecture?

Bronze: raw data as-is from sources. Silver: cleaned, deduplicated, validated data. Gold: business-level aggregates and curated datasets ready for dashboards and reports.

Microsoft Fabric Core Concepts

Cheat Sheet

1. OneLake

Shortcuts

2. Workspaces

3. Capacities

4. Lakehouse

5. Warehouse

6. Pipelines & Dataflows Gen2

7. Power BI & Direct Lake

8. Medallion Architecture

Test Yourself

Related Topics