Microsoft Fabric (DP-600)
Microsoft Fabric (DP-600)
1. What Microsoft Fabric Actually Is
After passing PL-300, the natural next step was Microsoft Fabric and the DP-600: Fabric Analytics Engineer Associate certification. So before the exam details, let me explain what Fabric really is.
Microsoft Fabric is an all-in-one analytics platform that bundles data engineering, data warehousing, data science, real-time analytics, and Power BI into a single SaaS product. Instead of stitching together separate services, Fabric gives you one environment where every tool reads and writes to the same shared storage.
The core idea that makes this work is OneLake:
┌─────────────────── OneLake (one copy of data) ───────────────────┐
│ │
Data Engineering Data Warehouse Data Science Real-Time Power BI
(Spark/Notebooks) (SQL) (ML models) Analytics (Reports)
- OneLake is a single, unified data lake for the whole organization — think “OneDrive for data.” Every workload stores its data here in open Delta/Parquet format.
- Because there’s one copy, a table created by a data engineer can be queried by SQL, modeled in Power BI, and used for ML — without copying it around.
Where PL-300 is about analyzing data, DP-600 is about engineering the analytics solution that feeds those reports.
2. The Lakehouse vs. the Warehouse
What they are: Fabric gives you two main ways to store and serve analytical data, and a big part of DP-600 is knowing when to use which.
- Lakehouse — combines a data lake (files, unstructured data) with table-like structure on top. You work with it using Spark notebooks (Python/PySpark) and SQL. Best when you have raw, large, or semi-structured data and want flexibility.
- Warehouse — a traditional relational data warehouse with full T-SQL support, including writes. Best when your team is SQL-first and you want classic warehouse semantics.
Lakehouse → files + tables, Spark + SQL, schema-on-read, data-engineer friendly
Warehouse → tables only, T-SQL, schema-on-write, SQL-developer friendly
The concept to understand: Both store data in OneLake as Delta tables, so they’re interoperable. The choice is about the team and the workload, not about locking your data into one format. The exam tests whether you can recommend the right one for a given scenario.
3. Loading Data — The Medallion Architecture
What it is: A common pattern Fabric promotes for organizing data into quality layers, named after medals.
Bronze → Silver → Gold
(raw) (cleaned) (business-ready)
- Bronze — raw data ingested exactly as it arrives, untouched.
- Silver — cleaned, deduplicated, validated, and conformed.
- Gold — aggregated, business-level tables ready for reporting.
Why it matters: Separating layers means you never lose the raw source, each transformation step is auditable, and reports always read from clean Gold tables. Data pipelines and Dataflows Gen2 are the tools you use to move data between these layers.
The concept to understand: Fabric isn’t just storage — it’s about building a governed flow from messy source to trusted analytics, with each layer adding quality.
4. Querying and Modeling
What it is: Once data lands in a Lakehouse or Warehouse, you query it with SQL and build a semantic model on top.
The key feature — Direct Lake: This is the headline capability DP-600 cares about. Traditionally Power BI either imports data (fast but a copy) or uses DirectQuery (live but slow). Direct Lake is a third mode: Power BI reads the Delta files in OneLake directly, getting import-level speed with no data copy and near real-time freshness.
Import → copy data into the model (fast, but stale + duplicated)
DirectQuery → query the source on every visual (fresh, but slow)
Direct Lake → read OneLake Delta files directly (fast AND fresh, no copy)
The concept to understand: A semantic model in Fabric still uses the same DAX and star-schema skills from PL-300 — Fabric just changes where the data lives and how fast it connects. Your modeling knowledge carries straight over.
5. Securing and Governing
What it is: Because Fabric centralizes all data in OneLake, governing access is critical, and DP-600 tests it heavily.
- Workspace roles — Admin, Member, Contributor, Viewer control who can do what in a workspace.
- Row-Level Security (RLS) and Object-Level Security (OLS) — restrict which rows or which tables/columns a user can see.
- Sensitivity labels — tag data (e.g., “Confidential”) and have the classification follow the data wherever it flows.
The concept to understand: One shared copy of data is powerful but risky — governance is what makes it safe to centralize. The exam expects you to apply the least-privilege role and the right security layer for each scenario.
6. How DP-600 Compares to PL-300
To put it simply:
- PL-300 (Data Analyst) — consume and analyze. Power Query, modeling, DAX, building reports.
- DP-600 (Analytics Engineer) — build the platform. Lakehouse/Warehouse design, Spark and T-SQL, pipelines, Direct Lake, and governance across OneLake.
Source → [ DP-600 territory: ingest, clean, model, secure ] → [ PL-300 territory: analyze, visualize ]
DP-600 sits one layer earlier in the pipeline. If PL-300 taught me to make sense of data, DP-600 taught me to engineer the system that delivers clean, fast, governed data in the first place.
7. Final Thoughts
Microsoft Fabric is best understood as one platform on one copy of data (OneLake), with specialized tools — Lakehouse, Warehouse, pipelines, and Power BI — all working over that shared foundation. DP-600 tests whether you can choose the right tool, organize data with the medallion architecture, connect it efficiently with Direct Lake, and govern it safely.
My advice for anyone moving from PL-300 to DP-600: your DAX and modeling skills transfer directly, so focus your study on the engineering side — Lakehouse vs. Warehouse trade-offs, pipelines, and the OneLake/Direct Lake model. Once you see Fabric as a single lake with many doors into it, the whole platform clicks.