data-infra / field-guide v1.0 · 12 lessons · ~3h · IC5 / staff

Data infrastructure, at system-design depth._

A field guide to the data stack at the level of an IC5 / staff system-design interview. Storage internals, streaming semantics, lakehouse formats, and the operational craft — built around interactive simulators of every concept that's usually drawn on a whiteboard.

12 lessons 40+ live simulations ~3h total 1 mock interview at the end
the data stack · top → bottom stack.svg
01source

Where data is born.

App backends, mobile clients, IoT sensors, third-party APIs. Every event has a creator.

PostgresiOS SDKStripe
02log

The append-only spine.

An ordered, durable, partitioned log. Decouples producers from consumers. The cleanest abstraction in this whole stack.

KafkaKinesisPub/Sub
03process

Where shape changes.

Stream jobs filter, enrich, window, aggregate. Batch jobs do the same, just on bounded data.

FlinkSparkdbt
04store

Bytes that survive.

Object storage holds the raw. Open table formats give it ACID. Indexes give it speed.

S3IcebergParquet
05serve

Sub-second answers.

OLAP engines for BI, vector stores for ML, key-value stores for online features.

SnowflakeTrinoDynamoDB
06consume

The whole point.

Dashboards, ML features, billing, fraud, the recommender. The stack only matters because of this row.

LookerFeature storeAPI
12
lessons
40+
live simulations
4
tracks
IC5
target depth
/ the course

Four tracks. Twelve lessons. One mock interview.

Linear the first time — each lesson assumes the last. Reference order after that. The capstone (lesson 12) is a 45-minute IC5 system-design walkthrough you can step through one move at a time.

/ progress

One badge per lesson. XP for every drill.

Progress is tracked in your browser. No accounts, no servers. Reset anytime from the footer.

0 XP 0 / 12 lessons

Open lesson 01.

The stack, top to bottom — in twelve minutes. Then the rest of the course is just zooming in.

$ start →