About
DriftCity: Statistics for MLOps
An interactive, narrative-driven educational platform that teaches production ML statistical concepts through hands-on visualizations, real-world case studies, and runnable code examples.
The Problem
Machine Learning teams face a critical knowledge gap when it comes to production model operations. Understanding concepts like data drift, A/B testing, and variance reduction is essential for maintaining reliable ML systems, yet these topics are typically:
- Scattered across dense textbooks and academic papers — inaccessible to practitioners
- Taught through static equations — abstract and hard to internalize
- Disconnected from real-world implementation — theory without production context
The result? ML Engineers ship models to production without fully understanding how to detect when they fail, how to run experiments correctly, or how to build monitoring systems that actually work.
The Solution
DriftCity transforms how teams learn MLOps statistics by combining three powerful forces:
Narrative Cohesion
A fictional "DriftCity" story where algorithms power urban transportation, making abstract concepts tangible through metaphor.
Interactive Exploration
Live Plotly visualizations with sliders, comparisons, and real-time calculations that let learners experiment and discover.
Production Reality
Code patterns and case studies from Uber, Airbnb, Netflix, and DoorDash showing exactly how these concepts work in practice.
Statistical Concepts Covered
Chapter 1: The City That Learned Too Fast
Baseline Distributions & Drift Detection
- Population Stability Index (PSI) — Quantifies distribution shift between reference and current windows
- Kolmogorov-Smirnov Test — Non-parametric test comparing empirical CDFs
- Reference Windows — Establishing baseline P(X) for feature monitoring
Chapter 2: The Weather Event
Covariate Drift (P(X) Changes)
- Covariate Shift — Input distributions change while P(Y|X) remains stable
- Distribution Overlay Analysis — Visual comparison of baseline vs. current histograms
- Trend Monitoring — Tracking PSI over time windows to detect sustained shifts
Chapter 3: The Vanishing Commuter
Concept Drift (P(Y|X) Changes)
- Concept Drift — The relationship between inputs and outputs breaks down
- RMSE/MAE Trend Analysis — Tracking prediction error over time as drift signal
- Residual Analysis — Identifying spatial/temporal patterns in model failures
Chapter 4: The Great Experiment
A/B Testing & Controlled Experiments
- Sample Ratio Mismatch (SRM) — Chi-square test detecting randomization failures
- Statistical Power Analysis — Determining sample sizes to detect meaningful effects
- Type I/II Errors — Understanding false positive and false negative trade-offs
Chapter 5: The CUPED Control Tower
Variance Reduction & Sequential Testing
- CUPED — Controlled-experiment Using Pre-Experiment Data
- Variance Reduction — Reduction approaches rho-squared (correlation squared)
- Sequential Testing — O'Brien-Fleming boundaries for early stopping
Chapter 6: The City Restored
Continuous Monitoring & Guardrails
- Closed Feedback Loop — Detect, Diagnose, Retrain, Revalidate, Redeploy
- Dual-Metric Correlation — Tracking PSI against RMSE to quantify drift impact
- Automated Guardrails — Threshold-based triggers for SLA breaches
Technical Implementation
| Layer | Technology | Purpose |
|---|---|---|
| Framework | Next.js 14 | App Router, static generation, TypeScript strict mode |
| Content | MDX | Markdown with embedded React components |
| Visualization | Plotly.js | Interactive charts with client-only rendering |
| Styling | CSS Variables | Design tokens for consistent theming |
| Deployment | Vercel | Auto-deploy on main branch push |
Dynamic Imports
Plotly charts use dynamic imports with SSR disabled to prevent hydration errors and optimize bundle size.
Lazy Loading
Charts defer CSV loading until scrolled into viewport using IntersectionObserver with 120px rootMargin.
CSV Pipeline
D3-DSV for parsing with type coercion, parallel fetching, and Laplace smoothing for PSI calculations.
Teaching Methodology
Narrative-Driven Learning
Each chapter uses metaphor to make abstract concepts concrete:
| Chapter | Metaphor | Statistical Concept |
|---|---|---|
| 1 | City establishing equilibrium | Baseline distributions |
| 2 | Weather event | Covariate drift |
| 3 | Commuter behavior change | Concept drift |
| 4 | Engine competition | A/B testing |
| 5 | Control tower precision | Variance reduction |
| 6 | City recovery | Monitoring & feedback loops |
Three Evidence Layers
Every concept is taught through multiple lenses:
- Mathematical — Formulas and statistical tests (PSI, KS, CUPED)
- Visual — Interactive charts with threshold indicators
- Operational — Decision rules and production thresholds
Industry Case Studies
The content references real implementations from leading tech companies:
Uber Michelangelo
- Nightly feature monitoring computing PSI/KS for all continuous features
- Residual analysis flagging zones where error exceeds 2 sigma
- Auto-drain traffic on drift or SLA breach
Airbnb Experimentation
- CUPED on booking conversion achieving ~40% sample reduction
- Guardrail blocking for metric regressions
- XP Guards preventing concurrent test interference
Netflix XP
- Thousands of concurrent A/B tests daily
- Auto-checks for SRM, power, and guardrail violations
- Sequential testing ending ~10% of experiments early
DoorDash Feature Store
- Streaming feature store with 7-day moving PSI average
- Drift detection combined with volume metrics
- Upstream ETL failure detection via missing-data drift
Key Features
Skills Demonstrated
Statistical Analysis
- PSI implementation
- Kolmogorov-Smirnov test
- CUPED variance reduction
- Sequential testing
- A/B test power analysis
Educational Design
- Narrative framing
- Visual-first learning
- Progressive complexity
- Industry-grounded examples
UX & Interaction
- Interactive sliders & toggles
- Real-time calculations
- Color-coded thresholds
- Two-column layout
What Makes This Project Distinctive
- Narrative Cohesion — Unlike fragmented tutorials, DriftCity weaves statistical concepts into a consistent story where readers understand the "why" behind each metric
- Hands-On Interactivity — Sliders, comparisons, and live simulations let learners explore concepts, not just read about them
- Production-Grade Examples — Code snippets aren't academic—they're patterns used by Uber, Airbnb, and Netflix
- Accessibility — WCAG AA compliance and simple visual metaphors make MLOps accessible to non-statisticians
- Extensible Architecture — Modular MDX + component design allows rapid chapter additions without framework changes
Ready to Learn?
Start with Chapter 1 to understand baseline distributions and drift detection fundamentals.