What you'll learn
- How to detect covariate drift—changes in input distributions P(X)
- How to compute Population Stability Index (PSI) and Kolmogorov–Smirnov (KS) tests
- How to interpret PSI thresholds and visual patterns of shift
- Why covariate drift is often a precursor to concept drift
1. What is Covariate Drift?
Covariate drift occurs when the distribution of features P(X) changes, but the underlying relationship between features and target P(Y | X) remains the same — for now. The model's learned patterns still apply, but its inputs no longer represent the world it trained on.
In DriftCity, a sudden rainstorm altered rider behavior:
- Fewer people took short trips (walking became safer than waiting for a ride)
- More people took long trips (public transit was disrupted)
- Surge pricing responded to demand shifts
The relationships between distance and fare still held, but the input distribution moved.
2. Compare Baseline vs Rainstorm
Below are baseline histograms (blue) overlaid with rainstorm data (amber). Toggle between features to see where the shift is most pronounced.
Loading histogram data...
Observations:
- trip_distance_km: Rainstorm shows fewer short rides (< 2 km) and a heavier right-tail of long trips (> 10 km)
- surge_multiplier: Rain increased surge pricing; higher mean and wider spread
- fare_amount: Shifted right due to both distance and surge changes
The model trained on dry-weather patterns will now encounter more extreme inputs — and may produce less accurate predictions or intervals.
3. Quantifying Shift with PSI and KS
Population Stability Index (PSI) measures the difference between a reference (baseline) distribution and a current distribution. It's widely used in production monitoring because:
- Non-parametric: doesn't assume any specific distribution shape
- Symmetric: treats both directions of change equally
- Interpretable: thresholds are easy to operationalize
Calculating PSI...
PSI Interpretation:
- PSI < 0.10: Stable distribution (✅ no action)
- 0.10 ≤ PSI < 0.25: Moderate drift (⚠️ watch and log)
- PSI ≥ 0.25: Major shift (🚨 investigate, possibly retrain)
Kolmogorov–Smirnov Test provides a formal statistical hypothesis test:
from scipy.stats import ks_2samp
ks_stat, p_value = ks_2samp(
baseline["trip_distance_km"],
rainstorm["trip_distance_km"]
)
# Example result: statistic ≈ 0.36, p < 0.001
# Interpretation: reject H0 (same distribution)
Use KS for statistical rigor; use PSI for operational dashboards with human-friendly thresholds.
4. Monitoring Drift Over Time
A single PSI comparison is a snapshot. Real value emerges when you track PSI daily or weekly and watch the trend.
In this example, the rainstorm event (day 10) caused a sudden spike. Without monitoring, the shift might go unnoticed until model performance degrades weeks later.
Calculating PSI...
A sustained upward drift slope warns that inputs are diverging — a signal for:
- Data engineering: investigate upstream changes
- Retraining: incorporate recent data into model updates
- Monitoring: set tighter alerting thresholds
5. Run It Yourself
import numpy as np, pandas as pd
rng = np.random.default_rng(9)
# Load baseline
df0 = pd.read_csv("rides_baseline.csv")
N = len(df0)
# Simulate rain: fewer short trips, more long ones
trip = np.clip(rng.normal(7.8, 2.5, N), 0.3, None)
surge = np.clip(rng.lognormal(mean=0.12, sigma=0.20, size=N), 1.0, None)
fare = np.clip(38 + trip*3.5 + rng.normal(0, 6, N), 5, None)
# Create rainstorm dataset
df1 = df0.copy()
df1["trip_distance_km"] = trip
df1["surge_multiplier"] = surge
df1["fare_amount"] = fare
df1.to_csv("rides_rainstorm.csv", index=False)
print("Wrote rides_rainstorm.csv")6. Real-World Practice: Drift Monitoring in Production
Major tech companies use PSI and KS as part of their model monitoring stacks:
| Company | Implementation | Key Insight |
|---|---|---|
| Uber Michelangelo | Nightly feature monitoring jobs compute PSI/KS for all continuous features | Automate drift alerts before training pipelines run |
| DoorDash | Streaming feature store includes drift detectors with 7-day moving PSI average | Smooth short-term noise; alert only on sustained shifts |
| Pinterest Ads | Combine PSI with volume metrics to catch missing-data drift | Detect upstream ETL failures that change feature sparsity |
| Airbnb | Per-segment drift tracking (by geography, user cohort) | Understand where and whom shift affects most |
Common thresholds:
- PSI < 0.1 → Log and proceed
- 0.1 ≤ PSI < 0.25 → Create ticket, annotate in experiment logs
- PSI ≥ 0.25 → Page on-call engineer, trigger retraining job
7. Key Takeaways
Covariate Drift Checklist
- Establish reference histograms per feature before deployment
- Compute PSI & KS regularly (daily or weekly) between live and baseline windows
- Alert if PSI ≥ 0.25 or KS p < 0.01
- Track drift trends — one spike can be noise; sustained growth = real issue
- Covariate drift is often a precursor to concept drift → monitor closely
- Use overlay histograms to identify where the shift is (center vs tails)
8. Where This Connects
This chapter showed that inputs changed but model performance (in theory) stayed the same. In Chapter 3: The Vanishing Commuter, we'll see what happens when the relationship P(Y | X) itself changes — and model errors surge despite stable inputs.