Multi-Model Regime Divergence
As of April 21, 2026 (end-of-day snapshot). Pages update daily after the market close.
Where the 8-model calibration suite disagrees most about how to fit the implied-volatility surface. The dispersion score (MAD / median of iv_rmse) isolates how differently the models agree on fit quality, while the displayed median RMSE reveals how well ANY of them actually fits — low dispersion with low median RMSE means all 8 models agree it's a clean surface; low dispersion with high median RMSE means they all agree it's a hard surface to fit. High dispersion means specific models are capturing features (jumps, stochastic vol, heavy tails) that others miss — the regime-transition signal.
Top 50 by MAD / Median
The live model-divergence leaderboard loads after the page hydrates. Rows are ranked by MAD / median of per-model iv_rmse across 8 calibrated pricing models. Regime universe (~124 symbols spanning single stocks, sector ETFs, and bond ETFs).
Methodology
Cross-model score = MAD(iv_rmse) / max(median(iv_rmse), ε) across 8 models (bates, bs, essvi, heston, kou, merton, sabr, vg). Filters per-fit: is_fallback = false, iv_rmse finite and positive, n_options ≥ 10. Symbols included only when ≥6 valid fits exist. MAD is robust to single-model outliers. Median RMSE is displayed as the absolute fit-quality column — read it alongside the dispersion score to distinguish "all models agree well" from "all models agree poorly." Sourced from regime_model_fits daily. Regime universe (~124 symbols spanning single stocks, sector ETFs, and bond ETFs).
Frequently Asked Questions
Why 8 models?
Each captures a different slice of market dynamics. Black-Scholes is the baseline. Heston adds stochastic vol. SABR parameterizes the smile. Merton and Kou add jumps. Bates combines Heston + jumps. Variance Gamma uses a subordinated Brownian motion for heavy tails. eSSVI parameterizes the surface itself. High agreement + good fit = clean surface. High agreement + poor fit = hard surface no model handles well. Disagreement = specific features some models capture and others miss.
Why MAD/median instead of standard deviation?
MAD is robust — unlike stddev, it is not inflated by one or two outlier fits. If 7 of 8 models fit similarly and one fails badly, stddev spikes but MAD barely moves, which matches the regime question we are asking.
What does the best-fit model tell me?
Which model currently explains the surface best. Heavy-tailed models (Kou, Bates) fitting best = surface has jump-like features. Smooth models (BS, eSSVI) fitting best = smoothly-behaved surface. Changes in which model fits best over time are themselves a regime signal.
Why ~124 symbols?
The regime pipeline runs a full 8-model calibration per symbol, which is computationally expensive. We currently run it across a curated universe of ~124 names covering major single stocks, sector ETFs (XLF, XLY, XLE, etc.), and bond ETFs (TLT, IEF, LQD). Each symbol is categorized into one scope (bellwether / sector / fixed_income / etc.) in regime_daily, but the screener reads across all scopes so coverage is uniform. Expanding the universe is an ops decision, not a data-availability issue.