The Divergence Is the Signal
· 12 min read
The previous post argued that Black-Scholes is the coordinate origin of model space: the zero-information baseline from which every other pricing model deviates.
This post makes the practical claim that follows from it: those deviations are not noise to be minimized. They are state variables to be measured.
If BSM is the zero-information price, then the gap between BSM and any calibrated alternative is a priced expression of what the market believes beyond no-arbitrage. Track that gap over time, across model pairs, and you have something no individual model can give you: a direct readout of the market's collective beliefs about the character of uncertainty itself.
The Number Everyone Ignores
Walk onto any trading desk and you'll hear two conversations happening in parallel.
The first is about prices. What's the bid on the 30-delta put? Where's the straddle? Is the skew rich or cheap?
The second is about models. Heston says the tail is underpriced. Local vol says the barrier's too cheap. The Monte Carlo disagrees with both.
These conversations never connect. Traders use one model's output as a price, then switch to another when the first one "doesn't feel right." The divergence between models gets treated as a calibration problem - a number to be minimized, an embarrassment to be papered over.
This is backwards.
The divergence is where the information lives.
What Divergence Actually Encodes
BSM is the maximum-entropy solution: the price you get when you assume continuous paths, constant volatility, and nothing else. A calibrated Heston model is the price you get when you add a specific belief: volatility is stochastic, with particular vol-of-vol, mean-reversion, and spot-vol correlation parameters. The difference between these two prices is the dollar expression of that belief.
This generalizes across model pairs, with an important caveat: the decomposition is not perfectly clean. Calibrated models absorb surface features into their parameters in overlapping ways. But each projection loads most heavily on a specific feature of market dynamics:
- BSM vs. Heston loads primarily on stochastic volatility: vol-of-vol risk, spot-vol correlation, and the surface curvature that BSM's flat-vol assumption cannot represent.
- BSM vs. Merton (or Kou) loads primarily on jump risk: the premium the market pays for discrete, sudden dislocations that continuous-path models structurally exclude.
- BSM vs. Variance Gamma loads primarily on tail heaviness and path asymmetry beyond what lognormality accommodates.
- Heston vs. Merton is the most diagnostic pairing. It isolates the market's view on how tail risk arrives. Grinding volatility expansion (Heston's world) and sudden discrete repricing (Merton's world) are different catastrophes with different hedge implications.
Anatomy of a Divergence
Consider a long-dated, out-of-the-money put on a mega-cap equity. Six months out, 20% below spot:
| Model | Price | Divergence from BSM |
|---|---|---|
| BSM (ATM vol) | $4.10 | - |
| Heston | $5.95 | +45% |
| Merton Jump-Diffusion | $6.40 | +56% |
That gap is not merely model error. It is an accounting of what BSM leaves out: stochastic vol premium, vol-of-vol premium, and correlation premium.
Divergence expanding means the market is injecting new information. Divergence compressing means the market is shedding beliefs. Divergence inverting - BSM pricing higher than a calibrated model - signals a market belief state calmer than the naïve assumption.
Cross-Model Triangulation
Heston wide, Merton narrow: grinding-fear environment. Late-cycle macro deterioration, not flash crash.
Merton wide, Heston narrow: the market expects a sudden discrete event. Pre-earnings, pre-FOMC signature.
Both wide: genuine tail-risk regime. Late 2008, March 2020.
Both narrow: low-vol, low-skew, low-conviction calm.
The Operational Framework
Step 1: Price in BSM. Always. Step 2: Price in at least two calibrated models. Step 3: Compute the divergence profile. Step 4: Read the divergence. Step 5: Trade the divergence, not the story. Step 6: Monitor for regime breaks.
The Meta-Lesson
You don't need any particular-case model to be "right." You need them to be different from each other in informative ways. The distance between them - denominated in dollars, tracked over time, decomposed across model pairs - tells you what the market believes about the regime it's operating in.
Model divergence is not a nuisance to minimize. It is a state variable. Monitor it like you monitor skew, term structure, or realized vol.
The divergence is the signal.