Skill

monte-carlo-pricing-engine

March 30, 2026

quant-financemlpatternbreakthrough

Trigger

User needs probabilistic pricing for an asset affected by geopolitical disruption scenarios

Version: 260428

Changelog

260428: reclassify pattern label

Reclassified self-improvement pattern P3+P5 (Metric Ratchet + Predictive Substrate) to P3 only per consolidated vault audit 2026-04-27.

260420: multiple edits

v_migrate: Changelog migrated from table to YYMMDD H3 format per versioning-standard rule 2 (V1.6 of skills upgrade plan)
v6: Added license, sources, skill_path per V6.1/V6.2 of skills upgrade plan.
v1.5: Added ## Quality Checks section per V1.5 of ~/vault/plans/2026-04-20-vault-skills-upgrade-plan.md

260403: Added Visual Enrichment section + self-improving-agent-patterns cross-reference

Use this when you need probabilistic pricing for an asset where value is driven by discrete geopolitical disruption scenarios: not continuous market drift. It solves the “regime-jump” pricing problem: standard Brownian motion models fail to capture the fat-tailed price behavior that occurs when a Strait gets mined, sanctions are lifted, or OPEC cuts production. The model needs discrete events with explicit probability and duration distributions.

The engine runs 5,000 Monte Carlo paths using jump-diffusion: continuous Brownian drift punctuated by Poisson-arrival disruption events with Student-t(5) magnitude draws. An AR(1) error correction layer eliminates autocorrelation in forecast residuals. Sigmoid calibration maps raw probability estimates to calibrated threshold-exceedance probabilities. The 8-layer consensus pipeline refreshes parameters hourly using a tiered validation system (6 parallel sub-agents, each monitoring a geopolitical data source).

Applied to Hormuz Strait disruption pricing in projects/oil/_index, the engine beat Polymarket’s market-implied probabilities by +9.93 Brier score points (R² 0.9498 to 0.9752, MAE $1.633 to $0.833: 49% reduction). The sell model reached 100% accuracy using an 11-trigger conviction hold: P > 90%, edge > 25pp, daysLeft > 5.

The key architectural decision: 8 active parameters against 33 frozen ones (4.2

ratio). Unfreezing parameters without cause is the fastest way to overfit to recent history and break the model’s OOS performance.

Interface

Trigger: Probabilistic pricing under geopolitical uncertainty with discrete disruption scenarios.

Inputs:

base_price: current market price (anchor for relative disruption sizing)
disruption_parameters: 8 active parameters: pmCeil, supEl, demTh1, insM, mnPr, akFr, supPow, ynMax
duration_probabilities: scenario duration weight distribution (short/medium/extended/prolonged)
historical_analogues: past disruption events for calibrating jump magnitude and frequency

Outputs:

price_distribution: 5,000-path MC distribution with 10th/50th/90th percentile bands
calibrated_probabilities: sigmoid-calibrated threshold-exceedance probabilities for key price levels
brier_score: vs Polymarket market-implied probabilities (tracks model edge)

Provenance

5 major model iterations (v13-v17.2) across experiments/oil/2026-03-16-v13-tail-risk-recalibration, experiments/oil/2026-03-17-v14-ar-error-correction, experiments/oil/2026-03-17-v14b-sigmoid-calibration, experiments/oil/2026-03-18-v16-sell-model-deesc-tuning, and experiments/oil/2026-03-18-v17-realtime-consensus-pipeline. Each iteration addressed a specific failure mode: v13 corrected tail risk underestimation, v14 eliminated autocorrelated residuals, v14b fixed probability calibration bias, v16 built the sell signal, v17 automated parameter refresh.

Related patterns: skills/karpathy-ratchet (the ratchet methodology used across all 5 iterations), skills/self-improving-agent-patterns (Pattern 3: Metric Ratchet). Forecasting is the product, not meta-prediction of optimization yield: reclassified from P3+P5 to P3 per consolidated vault audit 2026-04-27.

Usage Notes

Do not unfreeze parameters without cause. The 4.2
frozen-to-active ratio protects OOS performance. Unfreezing is a structural change, not a tuning step.
Hourly refresh runs at
via 6 parallel sub-agents, each monitoring one geopolitical data source tier.
Sell model conviction hold: P > 90%, edge > 25pp, daysLeft > 5. All three conditions must hold simultaneously.
Portfolio circuit breaker at -15% total P&L. Hard stop, no override.
The geopolitical consensus pipeline uses tier-based validation: Tier 1 sources (official government announcements) override Tier 2-3 (news, market signals).

Quality Checks

≥ 10,000 paths simulated. Configurable floor; below 10k the CI widens unacceptably. grep 'n_paths' config.yaml shows the value.
No NaN/Inf in simulated prices. Output array passes is.finite() for all elements: a single NaN signals an upstream div-by-zero.
Jump-diffusion parameters in documented ranges. λ (jump intensity), μ_J (jump mean), σ_J (jump stddev) within the literature-backed bounds; document the band.
Scenario probabilities sum to 1.0 ±ε. Geopolitical scenario weights must be a valid probability distribution. Assert abs(sum - 1.0) < 1e-6.
Backtest MAPE < 3%. On the 12-month holdout, model beats this threshold or flag the drift.
R² > 0.9 on holdout. Along with MAPE; both gate a model promotion.

Visual Enrichment

When this skill produces output that benefits from visualization:

Finding Type	Tool	Specification
MC price distributions	R viz (skills/r-visualization-pipeline)	Family: `DST`, Template: Journal
Price forecast with confidence intervals	R viz (skills/r-visualization-pipeline)	Family: `TS`, Template: Journal
Calibration scatter + trend	R viz (skills/r-visualization-pipeline)	Family: `COR`, Template: Journal

See topics/visual-output-routing for the full routing decision framework.

Self-improvement context: This skill relates to Pattern 3 (Metric Ratchet) from skills/self-improving-agent-patterns. 7 model versions ratcheted. Forecasting is the product (not meta-prediction of optimization yield): P5 label removed per audit 2026-04-27.