monte-carlo-pricing-engine
User needs probabilistic pricing for an asset affected by geopolitical disruption scenarios
Changelog
260428: reclassify pattern label
- Reclassified self-improvement pattern P3+P5 (Metric Ratchet + Predictive Substrate) to P3 only per consolidated vault audit 2026-04-27.
260420: multiple edits
- v_migrate: Changelog migrated from table to YYMMDD H3 format per versioning-standard rule 2 (V1.6 of skills upgrade plan)
- v6: Added license, sources, skill_path per V6.1/V6.2 of skills upgrade plan.
- v1.5: Added
## Quality Checkssection per V1.5 of ~/vault/plans/2026-04-20-vault-skills-upgrade-plan.md
260403: Added Visual Enrichment section + self-improving-agent-patterns cross-reference
260331: Initial creation
Description
Use this when you need probabilistic pricing for an asset where value is driven by discrete geopolitical disruption scenarios: not continuous market drift. It solves the “regime-jump” pricing problem: standard Brownian motion models fail to capture the fat-tailed price behavior that occurs when a Strait gets mined, sanctions are lifted, or OPEC cuts production. The model needs discrete events with explicit probability and duration distributions.
The engine runs 5,000 Monte Carlo paths using jump-diffusion: continuous Brownian drift punctuated by Poisson-arrival disruption events with Student-t(5) magnitude draws. An AR(1) error correction layer eliminates autocorrelation in forecast residuals. Sigmoid calibration maps raw probability estimates to calibrated threshold-exceedance probabilities. The 8-layer consensus pipeline refreshes parameters hourly using a tiered validation system (6 parallel sub-agents, each monitoring a geopolitical data source).
Applied to Hormuz Strait disruption pricing in projects/oil/_index, the engine beat Polymarket’s market-implied probabilities by +9.93 Brier score points (R² 0.9498 to 0.9752, MAE $1.633 to $0.833: 49% reduction). The sell model reached 100% accuracy using an 11-trigger conviction hold: P > 90%, edge > 25pp, daysLeft > 5.
The key architectural decision: 8 active parameters against 33 frozen ones (4.2
ratio). Unfreezing parameters without cause is the fastest way to overfit to recent history and break the model’s OOS performance.Interface
Trigger: Probabilistic pricing under geopolitical uncertainty with discrete disruption scenarios.
Inputs:
base_price: current market price (anchor for relative disruption sizing)disruption_parameters: 8 active parameters: pmCeil, supEl, demTh1, insM, mnPr, akFr, supPow, ynMaxduration_probabilities: scenario duration weight distribution (short/medium/extended/prolonged)historical_analogues: past disruption events for calibrating jump magnitude and frequency
Outputs:
price_distribution: 5,000-path MC distribution with 10th/50th/90th percentile bandscalibrated_probabilities: sigmoid-calibrated threshold-exceedance probabilities for key price levelsbrier_score: vs Polymarket market-implied probabilities (tracks model edge)
Provenance
5 major model iterations (v13-v17.2) across experiments/oil/2026-03-16-v13-tail-risk-recalibration, experiments/oil/2026-03-17-v14-ar-error-correction, experiments/oil/2026-03-17-v14b-sigmoid-calibration, experiments/oil/2026-03-18-v16-sell-model-deesc-tuning, and experiments/oil/2026-03-18-v17-realtime-consensus-pipeline. Each iteration addressed a specific failure mode: v13 corrected tail risk underestimation, v14 eliminated autocorrelated residuals, v14b fixed probability calibration bias, v16 built the sell signal, v17 automated parameter refresh.
Related patterns: skills/karpathy-ratchet (the ratchet methodology used across all 5 iterations), skills/self-improving-agent-patterns (Pattern 3: Metric Ratchet). Forecasting is the product, not meta-prediction of optimization yield: reclassified from P3+P5 to P3 per consolidated vault audit 2026-04-27.
Usage Notes
- Do not unfreeze parameters without cause. The 4.2 frozen-to-active ratio protects OOS performance. Unfreezing is a structural change, not a tuning step.
- Hourly refresh runs at via 6 parallel sub-agents, each monitoring one geopolitical data source tier.
- Sell model conviction hold: P > 90%, edge > 25pp, daysLeft > 5. All three conditions must hold simultaneously.
- Portfolio circuit breaker at -15% total P&L. Hard stop, no override.
- The geopolitical consensus pipeline uses tier-based validation: Tier 1 sources (official government announcements) override Tier 2-3 (news, market signals).
Quality Checks
- ≥ 10,000 paths simulated. Configurable floor; below 10k the CI widens unacceptably.
grep 'n_paths' config.yamlshows the value. - No NaN/Inf in simulated prices. Output array passes
is.finite()for all elements: a single NaN signals an upstream div-by-zero. - Jump-diffusion parameters in documented ranges. λ (jump intensity), μ_J (jump mean), σ_J (jump stddev) within the literature-backed bounds; document the band.
- Scenario probabilities sum to 1.0 ±ε. Geopolitical scenario weights must be a valid probability distribution. Assert
abs(sum - 1.0) < 1e-6. - Backtest MAPE < 3%. On the 12-month holdout, model beats this threshold or flag the drift.
- R² > 0.9 on holdout. Along with MAPE; both gate a model promotion.
Visual Enrichment
When this skill produces output that benefits from visualization:
| Finding Type | Tool | Specification |
|---|---|---|
| MC price distributions | R viz (skills/r-visualization-pipeline) | Family: DST, Template: Journal |
| Price forecast with confidence intervals | R viz (skills/r-visualization-pipeline) | Family: TS, Template: Journal |
| Calibration scatter + trend | R viz (skills/r-visualization-pipeline) | Family: COR, Template: Journal |
See topics/visual-output-routing for the full routing decision framework.
Self-improvement context: This skill relates to Pattern 3 (Metric Ratchet) from skills/self-improving-agent-patterns. 7 model versions ratcheted. Forecasting is the product (not meta-prediction of optimization yield): P5 label removed per audit 2026-04-27.