Forecasts & accuracy
Probabilistic 30/60/90-day risk forecasts per AI vendor. We publish them here. We Brier-score them against subsequent reality. We publish that too.
A model that forecasts and never grades itself is a guru. A model that forecasts and grades itself in public is a forecaster. We are the latter.
Live forecasts (sorted by 30-day risk)
| Vendor | Band | 30d | 60d | 90d | Most likely | Severity |
|---|---|---|---|---|---|---|
| GitHub Copilot | on watch | 73% | 93% | 98% | billing-model-shift | major |
| ChatGPT | on watch | 67% | 89% | 96% | billing-model-shift | major |
| Cursor | clean | 59% | 83% | 93% | billing-model-shift | critical |
| Claude (Anthropic) | on watch | 31% | 52% | 67% | feature-gated | major |
| Claude Code | shrinking | 25% | 44% | 58% | tier-removed | critical |
Public scoreboard
20
forecasts published
0
forecasts resolved
awaiting outcomes
—
brier score
needs ≥10 resolutions
—
calibration
needs ≥30 resolutions
We've just published our first 20 forecasts (one per tracked vendor). As prediction windows close, we'll grade each forecast against what actually happened. The board updates automatically. No cherry-picking. We will publish the wrong calls in red.
How forecasts work
- Probability of any tracked change landing in the next 30/60/90 days, per vendor.
- v0 model: exponential rate from observed cadence with a recency multiplier (1.3 if last receipt was within 30 days).
- Capped at 95/97/98% to leave room for upside surprise — no certainty claims.
- Likely change kind = mode of observed kinds. Expected severity = severity-weighted average.
- v1 model (planned): gradient boost over funding events, exec hires, ToS commit velocity, GitHub repo activity, hiring spikes.
- Source code: lib/forecast.ts. PRs welcome.