measured study · 2026-07-02

The scaling law: how many days until a true twin?

The look-alike finder’s promise is a historical day that looks like today’s. So how big does the history have to be before everyday has a near-perfect twin? We measured it — on the finder’s own distance and its own 0–100 score — and the answer comes down to one idea: how many independent ways a trading day can differ.

Days differ in ~15 independent ways

Think of every knob a session can turn independently of the others: which way it gaps, how hard the open drives, whether the first pullback holds, what lunch does, whether the afternoon extends or reverses, where it closes in its range, how wide the bars run, where the wicks cluster… If days only varied in two or three ways, near-clones would be everywhere. Measured from how fast best-match quality improves as the pool grows, a full RTH session behaves like it has about 15 independent knobs — its effective dimension. A twin has to agree with today on all of them at once, which is why each doubling of data buys a smaller step: variety multiplies, matches only add.

Solid = measured (200 seeded query days, nested pool subsets, the engine’s exact distance and score). Dashed = the fitted power law extended — a prediction, not a measurement.

What the curve says

Today’s pool (180,591 full sessions, 117 symbols): the median day’s best twin scores 82/100, and 26% of days find a 90+ twin.
At 2.28M days — the census-counted, gate-passing universe of US common stocks — the median best twin is predicted near 87, with 90+ coverage roughly doubling.
A true 1:1 copy of every full day is unreachable by data volume alone: at ~15 effective dimensions, each halving of the remaining gap costs about 2¹⁵≈ 33,000× more data. No corpus that will ever exist closes that by brute force.
Shorter windows change the game: the first two hours behave like ~13 dimensions and improve faster per doubling; a single breakout leg is lower-dimensional still. 1:1 mornings and 1:1 phases are findable where 1:1 full days are not — they just need a yardstick calibrated to their own scale, because a per-bar score undersells how identical two short windows look.

What we do with this

Three levers, in order of what the curve says they’re worth: grow the pool to the full gated universe (the biggest single step available); align smarter, so a day that printed the same path on a different clock counts as the clone it visually is; and match at the scale where clones actually live — mornings and phases, each scored against its own calibration. And when a day genuinely has no twin, the finder says so. “Nothing in millions of days looks like this” is information, not failure.

Method: engine-native distances (z-normalized multi-channel banded DTW, the matcher’s weights) over nested seeded pool subsets; medians over 200 query days; effective dimension from the log-log slope of best-twin distance vs pool size. Reports: twin_quality_curve.json, twin_quality_curve_24.json. Extrapolations are labeled predictions and will be re-measured as the pool actually grows.