research narrative · 2026-07-03

Shape math: how a chart read becomes a number

Everything the look-alike engine does leans on one move: turn the thing a price-action trader sees on an intraday chart into a number, so ~167,000 historical sessions can be searched for precedent. This page is the story of that math — what ships today, what’s landing now, and what comes after — with every method tied back to a read you already make on the tape.

The bet: a chart read is geometry

When you read a 5-minute chart you’re doing geometry. The drive off the open, the first pullback that holds or doesn’t, the wedge into lunch, the afternoon that never comes back — each of those reads is a statement about the shape of a path. The site’s core bet is that this is measurable: a chart read is geometry, geometry can be measured, and measured shapes can be searched. Get the measurement right and the archive stops being a pile of old charts and becomes an index of precedent. (How far search alone can carry it is itself measured — the scaling law: days vary in ~15 independent ways, so every doubling of data buys a smaller step.)

The honest part comes first, because it frames everything else. The engine is a visual-similarity tool, not a predictor. The site’s own 5-fold cross-validation showed the morning does not predict the afternoon — and the engine’s spec says so in its second sentence instead of burying it. When a match card shows what the closest days did next, that’s a labeled picture of the past, shown beside an any-day baseline so you can see when the matched days are doing nothing a random day wouldn’t. Every method below inherits that framing: instruments for describing the tape, not oracles.

What ships today: twins in warped time live today

Two days almost never make the same move at the same minute. Yesterday’s twin printed its pullback at 10:50 where today’s came at 11:00; compared bar-for-bar they read as strangers, while your eye calls them the same day. Dynamic time warping (DTW) fixes exactly this: it lets the two paths stretch a little in time so the pullbacks line up, then scores whatever difference remains. The stretch is capped at a ±4-bar band — about 20 minutes of timing slack on 5-minute bars — so a pullback can slide, but a morning can’t masquerade as an afternoon.

All of it collapses to one distance. On its scale, ~0.17 reads as a tight twin and ~0.24 as a generic shape hundreds of days share — and when today is the latter, the engine says “generic morning” out loud and caps its own grades. The pool it searches: 104 tickers, ~167k sessions — indexes, mega-caps, sector ETFs, FX majors, crypto — every day sliced to the same 78-bar regular-hours window so the comparison is apples to apples at all.

Shape stays the oracle; three opt-in lenses re-rank its shortlist with context a path can’t carry. Structure — do both days print the same named Brooks structures (wedge, head & shoulders, opening reversal, climax)?Levels — z-norm deliberately erases location, so this lens restores the memory you carry between sessions (yesterday’s high/low/close, the week’s open, unfilled gaps, all in ATR units from today’s open) and favors twins that started the day in the same neighborhood. Volume — did participation arrive the same way, an opening surge vs a dead tape? Each lens is a nudge on top of the pinned distance, not a new oracle — and each is checked against a frozen tail-agreement rig rather than trusted on vibes.

The map: what your eye sees → what the math measures

Everything on this page in one table. Each row is a read you already make with your eye, and the measurement that turns it into a number you can search, rank, or filter on.

What your eye seesWhat the math measuresStatus
These two days look the sameBanded multi-channel DTW distance — the shipping enginelive today
Twins that never really divergedFréchet distance — the worst point on the pathlanding now
Same day, except one spikeLCSS / EDR — matching that skips bad printslanding now
Same move, shifted in timeSBD — best whole-path alignment by cross-correlationlanding now
THE opening-reversal daySoft-DTW barycenter — align first, then averagelive today
The market keeps drawing this shapeMatrix profile motifslive today
Tape like nothing I’ve seenMatrix profile discordslive today
The day changed character right thereFLUSS — finding the seam in the sessionlive today
Trend day vs range dayHurst exponent / DFAlive today
Orderly grind vs chopPermutation entropylive today
Stretches keep snapping backOU half-life — minutes for a stretch to decaylive today
Price is being accepted at the levelRQA laminarity — time spent stucklive today
P-day vs b-day volume profileWasserstein distance between volume-at-price histogramslive today
A day’s fingerprint on a cardPath signature — the path as a fixed vectorroadmap
Which timescale is doing the workWavelets / EMD — energy by timescaleroadmap
Grep the archive for a chartSAX symbols + learned shapeletsroadmap
The regime just changedHMM day-states / BOCPD change-point probabilityroadmap
These days rhyme, don’t ask whyCompression distance (NCD)roadmap
A double top that won’t diePersistent homology — long-lived loopsroadmap

Status chips are edited by hand as work lands: live today is serving on the site, landing now is being built and benchmarked in parallel with this page, roadmap is not started.

Stricter twins, different questions landing now

What your eye sees. When you call two days “the same day,” you often mean something stricter than DTW checks: they never stopped agreeing — no ten-minute stretch where one panicked and the other didn’t. Other times you mean something looser: “same day except for that one news spike,” or “the same move, it just started an hour late.” Those are three different questions, and one distance can’t answer all of them.

What the math measures. Four metrics, four definitions of “same.” Fréchet distance is the dog-leash test: walk both paths start to finish and record the longest leash you ever need — the score is the worst point, so a twin that diverged once, anywhere, is caught. LCSS asks how much of the two days can be paired up within a tolerance and simply skips what can’t — one bad print doesn’t poison the match. EDR counts how many bars you’d have to patch to turn one day into the other, an edit distance for tape. SBD slides one whole path against the other (cross-correlation) and scores the best alignment — pure time-shift matching, no local warping at all.

How it applies here. These run as second opinions on the DTW shortlist. A match that scores well on DTW and Fréchet is a certified never-diverged twin — a stronger claim than DTW alone can make. LCSS and EDR rescue twins that one spike would have cost. And SBD catches “the same open, an hour late,” which the ±4-bar band deliberately refuses — a different question the engine currently can’t ask.

The canonical curve of a family live today

What your eye sees. You carry an idealized template of each day type in your head — the opening-reversal day, the trend-from-the-open bull. No single date is it; every real instance is a noisy copy with its own timing. The template is the thing you actually compare against, and until now it only existed as intuition.

What the math measures. Averaging 200 opening-reversal days point-by-point destroys the pattern: one reverses at 10:00, another at 10:20, and the Euclidean average cancels the out-of-phase moves — the turn that defines the family flattens into mush. A soft-DTW barycenter aligns first and averages second: it finds the one curve whose total warped distance to all 200 members is smallest, so the reversals get lined up before they’re averaged and the turn survives.

How it applies here. Take every session the archive files under a pattern family and render the canonical opening-reversal day as an actual curve — not a description, a path. Archetypes become objects today can be scored against (“today sits 0.19 from the canonical opening reversal”), and each family in the day-type gallery gets a definitive picture at the top. The first two are below — computed from every session the archive’s own Brooks detectors file under the family, event-aligned on the reversal extreme so the turn survives the averaging, beside the three most textbook real members so the average can be judged against its inputs. An archetype describes its family; it forecasts nothing.

The opening reversal (up)
Sell-off from the open into a reversal extreme that holds all session, then a grind higher.
the aligned average of the 116 closest of 230 dominant members — computed, not drawn · dashed line = the alignment anchor (the family's defining event)
the three most textbook real examples
2020-03-30
2021-11-24
2025-01-28
The opening reversal (down)
Pop from the open into a reversal extreme that holds all session, then a slide.
the aligned average of the 77 closest of 153 dominant members — computed, not drawn · dashed line = the alignment anchor (the family's defining event)
the three most textbook real examples
2020-04-03
2021-03-04
2024-04-12
The buy-the-close climax
A run that accelerates INTO the bell and finishes at the session high — the event anchor is the close itself.
the aligned average of the 23 closest of 45 dominant members — computed, not drawn · dashed line = the alignment anchor (the family's defining event)
the three most textbook real examples
2022-07-05
2023-03-24
2020-03-04
Trend from the open (up)
One-directional from the first bar — membership from the day-type labels, aligned on the open itself.
the aligned average of the 99 closest of 196 dominant members — computed, not drawn · dashed line = the alignment anchor (the family's defining event)
the three most textbook real examples
2020-05-18
2025-07-10
2020-04-29
Trend from the open (down)
The bear mirror, from the day-type labels.
the aligned average of the 92 closest of 183 dominant members — computed, not drawn · dashed line = the alignment anchor (the family's defining event)
the three most textbook real examples
2025-04-23
2023-05-19
2025-09-24

The shapes the market keeps drawing live today

What your eye sees. Three related reads. The market re-drawing the same picture — the flag that keeps printing, the midday coil you’ve seen a hundred times. Tape that looks like nothing you’ve seen. And the bar where the day changed character — where the morning’s two-sided chop became the afternoon’s one-way trend.

What the math measures. The matrix profile computes, for every window of the tape, the distance to its nearest neighbor everywhere else — in one pass. Windows with unusually close neighbors are motifs: the shapes the market keeps drawing. Windows whose nearest neighbor is still far away are discords: the weirdest stretches of tape in the archive. FLUSS then reads where each window’s neighbors live in time: while a session stays one regime, neighbors sit nearby; count the neighbor-arcs crossing each bar and the deepest valley marks the bar where the day changed character.

How it applies here. Motif tables — “this 45-minute shape has printed 312 times across the pool, here are the dates.” Discord surfacing — when today has no precedent, a precedent tool should say that plainly instead of serving its least-bad match. And honest session-splitting: instead of assuming the day flips at noon, FLUSS finds the actual seam, which is where “the morning doesn’t predict the afternoon” stops being a fixed clock boundary and becomes a measured one. The first mined set is below — every two-hour window of every session self-joined, occurrence counts measured, weirdest tape included.

SPY — the 2-hour shapes it keeps drawing
every 24-bar window of 1,668 sessions self-joined; windows spanning the overnight seam excluded · occurrences measured within 1.25× the pair distance
motif #1 · printed ×3 · d=0.531
2022-04-28 · 11:40 ET
2023-04-17 · 13:55 ET
motif #2 · printed ×2 · d=0.551
2021-09-01 · 11:25 ET
2026-04-15 · 13:25 ET
motif #3 · printed ×2 · d=0.557
2021-09-10 · 14:00 ET
2026-02-12 · 9:30 ET
motif #4 · printed ×3 · d=0.598
2023-12-20 · 13:10 ET
2024-04-04 · 12:25 ET
the weirdest 2 hours in the archive (discords)
2021-09-17 · 13:55 ET · d=4.541
2020-08-10 · 12:40 ET · d=4.523
2026-02-19 · 9:40 ET · d=4.403
QQQ — the 2-hour shapes it keeps drawing
every 24-bar window of 1,682 sessions self-joined; windows spanning the overnight seam excluded · occurrences measured within 1.25× the pair distance
motif #1 · printed ×2 · d=0.542
2022-04-28 · 12:00 ET
2025-05-15 · 10:30 ET
motif #2 · printed ×4 · d=0.55
2023-04-17 · 14:00 ET
2023-07-07 · 11:15 ET
motif #3 · printed ×2 · d=0.56
2022-01-10 · 12:50 ET
2024-01-18 · 13:10 ET
motif #4 · printed ×3 · d=0.592
2022-09-02 · 11:20 ET
2022-10-26 · 11:45 ET
the weirdest 2 hours in the archive (discords)
2022-07-27 · 12:35 ET · d=4.43
2024-09-23 · 14:00 ET · d=4.328
2023-04-19 · 12:40 ET · d=4.278

The character of a day, in single numbers live today

What your eye sees. Within the first hour you’re already categorizing: trend day or range day; clean grind or chop; whether a stretch away from VWAP fades or extends; whether price is being accepted at a level or just visiting it. It’s the Brooks vocabulary — always-in, breakout mode, level acceptance, trapped traders — and it’s all about the day’s character rather than any one pattern.

What the math measures. One number each. The Hurst exponent (via DFA): do moves tend to extend (above 0.5) or fold back (below)? — trend-day vs range-day as a single number. Permutation entropy: how orderly the sequence of ups and downs is — a grinding microchannel scores low, a chop-fest scores near the maximum. OU half-life: on a range day, fit the pull back toward the mean and read off how many minutes a stretch takes to decay halfway — “fade it, it comes back in ~20 minutes” as a fitted parameter. RQA laminarity: from recurrence analysis, the fraction of time the path spends stuck revisiting the same state — time spent sitting at a level instead of moving through it.

How it applies here. These become the day’s character sheet — labels on every session in the archive, so twin search can be filtered (“range days with fast mean-reversion only”) and a shape twin from a day with the opposite character can be flagged as the weaker precedent it is. Laminarity in particular puts a number on level acceptance: high laminarity pinned under yesterday’s high reads very differently — for who’s paid and who’s trapped — than a quick tag and rejection, even when the two days share an outline. The first join is live: the day-types page now carries these numbers per type over ~180k labeled sessions — reversion half-life separates trend days from ranges cleanly, and Hurst, honestly measured, barely separates them at all.

Volume profiles as earth to move live today

What your eye sees. The volume profile’s shape: a P-day (rally, then business done up top — short covering), a b-day (liquidation, then business done down low), a double-distribution day (two value areas and a thin seam where price only traveled). Where volume piled up is where traders agreed on value; the profile’s shape is a map of who did business where.

What the math measures. A volume profile is a histogram — volume at price. Comparing histograms bin-by-bin fails the way lockstep path comparison fails: two profiles with humps one bin apart read as totally different. The Wasserstein (earth-mover’s) distance instead asks: how much dirt must move, and how far, to turn one profile into the other? Two P-days with humps at slightly different prices are near; a P-day vs a b-day is far — the full cost of hauling the hump from the top of the range to the bottom.

How it applies here. A fourth opt-in lens — volume profile — beside structure, levels, and volume. The existing volume lens compares how volume arrived over time; the profile lens compares where it piled up in price. With it, twins agree not just on the path but on where the day’s business got done — and P-shape / b-shape / double-distribution become searchable labels of their own.

The roadmap roadmap

None of this is built. Each item earns a slot only if it survives the treatment the engine already gets — frozen benchmarks, honest nulls published when something doesn’t help.

The through-line: every method here formalizes a read you already make. And the boundary never moves — numbers describe the tape, they don’t predict it. What the math buys is honest precedent: measured, benchmarked, labeled a picture of the past. The read is still yours.

Sources: every shipping constant on this page (channel weights, band width, front weight, swing boost, magnitude penalty, twin/generic scale, pool size) comes from the engine’s living spec, src/lib/analogs/LOOKALIKES.md, which changes in the same PR as the code it describes and is pinned by a frozen trust bench in CI. The cross-validation finding (the morning does not predict the afternoon) is the site’s own, and is why every “what happened next” view ships with an any-day baseline beside it.