Whitepaper

Vector-Based Pattern Memory: Preserving Market Complexity Beyond Neural Network Averages

A vector similarity approach to discovering truly analogous market regimes — preserving structural nuance where neural networks blur and statistical models average. Interpretable pattern retrieval and probabilistic scenario planning for professional trading.

Abstract: The "Glass Box" Approach

Modern markets operate in regimes that legacy models fail to recognize. Neural networks ("Black Boxes") provide signals but lack explainability, making them unsuitable for managing large capital due to compliance and risk management issues. We propose a "Glass Box" Pattern Memory that indexes market structure using vector similarity search. Instead of "predicting" the future, our engine instantly retrieves historical precedents for the current situation, providing complete transparency and evidence-based decision making.

Motivation: Regime Detection & Honest Backtesting

Practitioners need a way to recover historical context: “Where have we seen this kind of behavior before?” Our system searches for similar situations in a high‑dimensional structural space. Crucially, we employ a "Time Machine" methodology for backtesting—rigorous Walk-Forward Analysis with recursive lookups that eliminates look-ahead bias, providing a true measure of strategy robustness.

Honest Backtesting: The "Time Machine"

Most backtests are flawed because they train on the entire dataset (including the future). Our engine enforces strict temporal isolation.

Recursive Lookups

For every point in the backtest (e.g., Jan 1, 2020), the engine rebuilds its index using only data available up to that moment. It cannot "see" the crash of March 2020 until it happens.

Walk-Forward Validation

We simulate the exact experience of a trader living through history day by day. This exposes how strategies perform during regime shifts, not just on average.

Method Overview

Our pipeline ingests market data, derives structure‑aware features, indexes windows into a Graph Memory, retrieves analogous cohorts via multi‑metric similarity, and projects forward quantile envelopes for range‑first planning.

Figure: End‑to‑end pipeline for structural retrieval and probabilistic planning.
Ingest Features Graph Search Bands

Vector-Based Pattern Memory Architecture

Our Pattern Memory encodes time‑localized market structure as high-dimensional vectors stored in a searchable graph. Each node represents a window enriched with features that capture shape, volatility context, and microstructure cues—not compressed into opaque weights that blur details, but preserved as rich, multi-metric representations.

Edges connect windows that are structurally analogous under multiple similarity metrics, enabling nuanced pattern discovery that classical averaging methods miss. The result is a searchable vector database of regimes that supports precise retrieval, clustering, and transfer across instruments while maintaining interpretability.

Key Advantages Over Black-Box Models:

  • Preserved complexity: Vector features capture shape, persistence, dispersion, and state transitions without averaging them into a single latent space.
  • Multi‑metric similarity: Composite distances and consensus neighborhoods ensure robustness without over-smoothing.
  • Temporal coherence: Sequence‑level consistency prevents cherry-picked matches while maintaining structural fidelity.
  • Glass Box Transparency: Actual historical windows returned, not opaque black-box predictions. Full auditability.
Graph Memory architecture diagram: nodes as enriched windows and edges as structural similarity
Figure: Structural regime graph — retrieved cohort highlighted.
Node (window) Similarity edge Cohort highlight

Similarity Search and Cross‑Market Transfer

The system scales beyond a single instrument. It can retrieve patterns from Bitcoin and test whether analogous structure emerges on Solana, or jointly aggregate matches from BTC, ETH, and SOL to assess whether a cross‑asset consensus edge exists. This enables portability of memory across markets without assuming identical dynamics.

Example 1 — Transfer

“Find situations like those seen in BTC and apply to SOL if the structure reappears.”

Example 2 — Consensus

“Find similar structures on SOL, ETH, and BTC and summarize if a consistent edge persists across assets.”

Cross-market transfer diagram: BTC to SOL and BTC+ETH+SOL consensus
Figure: Transfer and consensus across assets.
Transfer arrow Cohort Aggregate → consensus

Probabilistic Envelopes, Not Points

For each cohort of retrieved matches, we project a forward distribution summarized as quantile bands (e.g., 10–90%, 25–75%, 40–60%) and a dynamic median. Traders get a range‑first view of scenario space, not a single path. This aligns with risk management, sizing, and guardrail design under uncertainty.

Probabilistic forecast envelopes with quantile bands and dynamic median
Figure: Quantile envelopes illustrate scenario space versus a single path.
10–90% 25–75% 40–60% Median (dynamic)

Interpretability: No Black Box Opacity

Unlike neural networks that produce opaque predictions from compressed latent spaces, our vector-based approach returns actual historical windows that structurally match the current regime.

Traders see the matched patterns, their similarity scores, cohort composition, and forward outcome envelopes—a complete, transparent line from evidence to decision. No hidden layers, no averaged-away details, no black-box mystery. Just real patterns that preserve the complexity professional traders need to make informed decisions.

Pilot Scope and Scalability

The current pilot operates on a limited set of instruments and datasets. Graph Memory is designed to scale to broader universes and cross‑market searches as data coverage expands. The underlying graph yields efficient reuse of structure and incremental indexing.

Roadmap: Towards Semantic Search

We are actively researching algorithmic improvements to deepen structural understanding beyond simple geometry:

  • Dynamic Time Warping (DTW): To recognize patterns that are temporally distorted (stretched or compressed) but structurally identical.
  • Semantic Embeddings: Exploring hybrid models that use contrastive learning to capture non-linear relationships while maintaining the "Glass Box" retrieval paradigm.
  • Multi-Scale Context: Integrating higher-timeframe trends into the local pattern definition for context-aware matching.

Limitations and Disclaimer

Historical analysis does not guarantee future results. Structural similarity may fail under regime breaks, liquidity shocks, or novel catalysts. We emphasize range‑based planning and ongoing accuracy auditing. Outputs are decision support, not signals to execute blindly.