A Framework for Cross-Platform Prediction Market Unification

Abstract

Prediction markets on the same event rarely agree uniformly, and most of the disagreement is structural rather than informational. A contract on Polymarket and a contract on Kalshi can reference the same FOMC rate decisions and resolve on the same Federal Reserve announcements, yet their prices behave differently across market conditions: tightly converged on high-liquidity macro questions, diverging by 5 to 14 percentage points on less-liquid or politically charged events. Fragmentation across protocols, resolution criteria, oracles, and settlement currencies means that a single real-world question is represented by many different contracts, none of which are individually authoritative.

This paper describes the Predictive Labs Framework, a system for unifying fragmented prediction market data into canonical event-level abstractions and quantifying the relationships between those abstractions. The framework separates 2 concerns. Identity addresses when 2 contracts refer to the same underlying event, and is resolved through oracle-gated probabilistic entity resolution. Influence addresses when the outcome of one event conditions the probability of another, and is modeled through a hybrid graph combining topic-space similarity with empirical price correlation. The composition uses established statistical and machine-learning methods; the contribution lies in the domain-specific adaptations and in the architectural separation that together produce a system accurate enough to act on.

1. Introduction

The standard framing of prediction markets is that they aggregate dispersed information into a single probability estimate, letting collective betting activity reveal what crowds actually believe. The framing captures a real phenomenon but simplifies it in ways that matter for anyone building analytics on top.

In practice, prediction markets are subject to the same microstructure effects as any other market: liquidity constraints, funding costs, order flow dynamics, and the cost of hedging. Prices reflect all of these alongside genuine belief. When the same question trades on 2 different venues, the 2 prices can diverge for reasons that have nothing to do with belief disagreement: different costs, different counterparty risk, and different settlement constraints. For a system that wants to read probability directly off market prices, this is noise, and it is structural rather than informational.

The fragmentation is compounded by contract design. Polymarket and Kalshi can each list a contract on 'Fed cut in May 2026', but the resolution criteria, oracle sources, and settlement currencies will differ. Two markets, two prices, and a real question about whether they refer to the same event. A system built on top of these venues has to answer that question before it can combine the prices into anything useful.

This paper describes a framework for doing that at scale across many venues and many markets. The framework is built on 2 axes:

A vertical axis that resolves identity, collapsing multiple contracts into canonical Meta-Markets when they refer to the same underlying event.
A horizontal axis that represents influence, linking distinct Meta-Markets when the outcome of one conditions the probability of another.

Section 2 introduces a running example drawn from the Fed rate-cut market, used throughout the paper to illustrate each component. Sections 3 through 6 describe the framework in detail. Section 7 covers 2 applications: cross-venue arbitrage and macro signal distillation. Section 8 discusses limitations and open problems.

1.1. Contributions

This paper does not introduce new statistical or machine-learning techniques. The entity resolution core is Fellegi-Sunter (1969) [1]. The semantic similarity is cosine distance on transformer embeddings. The graph is a labeled property graph. The correlation analysis uses rolling window estimators. The contribution sits elsewhere:

An architectural separation between identity and influence, which has not been applied to prediction market data before in a systematic way.
Domain-specific adaptations that address failure modes specific to prediction markets, including oracle-compatibility gating, log-odds correlation estimation to handle binary-contract boundary effects, and dual-layer edge labels that distinguish thematic from empirical linkage.
A worked example that demonstrates the framework end-to-end on a realistic contract pair.

2. Problem Statement and Running Example

Consider the following 2 contracts, both live on April 17, 2026, both referencing the outcome of FOMC rate decisions over the 2026 calendar year. Prices and odds quoted throughout this paper are representative snapshots as of that date; live values will have moved by the time the paper is read.

2.1. Contract A (Polymarket)

Title: 'How many Fed rate cuts in 2026?' (outcome '0 (0 bps)').
Resolution rule: Resolves to YES if the total number of 25bp-equivalent rate cuts announced by the FOMC (including emergency cuts) across all 2026 scheduled meetings equals 0. Cuts of 1 to 24bps count as 1 cut; a 50bp cut counts as 2. Market closes December 31, 2026.
Resolution source: FOMC statements following each 2026 meeting per the official calendar (federalreserve.gov/monetarypolicy/fomccalendars.htm) and the target federal funds range published at federalreserve.gov/monetarypolicy/openmarket.htm [14].
Oracle: UMA Optimistic Oracle, with dispute resolution handled by UMA token holders.
Settlement currency: USDC.
Price (April 17, 2026): 0.40.

2.2. Contract B (Kalshi)

Title: 'Number of rate cuts in 2026?' (outcome '0'), Kalshi series KXRATECUTCOUNT, event ticker KXRATECUTCOUNT-26DEC31 [15].
Resolution rule: Resolves to YES if the Federal Reserve announces 0 rate cuts in 2026 (25bp per cut). Resolves December 31, 2026.
Resolution source: Federal Reserve System.
Oracle: Kalshi internal resolution committee, operating under CFTC Designated Contract Market status.
Settlement currency: USD.
Price (April 17, 2026): 0.40.

These 2 contracts are not identical. They use different oracles, settle in different currencies, and the exact wording of their resolution rules differs (Polymarket treats emergency cuts explicitly; Kalshi's ruleset derives from a standardized Kalshi series format). But they reference the same underlying fact pattern: the count of rate reductions announced by the Federal Reserve across 2026 FOMC meetings. A human reader would call them 'basically the same market'. The framework needs to reach the same conclusion, quantitatively.

3 questions follow:

Are A and B observations of the same underlying event?
If yes, what is the canonical probability for that event, combining the 2 prices with appropriate liquidity weighting?
What other Meta-Markets are influenced by this one, and by how much?

The vertical axis answers question 1 and the first half of question 2. The horizontal axis answers question 3. The streaming layer makes question 2 robust to liquidity noise and validates question 3 empirically.

3. Framework Overview

The framework is organized around 3 data objects:

Source contract: a single prediction market as listed by a single venue. Contract A and Contract B above are source contracts.
Meta-Market: a canonical abstraction representing a single real-world event. A Meta-Market aggregates 1 or more source contracts determined to reference the same event.
Link: a directed, weighted relationship between 2 Meta-Markets, representing either thematic similarity, empirical price correlation, or both.

Processing runs in 4 stages:

Feature extraction. Each source contract is reduced to a structured feature vector covering resolution time, resolution criteria, oracle, and a semantic embedding of the contract text.
Identity resolution. The vertical axis uses the feature vectors to decide which source contracts belong to the same Meta-Market. Oracle compatibility acts as a hard gate before probabilistic matching is applied.
Graph construction. The horizontal axis builds 2 subgraphs. A thematic subgraph uses embedding similarity to identify Meta-Markets in related domains. An empirical subgraph uses price correlation to identify Meta-Markets whose prices move together.
Streaming validation. A streaming engine continuously updates the empirical subgraph as new price observations arrive, reinforcing or attenuating edges based on sustained correlation behavior.

The key architectural decision is the separation between identity and influence. A Meta-Market's identity is a static, structural property that changes only when new source contracts are added or when resolution occurs. A Meta-Market's influence on other Meta-Markets is dynamic and empirical, subject to ongoing validation. Conflating the 2 leads to systems that rewrite their own identity structure based on noisy price correlations, which is how meta-markets end up drifting away from any real-world referent.

4. Identity Resolution (Vertical Axis)

4.1. Feature vector

Each source contract is represented as a 4-tuple, extended at a later stage of the pipeline with a 5th categorical slot for discovered tags (section 4.5):

$$ F = (T, L, O, S) $$

Where:

$T$ is the resolution timestamp in ISO 8601 UTC, the deterministic hard boundary for when the contract resolves.
$L$ is the resolution rule, normalized to a canonical form that strips platform-specific markup and boilerplate.
$O$ is the oracle identifier, drawn from a controlled vocabulary of known oracle types (UMA, Chainlink, Kalshi resolution committee, platform-specific committee, and so on).
$S$ is a dense semantic embedding (typically 768 to 3072 dimensions depending on the chosen model) of the normalized contract title and resolution rule, generated by a sentence-embedding model.

The embedding model is a current-generation sentence encoder capable of producing reliable similarity scores on short domain-specific text (e.g., BGE, E5, Voyage, or OpenAI text-embedding-3). Model choice is an operational decision driven by the tradeoff between quality, latency, and API cost, and by whether domain fine-tuning is warranted for the prediction-market corpus. The original BERT-base encoder from 2018 has been superseded by these newer encoder families, which produce materially better similarity judgments on short-form text with minimal additional latency.

For Contract A and Contract B from the running example:

$T_A \approx T_B$: both resolve on December 31, 2026.
$L_A \neq L_B$: Polymarket treats emergency cuts explicitly and rounds sub-25bp cuts up; Kalshi's standard series rule defines cuts in straightforward 25bp increments. Equivalent outcomes under all realistic FOMC decisions, with edge cases at very small or off-cycle cuts.
$O_A = \text{UMA}$, $O_B = \text{Kalshi resolution committee}$.
$\cos(S_A, S_B) \approx 0.91$: very high semantic similarity (representative of the embedding output on near-paraphrased titles and rules).

4.2. Oracle-compatibility gate

Before any probabilistic matching runs, the framework applies an oracle-compatibility predicate: are the 2 oracles compatible in the sense that a single real-world outcome would resolve both contracts the same way?

The predicate does not require oracle identity. UMA and the Kalshi resolution committee are 2 different oracles. For the running example, both will read the same FOMC statements and both will produce the same count of 2026 rate cuts. The oracle-compatibility predicate returns TRUE.

By contrast, consider the family of contracts on whether a US-Iran nuclear deal is announced by a given date. Polymarket and Kalshi both list contracts in this space. Polymarket's resolution rule typically allows 'a consensus of credible reporting' to resolve YES, with disputes routed to UMA token holders. Kalshi names specific authoritative sources (for example, an official US government statement) and routes disputes to its internal team and, if needed, regulators. These oracles can plausibly disagree on the same real-world development: an announced framework that Polymarket treats as a deal might fail Kalshi's stricter requirement for a formal announcement from a named source. The oracle-compatibility predicate returns FALSE, and the contracts are held separate regardless of how similar their titles look.

The predicate is implemented as a lookup table combined with rule-based logic, maintained by the Predictive Labs data team and versioned alongside the codebase. It is explicit about what it does not know: when 2 oracles have no prior joint resolution history, the predicate returns UNKNOWN, and the contracts are held in a pending state until sufficient joint resolutions are observed.

This gating step is the single most important correctness decision in the vertical axis. Probabilistic matching alone, without oracle gating, will produce Meta-Markets that look coherent and resolve incoherently.

4.3. Probabilistic matching

Conditional on oracle compatibility, the framework uses Fellegi-Sunter record linkage [1] to decide whether 2 source contracts refer to the same Meta-Market. For a pair of contracts $a$ and $b$, a comparison vector $\gamma$ is computed from their feature vectors, and the likelihood ratio is evaluated:

$$ R = \frac{P(\gamma \mid M)}{P(\gamma \mid U)} $$

Where:

$M$ denotes the set of matching pairs (same underlying event).
$U$ denotes the set of non-matching pairs (distinct events).

The conditional probabilities are estimated from a training set of labeled contract pairs curated by the Predictive Labs data team. If $R$ exceeds a threshold $\lambda_{\text{high}}$, the contracts are assigned to the same Meta-Market. If $R$ falls below a threshold $\lambda_{\text{low}}$, they are classified as distinct. Between the thresholds, the pair is held for manual review. The dual-threshold design is deliberate: it prevents the system from silently merging borderline cases and creates a natural queue for human curation.

Applied to the running example, the comparison vector reflects near-identical resolution timestamps, semantically close but not identical resolution rules, compatible oracles, and a cosine similarity of 0.91. The resulting likelihood ratio is well above $\lambda_{\text{high}}$, and Contracts A and B are merged into a single Meta-Market, M_FedCuts2026_Zero.

4.4. Centroid maintenance

Once a Meta-Market contains multiple source contracts, its centroid is maintained as a weighted average over the member feature vectors:

$$ C_M = \frac{\sum_i w_i \cdot F_i}{\sum_i w_i} $$

Weights $w_i$ reflect the reliability of each source, combining oracle trust, liquidity depth, and data freshness. The centroid does not 'approach perfect information' in any formal sense, because the underlying events are not stationary and the weights themselves evolve over time. It provides the best current estimate of the event's canonical representation given the observations currently in the system. When resolution occurs, the centroid becomes the ground-truth anchor for retrospective analysis of the constituent contracts.

4.5. Latent tag discovery

Latent tag discovery enriches the feature vector defined in 4.1 with thematic categories that are not present in source metadata. Contracts typically arrive with thin categorical labels (e.g., 'Politics', 'Crypto'), which are too coarse for downstream analysis. The framework runs an unsupervised topic-discovery pass over the embedding space to surface finer-grained thematic tags.

The current implementation uses HDBSCAN clustering [13] on the semantic embeddings, followed by class-based TF-IDF to extract descriptive terms for each cluster. This is the BERTopic pipeline [12], which has replaced Latent Dirichlet Allocation as the standard approach for short-text topic discovery. HDBSCAN is preferred over k-means and agglomerative clustering because it does not demand a pre-specified cluster count, because it handles the variable density of real prediction-market taxonomies where some topics are dense and others sparse, and because it assigns a 'noise' label to off-topic contracts instead of forcing them into a cluster they don't belong in.

Discovered tags are appended to each source contract's feature vector as an additional categorical dimension. They contribute soft evidence to the probabilistic matcher in 4.3, and serve as a navigation aid for operators browsing the Meta-Market layer. Tags are recomputed on a weekly schedule; between recomputations, new contracts are assigned to existing tags via nearest-cluster lookup on the embedding space.

5. Influence Graph (Horizontal Axis)

With Meta-Markets established, the framework constructs a graph over them. The key design decision is to use 2 parallel edge types, maintained separately and combined only when both are present.

5.1. Thematic edges

A thematic edge between Meta-Markets $M_a$ and $M_b$ reflects content similarity, derived from the semantic embeddings of their constituent contracts. The edge weight is the cosine similarity between the centroid embeddings:

$$ \text{sim}(M_a, M_b) = \frac{V_a \cdot V_b}{\|V_a\| \cdot \|V_b\|} $$

Thematic edges are cheap to compute, easy to interpret, and useful for navigation and discovery. Their weakness is that topic overlap does not reliably track economic or probabilistic influence. Markets on 2 different football matches will produce high thematic similarity without any useful signal about each other's outcomes. Thematic edges are therefore treated as a hypothesis-generating layer rather than a trading signal in their own right.

5.2. Empirical edges

An empirical edge represents sustained co-movement in prices between 2 Meta-Markets. The naive approach is rolling Pearson correlation, but Pearson has 2 failure modes on prediction market data:

Binary prediction markets resolve to 0 or 1, which causes prices to compress toward the resolution outcome as the event nears. Any 2 markets resolving toward the same outcome will show high correlation in their terminal windows, even when their underlying events are unrelated.
Prediction market price series are generally non-stationary, and Pearson assumes stationarity in the compared populations.

The framework addresses both issues. Correlation is estimated on log-odds transformed prices, which expands the resolution-boundary region and reduces the mechanical end-of-contract correlation. Edges are not admitted to the graph based on a single window; an edge is reinforced only when correlation exceeds a threshold across multiple successive windows, and when a permutation test rejects the null hypothesis that the observed correlation arises from contemporaneous trading noise. The combination of these 2 safeguards means empirical edges require both statistical significance and persistence to survive.

Alternatives to Pearson have been considered and can be swapped in for specific edge types where the default is known to fail. Kendall's tau handles rank-based dependence with less sensitivity to outliers. DCC-GARCH [5] models time-varying correlation structure explicitly. Copula-based measures capture non-linear dependencies that linear correlation misses. Log-odds Pearson is the default because its coefficient is directly interpretable, it computes cheaply in a streaming context, and the persistence-plus-permutation-test combination already delivers the signal quality needed for the edge admission decision.

5.3. Edge combination and labels

An edge between $M_a$ and $M_b$ carries:

A thematic weight $w_T$ in $[0, 1]$.
An empirical weight $w_E$ in $[-1, 1]$.
A label combining the 2: THEMATIC_ONLY, EMPIRICAL_ONLY, CONFIRMED (both present and aligned), or CONFLICTING (both present but pointing in different directions).

The CONFLICTING label is diagnostically useful. A pair of Meta-Markets with high thematic similarity and low empirical correlation typically indicates either a well-known but economically decoupled relationship (same sport, different matches), or an edge that will matter only under specific conditions. A pair with low thematic similarity and high empirical correlation is more interesting: it indicates a hidden economic linkage, often driven by shared macro factors. The influence graph is searchable by edge label, and alerts can be configured on state transitions (e.g., an edge moving from THEMATIC_ONLY to CONFIRMED after several days of sustained correlation).

5.4. Graph analytics layer

Once the influence graph is populated and validated, a set of standard graph-theoretic measures becomes available as derived products. The framework supports 3 in particular, each answering a different analytical question.

Centrality (PageRank and Eigenvector). These identify Meta-Markets whose position in the graph makes their price movements widely consequential. Weighted PageRank [9], computed on the empirical subgraph and filtered by CONFIRMED edges, returns an ordered list of 'anchor' Meta-Markets whose movements propagate through the most reinforced empirical pathways. Eigenvector centrality produces similar rankings with different sensitivity to cluster structure. PageRank is the default. The damping factor keeps scores stable in the presence of dangling nodes and recently-added Meta-Markets without incoming history, convergence is fast on the sparse graphs typical of this domain, and the score has a straightforward interpretation as the stationary distribution of a random walk, which helps when explaining rankings to operators. Eigenvector centrality is retained as a cross-check on the PageRank ordering.
Betweenness. Betweenness centrality [11] identifies Meta-Markets that sit on the shortest paths between otherwise-disconnected parts of the graph. These are 'bridge' markets. A Fed rate-decision Meta-Market may show high betweenness because it connects bond-market clusters, FX clusters, and equity-market clusters that are only weakly linked to each other directly. Bridge markets are operationally significant because their movements tend to precede correlated moves across multiple clusters at once.
Community detection (Louvain). The Louvain method [10] partitions the graph into communities of densely-connected nodes. Applied to the empirical subgraph, it surfaces thematic clusters of Meta-Markets that are not necessarily obvious from contract metadata. A community might contain markets on 3 different central bank decisions, 2 commodity markets, and a currency pair, all bound together by observed co-movement. Communities are useful for navigation, correlated-position monitoring, and portfolio construction.

These measures are recomputed on a configurable schedule, because global graph access is required and streaming updates are not feasible at graph-wide scope. A 1-hour cadence is adequate for most use cases; higher frequencies are available at compute cost.

6. Streaming Validation

The empirical subgraph is maintained by a streaming engine. For each pair of Meta-Markets tagged as 'candidate empirical' (through either thematic proximity or operator-specified interest), the engine maintains a stateful window of synchronized price observations and computes log-odds correlation within that window.

The architecture uses Apache Flink [4] for stream processing, with correlation updates triggered by new price ticks rather than on a fixed clock. Synchronization across venues is handled through a windowed alignment step: price observations within a small time tolerance (default 5 seconds) are treated as contemporaneous, with gaps filled by forward propagation up to a configurable maximum (default 60 seconds). Venues with heterogeneous tick rates are downsampled to a common cadence before correlation is computed: a high-frequency Polymarket stream producing several updates per second is bucketed to match a lower-frequency Kalshi stream, using the latest price within each bucket.

Representative operational defaults, tunable per edge type: window length of 15 minutes for intraday empirical correlation, persistence requirement of $N=4$ successive windows before edge admission, $r_{\text{high}}$ threshold of 0.70, $r_{\text{low}}$ attenuation threshold of 0.30, permutation test with 1,000 shuffles at $p<0.01$. These defaults are starting points; production deployment calibrates them against historical resolution data and against the observed rate of false-positive edge admissions.

Edge weights are updated according to 3 rules:

Reinforcement. A correlation that persists above threshold $r_{\text{high}}$ across $N$ successive windows increases the edge weight, subject to a ceiling that prevents any single edge from dominating the graph.
Attenuation. A correlation that persists below $r_{\text{low}}$, or that oscillates between high and low without settling, decreases the edge weight. An edge that falls below an attenuation floor is marked ATTENUATED and excluded from signal propagation until it recovers.
Anomaly. A sudden transition within a single window is flagged for operator review, without triggering automatic weight changes. These sudden transitions often signal that the relationship between 2 Meta-Markets has fundamentally shifted, which tends to be more interesting than steady-state correlation.

The streaming layer is designed so that the graph can be read at any time for downstream applications, with a clear indication of the freshness and confidence of each edge. Edge staleness is computed from the age of the most recent window included in the edge's estimate.

7. Applications

7.1. Cross-venue arbitrage

Returning to the running example. Contract A trades at 0.40 on Polymarket; Contract B trades at 0.40 on Kalshi. The canonical Meta-Market M_FedCuts2026_Zero has both as members, with compatible oracles and equivalent resolution outcomes.

The theoretical anchor for this application is the Law of One Price (LoP): identical claims should trade at identical prices in frictionless markets. Prediction markets are not frictionless, and observed price divergence is the structural cost of that friction. The arbitrage boundary for a Meta-Market is:

$$ \Delta P = \max(P_i) - \min(P_i) > \varepsilon $$

Where $\varepsilon$ is the sum of round-trip frictions (transaction costs, slippage, basis risk, counterparty risk). When $\Delta P$ exceeds $\varepsilon$ across the constituent contracts of a Meta-Market, an arbitrage exists, and the price-dispersion reading becomes an empirical measure of how far the venue-level market deviates from LoP equilibrium.

The framework computes a unified price estimate, weighted by liquidity and recent trade volume:

$$ P_M = \frac{\sum_i w_i \cdot P_i}{\sum_i w_i} $$

For the running pair, $P_M \approx 0.40$ and $\Delta P \approx 0$. For this high-liquidity macro contract, prices have converged tightly, and the framework reports no actionable arbitrage. The $\varepsilon$ components for this specific pair:

Polymarket: taker fees apply on most categories as of April 2026 under the dynamic taker-fee model, with peak rates of approximately 1.00% on Finance and Politics markets, 1.50% on Economics, and 0% on geopolitics. At $p = 0.40$, the Finance-category fee is approximately $0.04 \cdot p \cdot (1 - p) = 0.96$¢ per contract. Maker orders remain free and earn a rebate. Gas cost on Polygon is negligible at typical contract sizes.
Kalshi: taker fee $0.07 \cdot p \cdot (1 - p)$ per contract, which at $p = 0.40$ is 1.7¢ per contract (contracts settle at $1, so ~1.7% of payout). Maker fees are approximately a quarter of this.
USDC-USD basis risk: typically within 10bps under normal conditions, wider during stablecoin stress.
Counterparty and smart contract risk on Polymarket; regulatory execution risk on Kalshi.

Total $\varepsilon$ for this pair is on the order of 3 percentage points under normal conditions for taker-only execution, with maker-side execution reducing that by roughly a third. A spread of 0 is inside $\varepsilon$, and the framework correctly flags no opportunity.

For illustration of the opposite case, cross-venue divergence of 5 to 14 percentage points has been documented on less-liquid or politically charged contracts. Ng et al. (2026) [16] study common contracts traded across Polymarket, Kalshi, PredictIt, and Robinhood during the 2024 US election and document persistent cross-platform price disparities, with Polymarket leading price discovery during high-liquidity periods and Kalshi exhibiting lags that create exploitable windows. Saguillo et al. (2025) [7] analyze 86 million bids across Polymarket and quantify approximately USD 40 million in extracted arbitrage between April 2024 and April 2025. When a case of this kind arises on a Meta-Market with the same $\varepsilon$ components as above, $\Delta P$ substantially exceeds $\varepsilon$ and the framework surfaces a candidate trade. Suppose, for a different Meta-Market on a politically-charged outcome, Contract A trades at 0.43 and Contract B at 0.37. Then $P_M \approx 0.40$, $\Delta P = 0.06$, and net-of-$\varepsilon$ edge $\approx 0.04$ per unit notional. The framework produces an ordered list of such candidates. Execution remains the responsibility of the downstream user, who has their own risk constraints and capital availability.

At scale, aggregating these opportunities across many Meta-Markets and many venues produces a continuous stream of candidate trades, each tagged with an expected edge and a confidence level. For market makers and systematic traders, this is the core commercial output of the vertical axis.

7.2. Macro signal distillation

The horizontal axis supports a different kind of use case: inferring probability shifts in peripheral Meta-Markets from movements in anchor Meta-Markets. Anchor status is a derived property. Weighted PageRank on the CONFIRMED subgraph returns an ordered list of Meta-Markets whose movements have historically propagated most widely through reinforced empirical pathways. The nodes at the top of the list are the anchors to watch first.

Suppose a scheduled Federal Reserve speech causes the price of M_FedCuts2026_Zero, a persistent high-PageRank node, to move from 0.40 to 0.28 within an hour (a hawkish reassessment: the probability of zero cuts falls as the market re-prices the likelihood of cuts). The framework identifies the relevant neighborhood of peripheral markets through the most recent Louvain partition of the empirical subgraph: rate-sensitive equity and index Meta-Markets, dollar-index Meta-Markets, 10-year Treasury yield Meta-Markets, and others with reinforced empirical edges to the anchor. Each edge has a weight derived from both thematic proximity and sustained empirical correlation with M_FedCuts2026_Zero.

A signal-propagation step uses these edge weights to compute expected probability shifts in the connected Meta-Markets. The result describes the price moves that would keep peripheral Meta-Markets consistent with the anchor move and with the historical relationship; it is a relational output describing expected motion rather than absolute price levels. Traders can compare the predicted shift to the observed shift in each peripheral Meta-Market. A peripheral market that has not yet moved in line with the implied shift is a lagging opportunity. A peripheral market moving in the opposite direction is either an anomaly worth investigating or a signal that the historical relationship has broken.

Bridge Meta-Markets identified by high betweenness centrality need separate treatment. A move in a bridge market hits multiple communities simultaneously, with downstream propagation paths spreading across each affected cluster. Operators watching a bridge move have to scan multiple neighborhoods in parallel, and the signal-propagation step runs across each affected community independently.

This use case depends on the empirical validation of the graph. A thematic-only graph would propose propagation paths with no empirical basis, and users would rapidly lose trust in the output. The CONFIRMED edge label is the operational signal that a propagation path is safe to act on.

8. Limitations and Open Problems

Several classes of problem are not fully addressed by the framework as described, and merit explicit acknowledgment.

Temporal shifts. 2 Meta-Markets can refer to the same event but with different resolution windows (e.g., 'Fed cut by May' vs 'Fed cut by June'). The current framework treats these as distinct Meta-Markets, with an influence edge between them once their correlation is established. This is correct but inefficient. A future extension would add a temporal-nesting relation that expresses containment directly.
Multi-outcome contracts. The framework is described in terms of binary markets. Categorical and continuous-outcome markets (e.g., 'What rate will the Fed set in May?' with multiple tranches) require a more general treatment. The feature vector extends naturally; the oracle-compatibility predicate and correlation estimator both need adaptation.
Adversarial listings. A venue could list a near-duplicate contract with subtly different resolution criteria to game any merging system. The framework partially mitigates this through the oracle-compatibility gate and the training data used for the probabilistic matcher, but an adversarial listing that passes both could produce a false merge. Active monitoring of resolution outcomes against pre-resolution Meta-Market membership is the current mitigation.
Oracle disagreement. When 2 oracles that have historically agreed produce different resolutions for a given event, the Meta-Markets they belonged to must be retrospectively split and the historical graph recomputed. The framework supports this operationally but at significant compute cost for events with many downstream dependencies.
Cold start. On day 1, the empirical subgraph contains no edges. Bootstrapping relies on thematic edges plus operator-curated seed pairs until sufficient correlation history has accumulated. The time required to reach a useful empirical graph depends on the frequency of the events in question and the diversity of the venues ingested.
Settlement basis. Contracts denominated in USDC and USD can diverge persistently due to depegs or liquidity stress. Merging across settlement currencies without accounting for this produces a canonical price that is internally inconsistent. The framework currently treats settlement currency as part of the feature vector and adjusts the unified price for basis, but the adjustment model is simple and subject to error during stress periods.
Latency floor. Streaming correlation estimation introduces a lower bound on how quickly an edge can be reinforced or attenuated, set by window size and persistence requirements. Fast-moving regime shifts may therefore be identified only after a delay. Anomaly flags mitigate this by surfacing transitions earlier, but they do not eliminate the floor.

9. Conclusion

Fragmentation is the dominant structural feature of prediction market data today. Different venues list different contracts on the same events, with different resolution criteria, different oracles, and different settlement currencies. Reading the collective signal across venues requires a framework that can determine when 2 contracts refer to the same event, and when 2 events are related enough that one conditions the probability of the other.

The Predictive Labs Framework separates these 2 concerns architecturally. Identity is resolved through oracle-gated probabilistic entity resolution. Influence is represented in a 2-layer graph, with thematic similarity and empirical correlation maintained as independent edge types and combined through explicit labels. Streaming validation keeps the empirical component aligned with current market behavior.

The framework uses established methods throughout. What is specific to this work is the architectural separation between identity and influence, the set of domain-aware adaptations that make each layer correct on prediction-market data (oracle gating, log-odds correlation, dual edge labels), and the engineering decisions that make the whole thing runnable against live venues. Fed with live data, the system produces a canonical, queryable event-level layer over the prediction market ecosystem. The fragmentation problem does not disappear. It becomes something a team can operate against.

This paper describes the framework as designed and partially implemented, ahead of deployed results. The running example draws on real contract metadata and representative pricing snapshots from live Polymarket and Kalshi listings; the derived quantities (cosine similarities, likelihood ratios, unified prices, propagation magnitudes) illustrate the kind of output the pipeline produces and are not measurements from a production system. A companion results paper is planned once the system has run against live data for a meaningful window, covering entity-resolution precision and recall on a human-labeled test set, graph calibration against realized correlations, and the hit rate of surfaced arbitrages net of execution cost.

References

Fellegi, I. P., & Sunter, A. B. (1969). 'A Theory for Record Linkage.' Journal of the American Statistical Association, 64(328), 1183–1210.
Reimers, N., & Gurevych, I. (2019). 'Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks.' arXiv:1908.10084.
Song, K., Tan, X., Qin, T., Lu, J., & Liu, T.-Y. (2020). 'MPNet: Masked and Permuted Pre-training for Language Understanding.' arXiv:2004.09297.
Carbone, P., et al. (2015). 'Apache Flink: Stream and Batch Processing in a Single Engine.' Bulletin of the IEEE Computer Society Technical Committee on Data Engineering.
Engle, R. F. (2002). 'Dynamic Conditional Correlation: A Simple Class of Multivariate GARCH Models.' Journal of Business and Economic Statistics, 20(3), 339–350.
Neo4j Graph Data Science Documentation. 'Labeled Property Graph Modeling.'
Saguillo, O., Ghafouri, V., Kiffer, L., & Suarez-Tangil, G. (2025). 'Unravelling the Probabilistic Forest: Arbitrage in Prediction Markets.' 7th Conference on Advances in Financial Technologies (AFT 2025). arXiv:2508.03474.
Lamont, O. A., & Thaler, R. H. (2003). 'Anomalies: The Law of One Price in Financial Markets.' Journal of Economic Perspectives, 17(4), 191–202.
Page, L., Brin, S., Motwani, R., & Winograd, T. (1999). 'The PageRank Citation Ranking: Bringing Order to the Web.' Stanford InfoLab.
Blondel, V. D., Guillaume, J.-L., Lambiotte, R., & Lefebvre, E. (2008). 'Fast unfolding of communities in large networks.' Journal of Statistical Mechanics, 2008(10), P10008.
Freeman, L. C. (1977). 'A Set of Measures of Centrality Based on Betweenness.' Sociometry, 40(1), 35–41.
Grootendorst, M. (2022). 'BERTopic: Neural Topic Modeling with a Class-based TF-IDF Procedure.' arXiv:2203.05794.
McInnes, L., Healy, J., & Astels, S. (2017). 'hdbscan: Hierarchical Density Based Clustering.' Journal of Open Source Software, 2(11), 205.
Polymarket. (2026). 'How many Fed rate cuts in 2026?' polymarket.com/event/how-many-fed-rate-cuts-in-2026.
Kalshi Inc. (2026). 'Contract Terms: KXRATECUTCOUNT, FEDDECISION.' kalshi-public-docs.s3.amazonaws.com/contract_terms.
Ng, H., Peng, L., Tao, Y., & Zhou, D. (2026). 'Price Discovery and Trading in Modern Prediction Markets.' SSRN Working Paper 5331995. https://ssrn.com/abstract=5331995.

← Back to home Working paper · April 2026