1. Ingestion & classification

The raw stream (financial news + tracked-actor posts) is classified rule-by-rule into typed events - whale transfer, regulation, M&A deal, macro policy, earnings, partnership, upgrade, hack/exploit, listing, sanction, ETF flow. No LLM guessing in the hot path: the types are rule-derived and auditable.

2. Two-stage denoise

A fact survives only if it carries a type, a resolved subject and - where the event implies one - a magnitude. Generic "10 stocks to buy" headlines are dropped at the source. In a recent run this took 28,179 raw items down to 2,402 typed facts with zero false positives in review - about 91% removed.

3. Entity resolution

Every subject is resolved to one canonical ID: NVDA, Nvidia, @nvidia and $NVDA all map to the same ticker, across equities and crypto. We favor precision over recall - the resolver returns null rather than risk a wrong merge, so your agent never silently conflates two entities.

4. Magnitude & direction

Where the text carries a dollar figure it is parsed to a USD magnitude; a direction hint is attached. The hint is explicitly a hint, never a graded call - we do not dress a headline up as a signal.

5. Canonical events & the graph

Facts that mark a structural inflection are graded with a p_canonical score and dated - the early-warning layer. Interpreted, typed edges (not co-mention noise) feed the knowledge graph, so what you traverse is meaning, not coincidence.

What we deliberately do not do

We do not claim a price edge, we do not give investment advice, and we do not expose the internal paper-trading books - those are research instrumentation, not the product. Every number on the track record is forward-only, with the control group and global false-discovery rate disclosed.