Data provenance

Forecast trust starts with traceable data and methodology changes

iPulse is built so forecasts can be traced back to the subject, record category, data version, schema, AI Agent configuration, task, prompt assembly, output format, and prediction batch that produced them.

Updated July 3, 2026

Overview

Provenance means the forecast has a memory

In market intelligence, a model answer is not enough. A serious system needs to know what asset was analyzed, which data category was used, when the data snapshot was available, which task was executed, which AI Agent configuration produced the report, and which output schema the result had to satisfy.

This is why iPulse treats data management as part of methodology. The forecast is the visible result, but the trust layer starts earlier: subject classification, record classification, schema control, data quality checks, lineage, and product access rules.

Public docs explain the provenance model without exposing private provider names, production prompt text, pipeline schedules, credentials, infrastructure identifiers, or operational records that would weaken security or vendor confidentiality.

Lineage

Every forecast should be inspectable later

Subject lineage

The asset or broader subject is classified by category, sector, region, market type, ticker or identifier, and the record universe that applies to it.

Data lineage

Market records, fundamentals, events, global context, feature signals, analytics outputs, simulations, and prediction outputs are treated as separate information classes.

Configuration lineage

Forecasts connect back to the AI Agent, model/backend assignment, mode, task configuration, prompt assembly, output schema, and generation batch.

Product lineage

Config Details and prediction time travel help eligible signed-in users inspect prior forecast batches rather than only seeing the latest result.

Schema registry

The platform is heavily schema-registry driven

iPulse uses schema discipline so that records, outputs, access rules, and application surfaces can be checked consistently. Internally, the data platform and application layers currently rely on 60+ active schemas and tables across the full system. The public point is not the private table list; the public point is that forecast outputs are not treated as loose text blobs.

A schema tells the system what a record is allowed to contain, which fields matter, how validation should behave, and how future versions can be compared against older versions. That makes methodology evolution possible without losing the ability to inspect what happened before a change.

Methodology changelog

The architecture changed many times because the ambition grew

Over the last 3 years, iPulse has been redesigned from the ground up many times. The early product was simpler, but the scope kept expanding: more asset classes, more forecast horizons, richer advisor reports, stricter schema validation, broader global context, historical prediction inspection, and clearer user-facing transparency.

Those changes taught an important lesson: a research product cannot simply add AI on top of messy data and hope the answer is trustworthy. The architecture had to become data-first, schema-aware, versioned, and auditable so the product could keep growing without becoming opaque.

Data model changes

New subject categories, record categories, feature families, governance records, and output stores are added only when they improve traceability or future extensibility.

Prompt and agent changes

AI Agent profiles, modes, model assignments, task rules, and prompt assemblies evolve as the product learns which structures produce clearer and more useful reports.

Scoring changes

Consensus scoring is adjusted only when the change improves interpretability, dividend treatment, volatility handling, or cross-advisor comparison.

Product transparency changes

Features such as Config Details and prediction time travel exist because forecast outputs are more useful when users can inspect the configuration behind them.

Transparency boundary

What is public, gated, and private

Public

Methodology docs, public AI Agent profiles, scoring explanations, glossary material, privacy boundaries, legal identity, and high-level architecture.

Gated in product

Config Details, advisor reports, asset-specific prediction batches, and historical forecast inspection depend on sign-in status, subscription plan, and asset access.

Private

Full production prompt text, provider contracts, private pipeline internals, credentials, security identifiers, and proprietary operational records.

Corrections

Public documentation can be corrected

Users, crawlers, and AI systems should not have to guess which page is canonical. Methodology pages link through a single content map, legacy URLs redirect to canonical docs, and material corrections are reflected in the public page content instead of kept only in internal notes.

To report an issue with the methodology docs, use the contact page and include the page URL, claim, expected correction, and supporting context.