The Discovery Engine

Science Has Parts

A foundation has money for three experiments. A lab finished a failed run yesterday, but the result is still in an instrument export and a thread. A model proposes a trial design this morning against last month’s literature. By Friday, the experiments will be chosen before the failed run is absorbed, and the old assumption will buy another month of work.

That is the failure: the correction arrives after the decision.

Science already has most of the parts: instruments, papers, datasets, code, peer review, clinical records, AI agents, cloud labs, funders, and expert communities. What it lacks is the drivetrain that lets one piece change the next before another decision is made.

Those parts do not share the thing that matters most: the current state of the claim. Papers, datasets, lab traces, model outputs, review notes, funding decisions, and clinical observations all record activity in different places. The diagnosis sits beside older infrastructure efforts: FAIR data principles (Wilkinson et al., 2016), nanopublications (Groth et al., 2010), scientific workflow provenance (W3C PROV), and knowledge systems such as the Open Research Knowledge Graph. The point here is narrower: not better metadata alone, but a governed state transition.

The fragmentation reaches beyond tooling into epistemic state. A paper can tell you what an author claimed. It does not, by itself, tell the next system what changed, where the change applies, what confidence moved, what contradiction appeared, what depends on the claim, who attested it, and what should be tested next.

This is why a scientist still assembles the car by hand. She searches the literature in one place, checks data in another, reads code in a third, reconstructs methods from a supplement, asks a colleague whether a failure was real, opens a model chat that will not remember the correction tomorrow, and then writes a narrative artifact that another person has to reverse-engineer later. Wrong trial assumptions continue. Failed experiments repeat. Funders buy isolated reports. Patients wait while updates remain trapped in local memory. The system contains intelligence and labor. It does not contain a shared transition object.

The engine science needs is a procedure: every artifact should be able to propose a governed change to the shared record, and every accepted change should be able to guide the next task in time to matter.

One business analogy helps, but it is secondary. The useful analogy in Rippling is that workforce operations become more powerful when HR, IT, finance, payroll, identity, devices, permissions, workflows, and reporting can read and write shared employee state. The lesson is shared operational state: many surfaces acting on one underlying object.

Start from a week of use, not the object model. A foundation wants to know what to fund next. A student wants a real task at the frontier. A robotic lab has a failed run to deposit. A model makes a prediction that should either earn calibration or lose it. The engine exists so each of those ordinary actions lands in the same frontier instead of disappearing into four separate systems.

Science needs an equivalent shared object. For science, that object is the scientific state transition, not the person, paper, dataset, project, lab, agent, or grant around it.

01 Record Constellations Sky Science needs a shared record where findings, failures, corrections, and dependencies can be inherited. 02 Engine Discovery Engine Transition loop Activity becomes governed state transitions that can guide the next task before another decision is made. 03 Body Terafactory Age Public fork The engine reaches labs, factories, capital, and government as either public infrastructure or closed bodies.

Fig. 01. The trilogy. The public frame is Record, Engine, Body. In the essay-world, the same movement is Sky, Engine, Body: the shared record, the transition loop, and the physical infrastructure that acts on it.

Between those two arguments sits the engine. Constellations of Borrowed Light argues that science needs a shared record. The Terafactory Age asks whether the engine reaches the physical world as an open public body or as closed private bodies first. This page names the machinery between them.

The public structure is Record, Engine, Body: the record remembers, the engine moves, the body acts. Activity becomes state, state guides action, and action returns to the record.

The claim here is narrower and more mechanical than the vision around it: if scientific work is going to compound across humans, agents, world models, and labs, the basic operating unit has to change from an artifact to a governed state transition.

The Shared Object

The paper becomes an artifact around the primitive. The primitive is the state transition.

In plain English, a scientific state transition is a reviewed update to what a field currently believes or can act on, with the evidence, scope, provenance, confidence movement, and downstream effects attached.

Take a deliberately schematic mechanism claim in Alzheimer’s: a vascular-inflammatory signature predicts early cognitive decline. Use it as a toy frontier for mechanics rather than a live biomedical assertion. Today a new cohort result may narrow such a claim to APOE4-positive patients above sixty-five, but the change disperses through papers, talks, private reviewer judgment, grant summaries, trial-design spreadsheets, and model context windows. In the engine, the new evidence becomes a proposed state transition. The reviewer judges whether that proposed change should merge.

For the reader, the object should feel less like protocol machinery than like three ordinary questions: What is being proposed? What would change if it were accepted? Who is willing to sign that change, and under what scope?

The names are implementation handles. The reader only needs the sequence: proposal, visible diff, reviewed commit, replayable event.

First, someone proposes a change. Then reviewers see what would change if the proposal were accepted. If it passes, the accepted package carries enough structure for another system to replay the effect, audit the evidence, inspect the provenance, recover the review target, and understand the conditions under which the change holds. A replayable event records the signed state change; the living frontier renders the result.

In the reference implementation, the diff is called ScientificDiff, the accepted package is ScientificCommit, the event is Vela Event, and the living frontier is Atlas. Those names matter less than the boundary between proposed change, reviewed acceptance, and replayable update.

The object is small enough to be boring and strong enough to carry the system:

Transition object

scientific state transition =

prior state
proposed change
evidence packet
context and scope
provenance
attestation target
replay effect

That is the equivalent of a commit in software: the thing every surface can read, update, audit, and use as its coordination anchor.

frontiers/alzheimers/events/evt_7c91b2a4.json signed

1 { 2 "type": "vela.event.accepted_commit", 3 "id": "evt_7c91b2a4", 4 "frontier": "alzheimers.neurovascular.bbb", 5 "commit": { 6 "proposal": "split cytokine signature by APOE4 scope", 7 "diff": "diff_2b8e19", 8 "evidence": [ "ev_perturb_441", "ev_cohort_118", "ev_repl_fail_037" ], 9 "context": "APOE4+, age > 65, vascular-inflammatory cohort", 10 "attestation_target": "finding:BBB-PERICYTE-APOE4-017", 11 "replay_effect": "update_status_and_queue_trial_review" 12 }, 13 "signatures": [ "reviewer:m.chao", "reviewer:l.okafor", "clinical:r.sen" ], 14 "recorded_at": "2033-11-18T09:16:22Z" 15 }

Fig. 02. Event file. A real substrate artifact should look inspectable. The event is a small signed object that points to the proposal, diff, evidence, context, attestation target, and replay effect.

This distinction is the load-bearing one. A dataset is not state. A model prediction is not state. A robot run is not state. A review comment is not state. Each can become part of state only when it is converted into a proposed change, reviewed against a frontier, attested under governance, and recorded as an event that future work can inherit.

Scientific work has always had a state machine, but it has been implicit and lossy. A lab runs an experiment. A result changes private belief. A paper turns some of that belief into prose. Reviewers inspect the prose. Readers infer what they think changed. Future labs reconstruct the implied update again. The state transition exists, but it is carried by narrative, memory, status, and citation rather than by an object designed to be replayed.

The Discovery Engine makes the transition explicit. That is what lets software surfaces, lab surfaces, review surfaces, funding surfaces, and learning surfaces remain modular without losing the frontier between them.

The Engine Loop

Think of the engine as an operating loop before you think of it as storage.

Core loop

01 Goal
02 Frontier
03 Task
04 Activity
05 Diff
06 Attestation
07 Event
08 Frontier update
09 Next action

The loop starts with a goal: cure a disease, prove a theorem, build a better material, explain a climate signal, identify a safety risk, or decide which experiment deserves scarce lab time. The goal pressures the frontier. It determines which uncertainty matters enough to become work.

The goal becomes a frontier: what is known, what is unknown, what is contested, what depends on what, and which uncertainties are worth spending effort to reduce. A frontier becomes tasks. Tasks are assigned to humans, agents, models, reviewers, labs, funders, or institutions. Activity produces artifacts: papers, extractions, simulations, protocols, robot runs, code, clinical observations, field measurements, and reviews.

Activity still has to pass through the drivetrain before it becomes state.

The reference implementation can have many named surfaces, but the loop itself is simple. Preserve the artifact. Normalize what it says. Expose the proposed change. Decide whether it merges. Record the accepted transition. Render the frontier that changed. Everything else exists so that loop can be governed rather than merely executed.

A failed run shows the difference. The lab deposits its protocol trace and readout. The diff says a dependent claim should weaken in this cell line but not in the broader mechanism. A reviewer signs the narrow change, an event records it, and the next task queue stops assigning that experiment as if nothing had happened.

01 Goal
02 Frontier
03 Task
04 Activity
05 Diff
06 Attestation
07 Event
08 Atlas

before status: contested

one BBB finding spans incompatible cohorts

after status: scoped

APOE4+ split, replication request, model recalibration

09:16 frontier.rank BBB delivery discord rises
09:18 diff.ready APOE4 subgroup split proposed
09:24 event.commit Atlas update and task rerank

Fig. 03. The engine loop. The operating unit is not the paper. It is a governed transition: work proposes a change, review decides whether it should merge, the event updates the Atlas, and the next task is selected from the new state.

This is where a knowledge graph runs out of surface. A graph can store relationships. It does not, by itself, decide what should change next, who can propose it, who can merge it, what evidence is sufficient, or what physical action follows. A graph is a map. An engine needs a clutch.

Agent runtimes have the same limit. Agents can produce activity. They can extract, summarize, design, critique, and execute. Without a governed transition layer, their work becomes more output for another agent to summarize later.

The test is practical. A better database helps a lab find what happened. A better notebook helps a lab preserve what happened. A better agent helps a lab produce more things that happen. The engine is different only if it changes the next decision: a trial pauses, a replication is routed, a model recalibrates, a funder stops buying a fragile assumption, or a lab avoids repeating a failure another lab already paid for.

Complete loop

01 State
02 Task
03 Model
04 Proposal
05 Diff
06 Review
07 Action
08 Evidence
09 Event
10 State

The loop is the drivetrain. Every product in the ecosystem matters only insofar as it helps the loop run with more fidelity, more throughput, more legitimacy, or less waste.

The design pressure is that the loop must be mundane enough for ordinary work. A graduate student should be able to extract a method, an agent should be able to open a provenance audit, a reviewer should be able to sign a narrow correction, and a lab should be able to write back a failed run without turning the act into ceremonial publication. The engine is only real when the smallest transition can travel.

The Four Planes

A discovery engine has four planes. They are a practical discipline rather than vocabulary for its own sake.

State

known, contested, scoped, actionable

Finding / Evidence / Frontier

without it activity cannot compound

Control

what should happen next

ResearchTask / Queue / SafetyGate

without it agents create noise

Model

what might happen if we act

Simulation / Prediction / Calibration

without it predictions float

Action

what touched reality

Protocol / RobotRun / LabResult

without it experiments vanish

Fig. 04. Four planes. The architecture separates what is known, how work is coordinated, how outcomes are forecast, and what physically changes. The planes share one event spine, but each one has a different shape of work.

The state plane is what is currently known, unknown, contested, scoped, weakened, deprecated, and actionable. Its job is to make scientific state typed, replayable, attestable, and federated. If the state plane fails, the engine becomes another pile of activity.

The control plane coordinates work: what should happen next, who is allowed to do it, and what review is required. This is the lesson from modern software-agent orchestration: serious work needs tasks, isolated workspaces, review output, permissions, and boundaries, not one enormous conversation. OpenAI’s Symphony describes issue trackers, isolated agent workspaces, and human review as the organizing pattern for coding-agent work. The science version needs the same discipline, but with scientific state as the merge target.

The model plane predicts what might happen. Model outputs are powerful, but they do not become state by being generated. A prediction has to record which state it trained on, where it was validated, where it failed, and what evidence later contradicted it.

prediction 72%

lab result 58%

after update 63%

WorldModelCard

BBB transport twin

trained on: Atlas state: 2033-11-17
failure: overpredicted APOE4 transport
gap: sim-to-real +14 points

Fig. 05. Model calibration. The model plane needs its own record. Predictions become useful when they are coupled to the state they read from and recalibrated against the evidence that returns.

The action plane touches the world: experiments, robot runs, clinical observations, simulation jobs, field measurements, and funding decisions that actually change what gets done. This is where the engine meets bodies, instruments, patients, organisms, materials, weather, and capital.

The complete engine is the relationship among the four:

Plane coupling

01 State guides control
02 Control assigns work
03 Models forecast action
04 Action produces evidence
05 Evidence proposes state transitions
06 Governance decides what merges

No plane can replace the others. A state plane without control is a museum. A control plane without state is a task manager. A model plane without state is a prediction market floating above reality. An action plane without writeback is the current lab system with better robots.

Products on the Substrate

Do not start by imagining a suite. Start by imagining a Tuesday morning.

One person searches the frontier. Another reviews a correction. A lab writes back a failed run. A model records that its prediction lost calibration. A funder asks what should stop receiving money. These should not be five disconnected products. They should be five views over one record.

Specialized tools can exist without fragmenting the frontier when they share the substrate.

The first fundable unit is still narrow: one frontier where signed state transitions, a registry, and a review queue work together. The reference implementation can name the roles, but the names are handles, not the thesis. The thesis is that capture, review, action, and learning should point at the same transition record.

Search, review, lab orchestration, model calibration, education, funding, and safety products can compete as long as they preserve that record.

state

Frontier view

cockpit

Operator console

control

Task queue

review

Review surface

forecast

Prediction layer

execution

Lab writeback

shared state substrate one replayable scientific state substrate

open protocol
typed objects
registry
signatures
federation

surfaceProducts propose or read work
substrateProtocol, objects, registry, signatures
effectState replays into every conformant product

Fig. 06. Modular products on one substrate. The discovery engine is not a single app. Products can specialize at the surface because they share lower-level state, schemas, identity, replay, and attestation.

The build order should be narrower than the eventual ecosystem. First make state transitions real in one serious frontier. Then make review, frontier state, task routing, lab write-back, and model write-back read and write the same transition substrate.

The structural analogy becomes useful only at this level: many specialized products can remain separate if they read and write one underlying state. Science should copy that lesson while keeping the registry open.

The shared object lets the ecosystem remain plural. Different teams can build better search, better review surfaces, better lab orchestration, better education products, better safety gates, better model registries, and better funding markets. The products can compete while the state remains inheritable. That is the difference between an ecosystem and a suite.

Four Users

The engine should be concrete enough that a user can ask it for work as well as knowledge.

A disease foundation says: “We want to make progress on neurovascular dysfunction in Alzheimer’s.”

The engine responds with the current frontier, the top unresolved questions, the highest-value discriminating experiments, the claims most fragile to provenance gaps, the agents already assigned, the expert reviews needed, the labs capable of execution, the safety class, the proposed diffs, and what changed this week. The foundation is no longer buying isolated papers or grant reports. It is buying movement in the frontier.

For a foundation, this changes the week. A $5 million program does not begin with a blank call for proposals. It begins with a portfolio frontier: three candidate experiments, two fragile assumptions, one lab-capacity constraint, one safety class, and a reviewer queue. If a failed replication weakens the broad claim on Wednesday, Friday’s funding decision should know.

A student says: “I want to help cure brain tumors.”

The engine routes her to a frontier, then to a task appropriate to her current competence: link evidence for this claim, compare these two papers, check this method, extract the protocol conditions, inspect why a replication failed, draft a proposal, learn from the review. The student does not begin with a textbook/test pathway detached from real work. She begins with a small edge of the frontier that has somewhere for her contribution to land.

A robotic lab says: “Experiment E completed.”

The engine captures the protocol trace, instrument calibration, raw data, measured outcome, uncertainty, and execution context. It creates an evidence object, computes the proposed change, routes it for attestation, updates the shared frontier if accepted, and updates the relevant model calibration records. The lab finishes a run by writing back.

robot run run_2033_11_18_B17

instrument trace cytokine signature

calibration: ok
evidence: ev_perturb_441
diff: confidence -0.13

Fig. 07. Instrument writeback. The lab surface should look like instrumentation, not project management. The useful object is the connection between plate position, signal trace, calibration, raw data, and the state transition it proposes.

A model says: “I predict intervention X will work.”

The engine converts the prediction into a counterfactual proposal, checks the context and assumptions, ranks expected information gain, applies the safety gate, routes to lab or simulation, and writes back the outcome. The prediction earns, loses, or keeps trust through its relationship to state, action, evidence, and calibration.

These scenarios are intentionally ordinary. Their value is mundane: the work of science stops being lost between systems.

Mon
Foundation asks what to fund next
frontier gaps ranked
Tue
Student extracts protocol conditions
evidence object improves
Wed
Lab failed run writes back
claim scoped; model weakened
Thu
Reviewer signs subgroup diff
Atlas state updates
Fri
Scheduler reranks the queue
funding decision changes

Fig. 08. A week in the engine. The operating substrate matters because ordinary actions land in the same frontier. A grant question, student contribution, lab failure, review decision, and scheduler update become one governed week of state movement.

Scaling Intelligence

This is a post-v1 stress test. The first corridor may only have hundreds of deposits a month. The design still has to survive the moment generation outruns review.

Scaling to millions or billions of scientific agents cannot mean every agent talking to every other agent. The literal number is a stress test, not the premise. The engineering analogy is mature open-source governance and software-agent orchestration: broad proposal access, bounded workspaces, scarce maintainer review, CI, reputation, and merge authority. Science has a harsher version because a bad merge can move trials, lab work, funding, or safety decisions rather than code.

You scale with structure.

Frontiers become shards: pediatric high-grade glioma, blood-brain-barrier delivery, protein binder design, a disputed theorem, climate attribution, drought-tolerant cultivars, direct-air-capture sorbents. Agents operate inside frontiers, not in one global chat.

Tasks become the coordination primitive. Check an evidence span. Extract a method. Propose a discriminating experiment. Review a contradiction. Verify a proof. Audit a provenance chain. Draft a lab protocol. Inspect a model’s sim-to-real gap.

A task contract carries more than a prompt: objective, frontier, input state, required capability, workspace boundary, allowed tools, safety class, evidence standard, expected output, reviewer requirement, expiry, budget, and downstream dependencies. The scheduler can deduplicate near-identical tasks, lease work to qualified actors, expire stale tasks, retry failed work, archive dead ends, and route high-risk outputs to heavier review.

proposal intake 18,421/day merge authority scarce

Fig. 09. Frontier sharding. At scale, agents do not enter one global room. They attach to frontiers, pick up bounded tasks, and route proposed changes through scarce merge authority.

Agents specialize. One is excellent at PubMed extraction. Another at Lean proofs. Another at protein design papers. Another at trial protocol critique. Another at statistical power checks. Another at lab safety. Another at reconciling contradictory clinical cohorts. Specialization matters because scientific work is a chain of jobs with different failure modes.

Every agent has a reliability record: accepted proposal rate, rejection reasons, later contradiction rate, citation hallucination rate, calibration, domain competence, review usefulness, and safety violations. The record does not have to be punitive to be essential. Reliability affects routing; routing produces work; review and later contradiction update reliability; reliability decays by domain and time. In a world of abundant generation, reliability is infrastructure.

Merge authority remains scarce. Millions of agents can propose. Few can merge.

The open-source law carries over:

Open-source law

Proposal access is broad.

Merge authority is governed.

At large scale, planning becomes hierarchical:

Planning hierarchy

01 Global goals
02 Domain goals
03 Frontier goals
04 Task contracts
05 Agent workspaces
06 Proposed diffs
07 Review queues
08 Accepted events

Markets and schedulers appear because scarce resources remain scarce: human review, lab time, funding, model recalibration, safety approval, clinical access, material transfer, regulatory attention. The engine does not abolish politics or allocation. It makes the allocation surface visible enough to govern.

Review capacity has to be engineered as deliberately as agent capacity. Low-risk transitions can be deduplicated, sampled, auto-rejected, or routed to narrow credential pools. Safety-relevant, clinical, animal, manufacturing, or high-dependency transitions require named human signers, conflict checks, escalation paths, and service-level expectations. A first corridor can be staffed like a serious study section rather than a social feed: two domain maintainers, one statistical reviewer, one provenance auditor, rotating external signers, and a weekly merge window with written rejection reasons. As a design stress test, a queue that receives ten thousand proposals a week but can merge two hundred is not failing if it discards nine thousand with auditable reasons and reserves scarce review for the transitions that move decisions.

The worst version of abundant agents produces too much low-legibility work for any institution to sort. The engine’s job is to make generation submit to structure: task boundaries, isolated workspaces, replayable evidence, calibration, review queues, signer reputation, safety classes, and merge authority. Agent spam, stale shards, poisoned evidence, model overconfidence, queue saturation, and reviewer capture are core design inputs for the control plane.

Governance and Capture

The bottleneck extends beyond intelligence into trustworthy integration.

Votes, comments, stars, citation counts, social attention, and agent output are signals. Trust enters when someone recognized for a domain, by a registry, under a revocable credential, signs a transition under rules that other institutions can inspect, contest, and inherit. The signature has scope, conflict metadata, expiration, appeal paths, and a registry that can itself be audited.

This makes governance a product requirement. Proposal access can be broad. Merge authority has to be governed. Identity, signer recognition, schema evolution, dispute handling, safety classification, maintainer turnover, and registry federation cannot be afterthoughts.

The institution can be small at first, but the rules cannot be vague, because the signature becomes downstream infrastructure.

The first operator should not be a company pretending to be a commons. It should be a chartered nonprofit registry or consortium for a bounded frontier. The governance test is simple: the registry can lose a dispute and survive, lose a founder and survive, lose a vendor and survive, and be forked if it violates its charter. The governance precedents are partial rather than exact: Crossref for nonprofit scholarly infrastructure across competing publishers, IETF for open protocol process, and W3C for web standards maintained across institutional actors.

The first host should be a patient-led foundation or FRO-hosted nonprofit registry with enough convening power to recruit three to five labs before the network effect exists. The pitch to each participant is practical: labs get milestone funding, shared negative-result protection, and a regulator-readable provenance export they could not produce alone. Funders get a weekly frontier export: what changed, what should stop, what should be tested, and which assumptions are now too fragile to buy.

The first pilot is an order-of-magnitude institutional proposal, not a claim about an existing program: one disease frontier, one chartered registry, one reviewer queue, negative-result writeback, one regulator-readable export, and a twelve-to-twenty-four-month success metric. These numbers are design assumptions for a serious pilot, not evidence that an existing program has committed. A plausible first budget is $5-to-$10 million for schema maintenance, reviewer time, conformance tests, export tooling, and lab integration. Labs receive milestone money only after signed failed-protocol deposits resolve.

Review capacity needs numbers from the beginning. A first queue can assume hundreds, not millions, of deposits per month: automatic duplicate clustering, rule-based rejection for malformed deposits, ten-to-twenty-minute provenance triage for low-risk transitions, and one-to-three-hour expert review for transitions that touch active funding, trials, animals, manufacturing, or safety. Reviewers are compensated, conflicts are logged, signer authority is scoped, appeals are rare but tracked, and the quality metric is downstream effect. The pilot passes only if at least one accepted correction changes a funding, review, lab, or regulator-facing decision that otherwise would have repeated the old assumption.

The regulator path is advisory first. Request an early scientific or regulatory advice meeting, show that the export preserves provenance and dependency movement, then let one IND, CMC, DSMB, IRB, or IACUC packet include it as supporting evidence without asking the agency to bless the protocol as canonical. FDA formal-meeting programs, including Type C and other advice meetings, provide the regulatory analogy: targeted questions, meeting packages, and early feedback without turning the supporting infrastructure itself into an approval decision. See FDA, Formal Meetings Between the FDA and Sponsors or Applicants of PDUFA Products.

Regulatory coupling should be explicit without pretending the engine approves anything. Some transitions are non-regulatory scientific state. Some are regulator-readable support for an IND, CMC package, real-world-evidence submission, DSMB review, IRB packet, or IACUC protocol. The registry makes provenance and dependency movement inspectable; the agency still decides under its own authority.

The capture point is often above the nominally open layer. Git itself remained open, but much of software collaboration moved into platform-owned issues, pull requests, Actions, review history, social graphs, and contribution reputation. The scientific analogue is sharper. A protocol can be open while the canonical registry of signers, reviewer reputation, lab capabilities, safety gates, and regulatory recognition is closed. Open code with a closed registry is captured infrastructure with a permissive license file.

root keys

rotated by foundation charter

forkable under audit failure

signer recognition

domain consortia issue credentials

regulators query registry state

safety classes

public, delayed, permissioned, excluded

classification is itself reviewable

disputes

plural canonical views allowed

discord is represented as state

Fig. 10. Registry as governance. The registry is a social object as much as a technical one. It decides which signatures, safety gates, credentials, and disputes other institutions can trust.

The Discovery Engine has to be open protocol first, reference implementation second, ecosystem third, SaaS fourth.

Commons governance needs several enforceable properties:

Multiple independent implementations can read and write the protocol.
The canonical registry can be forked under chartered failure conditions.
Signer recognition is federated, not owned by one company.
Safety gates are inspectable and mapped to IRB, IACUC, DSMB, biosafety, CMC, and real-world-evidence interfaces where relevant.
Public and private state boundaries are explicit: public transition, delayed evidence packet, trusted-reviewer room, regulator-only packet, or excluded deposit.
Disagreement can be represented as state, not forced into premature consensus.
Review authority can be revoked, rotated, appealed, and audited.
Registry snapshots, signer graphs, conformance tests, dispute records, and reputation exports are portable enough that capture can be contested.

This is where the middle page connects back to both other essays. Constellations argues for maintainers, plural canonical views, and structural separation. Terafactory argues that the orchestration layer and identity registry above physical facilities are the capture points that decide whether the body stays open.

The Discovery Engine is the architecture that makes those claims operational. Without governance, the engine becomes a faster way to manufacture apparent consensus. With governance, disagreement itself can become state: not noise around the record, but part of the record.

The states have to be explicit: contested, minority view, scope split, replication failed, under appeal, safety held, and retracted. Each has a different effect on what can be shown, routed, funded, repeated, or merged.

The Three Artifacts

The trilogy has one movement. First, the record: science needs a shared frontier where corrections, failures, dependencies, and evidence can compound. Then, the engine: activity has to become governed state before it can guide the next task. Finally, the body: state has to reach instruments, labs, clinical systems, factories, funders, and regulators before the future changes shape.

Record, Engine, Body names a sequence of pressure. What is known must be writable. What is writable must be reviewable. What is reviewed must be able to act.

At small scale, that means a reviewer opens a diff and sees exactly what would change.

At institutional scale, a lab deposits a failed run and the next lab does not repeat it. A model predicts against the state it will later be judged by. A foundation changes Friday’s funding decision because Wednesday’s correction arrived in time.

Across the next thousand years, the details change. The invariant remains: intelligence and experimentation can become abundant, but trustworthy state integration remains the bottleneck.

That is the engine’s test: the correction arrives before the experiment is bought, before the model is trusted, before the grant renews, before the patient-facing decision inherits the old assumption.