Architecture · Vela · Horizons

The Three Registers of Science

Science is structured like an observatory. The same three-register architecture that structures how an institution reads science also structures how science as a whole is built, remembered, and coordinated. Vela is the State-layer protocol that restores the middle of the stack.

Science has many partial maps and no shared state layer. This document is the architecture for what would change if it did.

I · the frame

Sky, Astronomy, Observatory
applied to science itself

At the civilizational scale, the registers are the infrastructure of science itself. They map cleanly onto the three-layer taxonomy the field is converging on: Runtime, State, Network.

Fig. I · the honest stack, at civilizational scale
03 · OBSERVATORY / NETWORK how science coordinates and compounds the missing unified interface — Git for science open protocols & registries federated sharing schemas & standards trust across orgs & labs closed-loop orchestration autonomous labs coordination cross-institution workflows public + private interop governance permissioning attribution & citation graphs economic layer for attribution native language coordination protocol 02 · ASTRONOMY / RUNTIME the system that does science where constellations are drawn, frontiers expanded, loops closed experimental workflows protocol execution instruments & robots autonomous labs (SDLs) compute & simulation agents in the loop hypothesis generation extraction & compilation world models for science simulation of paths forward FutureHouse · Phylo · Biomni Revel · Nominal · Emerald Arc · FH · Argo · robotic labs most AI-for-science today compiling into the void native language methodology + agents 01 · THE SKY / STATE compiled scientific knowledge itself the substrate — versioned, provenance-bearing, content-addressed findings (atomic claims) evidence & provenance typed links between claims corrections & contradictions confidence & drift dark matter of science gaps on the frontier the observable universe of findings VELA lives here. an open protocol for this layer. content-addressed finding bundles. the thing that is missing today. native language compiled knowledge civilizational atomic finding
At the scale of science itself, the three registers also have a second name — Runtime, State, Network — the infrastructure taxonomy the field is converging on. Vela is an open protocol for the State layer. Runtime and Network exist in fragments today; most frontier labs are building Runtime, almost no one is building the protocol layer for State. That is where the wedge is, and that is what Vela fills.
01 · sky

State

the system that remembers science

Compiled scientific knowledge as a first-class substrate. Findings with evidence, provenance, confidence, and typed relations. Versioned. Content-addressed. Queryable. Correctable at the finding level.

the thing missing today

Papers are human-readable renderings, not a substrate. No Git for findings. LLMs compile into the void. This is the 100% compilation debt that every downstream layer inherits.

Vela — open protocol
02 · astronomy

Runtime

the system that does science

Experimental workflows, protocol execution, instruments and robots, compute, simulation, agents in the loop. The active work of extending the frontier. Where FutureHouse, Phylo, Biomni, Emerald Cloud Lab, and self-driving labs operate.

the structural problem

Without a State layer underneath, every runtime is a silo. Every agent re-extracts findings from the same papers. Every SDL stores results in its own schema. Compilation is redone every time because no substrate holds its output.

Runtime — many players, no shared substrate
03 · observatory

Network

the system that lets science compound

Registries, standards, federation, trust, attribution, governance. The coordination protocol that lets findings, runs, and workflows cross institutional boundaries without losing meaning or provenance.

where it exists in fragments

DOI, ORCID, ROR, schema.org-for-science attempts. Piecemeal, disconnected, human-readable-only. There is no version-controlled, machine-queryable fabric that spans the scientific community. GitHub for science has not been built.

Network — speaks to Vela, not owned by it

Vela is the open protocol that restores the State layer. The other two layers will be built by many teams. The thing missing underneath all of them is the compiled substrate. That is the wedge.


II · stack inversion

Why every AI-for-science tool
compiles into the void

Software infrastructure follows an ordering. Git came before GitHub. GitHub came before package managers. Package managers came before CI/CD. CI/CD came before Copilot. Each layer required the substrate beneath it to exist before it could compound. Remove any lower layer and everything above it collapses into individual heroics.

Science has inverted the stack. We have AI co-scientists, agentic pipelines, autonomous labs, and foundation models for biology, running on top of a substrate that does not yet exist. The compiled, queryable, versioned layer that Git provided for code has no equivalent for findings. Papers are not it. Databases are not it. Citation graphs are not it. They are human-readable renderings or single-purpose silos.

We have built the frontier layer of the stack before the substrate layer underneath it. This is why AI compilation of science does not compound.

Vela is the substrate. An open protocol for content-addressed, provenance-bearing, version-controlled finding bundles. The JSON Schema is published. The compiler runs from a Rust CLI. The first corridor — the Alzheimer blood-brain-barrier corridor — has 700 papers, 2,299 findings, 53,632 typed links, and 8 novel cross-domain hypotheses that produce zero results on PubMed. The flywheel is real and it is already spinning on a small region of the sky.

Every other serious AI-for-science team eventually reaches the same realization: there is nowhere for compiled findings to land. PhysMaster's LANDAU, Bohrium's traceable execution, Allen Institute's S2AG — all converging toward the same architectural requirement. Vela is the open, interoperable, first-class version of the thing they are all trying to build privately.


III · three clocks at civilizational scale

Time in a scientific universe

Time runs through all three registers as three clocks, each keeping time for a different kind of event. The discipline of keeping the clocks separate is the difference between a substrate that compounds and one that decays into story.

Fig. II · world, system, simulation
01 world time when reality did the thing experiment run mechanism observed claim first made 02 system time when Vela learned of it paper published finding compiled correction applied 03 simulation time counterfactual branches if this claim is retracted if this mechanism holds if this experiment is run INGEST & COMPILE BRANCH
World time is when reality produced the finding. System time is when the finding was compiled into the substrate — always later, sometimes decades later, which is the bench-to-bedside gap made formal. Simulation time is where the future lives: what happens if this claim is retracted, if this mechanism holds, if this experiment succeeds. Drawn dashed because ephemeral.
the discipline that makes compounding possible

Confidence drift is what happens when the clocks are not separated. A tentative correlation from 1990 becomes an established fact by 2015 through successive citation, because no one tracks when the confidence was earned and when it was inherited. Vela separates the clocks by construction. Every finding has a world time and a system time. Simulation outputs are always tagged, always ephemeral, and never promoted to substrate without explicit verification. The clocks must stay separate or the substrate decays into the same confidence-laundering machine that produced the current crisis.


IV · simulation at science-scale

Four kinds of "what if,"
at civilizational stakes

Simulation is not a new surface. It is a capability that lives inside the Runtime register, reading the State substrate and writing ephemeral branches. At science-scale the consequence is larger by orders of magnitude: not "should we hire this advisor" but "what collapses if this claim is retracted."

01 · cheapest

Horizon re-projection

near-term vs long-horizon lens on the same corridor

The same compiled corridor read through different evaluation criteria. Does this region of the substrate look different under clinical-translation weights than under mechanism-discovery weights? The findings don't move. The interpretation does. Often the most illuminating.

02 · linear

Trajectory extrapolation

where is this corridor going if current momentum holds

Velocity and acceleration of a scientific region. Which corridors are compounding, which are stalling, which are about to enter a regime change. Useful for prioritization; fails exactly at the moments that matter most — supernova events like AlphaFold or CRISPR, which re-org the landscape in a year.

03 · genuinely new

Counterfactual intervention

needs a causal model over the finding graph

"If this finding is retracted, what else collapses?" "If this mechanism is confirmed, which experiments become obvious next?" The dependency graph Vela makes explicit becomes a simulation surface. Where the compiled substrate starts to act as a causal inference engine for science itself.

04 · portfolio-level

Corridor composition

which regions of science should we compile first

Given finite attention and capital, which corridors should the Weave compile next? Which interlocking pairs unlock each other? Which dark regions should be deliberately long-exposed? Markowitz at civilizational scale, and the Horizons operating model.

The four are layered. Horizon re-projection is cheapest and ships first. Trajectory extrapolation builds on it. Counterfactual intervention requires Vela's typed links to be dense enough to support causal reasoning. Corridor composition is the portfolio-level capability that the Gigafactories need to operate against.

The honest epistemic limit. Prospective simulation of science is harder than retrospective reading, and the honesty is in admitting it. Short-horizon modest interventions are tractable. Long-horizon field-scale simulation is not forecastable with calibrated confidence. The model's job is not to predict the future but to surface which compilation decisions are robust across many plausible scientific futures. Not "what will happen" but "what compilation work performs well enough across the widest set of conditions."


V · the vision stack

Vela, Horizons, Gigafactories, the Weave

Vela is the protocol. Horizons is the organization. The Gigafactories are the campaigns. The Weave is what the three together produce when the substrate is open and the campaigns compound.

Vela
the open protocol
An open protocol for the State layer of science. Content-addressed finding bundles, typed links, confidence tracking, provenance chains, corridor assembly. Published schema. Rust CLI. Apache 2.0. Intended to be interoperable with every Runtime and every Network that speaks it. Never a revenue product, because the substrate cannot be enclosed without re-breaking the thing it exists to fix.
Horizons
the organization
The company built around Vela. Operates the reference compiler, runs the initial corridors, produces the open benchmarks, and builds the State-layer products that make the protocol usable. Successor to the Bell Labs form: long time horizon, open substrate, frontier-shaped, compounding through published output rather than enclosed IP.
The Gigafactories
the eleven campaigns
Named compilation campaigns over strategic regions of the scientific sky. Solace · Tidal · Watershed · Meridian · Granary · Rift · Lodestar · Sentinel · Parallax · Rootstock · Beacon. Each one is the Alzheimer blood-brain-barrier corridor scaled: a serious, long-exposure compilation of a region that matters, partnered with the Runtime teams extending the frontier in that region.
The Weave
the civilizational result
What science becomes when its compiled substrate is open, versioned, and queryable; when the Runtime has somewhere to land; when the Network is interoperable by default. The compounding regime every serious observer knows is missing.

VI · the bet

What this work is, and is not

Most AI-for-science funding is flowing to the Runtime layer. The foundation models, the autonomous labs, the agent pipelines. These are real and necessary. But every Runtime team that runs for long enough arrives at the same architectural realization: there is no substrate to land on. The compiled, queryable, versioned layer that Git provided for code has no equivalent for findings. Vela is the open protocol that fills that gap.

The bet is not that Vela alone reshapes science. The bet is that Vela plus a community of Runtime teams building on top of it, plus a Network layer coordinating across them, produces the compounding regime that every serious observer knows is missing and that no one has yet constructed. The State layer is the wedge. Everything else follows.