01Primitive

DeltaWrite

A persistent, inference-time write operation on a frozen language model — built from forward passes alone, loadable as a portable file, reversible at will.

Patent pendingUSPTO · Provisional filed 2026-04-11PCT window open through 2027-04-11

Overview

A write operation, not a training pipeline.

Today, getting new knowledge into a pretrained language model means one of three things. Fine-tuning: slow, expensive, destructive, and model-wide. Retrieval-augmented generation: retrieval plumbing, context overhead, and recall that degrades under paraphrase. Prompt engineering: ephemeral, token-budgeted, brittle. All three compound at scale, and none of them treat “teach the model a thing” as a first-class operation.

DeltaWrite replaces all three with a single primitive. New knowledge is written directly into the model’s own weights in milliseconds, using only forward passes over a frozen base. No optimizer runs. No loss is minimized. No gradient is computed. No prompt context is consumed at inference time. The base model’s general capabilities are preserved intact.

The output of a write is a compact, portable file we call a Knowledge Module. Modules load additively, compose in stacks, and unload cleanly. A module taught in one process reloads verbatim in another. A module attached today can be revoked tomorrow and the base model is bit-identical to before the write.

Key innovations

What makes it patentable.

Property 01

Forward-only construction

Modules are built from forward passes over the base model. No gradients are computed, no optimizer runs, no loss is minimized. A fact is written in the time it takes to read its answer.

Property 02

Frozen base, always

The pretrained weights are never retrained. DeltaWrite writes alongside them, not over them. The base model’s general-capability profile is preserved exactly and measurably — held-out probes across reasoning, math, and code remain unchanged.

Property 03

Content-addressable dispatch

A learned router decides per query which module, if any, fires. Unrelated queries pass through untouched. Dispatch is a first-class part of the system, not an optimization — and a claimed component of the IP.

Property 04

Persistent and reversible

Modules persist across prompts, sessions, and full process restarts. Selective forgetting is a single subtract operation: remove a module and the model returns to its pre-write state exactly.

Property 05

Portable Knowledge Modules

Each module serializes to a compact file on the order of a few hundred kilobytes per fact. Transfer between machines, checkpoint like code, version like a binary artifact, distribute like a library.

Property 06

Model-family agnostic

Validated across Qwen 2.5 (1.5B, 7B, 14B), Llama 3.1 8B Instruct, and Mistral 7B Instruct. Per-family calibration differs, but the mechanism transfers without change.

How it works

The method, in steps.

Step 01

Write

Provide a prompt and a target response. DeltaWrite runs a small number of forward passes over the frozen base model to construct a low-rank module for that fact. There is no training loop, no optimizer state, and no gradient tape. A single fact is written in milliseconds.

Step 02

Route

At query time, a content-addressable dispatcher inspects the incoming request and selects at most one module to apply. The router is trained once on a few paraphrases per module plus a pool of unrelated control queries. Queries that match nothing pass through the base model unchanged.

Step 03

Apply

When a module is selected, it contributes a small, carefully calibrated additive correction during inference. The perturbation is far below the magnitude at which general-capability probes drift. Everything the model already knew, it still knows.

Step 04

Revoke

Removing a module is the inverse of the write. The module’s contribution is subtracted out; the dispatcher is retrained without it. What remains is the base model in its original state — no residue, no measurable drift.

The numbers

Measured, not claimed.

Routing accuracy, N = 500

99.8%

Five hundred distinct modules coexisting under a single dispatcher. Zero control leaks on held-out queries.

Paraphrased recall, N = 50

ρ = 0.94

Qwen 2.5-7B-Instruct on a 50-fact knowledge base. Learned dispatch, averaged across three seeds, pre-registered paraphrase split.

Per-query overhead

~250 ms

End-to-end dispatch and routing latency at N = 500. Base-model generation throughput is otherwise unchanged.

Model families validated

Qwen, Llama, Mistral. Each cleared the pre-registered MVP bar (oracle ≥ 0.90, full ≥ 0.70) with margin.

Applications

Where it deploys.

Enterprise knowledge injection

Replace retrieval pipelines for proprietary corpora. Write company facts, policies, product schemas, and internal terminology once; serve them from the weights, without retrieval plumbing or context budget, across every downstream application built on the base model.

Regulated-domain deployment

Auditable write, auditable revoke. The answer to “what does this model know?” becomes an enumerable list of loaded modules, not an opaque corpus hash. Well-suited to healthcare, legal, financial, and government deployments where provenance and revocation are hard requirements.

Multi-tenant model serving

A single shared base model across many tenants. Each tenant gets their own stack of modules; routing isolates them per query. One customer’s knowledge never influences another’s generations. Serving cost amortizes across the tenant fleet.

On-device personalization

Personal modules are small, private, and fast. Personalize a local model to a user’s context without sending a single token to the cloud. New personalizations can be written on-device with no training-time infrastructure.

Model iteration at engineering speed

Ship a new fact in the time it takes to write a unit test. Roll it back as quickly as you shipped it. “Release” and “rollback” become operations on a model’s knowledge, not a fine-tuning schedule.

Engage

License, replicate, or co-develop.

DeltaWrite is available for licensing to frontier labs, infrastructure providers, and regulated enterprises. Replication kits, benchmark protocols, and technical deep-dives are shared under NDA. A US provisional patent was filed on 2026-04-11; the PCT window remains open through 2027-04-11.

Download the preprint