Notes · 02
Parallel Sessions, Shared Substrate Last night was the first time the system felt less like a queue of assistants and more like a small engineering floor.
The interesting part was not that multiple agents were running. That is easy to demo. The interesting part was that they were running at different levels of the stack, at the same time, without a central script telling each one where the others were. One lane was closing the JustAi era. One lane was hardening the public history. One lane was standing up the substrate repo. Lem kept a research session watching the edges. Later, Codex noticed a small grammar issue on a live deploy and pushed a fix after a different lane had merged the release minutes earlier.
No ceremony announced that behavior. There was no meeting, no lock service, no carefully staged baton pass. The coordination primitive was much smaller: purpose-slugged writes to an append-only shared bus, hydration patterns before risky action, and Cowork mode (Claude desktop) treating memory as part of the operating surface rather than a note-taking afterthought.
That is a different shape from most agentic demos. The usual demo is single-threaded: one instruction, one agent loop, one transcript, one artifact. Useful, but misleading. Real engineering rarely behaves that way. Work lands while other work is in flight. Someone fixes the deploy while someone else is still writing the release note. A safety check changes the release plan. The question is not whether Claude, GPT-5.5, Sonnet, or Opus can finish a task. The question is whether independent sessions can compose without turning the repo, the deploy, or my attention into shared mutable state.
The receipts
The first lane closed the JustAi pre-public arc: Phase 4 through Phase 6 ratified, the release tagged at v0.4.0, and the architecture split made explicit. JustAi became the orchestration layer. safe-mini became the execution substrate. local-resident remained the experiment driver. That three-repo shape matters because it separates product UX, safety boundary, and research harness instead of letting them blur into one repo that only its author can reason about.
The second lane made the ownership claim clean. The public history had inherited attribution trailers from assisted development runs. That is fine inside a lab, but it is noise in a principal-owned public artifact. The cleanup removed attribution trailers from the commit messages without changing the working tree. The important verification was not just "the strings are gone." The code did not change; the public history became cleaner without rewriting the substance of the work.
The third lane stood up safe-mini as a public substrate repo. It moved from scaffold to tested reference implementation with CI across supported Python versions. That can sound like polish, but here it is more basic: a safety substrate that cannot explain itself under test is not a substrate yet. The public claim has to be inspectable.
The fourth lane caught the most important fact of the night by refusing to trust a tidy story. A release-safety assumption looked harmless until a live probe disproved it. That held the public flip until remediation and verification were complete. This is the line I want preserved from the night: probe before claiming. "Probably safe" is not a release criterion. The system was useful precisely because a research session was willing to falsify the plan while the release work was otherwise moving forward.
The coordination trick
The coordination trick was not complex. Each session wrote durable notes with a purpose slug: enough identity to make concurrent writes distinct, enough structure to be searchable later, not so much process that every worker needed to understand the entire organization. The visible spine was an append-only handoff log, backed by durable memory for facts that need to survive the room. There were no key collisions across simultaneous sessions writing the same memory store. That sounds mundane until you compare it with the usual failure mode: a shared scratchpad that becomes either a bottleneck, a conflict zone, or a pile of stale narration.
Hydration patterns did the other half of the work. Before a risky command, the session asks what it should remember about that class of action. Sometimes the answer is empty and the work proceeds. Sometimes it surfaces a local rule: do not edit harness internals without approval, use upsert because memory delete is soft, run the browser because type checks do not catch UI behavior. The pattern is lightweight, but it moves knowledge from "I might remember" into the execution path.
The handoff tools make that practical. Cowork can dispatch directly to Codex when the work belongs in WSL, route review and planning through a cheap path, and use a clean local shell bridge instead of a quoting gauntlet through Windows. None of these is the product. They are the boring substrate that keeps me from becoming the message bus.
The result was "do X in support of Y" emerging without explicit coordination. The typo fix is a small example, which is why it is useful. A separate Codex session saw a deployed text issue and corrected it after the landing work had already merged. It did not need to ask which lane owned the page. It had enough durable context to know the active surface, enough restraint to keep the patch tiny, and enough verification discipline to push only after checking the result.
What changed
I do not think last night proves some grand thesis about the future of software work. It proves a narrower, more useful thing: parallel autonomy gets real when the coordination layer is boring. Durable append-only memory. Purpose-slug naming. Hydration before risky operations. A strong boundary between the orchestrator and the substrate. A release posture that values being correct over looking fast.
The architecture lesson is that composition is mostly a state-management problem. If state lives only in chat history, the human becomes the bus. If state lives in a mutable shared plan, every worker needs a lock protocol. If state is append-only, searchable, and surfaced at the moment of action, independent sessions can make local decisions that still line up with the larger direction.
There are still rough edges. The public repos need more miles under real workloads. The memory layer needs continued pressure from messy sessions, not just curated milestones. But the shape is now visible. The harness is no longer just a way to run a capable loop. It is becoming an operating substrate for multiple simultaneous engineering conversations that can leave each other useful state, avoid stepping on the same keys, and ship in adjacent lanes.
That is the part worth writing down. Not "autonomy" as a slogan. Autonomy as a set of boring constraints that let separate sessions become additive instead of theatrical.