Incremental teaching-table prompts and reducing prompt churn¶
This document describes how Plasm serves the Plasm teaching table (many-shot, symbol-tuned TSV examples) for HTTP execute and MCP execute sessions, and why that design reduces prompt churn for agents and humans.
Teaching medium: agent-visible context is always the TSV table (plasm_expr, one tab, Meaning), optionally prefixed by # comment contract lines and wrapped in a markdown fence by HTTP/MCP hosts. The legacy compact markdown transcript (;;-style blocks) is not emitted on the wire.
Goals¶
- Less redundant context — Avoid sending the full teaching table on every tool turn when the session’s catalog entry and seeds have not changed.
- Incremental graph exposure — Treat the CGS as a graph: ship teaching rows in waves as more entity types are needed, instead of always expanding to a large 2-hop neighbourhood in the first message.
- Stable symbolic indices — Keep
e#/m#/p#/r#assignments monotonic: once assigned in a session, a symbol does not change meaning when new entities or capabilities enter the slice. Relations user#(notp#). - Aligned expand + teaching table — Expression pre-parse expansion (
expand_*) must use the same symbol map as the teaching text the model saw, soe1.m3(...)expands consistently after each wave.
“Prompt churn” here means: repeated or oversized teaching text in agent context (duplicate full prompts on session reopen, multi-megabyte tables when only a small neighbourhood is needed, or shifting m# indices between waves). Those waste tokens, confuse models, and break trust in symbolic examples.
Problem (before this design)¶
- Full dump — Rendering the teaching table for the union of 2-hop neighbourhoods around seeds produced large prompts even when the task only needed a few entity types.
- Repeat sends — MCP
plasm_context(open path) could return the entire teaching table again when the server reused an existing session (reused: true), unless the client omitted the body (we now omit the teaching block on reuse). - Index drift — A naïve rebuild of
SymbolMapfrom a growing entity set can re-sort method keys globally, which would reshufflem#values between waves. Incremental sessions instead append new(domain, kebab)and identifier bindings.
Design overview¶
FocusSpec::SeedsExact¶
Teaching-table slicing can use an exact entity list (no automatic 2-hop union). That list is the first wave of exposure: only those entity blocks appear in the initial teaching string.
Implementation: FocusSpec::SeedsExact and entity_slices_for_render.
TeachingExposureSession¶
A session-scoped structure in plasm-core (TeachingExposureSession) allocates:
e#— Order of first exposure of each qualified(registry entry_id, entity)pair. Colliding entity names across catalogs (e.g.github:Issueandlinear:Issue) receive distincte#symbols; teaching rows and surface filters always use the session registryentry_id, not bareCGS::entry_idfrom YAML fixtures.m#— New(domain, kebab)capability pairs, sorted only among newly added pairs, then assigned the next freemindices.p#— New fields and capability params visible in the cumulative slice (sorted among new names, then next freepindices).r#— Declared relation navigation slots (separate counter fromp#).
Existing assignments are never rewritten. Rendering uses render_teaching_prompt_bundle_for_exposure; later waves pass emit_entity_blocks so only new entity blocks are appended (and the main “Valid expressions” preamble is omitted on those waves).
Teaching exemplar anchors (CGS binding surface)¶
Whether teaching rows should include an entity anchor exemplar (for example Entity($) / symbolic e# usage) must not be decided in the prompt layer by naming a transport (for example “GraphQL”). plasm-core exposes transport-neutral predicates on the capability’s mapping template:
template_domain_exemplar_requires_entity_anchor— true when the template needs an anchor for teaching examples: HTTP path template variables or a GraphQL operationvariablesblock that binds anid(or equivalent single-entity key).template_invoke_requires_explicit_anchor_id— used for expression pre-parse / shadow-invoke rules when an explicit anchor id is required (path vars or any GraphQL operation variable list), matching the compile path’s expectations.
CapabilitySchema::domain_exemplar_requires_entity_anchor and invoke_requires_explicit_anchor_id delegate to those helpers. Teaching synthesis consults the schema-level predicate (for example via path_vars_empty in prompt_render) so prompt synthesis stays free of GraphQL-specific conditionals.
When the cumulative slice includes structured string semantics, the preamble adds <<TAG heredoc rules in prompt_render: copy-pastable fenced text blocks show tagged form only. The only multiline/raw string form in path expressions is bash-inspired <<TAG + newline + body + closing line (trimmed TAG), with the same close optionally glued before ) / , / }.
Grammar note: The opener is << (two characters) plus a tag, not <<<. Legacy d<<< is removed—use <<TAG only (never << + newline alone).
$ and ~"text" (teaching only)¶
Teaching TSV rows use $ and ~"text" as fill-in cues, not values to copy into executable programs. The parser accepts bare $ as the string "$" (expr_parser); submitting e#~$ runs a real search for that character (e.g. Linear issue_search → title contains "$"), which often returns zero rows.
- Never emit bare
$inplasm/plasm_runprograms — substitute concrete ids, filter keys, or search strings from context (team list, prior bindings, user intent). - Full-text search rows teach
e#~"text"(quoted meta-literal). Replacetextwith real terms, e.g.e2~"billing", note2~$. - Search-only entities (no
querycapability, e.g. LinearIssue): there is noe#{}“list all”. Use scoped filters shown in the teaching table (e#{p#=…}) and/or real~"…"search text. Resolve filter values from the workspace (e.g. listTeamfirst — do not assume doc-example keys likeENG).
First-wave teaching contract and MCP program_contract repeat these rules; incremental waves may omit the full preamble — see program_contract.txt.
Execute session state (plasm)¶
ExecuteSession holds:
prompt_text— Cumulative teaching text (wave 1 + optional## Expanded capabilitiessections).teaching_exposure— The [TeachingExposureSession] used for both teaching rendering andexpand_expr_for_teaching_session(viaexpand_expr_for_session_with_optional_exposure).domain_revision— Increments each time more entities are exposed (wire field name; teaching-table revision counter).
Session identity (prompt_hash, session id) stays stable across waves; the hash is still derived from the initial prompt text for routing (see agent code paths).
MCP tools¶
plasm_context: Call first on each MCP connection. Passintent(host-chosen, stable for the same agent context — seedocs/mcp-session-reuse.md) and requiredseedsarray of{ api, entity }. The server returnslogical_session_ref(s0,s1, … — a per-connection slot, like artifact indexr/{n}) for subsequentplasmcalls; canonical UUID + trace identity are server-side (seedocs/mcp-logical-sessions.md).- On a fresh open (no live execute binding for that logical id), the primary
apiis the lexicographically first distinct catalog id among seeds — this keepsSessionReuseKeystable if the host reorders an equivalent seed set. Secondary catalogs in the same call are federated/expanded in lexicographicapiorder (after the primary), so multi-API open order does not depend on seed list order. Tool output returns delta-only teaching waves (no full prompt replay on federate/expand), while session symbol maps stay append-only. MCP_meta.plasm.continuityalways includesstale_binding_recoveredandnew_symbol_space(anddiscard_cached_plasm_symbolswhennew_symbol_spaceis true) — when that flag is set, discard any priore#/m#/p#/r#cached in the agent. Tenant MCP config scopes allowed APIs; a disallowed API fails the whole call. The teaching TSV contract teaches namedp#=…/name=…slots for creates/updates; do not infer field meaning by permutingp#numerically after a new wave. plasm: Passlogical_session_refandprogram. Runs Plasm lines using the session’s exposure map when present. Paginated lists: followpage(s0_pgN)(and_meta.plasm.paging) — the slot must match yourlogical_session_ref.plasm_run: Live execute (blocking by default). Withwait: false, returnswait(s0_oN)immediately; poll withwait(s0_oN), cancel withcancel(s0_oN). When dry verdict is review, passplan_commit_ref(pcN) from matchingplasmdry-run orforce: true. See plasm-long-operations.md and grammar in plasm-language-definition.md.
The first-wave teaching TSV preamble (via render_prompt_contract) teaches page(sN_pgM), wait(sN_oM), and cancel(sN_oM) alongside entity/query grammar — same contract as MCP program_contract.txt and Phoenix tool-model execute notes (tool-model-http.md).
Intent-scoped exposure (when plasm_context sets context_intent): capabilities on non-seeded entities still require lexicon overlap with intent. Each seeded { api, entity } always teaches that entity’s query / search / get surface (and primary_read when declared). Create / update / delete / action on seeded entities require intent lexicon overlap (or appear in ranked_capabilities when the ranked gate is enabled). MCP read-first open (read_first_seeded_exposure on session create) defers seeded mutators unless intent scores strongly (≥ READ_FIRST_SEEDED_MUTATOR_MIN_SCORE) or the wire name is listed in ranked_capabilities. Federate/expand waves use the same read-first policy. Updates / deletes / actions on non-seeded entities remain intent-filtered. See derive_intent_exposure_surface_batch.
Cardinality: many logical sessions per MCP transport (MCP-Session-Id); one active Plasm execute binding per logical session (see mcp_server.rs module docs).
Federated sessions (multi-catalog)¶
A single execute session (prompt_hash + session) can expose entities from more than one registry row (entry_id) without merging their CGS graphs into one artifact.
- Prompt / symbols —
TeachingExposureSessiontracks which catalog each exposed entity name belongs to viaentity_catalog_entry_idsparallel toentities;e#assignment,addeddetection, intent-surface filters, and federated teaching deltas key on(entry_id, entity), not bare entity names. Teaching rendering and the symbol map stay append-only (e#/m#/p#/r#monotonic within that session). Headings and tables can reflect (registry entry, entity) so the model knows which API each block refers to. Teaching TSV emission usesSymbolMap::entity_sym_for/ident_sym_*_forwith the owningentry_id; unqualifiedSymbolMap::entity_symreturns a wire name when the same entity label appears in more than one catalog — agents must copy thee#from the row for that catalog block, not infer fromIssuealone. - Execution — The agent keeps one
CgsContextperentry_id(backend URL, auth, and its ownCGS).FederationDispatchmaps exposed entity names to the owning context; the runtime selects HTTP origin (and typecheck graph) per operation, not a single merged schema. - MCP — If an execute binding already exists and
seedsinclude anentry_idnot yet in the session, the server federates that catalog into the same session (additional teaching wave, same binding). Seeds for already-loaded entries produce expand waves. - HTTP — Primary flow is still
POST /executewith oneentry_id; extending with a second catalog may use the same federate path as MCP where implemented (seehttp_execute.rs).
Not in scope: global merge semantics for colliding entity names across catalogs — prompts are symbolic and (catalog, entity) disambiguates; sessions do not rely on a structural union of CGS.
HTTP parity¶
POST /execute creates sessions the same way (incremental first wave + stored teaching_exposure). There is no separate HTTP route for expansion in the minimal design; MCP plasm_context invokes the same expand/federate paths server-side.
HTTP execute also supports ?mode=plan, ?wait=false, ?force=true, and ?plan_commit_ref=pcN on live runs — and program bodies wait(s0_oN) / cancel(s0_oN) on the synthetic s0 logical session when no MCP plasm_context is present. See plasm-long-operations.md.
MCP: who orders discover vs execute?¶
The host agent (e.g. Cursor) decides which tool to call and when. The server surfaces plasm_context first in tool order and initialize instructions requiring it before other Plasm tools; it cannot fully enforce ordering. If the model skips search, you may see only plasm_agent::http_execute “execute expression” lines in logs — that means the client went straight to execute after (or without) a plasm_context open that might have happened in an earlier turn or session.
Observability: at INFO, plasm_agent::mcp logs discover_capabilities, plasm_context, plasm, and list_registry when those tools run, so a healthy flow shows one discover (or retry if incomplete) → plasm_context → plasm explicitly. Filter with RUST_LOG=plasm_agent::mcp=info (or info for the whole crate) to confirm.
Related code¶
- CGS template binding helpers (teaching anchor / invoke id):
plasm-oss/crates/plasm-core/src/schema.rs(template_domain_exemplar_requires_entity_anchor,template_invoke_requires_explicit_anchor_id) - Federation dispatch (multi-context, no CGS merge):
plasm-oss/crates/plasm-core/src/cgs_federation.rs - Symbol tuning and exposure:
plasm-oss/crates/plasm-core/src/symbol_tuning.rs - Teaching synthesis:
plasm-oss/crates/plasm-core/src/prompt_render.rs - Prompt pipeline:
plasm-oss/crates/plasm-core/src/prompt_pipeline.rs - HTTP + expand:
plasm-oss/crates/plasm-agent-core/src/http_execute.rs - MCP:
plasm-oss/crates/plasm-agent-core/src/mcp_server.rs - Long-running plan execute: plasm-long-operations.md
- Phoenix tool-model + execute notes: tool-model-http.md
Summary¶
Prompt churn is reduced by (1) exact first-wave teaching size, (2) append-only waves via plasm_context seed deltas, (3) no duplicate teaching table on reused opens, and (4) monotonic e#/m#/p#/r# so earlier examples remain valid as the session grows. Federation adds (5) multi-catalog sessions without merging CGS — same monotonic symbol stream, dispatch per CgsContext.