{"type":"fact","id":"project-identity","text":"PHOSPHOR is an Execution-as-Interface (EAI) infrastructure by EVEMISS TECHNOLOGY CO., LTD. (一言諾科技有限公司), authored by 許筌崴 Neo.K. Current version is v0.5.0-beta, an experimental/test release. Licensed Apache-2.0. Canonical domain emlphosphor.com; repository github.com/kakon77777-commits/eml-phosphor; licensing contact kakon77777@gmail.com."}
{"type":"principle","id":"eai-thesis","text":"Execution-as-Interface (EAI) is the core thesis: a VM's actual execution, once paired with a complete Correspondence Table System (CTS), is simultaneously a human-readable visualization and an AI-parseable event stream. These are not two representations of one object; they are the same object viewed two ways."}
{"type":"definition","id":"phi-formula","text":"The EAI projection is deterministic and written Φ : M × CTS → V. M is the VM state at tick t, CTS is its semantic table set, and V is a representation that is directly readable by a human and structurally parseable by an agent."}
{"type":"principle","id":"tagline-visible-visualizable","text":"PHOSPHOR's tagline is 'Visible ≡ Visualizable' (可見即可視). What is genuinely visible in execution is, by construction, visualizable to both a human observer and a machine consumer through the same projection."}
{"type":"concept","id":"correspondence-table-system","text":"The Correspondence Table System (CTS) is the semantic table set that pairs with raw VM state to make execution legible. It has six layers and supports both static construction and dynamic augmentation from execution traces."}
{"type":"component","id":"vm-family-overview","text":"PHOSPHOR ships three integer VM profiles: EML-VM-16 (prototype/teaching), EML-VM-64 (larger address space), and EML-VM-BASIC (cleanest AI-mode substrate). Float VMs (EML-VM-F32/F64) are deferred and not shipped."}
{"type":"component","id":"vm-16","text":"EML-VM-16 is an 8-bit VM with 256 bytes of memory and u8 values. Its ISA is exactly 28 opcodes, defined by OPCODE_TABLE in eml-vm16-core.ts (spec §6, 'complete ISA definition'). Instruction format is a fixed 2-byte layout [opcode:8][arg:8]. Intended use: prototype and teaching."}
{"type":"caution","id":"vm-16-opcode-count","text":"The authoritative EML-VM-16 opcode count is 28, per OPCODE_TABLE in eml-vm16-core.ts. An early v0.2 draft's ISA section header miscounted it as '38' even though that section's own list enumerates 28; the overcount is now corrected across the README, site, and specs. Always use 28."}
{"type":"component","id":"vm-64","text":"EML-VM-64 is a 16-bit VM with a 64 KB address space. It adds address registers AR0–AR3 and a variable-length ISA (2/3/4-byte instructions). It is V1-compatible. Intended use: programs needing a larger address space."}
{"type":"component","id":"vm-basic","text":"EML-VM-BASIC is a bounded-integer profile over the domain [0,N]; arithmetic overflow follows a configurable policy: wrap modulo N+1 (default), clamp, or throw. It ships a constraint engine and has no multiply, divide, or logic opcodes. It is the cleanest substrate for AI mode."}
{"type":"component","id":"vm-basic-no-muldiv","text":"EML-VM-BASIC intentionally omits mul, div, and logic operations. Its value domain is a bounded integer range [0,N] with a configurable overflow policy (wrap mod N+1 by default, or clamp/throw), which keeps its state transitions minimal and easy for an agent to reason over."}
{"type":"concept","id":"cts-six-layers","text":"The CTS has six layers: opcode, symbol, type, string, comment, and crossRef. Together they map raw bytes and addresses to mnemonics, named entities, data types, string literals, human annotations, and a computed cross-reference graph."}
{"type":"definition","id":"cts-layer-opcode","text":"CTS opcode layer: maps each instruction byte to its mnemonic. It is the shallowest correspondence — byte to instruction name — and is deeper-elaborated by the v0.5 semantic layer's describeEffect."}
{"type":"definition","id":"cts-layer-crossref","text":"CTS crossRef layer: the computed cross-reference graph of readers and writers over memory cells. It is built statically and augmented dynamically via augmentCTSFromTrace, which recovers register-indirect readers and writers that static analysis alone cannot see."}
{"type":"concept","id":"cts-static-dynamic","text":"CTS is built statically and augmented dynamically. augmentCTSFromTrace recovers register-indirect dataReaders and dataWriters that a plain memory diff misses — e.g. an LD Rd,[Rs] read changes no memory, so it is invisible to diff alone."}
{"type":"concept","id":"cts-dynamic-reader-recovery","text":"CTS sixth-layer dynamic reader recovery: augmentCTSFromTrace originally recovered only dataWriters via memory diff. In v0.5, traceWithSnapshots additionally captures each tick's effectiveAccess (indirect operands resolved from the pre-execution state), letting augmentCTSFromTrace recover register-indirect dataReaders and complete the crossRef compute graph."}
{"type":"component","id":"eml-semantic-layer","text":"The v0.5 semantic layer lives in eml-semantic.ts. It is built on the existing integer VMs and introduces no new ISA. It has two parts: describeEffect (per-instruction operational semantics) and semanticEquiv (a three-valued equivalence judge). A formal Hoare-logic proof layer is intentionally deferred."}
{"type":"definition","id":"describe-effect","text":"describeEffect(op, arg) maps each instruction to its state-transition meaning: an InstrEffect record with mnemonic, reads[], writes[], readsFlags[], flags[], mem ('none'|'read'|'write'), control ('fallthrough'|'jump'|'cond-jump'|'call'|'ret'|'halt'), and a human summary such as 'R0 ← (R0 + R1) mod 256'. It is a deeper correspondence than the CTS opcode table, which gives only the mnemonic."}
{"type":"fact","id":"describe-effect-memory-addresses","text":"For LD/ST, describeEffect records only the kind of memory access in its mem field, not the concrete address, because addresses are register-indirect and determined only at run time. The actual effective address is captured by the core's effectiveAccess during execution."}
{"type":"definition","id":"semantic-equiv","text":"semanticEquiv(codeA, codeB, spec) judges whether two byte sequences are semantically equivalent by running both and comparing observable output. spec specifies where inputs are injected (registers/data cells) and where outputs are observed (optionally including flags). The verdict is three-valued: equivalent, not-equivalent, or inexpressible. The discipline is ported from EML's execution-truth validateEquivalence gate."}
{"type":"principle","id":"equivalence-by-execution","text":"PHOSPHOR v0.5's central design decision is that equivalence is established by execution, not by proof. This mirrors sibling project EML, which in v1.0 abandoned axiomatic/denotational proof in favor of 'run both sides, compare observable output' as its execution-truth invariant. PHOSPHOR ports that discipline to the byte level."}
{"type":"principle","id":"semantic-equiv-adversarial","text":"semanticEquiv generates its own adversarial input vectors: boundary sweeps plus a deterministic LCG mixed across the full [0,255] range — not merely a curated pool, and never trusting a single (e.g. all-zero) input to certify equivalence."}
{"type":"principle","id":"semantic-equiv-exhaustive-proof","text":"With a single input slot, semanticEquiv enumerates all 256 values, and an 'equivalent' verdict is then a proof over the entire input space (exhaustive:true). With multiple input slots it samples, marks exhaustive:false, and 'equivalent' means only 'equivalent on the tested inputs' — not a universal proof."}
{"type":"caution","id":"exhaustive-proof-caveat","text":"An 'equivalent' verdict from semanticEquiv is a real proof only when exhaustive is true (a single input slot, all 256 values enumerated). Otherwise it is high-coverage bounded testing, not a universal guarantee. A 'not-equivalent' verdict is always sound and carries a concrete counterexample."}
{"type":"principle","id":"semantic-equiv-distinct-output-guard","text":"Before certifying equivalence, semanticEquiv requires that the input set actually discriminates behavior — at least 2 distinct outputs. Agreement on a degenerate (all-identical-output) input set is not evidence. This guard prevents degenerate certification; it is not a coverage guarantee."}
{"type":"principle","id":"semantic-equiv-code-region-guard","text":"semanticEquiv rejects mem input/output slots that fall inside either program's code region and returns inexpressible: poking code would asymmetrically corrupt instructions, and observing code would read instruction bytes instead of computed values. This was one of the soundness fixes added in v0.5."}
{"type":"principle","id":"semantic-equiv-falsifier","text":"semanticEquiv is essentially a falsifier. Because the VM core has no clock and no randomness, the verdict is fully reproducible for a given (codeA, codeB, spec). 'not-equivalent' is always sound; 'equivalent' is a proof only when exhaustive, else bounded high-coverage testing. It refuses (rather than guesses) on non-termination, non-discriminating inputs, or illegal slots."}
{"type":"fact","id":"vm-equiv-event","text":"semanticEquiv can optionally emit a self-verifying vm:equiv event through a phosphor-stream emitter, where ok ⟺ certified-equivalent. It is the byte-level counterpart to EML's source-level eml:equiv event."}
{"type":"concept","id":"phosphor-stream","text":"phosphor-stream is a portable 'state → AI-readable event stream' standard generalizing the AI-mode idea: any app can emit a phosphor-jsonl-v1 event stream so an agent sees what actually happened instead of guessing from UI or source."}
{"type":"definition","id":"phosphor-jsonl-v1","text":"phosphor-jsonl-v1 is the portable event-envelope standard. It is self-describing (carries a semantic dictionary), supports intent-vs-actual checking via check(), is globally orderable, and is best-effort — instrumentation never breaks the host application. The envelope and existing event types are frozen; new machine-layer event types (e.g. cpu:step) may be added under the same proto without a version bump."}
{"type":"principle","id":"dual-mode-one-engine","text":"PHOSPHOR has two modes and one engine. Human mode is a phosphor-green CRT React UI — the observation window. AI mode is a headless event stream (WS/SSE/JSONL) an agent subscribes to — the production surface. Both modes read the same VM state M."}
{"type":"principle","id":"single-source-snapshot","text":"The on-screen 'AI STREAM' panel reuses the same buildHeadlessSnapshot builder as the headless driver, so the human view and the AI view are provably one state with no re-implementation drift. In v0.4 the UI re-implemented snapshot construction and dropped the before field of changed_this_tick; v0.5 extracted HeadlessSnapshot and buildHeadlessSnapshot into the browser-safe headless-snapshot.ts, shared by both, with the UI tracking prior memory so before is a true value."}
{"type":"fact","id":"snapshot-stream-fields","text":"The vm:tick stream renames snapshot fields (tick→vm_tick, changed_this_tick→changed) through the single exportable function headlessSnapshotToStreamFields, which centrally documents the contract of the three snapshot shapes."}
{"type":"component","id":"ui-seven-tabs","text":"The human-mode UI has seven tabs: EML-VM-16, EML-VM-64, SEMANTIC ≡, CTS, EML, AGENT, and MATRIX. It ships as a hosted Vite + React app and also as a single-file offline executable."}
{"type":"fact","id":"offline-exe","text":"PHOSPHOR ships an offline single-file executable built via Node SEA (exe/build-exe.mjs) and published as a GitHub Release asset. The .exe is not committed to the repository."}
{"type":"concept","id":"eml-interop-overview","text":"PHOSPHOR interoperates with sibling project EML. EML emits the same phosphor-jsonl-v1 envelope, and PHOSPHOR consumes EML execution traces and bridges EML's source-level CTS into machine-CTS views. Both promised this in their docs; v0.5 is where the wiring landed."}
{"type":"component","id":"eml-trace-consumer","text":"stream/eml-consumer.ts ingests an EML phosphor-jsonl-v1 trace. ingestEmlTrace() reuses PHOSPHOR's own parseStream / validateEvent / mergeOrder / findAnomalies / summarize (0 violations on real EML output, proving the envelopes are interchangeable), then layers EML semantic extraction on top: eml:equiv (execution-truth equivalence verdicts), eml:bug (5-level BUG severity), and eml:run:* (execution lifecycle)."}
{"type":"component","id":"eml-cts-bridge","text":"eml-cts-interop.ts bridges EML's source-level Cts into PHOSPHOR-side views. The two CTSes are isomorphic but at different altitudes and key spaces: PHOSPHOR keys on memory addresses (machine altitude); EML keys on symbol/node-id strings (source altitude). They are not column-swappable."}
{"type":"fact","id":"eml-cts-bridge-transfers","text":"The EML→PHOSPHOR CTS bridge transfers only genuinely corresponding items: EML symbols → phosphor-stream semantic Dictionary (meta:dictionary); EML functions (cold/hot + importance) → attention/risk hints; EML loops (loopKind + determinism/termination) → control-flow hints. It does not transfer addresses, opcodes, or segments."}
{"type":"caution","id":"eml-cts-disjoint-vocab","text":"In the EML↔PHOSPHOR CTS bridge, EML's semanticType (source-statement class, e.g. function.cold) and PHOSPHOR's DataType (memory-cell type, e.g. u8/ptr) are disjoint vocabularies. They are preserved as labels and NOT force-mapped onto each other."}
{"type":"fact","id":"eai-proto-constant","text":"The runtime protocol/serialization version broadcast to agents is unified into a single constant EAI_PROTO = 'EML-EAI-2026-v0.5', replacing scattered hardcoded strings that had stalled at 'v0.1'. This prevents version drift."}
{"type":"fact","id":"semver-prerelease","text":"The package version is a semver pre-release: 0.5.0-beta.0, aligned across root, ui/, and exe/ with lockfiles synchronized."}
{"type":"fact","id":"verify-total","text":"PHOSPHOR verification comprises 6 harnesses totaling 151 checks, all green. The repository requires Node.js ≥ 22."}
{"type":"fact","id":"verify-core","text":"npm run verify — core integration, 36 checks. Covers reader recovery plus negative-control and cmd:call hard assertions."}
{"type":"fact","id":"verify-ws","text":"npm run verify:ws — WebSocket agent server, end-to-end over a real socket, 6 checks."}
{"type":"fact","id":"verify-stream","text":"npm run verify:stream — the phosphor-stream portable standard, 30 checks."}
{"type":"fact","id":"verify-headless","text":"npm run verify:headless — headless AI-mode VM plus EML-VM-BASIC, 23 checks."}
{"type":"fact","id":"verify-eml","text":"npm run verify:eml — v0.5 EML interop: trace consumer, --run splice, and Cts bridge, 30 checks."}
{"type":"fact","id":"verify-semantic","text":"npm run verify:semantic — v0.5 semantic layer operational equivalence judge, including code-region guard and exhaustive-coverage checks, 26 checks."}
{"type":"fact","id":"verify-breakdown","text":"The 151 checks break down as verify(36) + verify:ws(6) + verify:stream(30) + verify:headless(23) + verify:eml(30) + verify:semantic(26). typecheck (tsc --noEmit) additionally runs with zero errors."}
{"type":"fact","id":"adversarial-review","text":"v0.5 underwent one round of 34-agent adversarial review. 9 confirmed defects — including two soundness holes in the equivalence engine (code-region aliasing and sampling coverage) — were all fixed with regression tests added. The marketing site states '34 adversary agents'."}
{"type":"deferred","id":"deferred-float-vms","text":"EML-VM-F32 and EML-VM-F64 float VMs are deferred (post-v0.5), not shipped. They would require a float value model, IEEE-754 ISA semantics (NaN/inf overflow policy replacing wrap/clamp/throw), new instruction-length classes, and a float-aware CTS and changed_this_tick. Do not describe them as shipped."}
{"type":"deferred","id":"deferred-hoare-proof","text":"A formal Hoare-logic / denotational proof layer on top of the operational equivalence judge is intentionally deferred to a later version. v0.5 chose operational form because EML's experience showed equivalence is more tractable, falsifiable, and testable when established by execution rather than by proof. Do not describe a formal proof layer as shipped."}
{"type":"rights","id":"rights-summary","text":"PHOSPHOR's AI-learning rights (see /ai/rights-spectrum.json): read, index, RAG-retrieve, and summarize freely with attribution; non-commercial training and embedding are highly allowed (0.8, and 1.0 for /ai/ paths); commercial training, fine-tuning, and distillation require a license (contact kakon77777@gmail.com); verbatim memorization and style imitation are not allowed (0.0). Attribution and citation are required."}
{"type":"rights","id":"rights-non-commercial","text":"Non-commercial AI training, embedding storage, and summarization are highly permitted: default 0.8, raised to 1.0 for content under /ai/ paths (the machine-readable ingestion layer). Attribution and citation remain required. This is a machine-readable preference signal, not a standalone legal license; PHOSPHOR's source code is Apache-2.0."}
{"type":"rights","id":"rights-commercial-license-required","text":"Commercial training, fine-tuning, distillation, and model distillation are marked license_required in the rights spectrum. They need a license agreement, compensation, and attribution. Licensing contact: kakon77777@gmail.com."}
{"type":"rights","id":"rights-prohibited","text":"Verbatim memorization and style imitation are set to 0.0 (not allowed). Long-quote and substitutive generation are also 0.0. Prefer /ai/corpus/current.md for technical claims over scraping the rendered marketing SPA."}
{"type":"route","id":"route-manifest","path":"/ai/manifest.json","text":"The AI capability manifest (AICL v0.1): project identity, canonical entry points, reading order, corpus and spec catalogs, rights pointers, governance summary, version list, and a facts block (vm_family, eml_vm_16_opcodes:28, cts_layers, ui_tabs, verify_checks_total:151, deferred items)."}
{"type":"route","id":"route-rights-spectrum","path":"/ai/rights-spectrum.json","text":"The AI Learning Permission Protocol (AILP/AIRS v0.1) rights spectrum: per-activity permission values, path-scoped overrides for /ai/ and /ai/corpus/, licensing options, and licensing/citation contacts."}
{"type":"route","id":"route-ai-index","path":"/ai/index.md","text":"The AI entry point document. Human entry is https://emlphosphor.com/; AI entry is /ai/index.md; a machine llms.txt lives at /llms.txt."}
{"type":"route","id":"route-corpus-current","path":"/ai/corpus/current.md","text":"Calm technical description of what ships in v0.5-beta today. Canonical; preferred over the marketing SPA for technical claims."}
{"type":"route","id":"route-full-corpus","path":"/ai/corpus/full-corpus.jsonl","text":"This file: the full corpus as JSONL (application/x-ndjson), one knowledge unit per line, for batch machine ingestion. Canonical."}
{"type":"route","id":"route-spec-v05","path":"/ai/specs/eml-eai-2026-v0.5.md","text":"The current spec, EML-EAI-2026 v0.5 (EXPERIMENTAL): the semantic layer, EML interop, single-source snapshot refactor, version strategy, and deferred items."}
{"type":"route","id":"route-spec-phosphor-stream","path":"/ai/specs/phosphor-stream.md","text":"The phosphor-jsonl-v1 portable event standard: envelope, self-describing dictionary, intent-vs-actual check(), global ordering, best-effort emission."}
{"type":"route","id":"route-spec-cts-interop","path":"/ai/specs/cts-interop.md","text":"The PHOSPHOR ⇄ EML CTS reconciliation contract: altitude and key-space differences, what transfers, and what is preserved as a non-mapped label."}
{"type":"route","id":"route-spec-eml-interop","path":"/ai/specs/eml-interop.md","text":"The PHOSPHOR ⇄ EML trace envelope diff: per-field differences of the shared phosphor-jsonl-v1 envelope and what PHOSPHOR extracts from an EML trace."}
{"type":"route","id":"route-governance-policy","path":"/ai/governance/ai-learning-policy.md","text":"The AI learning policy narrative behind the rights spectrum, plus citation, license, usage, provenance, versioning, and crawler policies under /ai/governance/."}
{"type":"concept","id":"headless-vm-cli","text":"createHeadlessVM (headless-vm.ts) is the AI-mode factory and backs the `phosphor run` CLI; it emits vm:tick events. Example: `npm run phosphor -- run --program fibonacci --max 40` emits one VMSnapshot per tick as JSONL — {mode:'ai', pc, instruction, registers, changed_this_tick, ...}."}
{"type":"fact","id":"instruction-format-vm16","text":"EML-VM-16 uses a fixed 2-byte instruction encoding: [opcode:8][arg:8]. EML-VM-64 by contrast uses a variable-length ISA with 2, 3, or 4-byte instructions and address registers AR0–AR3 over a 16-bit / 64 KB address space."}
{"type":"principle","id":"best-effort-instrumentation","text":"phosphor-stream is best-effort by design: emitting events must never break the host application. Combined with self-description (a semantic dictionary travels with the stream) and global orderability, this lets an agent reconstruct what actually happened without coupling to the host's internals."}
