pp: Ontology — Risk Register & Mitigations

Version: 0.4.0 Created: 2026-02-18 Priority Ordering (Risk 4): training data quality > LLM reasoning accuracy > interchange compatibility > community adoption

Implementation Scope

Owner Risks Description
FxTool (this project) 1 (partial), 3 (partial), 4, 5, 6, 7 Vocabulary, escape types, MCP reference impl, governance
LLM Project (separate) 1 (partial), 2, 3 (partial) Training data annotation, external corpora, agent chaining

Risk 1: Mathematical Structure ≠ Semantic Meaning

Problem: The pp: vocabulary captures geometric structure (what things are) but not design intent (what they’re for). A circle used as a bullet point vs a planet vs a button are structurally identical but semantically different. Without intent labels, an LLM trained on pp: graphs cannot distinguish usage context.

Mitigation:

  • FxTool: Add coarse designIntents taxonomy to SemanticVocabulary.js: structural, decorative, temporal, interactive. These four categories partition all element usage into mutually exclusive function classes.
  • LLM Project: Annotate training examples with intent labels. Each item in training data gets a designIntent field from the coarse taxonomy.

Coarse → Fine Intent Mapping:

Coarse Intent Fine-Grained Intents Description
structural inform, teach, demonstrate Elements that carry information or define layout
decorative inspire, celebrate, brand Elements that add visual appeal without data payload
temporal announce, entertain, sell Elements whose purpose is time-sensitive or attention-driven
interactive — (future) Elements that respond to user input (quizzes, buttons)

Status: Implemented in js/ontology/SemanticVocabulary.js (FxTool portion)


Risk 2: No Adoption Incentive (Cold-Start)

Problem: External users have no reason to adopt pp: vocabulary unless it provides immediate value. The vocabulary alone is abstract; the value comes from what it enables (generation, analysis, recommendations).

Mitigation:

  • LLM Project: Build agent chaining value first — demonstrate that pp: vocabulary enables multi-step generation workflows that plain SVG cannot.
  • FxTool: Ensure MCP tools output pp:-typed results so agents get structured data without manual classification.

Status: Deferred to LLM project for primary implementation. FxTool MCP tools already output structured data.


Risk 3: Circular Data Collection

Problem: If the vocabulary can only describe patterns already in the template library, training data becomes tautological. Novel patterns that don’t fit existing types get silently dropped or force-classified into wrong categories.

Mitigation:

  • FxTool: Add pp:Unclassified node type and pp:unknownRelation edge type as escape hatches. Items/relations that don’t match vocabulary get tagged rather than dropped. This creates an active vocabulary gap discovery mechanism.
  • LLM Project: Import external corpora (LLM4SVG, SVGBench) to break the self-referential cycle.

Status: Implemented in js/ontology/Vocabulary.js (FxTool portion)


Risk 4: Four Conflicting Masters

Problem: The pp: vocabulary serves four audiences with potentially conflicting needs: (1) training data quality, (2) LLM reasoning accuracy, (3) interchange compatibility, (4) community adoption. Optimizing for one can degrade another.

Mitigation: Explicit priority ordering — when conflicts arise, resolve in this order:

  1. Training data quality — The vocabulary must produce clean, consistent, unambiguous training data. This is the primary use case.
  2. LLM reasoning accuracy — The vocabulary structure must be learnable by LLMs. Prefer simple, regular patterns over clever optimizations.
  3. Interchange compatibility — Cross-tool portability is valuable but secondary to data quality.
  4. Community adoption — Documentation and onboarding matter, but not at the cost of schema integrity.

Status: Documented here and in Vocabulary.js module header.


Risk 5: 72% Novel Vocabulary = High Barrier

Problem: Most pp: types have no equivalent in existing design vocabularies (SVG spec, CSS, etc.). This means consumers (LLMs and humans) cannot map pp: concepts to prior knowledge. The novelty is a feature (covering motion graphics gaps) but creates an adoption barrier.

Mitigation: Publish Layer 1 (geometry primitives) and Layer 5 (relation types) first, using a four-part documentation format for each type:

  1. Plain description — What it is, in one sentence
  2. Canonical example — A concrete MCP tool call or code snippet that produces this type
  3. Formal definition — Parent type, properties, constraints
  4. JSON-LD representation — Machine-readable linked data format

Layer 1 (geometry) is familiar to anyone who knows SVG. Layer 5 (relations) is the novel contribution where pp: adds value over existing vocabularies. Publishing these two layers first builds understanding from familiar ground toward novel ground.

Status: Layer 1 and Layer 5 definitions exist in Vocabulary.js. Four-part doc format is a v0.5 deliverable.


Risk 6: Solo Governance

Problem: A single author controlling vocabulary evolution creates bus-factor risk and perception of closed governance, even if intentions are open.

Mitigation: BDFL-with-sunset model:

  • Current: Benevolent Dictator For Life (BDFL) — all vocabulary decisions are fast and consistent.
  • Trigger conditions for governance evolution:
    • First external contributor → add CONTRIBUTING.md with RFC template
    • First production user → add stability guarantees and deprecation policy
    • First funding → establish steering committee with odd-numbered voting
  • RFC process: GitHub Discussions with structured template (problem, proposal, alternatives, migration path).

Status: Process documented here. Implementation is process/documentation, not code.


Risk 7: LLM Pre-Training Erodes Vocabulary Value

Problem: As LLMs improve at understanding SVG/CSS/design directly, the vocabulary’s value as a “translation layer” diminishes. If an LLM can reason about raw SVG as well as pp: graphs, the vocabulary becomes overhead.

Mitigation: The MCP server — not the vocabulary document — is the canonical reference implementation. The MCP server produces deterministic, verifiable behavior that no amount of LLM pre-training can replicate. The ground truth loop is:

definition → implementation → validation → feedback

Every pp: type must ship with a tested MCP tool call, not just a schema definition. This makes the vocabulary a living, executable specification rather than a static document.

FxTool Implementation:

  • Add mcpTool references to vocabulary type entries, linking each type to the MCP tool that creates/manipulates it.
  • Add mcpToolRef references to vocabulary edge entries, linking each relation to the MCP tool that manages it.

Status: Implemented in js/ontology/Vocabulary.jsmcpTool and mcpToolRef fields added to types and edges.


Implementation Timeline

Phase Risks Deliverable
Now (v0.4.0) 1, 3, 4, 7 SemanticVocabulary intents, escape types, MCP mappings, priority doc
v0.5 2, 5 Four-part doc format, typed MCP outputs, pruned vocabulary publication
Pre-training 1, 3 Intent-labeled training data, external corpus import
Ongoing 6, 7 RFC process, MCP-as-reference-implementation standing requirement