On-Device AI (Client-Side LLM)

4 min read

PinePaper can drive its own tools from a language model that runs entirely in the browser — no account, nothing sent to a server. This powers the Assistant tab in the editor’s AI / Code console, and it’s available to your own code via window.PinePaperClientLLM.

Experimental. Small on-device models are constrained to valid tool calls (so output is always runnable), but quality varies. The Cloud tab and MCP server remain the higher-fidelity paths.

How it works

The model never emits free-form JavaScript (a tiny model hallucinates APIs). Instead it emits a JSON array of MCP-style tool calls, constrained so invalid arguments are structurally impossible:

[
  { "name": "pinepaper_set_background_color", "arguments": { "color": "#0F0F1A" } },
  { "name": "pinepaper_create_item", "arguments": { "itemType": "star", "x": 400, "y": 300, "radius": 90, "color": "#E74C3C", "animationType": "pulse" } }
]

The constraint is applied per provider:

Provider	Constraint mechanism
Chrome Prompt API (Gemini Nano)	JSON Schema via `responseConstraint`
WebLLM (in-browser open model)	EBNF grammar via `response_format: { type: 'grammar' }` (XGrammar)

A small executor (ToolCallExecutor) then runs the calls against the live window.PinePaper instance.

Providers

Browser AI — Chrome’s built-in Gemini Nano (window.LanguageModel). Zero hosting: the model ships with Chrome. Desktop Chrome with built-in AI enabled.
PinePaper AI — WebLLM in any WebGPU browser. Downloads a small open model on first use (cached thereafter), then runs entirely on your device.
Translation — for non-English prompts, the browser’s built-in Translator + Language Detector translate to English before generation (the small code models are English-centric).

`window.PinePaperClientLLM`

Exposed by the editor when the on-device Assistant is available. All methods are no-ops/rejections when it isn’t (see Availability).

const llm = window.PinePaperClientLLM;

// Is the feature on, and is a provider usable?
llm.enabled();                 // boolean (build flag + runtime kill switch)
await llm.capabilities();      // { enabled, promptApi, webgpu, translator, detector, provider }

// Choose an engine (or null = auto: Prompt API, then WebLLM)
llm.setProvider('prompt-api'); // 'prompt-api' | 'web-llm' | null
llm.preferredProvider();       // current preference
llm.provider();                // active provider after preload()/runScene()

// Warm the model ahead of the first prompt (optional but recommended on intent)
await llm.preload();

// One shot: translate → constrained generate → execute on the canvas
const r = await llm.runScene('a pulsing red star on a dark background', {
  sourceLocale: document.documentElement.lang,   // translation hint
  onProgress: (rep) => { /* { progress, text } during model download */ },
});
// r = { raw, calls, results, ok, failed }

llm.unload();                  // free GPU / session resources

runScene also dispatches a pinepaper:clientllm-progress window event ({ detail: { progress, text } }) during model download/warm-up — the editor uses it to drive the progress bar.

Tool set

ToolCallExecutor (phase 1) maps these tools to PinePaper methods:

Tool	Maps to
`pinepaper_create_item`	`PinePaper.create(itemType, args)` (+ `PinePaper.animate` if `animationType` given)
`pinepaper_execute_generator`	`PinePaper.executeGenerator(name, params)`
`pinepaper_set_background_color`	`PinePaper.setBackgroundColor(color)`
`pinepaper_animate`	`PinePaper.animate(<item by id>, { animationType })`

The valid item types, animation types, and generator names mirror CodeValidator and the ontology, so the grammar stays in lockstep with the engine.

Availability

The on-device Assistant is experimental and rolling out. Detect it at runtime through the public surface rather than assuming it’s present:

if (window.PinePaperClientLLM?.enabled()) {
  const caps = await window.PinePaperClientLLM.capabilities();
  // caps.promptApi / caps.webgpu / caps.translator tell you what this
  // browser can run on-device.
}

When it’s unavailable, enabled() returns false and the Assistant tab isn’t shown — fall back to the Cloud tab or the MCP server.

Relationship to MCP

Same tool vocabulary (pinepaper_*), two execution paths: an external agent calls these tools over the MCP server; the on-device model calls the same shapes locally through the executor. One contract, agents on either side of the browser boundary.

User guide: On-Device AI Assistant
MCP Integration · Agent Mode · AI Agents