A headless desktop overlay for Windows combining AI chat, local memory, agenda planning, safe action execution, and ambient workflow utilities into a single persistent desktop layer. Built in Electron. Currently in beta.
Yamada began from a simple frustration: AI tools in 2024 existed as applications — things you opened, used, and closed. They required a context switch. They lived in a browser tab or a separate window. They had no memory of your workflow, no awareness of what you were doing, and no capacity to act. They were chat interfaces dressed as productivity tools.
The design goal was different: a tool that disappears when you don't need it and appears instantly when you do. A layer that sits under your real work rather than competing with it. A companion that carries context across sessions, understands your active environment, and can take bounded actions on your behalf — without requiring you to leave your current focus.
The result is Yamada: a headless overlay built in Electron, summoned via a global keyboard shortcut, capable of AI conversation, file review, agenda management, focus timing, and guarded action execution — all within a design system precise enough to feel like a real product, not a prototype.
Three principles governed every decision in Yamada's development. Invisibility: the overlay must not feel like it occupies your desktop. When dismissed, it should leave no trace. Continuity: context must persist. Restarting Yamada should not mean starting over. Safety: a tool with execution capabilities is a tool that can cause damage. Every action pathway must be bounded and validated before it reaches the operating system.
Yamada's architecture separates concerns into four distinct layers: the Presentation Layer (Electron renderer), the Orchestration Layer (main process + execution planner), the Persistence Layer (local JSON memory + conversation history), and the Intelligence Layer (EdenAI model routing).
Electron was selected for three practical reasons: native Windows integration (global keyboard hooks, system tray, window manager access), a single codebase for UI and system logic, and the ability to ship a self-contained executable without runtime dependencies. The tradeoff — bundle size — was accepted given that Yamada is a persistent desktop resident, not a frequently installed utility.
The main and renderer processes are strictly separated via Electron's IPC bridge. The renderer has no direct access to Node.js APIs, which enforces a clean boundary between UI logic and system-level operations. All file system access, shortcut registration, and external API calls occur exclusively in the main process.
The overlay is implemented as a frameless, transparent Electron BrowserWindow with alwaysOnTop: true and skipTaskbar: true. On summon (Ctrl+Space), the window is shown with a cinematic CSS transition — a darkened semi-transparent backdrop with the chat surface sliding in from a neutral position. On dismiss, it hides rather than closes, preserving all in-memory state without re-initialization cost.
Global shortcut registration is handled via Electron's globalShortcut API, which intercepts the key combination at the OS level regardless of which application has focus. This is what makes Yamada genuinely overlay-native: it can be summoned from inside a game, a fullscreen application, or a locked terminal session.
Yamada does not bind to a single AI model. Instead, it routes requests through EdenAI's unified API, which provides access to multiple providers under a single authentication layer. This decouples the product from any single model's availability, pricing changes, or capability regressions.
Request routing is tiered by complexity and latency requirements. Standard conversation routes to Claude Sonnet as primary, with GPT-4o as automatic fallback on failure. Fast operations — intent classification, action code extraction, agenda parsing — route to Mistral for lower latency. Document summarisation (the /review command) routes to whichever provider is currently cheapest for long-context tasks, determined at startup.
Each request to the intelligence layer is assembled from multiple sources before the API call is made. The assembled context includes: a system prompt defining Yamada's persona and allowed capabilities, the compact memory block (extracted from memory.json, max 2KB), recent conversation history (last N turns, configurable), current runtime state (active timer, open widgets, current date/time), and the raw user message.
This assembly is deterministic — the same inputs always produce the same context structure — which makes debugging and testing straightforward. The memory block is intentionally compressed: the memory layer does not store raw transcripts but distilled facts, preferences, and context extracted by a secondary model call after significant interactions.
The memory system is one of Yamada's most technically interesting components. It solves a specific problem: LLMs are stateless, but users are not. A companion that forgets everything between sessions is a chat interface, not a companion.
Yamada's approach avoids the complexity and cost of a vector database for a single-user desktop application. Instead, it uses a two-tier local JSON store.
history.json stores a rolling window of conversation turns. The window is bounded (default: last 40 turns) to prevent unbounded growth. Older turns are summarised and promoted to the memory tier rather than deleted — each summary is generated by a fast model call that extracts factual content from the discarded turns and merges it into the memory document.
memory.json is a structured document capped at 2KB. It stores: user-stated preferences, recurring topics, confirmed facts about the user's context, and named entities (projects, people, locations) that have appeared in conversation. The schema is intentionally loose — a flat key-value store with timestamps — allowing the model to write and read it naturally without rigid parsing logic.
Yamada's execution layer is the most security-sensitive component in the system. Giving an AI assistant the ability to take actions on a user's system is a meaningful attack surface: a model that can be prompted into executing arbitrary commands is not a productivity tool, it is a vulnerability.
The execution layer addresses this through a strict allowlist architecture. Yamada does not pass model output to a shell interpreter. Instead, the model is prompted to return a structured action code when it detects applicable intent, and that code is validated against a static registry of permitted operations before any action is taken.
The action registry defines every operation Yamada is permitted to execute. Each entry specifies the action identifier, its parameter schema, a human-readable description, and a risk level. Actions above risk level 1 require an explicit user confirmation step before execution.
When the model returns an action code, the execution planner validates it through three sequential checks: schema validation (does it match a registered action's parameter types?), parameter sanitation (are file paths within permitted directories?), and risk evaluation (does this action require explicit user confirmation?). Only actions passing all three checks are executed. Anything else is surfaced to the user as an explanation of what was requested and why it was blocked.
The /review command opens a directory picker and loads the selected path into a structured file browser within the overlay. PDF files are extracted to plain text for AI summarisation. The review session persists across overlay close/reopen cycles within the same Yamada session — closing the overlay does not lose a review-in-progress.
Timer commands create a persistent floating countdown bound to the overlay's widget rail. The timer continues running when the overlay is dismissed. A configurable notification fires on completion. Pomodoro mode implements the standard 25/5 interval pattern with session tracking stored in history.
The widget rail is a pinned sidebar layer within the overlay that renders configurable widgets: a local agenda calendar, a live clock, a note surface, and a focus-mode planner. Agenda events are stored in agenda.json and are injected into the AI context at the start of each session, giving the model awareness of scheduled commitments when planning or responding.
Ten themes ship with Yamada, each defined as a complete CSS variable override set. Themes affect every visible surface — the backdrop, chat surface, widget rail, syntax highlighting in code blocks, and status indicators. Theme selection is stored in config and applied on startup without delay.
Yamada's threat model is that of a personal productivity tool, not a multi-user system. The primary threat vectors are: prompt injection attacks (a malicious document processed by /review attempts to exfiltrate data or execute commands), local data exposure (memory and history files contain sensitive context), and model output misinterpretation (the execution layer misreads a non-action response as an action code).
Mitigations in place: the execution layer validates all action codes against the static registry before any execution (no free-form command execution path exists); document content ingested by /review is sanitised and passed as read-only context, not as executable instructions; local data files are stored in the OS-standard application data directory with standard user-level permissions; the renderer process has contextIsolation enabled and nodeIntegration disabled, preventing renderer-level code from accessing Node APIs directly.
Known gaps in the current beta: API keys are stored in config.json in plaintext (encryption planned for 1.0); the /review path picker does not enforce directory sandboxing, meaning a user can technically open any accessible path on the filesystem.
Yamada is in public beta with a working feature set across all described systems. The product is distributed via direct download from chunchunmaru.yijie.space and has a growing user base of early adopters. The core architecture is stable; the focus of remaining beta work is edge case handling, performance optimisation on lower-end hardware, and the implementation of API key encryption.
Roadmap targets for 1.0 release include: encrypted credential storage, plugin architecture for third-party widget types, deeper OS integration (clipboard access, active window context), and a Linux port via the same Electron base.