Harmonica 0.4: a runtime for agentic facilitation

In the last post, we argued that facilitation is the missing primitive — that the most valuable organizational knowledge has to be produced through structured interaction, not just captured, and that facilitation is the infrastructure that produces it. That post made the argument. It left one thing unsaid: a primitive needs something to run it. An AI interviewer elicits deeper answers through conversations. It doesn’t enable workflows.

We mean “runtime” in the software sense: the system that executes a program. You write the logic once, and the runtime runs it — managing the moving parts and handing back a result. A facilitation methodology is that kind of logic: a defined sequence of stages, each with its own rules, and a clear sense of what carries into the next. The question this post is about is what it takes to actually run one — not stage it by hand every time, but execute it the way a runtime executes code.

The first version of Harmonica was an AI interviewer. You describe what you want to understand, share a link, and the AI has a one-on-one structured conversation with each participant. Each person answers questions, gets follow-ups based on what they said, and you get a synthesis at the end. It’s better than a survey because it adapts; it isn’t a method because each conversation runs in its own bubble, and when the session ends, the facilitator resets.

That ceiling was the constraint. A Delphi panel needs two rounds — gather opinions, reveal the spread, run a second pass with the first round as context. A Wardley mapping exercise sequences stages and carries output between them. Retrospectives are only useful if something accumulates across them. None of that was possible when each session stood alone. You could run a conversation. You couldn’t run a method.

0.4 removes that ceiling.

The engine

Sessions chain into multi-step workflows. You pick a framework; Harmonica structures each stage as a facilitated session with its own role assignments, its own facilitator prompt, and context inherited from the prior stage. When everyone in a stage finishes, the chain advances automatically — the next stage opens already holding what the previous one produced. Assigned participants get a signed-link email when it’s their turn; a notification center tracks chain progression, action-required items, and completions for the host. After the run, visitors to the results page can walk every step and see how the collective output evolved from stage to stage. Sixteen methods ship as ready-to-run templates: eight multi-step chains (Wardley Mapping, Six Thinking Hats, Cynefin Sensemaking, Delphi panel, Theory of Change, Three Horizons, Team Topologies Assessment, Example Mapping) and eight single-step sessions (Appreciative Inquiry, SWOT analysis, Retrospective, Change Readiness Assessment, Force Field Analysis, Impact Assessment, Risk Assessment, Stakeholder Analysis).

Cross-pollination happens during the conversation, not only at the summary. As participants respond, the facilitator surfaces what others have said — recurring themes, emerging tensions, angles the current participant hasn’t yet considered — without revealing who said what. The synthesis starts building live.

Every summary now has the participant’s exact words sitting behind each claim. When the synthesis says the team is worried about onboarding time, you open the claim and read the sentences participants actually wrote — attributed pseudonymously as Participant 3, Participant 7. The quotes are verbatim substrings, never paraphrased, never generated. The feature is free on every session. Hosts control who sees the results page — public, link-only, or restricted — set at creation or changed any time after.

The facilitator itself is layered now. Previously the facilitation logic lived in a single prompt — one blob of text covering everything. Now it composes: a base layer with your organizational context (seeded from onboarding as HARMONICA.md), a method layer with the logic specific to the framework you’re running, and a per-session layer you attach at setup — a research PDF, a prior session’s summary, an MCP tool the facilitator should consult during the conversation. The prompts stack. The facilitator isn’t starting from zero each time.

What it runs

The methodology library is wide enough to be pointed at different problems. Three that come up most in practice:

Strategy, with Wardley mapping as the flagship output. Run the Wardley chain and the discussion produces an actual map — components placed on the evolution axis, drawn from what the group said. The output renders as Mermaid’s open-source Wardley syntax (v11.14.0+): standard, portable text, not a diagram locked into our tool. You correct it when it gets something wrong, then take it anywhere Mermaid renders. A strategy conversation produces the artifact a strategy conversation is supposed to produce, without a separate mapping session after the fact. More on Wardley mapping with Harmonica.

Memory, so the work compounds. Most retrospectives are run as projects — budget, deliverable, a report nobody reads — which is why they’re the first thing cancelled when the calendar tightens. The cost of skipping them is invisible week to week and severe over years: the same debates return, context evaporates every time a cohort rotates out. What organizations actually want is closer to what brains do at night. Chained retros are cheap enough to run every cycle, and their output accumulates — each retro producing typed decisions with the evidence behind them, action items, and emerging tensions, rather than a summary that flattens what happened. Those become atomic, linked notes the rest of the organization can browse later, with a periodic pass that surfaces recurring themes. The full argument is in Retrospectives shouldn’t be projects. For a change manager or consultant, it’s the difference between client work that ends with a deck and client work that compounds.

Public sensemaking. The Public Sensemaking Package is a template for running large-scale structured interviews, synthesizing them to a public dashboard, and linking every claim back to the interview it came from — no login required to read the results. Metagov’s gov/acc working group used it to map their research domain: who is working on what, which problems converge, where the gaps are. Harmonica ran 30 structured interviews across the researcher community, each following a consistent flow with follow-ups adapted per person. The result: 11 convergent problems, 31 solutions in progress or proposed, 43 actors across the space — every entry auditable back to its source. Full case study. Artem made the longer case for this kind of work as civic infrastructure in a talk for OpenCivics: AI-facilitated sensemaking as civic infrastructure.

A runtime for agentic facilitation

If facilitation is a primitive, something has to run it — reliably, repeatably, for a group, not just for one person. That’s the shift 0.4 completes, and the pieces above aren’t really a bigger feature set; they’re the parts of a runtime. The engine executes a method across stages. The layered prompt is the method’s composable program — a base layer, a method layer, a per-session layer that stack. Structured session knowledge is the typed output it returns. Cross-pollination and the evals below are its runtime behavior and its guarantees.

That reliability is what lets the facilitator behave like an agent rather than a script. By agentic facilitation we mean a facilitator that does more than ask and record: it draws on the tools a session gives it, surfaces what other participants have said when the discussion calls for it, sequences a method across stages and carries context between them, and adapts its follow-ups to what’s actually been said. An agent needs an environment to act in — the runtime is that environment.

A runtime is only as useful as what can call it — so a method doesn’t have to start inside Harmonica. Through the open MCP server, any AI agent can hand Harmonica a method to run, tailored to whatever context that agent already holds. Drop the harmonica-chat skill into a coding agent and “run a retro on the API redesign” becomes a session pre-loaded with your actual project — the agent reads your context, picks the method, and launches it; Harmonica runs the group part. A researcher already did this: Maria Milosh implemented a novel cross-pollination method entirely as agent orchestration on top of the MCP, with no changes to the platform. The method was a spec the runtime executed, not a feature we shipped.

That’s also why the outputs are portable — a Wardley session emits open Mermaid text, not a diagram trapped in our tool. The method is a portable thing; Harmonica is where it runs. Which raises a question we’re now leaning into: if a facilitation method is just a well-specified program, it doesn’t have to be ours. The format can be open — methods anyone can write, fork, and adapt, with Harmonica as the runtime that knows how to execute them. More on that soon.

That’s one frontier — who gets to write the methods. The other is how much the facilitator can do on its own, and we’ll be honest that 0.4 is the foundation there, not the finish. Today’s facilitator follows a strong design and reaches for a tool when a session points it at one; the agentic part is real but early. The work ahead is a facilitator that decides more for itself: when to save an insight, when to pull a thread of cross-pollination, when a conversation has gone somewhere it should hand back to a human host, and how to manage what it carries from one turn to the next. None of that is possible until the layer underneath can execute, sequence, and measure reliably — which is what a runtime is, and what 0.4 puts in place. The runtime comes first; the more autonomous facilitator is what it’s for.

How it gets better

A method works as well as the facilitator running it. The Review tab on the results page gives you an AI critique of how the facilitator actually behaved during the session — where it stayed on one question too long, where it skipped a handoff — each finding backed by the participant’s own words. Where the review proposes a rule, you apply it in one click and it’s in for next time. Every prompt edit is tracked — who made it, when, and which surface it came from — so when a session behaves differently, you know what changed.

Underneath that is a measurement layer that runs each of Harmonica’s facilitator prompts against a scored rubric — criteria like “does the facilitator ask one question per turn?” or “does the closing turn check whether the participant is satisfied?” — scored by a second LLM. Not every behavior fits a rubric: some rules only trigger deep in a conversation, where a per-message scoring call would add too much latency. Three instrumentation types, matched to three different classes of behavior: detector, judge, smoke. The runtime doesn’t ask you to trust it. It gives you something to check. And that’s not just quality hygiene — it’s what makes agentic facilitation safe to push further: you can’t responsibly let a facilitator act more on its own without the instruments to see and score what it does. Observability and evals are the precondition for that autonomy, not an afterthought to it. The longer story is in We shipped prompt improvements against a broken scoreboard.

The mission

Everything in 0.4 is in service of the same thing: unlocking better collective sensemaking. The last post argued facilitation is the primitive that produces organizational knowledge — and a primitive that nothing can run reliably stays an argument. The ceiling we lifted is what kept facilitation from being something you could run rather than stage: isolated conversations that couldn’t carry context forward, couldn’t sequence stages, couldn’t compound. Methods that should run quarterly got skipped because setup cost too much. Knowledge that should accumulate got lost when the facilitator changed. Insights that should be auditable got flattened into a summary nobody trusts. Harmonica exists to make that tractable. The last post made the argument; this is the runtime.

If that mission resonates, you can help keep Harmonica independent with a donation of any size — on Open Collective or Giveth. And we’re hosting a Harmonica Town Hall on June 17 — an open call to see where we are, ask questions, and shape what’s next. Join us on Luma →

If you want to go all in, the lifetime deal is still open — $499 once for every premium feature, forever. The honest version: no one knows what AI will cost a year from now, so rather than promise unlimited inference we can’t underwrite, it’s paired with bring-your-own-model — you connect your own LLM and cover usage yourself. That’s what lets “forever” mean forever. It’s for our truest supporters — people who’d rather own their access than rent it.

Pick a method and run it at app.harmonica.chat.

Book a 30-minute call

Loading calendar…