Local Agent

tools and plugins

How to inject tools and explicit plugin instances into a local Agent

tools and plugins

tools

Local Agents can receive an explicit tool set:

const agent = new Agent({
  id: "repo-helper",
  path: "/path/to/project",
  tools: {
    my_tool: myTool,
  },
});

These tools are available during session execution.

shell

Use @downcity/shell when the Agent should own shell execution tools:

import { Agent } from "@downcity/agent";
import { Shell } from "@downcity/shell";

const agent = new Agent({
  id: "repo-helper",
  path: "/path/to/project",
  shell: new Shell(),
});

This mounts shell_exec, shell_start, shell_status, shell_read, shell_write, shell_wait, and shell_close.

Unrestricted sandbox requests are approved through the Agent:

const approvals = await agent.approvals();
await agent.approve({ approval_id: approvals[0].approval_id });
await agent.deny({ approval_id: approvals[0].approval_id });

plugins

Local Agents can also receive explicit plugin instances:

import { Agent } from "@downcity/agent";
import { SkillPlugin, WebPlugin } from "@downcity/plugins";

const agent = new Agent({
  id: "repo-helper",
  path: "/path/to/project",
  plugins: [new SkillPlugin(), new WebPlugin()],
});

Registered plugin actions can be called through agent.plugins.runAction(...), and plugin system(context) text is injected into session prompts.

plugin action metadata

Plugins with actions automatically give the Agent two built-in tools:

  • plugin_read: reads registered plugin actions, input schemas, and examples
  • plugin_call: executes a plugin action

When the model is unsure how to call an action, it should call plugin_read({ plugin, action }) first, then call plugin_call(...). plugin_call validates payloads against the action input_schema before execution.

Custom plugins can declare action metadata with createPlugin / createAction:

import { Agent, createAction, createPlugin } from "@downcity/agent";
import { z } from "zod";

const demo_plugin = createPlugin({
  name: "demo",
  title: "Demo",
  description: "Demo actions",
  actions: {
    echo: createAction({
      description: "Echo text",
      input_schema: {
        zod: z.object({
          text: z.string(),
        }),
        json_schema: {
          type: "object",
          required: ["text"],
          properties: {
            text: { type: "string" },
          },
        },
      },
      examples: [
        {
          title: "Echo text",
          payload: { text: "hello" },
        },
      ],
      execute: async ({ input }) => ({
        success: true,
        data: { text: input.text },
        message: "echoed",
      }),
    }),
  },
});

const agent = new Agent({
  id: "repo-helper",
  path: "/path/to/project",
  plugins: [demo_plugin],
});

The zod schema handles runtime validation. json_schema and examples are returned by plugin_read for the model or UI. Existing class extends BasePlugin plugins still work, and can gradually move individual actions to createAction(...).

ChatPlugin

Use ChatPlugin when you want the SDK runtime to own long-lived chat channels:

import { Agent } from "@downcity/agent";
import { ChatPlugin, TelegramChannel } from "@downcity/plugins";

const agent = new Agent({
  id: "repo-helper",
  path: "/path/to/project",
  plugins: [
    new ChatPlugin({
      channels: [
        new TelegramChannel({
          env: {
            TELEGRAM_BOT_TOKEN: process.env.TELEGRAM_BOT_TOKEN,
          },
        }),
      ],
    }),
  ],
});

Each channel object owns its own env and credential parsing, so ChatPlugin only manages lifecycle, queue, and actions.

ImagePlugin

Use ImagePlugin when you want an Agent to generate images during a conversation:

import { Agent } from "@downcity/agent";
import { ImagePlugin } from "@downcity/plugins";

const agent = new Agent({
  id: "creative-agent",
  path: "/path/to/project",
  model,
  plugins: [
    new ImagePlugin({
      list_models: async () => {
        const catalog = await city.ai.listModels();
        return catalog.forModality("image");
      },
      image_create: (input) => city.ai.image_create(input),
      image_result: (input) => city.ai.image_result(input),
    }),
  ],
});

After registration, the Agent automatically gets the built-in plugin_read / plugin_call tools. The model can inspect action metadata first:

await plugin_read({
  plugin: "image",
  action: "image_create",
});

Image creation consumes provider quota. The Agent should ask the user to explicitly confirm the exact image creation or edit request before calling image_create.

After confirmation, it can use the job-style actions directly:

const job = await plugin_call({
  plugin: "image",
  action: "image_create",
  payload: {
    model: "image-model-id",
    prompt: "A cinematic illustration of a rainy city corner at night",
    aspect_ratio: "16:9",
  },
});

await plugin_call({
  plugin: "image",
  action: "image_result",
  payload: {
    job_id: job.data.job_id,
  },
});

For reference images or image edits, use content:

const job = await plugin_call({
  plugin: "image",
  action: "image_create",
  payload: {
    content: [
      { type: "text", text: "Change this image to a white studio background" },
      { type: "image", url: "./input.png" },
    ],
  },
});

prompt and content are the two public input formats exposed to the Agent. Use prompt for text-only image generation. Use content whenever the request includes reference images, image edits, or multiple context parts. If both are present, content wins and prompt is not forwarded downstream.

content[].url can be an online URL, a local absolute path, or a path relative to the Agent project root. Local images are read by ImagePlugin and converted into the data URL format required by the City image job, so the model does not need to pass base64. The Agent should not pass messages or data_url.

Call models first when the Agent needs to inspect available image model IDs.

image_result reads the current job state once by default. If it returns queued or running, keep the job_id and call image_result again later. For short jobs, pass until_done: true to wait for a terminal state in one tool call, with optional max_wait_ms / poll_interval_ms. If it returns succeeded, data is the AI SDK UIMessage containing generated image file parts.

ImagePlugin is provider-agnostic. It does not know about City, OpenAI, or DeepSeek directly. Pass image_create and image_result job functions, and the plugin exposes the two-step job actions to the Agent; by default the product, Agent, or caller decides when to poll again based on poll_after_ms, while until_done is only a simple plugin-level wait helper.

“Resolved provider input” means this: the Agent only passes simple prompt or content; ImagePlugin reads local files into data URLs and converts content into ImagePluginResolvedInput.messages before calling image_create(input). This is the internal boundary from ImagePlugin to City / provider adapters, not a third Agent-facing payload format.

Generated image file parts are always materialized under .downcity/resources/, referenced by Agent-root relative paths such as .downcity/resources/..., and merged into the final assistant message automatically. Relative file URLs in generated file parts are resolved from the Agent project root before materialization. The plugin_call result also includes files[].relative_path and files[].path for the Agent-root relative path and absolute local file path.

plugin HTTP

If Downcity exposes the Agent HTTP gateway, registered plugins with plugin.http.server.register(...) are also mounted onto the same HTTP app automatically.

So:

  • plugin existence comes from the plugins array you pass into Agent
  • plugin HTTP exposure happens when Downcity publishes the Agent HTTP gateway

See also