Use case · AI agents

Give your agent a real image rendering tool.

DALL-E hallucinates layouts. Midjourney can't hit a brand spec. For agents that need a logo on a card, a chart from state, or a certificate per user action — you need deterministic rendering. MCP-native, REST-callable, CLI-shellable. Pick your wiring.

The problem

LLM image generation (DALL-E, Midjourney, Imagen) is creative but non-deterministic. Ask for “a card with our logo in the top left, headline in 72px Inter, gradient background” and you get an approximation — different every call, never quite on-brand, text often garbled.

It's also expensive (~$0.04+ per image) and slow (5-30s). For an agent that needs to issue 1,000 certificates, render a chart from query results, or stamp an OG image per blog post — that's a non-starter. You need rendering, not generation.

Three ways to give your agent the tool

1. MCP server (native)

Claude Desktop / Code, Cursor, Windsurf, Cline

Install @codetoimage/mcp-server via npx. Two intent-aware tools appear in your client: render_html_to_image (inline base64) and render_html_to_url(24h hosted). The model picks the right one. Listed in Anthropic's official MCP Registry.

2. REST API (any agent loop)

LangChain, custom GPT, n8n HTTP node

POST to /v1/render with HTML and dimensions, get bytes or a hosted URL back. Wrap it as a LangChain Tool, an OpenAI function-calling schema, or an n8n HTTP step. Same credit pool as MCP.

3. CLI shell (anything that can exec)

n8n Execute Command, Make, shell agents

npx @codetoimage/cli render from any process. n8n Execute Command, Make custom modules, shell-tool agents, GitHub Actions — if it can spawn a process, it can render images.

MCP setup — Claude Desktop

Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "codetoimage": {
      "command": "npx",
      "args": ["-y", "@codetoimage/mcp-server"],
      "env": {
        "CODETOIMAGE_API_KEY": "cti_live_..."
      }
    }
  }
}

Restart Claude Desktop. Now try a prompt like:

“Render a 1200x630 card with a dark gradient background, our logo bottom-left, and the headline ‘Q4 revenue up 38%’ in Inter 72px. Return a hosted URL so I can post it to Slack.”

Claude writes the HTML, picks render_html_to_url, returns the link. No prompt engineering, no tool selection hints needed.

Non-MCP agent — LangChain Tool wrapper

For LangChain, OpenAI function-calling, or any agent framework that takes Tool definitions, wrap the REST API:

// tools/renderImage.ts
import { tool } from "@langchain/core/tools";
import { z } from "zod";

export const renderImage = tool(
  async ({ html, width, height, format }) => {
    const res = await fetch("https://api.codetoimage.app/v1/render", {
      method: "POST",
      headers: {
        "X-API-Key": process.env.CODETOIMAGE_API_KEY!,
        "Content-Type": "application/json",
      },
      body: JSON.stringify({
        html,
        width: width ?? 1200,
        height: height ?? 630,
        format: format ?? "png",
        output: "url",
      }),
    });
    if (!res.ok) throw new Error(`render failed: ${res.status}`);
    const { url } = await res.json();
    return url;
  },
  {
    name: "render_image",
    description:
      "Render HTML/CSS to a pixel-perfect image. Returns a 24h hosted URL. Use for cards, certificates, OG images, charts — anything where layout precision matters.",
    schema: z.object({
      html: z.string().describe("Full HTML with inline CSS or <style> block"),
      width: z.number().optional(),
      height: z.number().optional(),
      format: z.enum(["png", "jpeg", "webp"]).optional(),
    }),
  }
);

Same shape works for OpenAI function calling — replace the wrapper with a JSON schema and call from your agent loop.

vs LLM image generation

AspectDALL-E / Midjourney / Imagencodetoimage
DeterminismDifferent every call✓ Same input → same pixels
Layout precisionApproximate, text often garbled✓ Exact — it's a real browser
Per-call cost~$0.04 (DALL-E 3 standard)✓ <$0.003 at Hobby
Latency5-30s✓ ~600ms median
Brand consistencyBest-effort prompt steering✓ Your design system, your fonts
Re-render same inputNew image, new cost✓ Cache the output URL forever
Creative novelty✓ Concept art, illustrationWhatever HTML/CSS can express

What you get

Agentic OG generation

LLM writes the HTML from the post content, codetoimage renders it. Per-post social previews without a designer in the loop.

Tool-call certificates

Issue a branded PNG per user action — course completion, badge, receipt. Same template, dynamic copy, deterministic output.

Chart / data viz from agent state

Pipe query results into a Chart.js or vanilla SVG template, render to PNG, hand back to the user as an attachment.

Slack / Discord image bots

Agent receives a message, renders it as a branded card, posts back. Beautiful unfurls without any image-editing tooling.

n8n / Zapier image steps

Drop an HTTP node or shell exec into a workflow. Every automation gets image rendering — no per-template setup, no Bannerbear-style template UI.

MCP Registry wedge

The only HTML-to-image stack in Anthropic's official MCP Registry. Discoverable by name in any client that indexes the registry.

FAQ

When should I use this vs DALL-E or other LLM image generation?

Use generative models (DALL-E, Imagen, Midjourney) when you want creative novelty — concept art, illustrations, mood boards. Use codetoimage when you want determinism — branded cards, certificates, charts, OG images, dashboards. If the same input must always produce the same pixel-perfect output, or the layout has to match your design system exactly, you need rendering, not generation. Bonus: rendering is ~10x cheaper per call and ~10x faster.

Can the LLM generate the HTML itself?

Yes — this is the canonical pattern. Prompt the model with the data plus a style guide ("render a 1200x630 card with our brand gradient, headline X, subheadline Y") and let it produce the HTML/CSS string. Pass that to render_html_to_image. The model handles creative layout, codetoimage handles pixel-perfect rendering. Works great for OG images, social cards, certificates with dynamic copy.

How much does it cost per render?

Sandbox is free for 50 renders/month. Hobby is $7/month for 3,000 renders — that's roughly $0.0023 per render. Pro is $19/month for 10,000 renders ($0.0019 each). Compare to DALL-E 3 standard at $0.040 per image — codetoimage is ~17x cheaper and the result is deterministic, so you can cache aggressively to drive effective cost even lower.

Which MCP clients are supported?

Any client that supports stdio MCP servers. Verified: Claude Desktop, Claude Code, Cursor. Should work without changes: Windsurf, Cline, Zed, OpenCode, and anything else that runs MCP servers via stdio. Non-MCP agents (LangChain, custom GPT actions, n8n, Make, Zapier) wire up via the REST API or by shelling out to @codetoimage/cli.

Plug image rendering into your agent.

Free Sandbox tier — 50 renders/month, no credit card. MCP, REST, or CLI — same credits, same renderer.