Use case · AI agents

Give your agent a real image rendering tool.

DALL-E hallucinates layouts. Midjourney can't hit a brand spec. For agents that need a logo on a card, a chart from state, or a certificate per user action – you need deterministic rendering. MCP-native, REST-callable, CLI-shellable. Pick your wiring.

Start free – 50 renders/mo Jump to MCP setup ↓

The problem

LLM image generation (DALL-E, Midjourney, Imagen) is creative but non-deterministic. Ask for “a card with our logo in the top left, headline in 72px Inter, gradient background” and you get an approximation – different every call, never quite on-brand, text often garbled.

It's also expensive (~$0.04+ per image) and slow (5-30s). For an agent that needs to issue 1,000 certificates, render a chart from query results, or stamp an OG image per blog post – that's a non-starter. You need rendering, not generation.

Three ways to give your agent the tool

1. MCP server (native)

Claude Desktop / Code, Cursor, Windsurf, Cline

Install @codetoimage/mcp-server via npx. Two intent-aware tools appear in your client: render_html_to_image (inline base64) and render_html_to_url(24h hosted). The model picks the right one. Listed in Anthropic's official MCP Registry.

2. REST API (any agent loop)

LangChain, custom GPT, n8n HTTP node

POST to /v1/render with HTML and dimensions, get bytes or a hosted URL back. Wrap it as a LangChain Tool, an OpenAI function-calling schema, or an n8n HTTP step. Same credit pool as MCP.

3. CLI shell (anything that can exec)

n8n Execute Command, Make, shell agents

npx @codetoimage/cli render from any process. n8n Execute Command, Make custom modules, shell-tool agents, GitHub Actions – if it can spawn a process, it can render images.

MCP setup – Claude Desktop

Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "codetoimage": {
      "command": "npx",
      "args": ["-y", "@codetoimage/mcp-server"],
      "env": {
        "CODETOIMAGE_API_KEY": "cti_live_..."
      }
    }
  }
}

Restart Claude Desktop. Now try a prompt like:

“Render a 1200x630 card with a dark gradient background, our logo bottom-left, and the headline ‘Q4 revenue up 38%’ in Inter 72px. Return a hosted URL so I can post it to Slack.”

Claude writes the HTML, picks render_html_to_url, returns the link. No prompt engineering, no tool selection hints needed.

Non-MCP agent – LangChain Tool wrapper

For LangChain, OpenAI function-calling, or any agent framework that takes Tool definitions, wrap the REST API:

// tools/renderImage.ts
import { tool } from "@langchain/core/tools";
import { z } from "zod";

export const renderImage = tool(
  async ({ html, width, height, format }) => {
    const res = await fetch("https://api.codetoimage.app/v1/render", {
      method: "POST",
      headers: {
        "X-API-Key": process.env.CODETOIMAGE_API_KEY!,
        "Content-Type": "application/json",
      },
      body: JSON.stringify({
        html,
        width: width ?? 1200,
        height: height ?? 630,
        format: format ?? "png",
        output: "url",
      }),
    });
    if (!res.ok) throw new Error(`render failed: ${res.status}`);
    const { url } = await res.json();
    return url;
  },
  {
    name: "render_image",
    description:
      "Render HTML/CSS to a pixel-perfect image. Returns a 24h hosted URL. Use for cards, certificates, OG images, charts – anything where layout precision matters.",
    schema: z.object({
      html: z.string().describe("Full HTML with inline CSS or <style> block"),
      width: z.number().optional(),
      height: z.number().optional(),
      format: z.enum(["png", "jpeg", "webp"]).optional(),
    }),
  }
);

Same shape works for OpenAI function calling – replace the wrapper with a JSON schema and call from your agent loop.

vs LLM image generation

Aspect	DALL-E / Midjourney / Imagen	codetoimage
Determinism	Different every call	✓ Same input → same pixels
Layout precision	Approximate, text often garbled	✓ Exact – it's a real browser
Per-call cost	~$0.04 (DALL-E 3 standard)	✓ <$0.007 at Hobby
Latency	5-30s	✓ ~600ms median
Brand consistency	Best-effort prompt steering	✓ Your design system, your fonts
Re-render same input	New image, new cost	✓ Cache the output URL forever
Creative novelty	✓ Concept art, illustration	Whatever HTML/CSS can express

What you get

Agentic OG generation

LLM writes the HTML from the post content, codetoimage renders it. Per-post social previews without a designer in the loop.

Tool-call certificates

Issue a branded PNG per user action – course completion, badge, receipt. Same template, dynamic copy, deterministic output.

Chart / data viz from agent state

Pipe query results into a Chart.js or vanilla SVG template, render to PNG, hand back to the user as an attachment.

Slack / Discord image bots

Agent receives a message, renders it as a branded card, posts back. Beautiful unfurls without any image-editing tooling.

n8n / Zapier image steps

Drop an HTTP node or shell exec into a workflow. Every automation gets image rendering – no per-template setup, no Bannerbear-style template UI.

MCP Registry wedge

The only HTML-to-image stack in Anthropic's official MCP Registry. Discoverable by name in any client that indexes the registry.

FAQ

When should I use this vs DALL-E or other LLM image generation?▾

Use generative models (DALL-E, Imagen, Midjourney) when you want creative novelty – concept art, illustrations, mood boards. Use codetoimage when you want determinism – branded cards, certificates, charts, OG images, dashboards. If the same input must always produce the same pixel-perfect output, or the layout has to match your design system exactly, you need rendering, not generation. Bonus: rendering is ~10x cheaper per call and ~10x faster.

Can the LLM generate the HTML itself?▾

Yes – this is the canonical pattern. Prompt the model with the data plus a style guide ("render a 1200x630 card with our brand gradient, headline X, subheadline Y") and let it produce the HTML/CSS string. Pass that to render_html_to_image. The model handles creative layout, codetoimage handles pixel-perfect rendering. Works great for OG images, social cards, certificates with dynamic copy.

How much does it cost per render?▾

The free tier gives you 50 renders/month, no watermark. Paid plans start at $9/month – Starter includes 1,000 renders ($0.009 per render). Hobby is $19/month for 3,000 renders – roughly $0.0063 per render. Pro is $59/month for 15,000 ($0.0039 each), and prepaid credit packs start at $5 for 250 renders (one-time offer per account), no subscription needed. Compare to DALL-E 3 standard at $0.040 per image – codetoimage is 4-10x cheaper and the result is deterministic, so you can cache aggressively to drive effective cost even lower.

Which MCP clients are supported?▾

Any client that supports stdio MCP servers. Verified: Claude Desktop, Claude Code, Cursor. Should work without changes: Windsurf, Cline, Zed, OpenCode, and anything else that runs MCP servers via stdio. Non-MCP agents (LangChain, custom GPT actions, n8n, Make, Zapier) wire up via the REST API or by shelling out to @codetoimage/cli.

Plug image rendering into your agent.

Free tier – 50 renders/month, no watermark, no credit card. MCP, REST, or CLI – same credits, same renderer.

Start free →See the MCP server