Article

gPdf vs Puppeteer: when 800 MB of Chromium is the wrong answer

Puppeteer renders any web page to PDF, but you're paying for a headless browser you mostly aren't using. A pragmatic comparison for engineers picking a PDF stack in 2026.

If you Googled “Puppeteer PDF alternative” today and landed here, the question you’re really asking is some flavour of:

“Why does my serverless function cold-start take 2 seconds and use 900 MB of RAM just to print one invoice?”

Puppeteer is a brilliant tool. It’s also massively over-engineered for the job most teams use it for: turning structured data into a predictable PDF. This post is for the team about to ship Puppeteer to production and quietly wondering if there’s a saner option.

We’ll cover where Puppeteer earns its weight, where it doesn’t, and what the actual tradeoff matrix looks like in 2026.

What you’re actually shipping with Puppeteer

When you npm install puppeteer, you pull down a roughly 170 MB Chromium build before transitive deps. At runtime, headless Chromium needs 600–900 MB of resident memory for a single page render, and 1–2 seconds of cold-start time to spin the browser. Each render has to:

  1. Boot the browser process (or reuse a pool)
  2. Open a new tab
  3. Navigate to your HTML/URL
  4. Wait for domcontentloaded (and usually for fonts, images, web components)
  5. Run page.pdf() which serialises the painted page through Chromium’s PDF engine
  6. Tear down the tab

This is the whole-web-platform tax. You’re paying it whether your document is a 90-page legal contract with embedded SVG charts, or a one-page shipping label with five lines of text.

For HTML-to-PDF use cases where your input genuinely needs CSS layout, JavaScript-driven content, web fonts and the rest of the web platform, that tax is fair. For everything else — invoices, labels, receipts, tickets, statements, certificates — it’s lighting money on fire.

Where Puppeteer wins

Be honest about this first, otherwise your team will second-guess the decision later:

  • Faithful HTML/CSS rendering. If your design system emits HTML and you want pixel-identical PDFs of that HTML, Puppeteer is unbeatable. It’s literally Chrome printing.
  • Web-platform features. SVG with filters, CSS Grid edge cases, web components, JavaScript-evaluated content, third-party iframes — all just work.
  • Visual debugging. You can take a screenshot mid-render, open DevTools against headless mode, and see exactly what your renderer sees.
  • Zero translation step. If your content is already a webpage, there’s no schema mapping. page.goto(url); await page.pdf() is the entire pipeline.

If any two of those bullets describe your real workload, don’t switch. Puppeteer is the right answer.

Where Puppeteer loses, badly

For everything else, the cost stack adds up fast.

Memory and cold-start in serverless

A typical Node 20 Lambda or Cloudflare Container running Puppeteer:

MetricTypical value
Container image size250–400 MB (Chromium + Node + your code)
Cold-start time1.8 – 2.5 seconds
Warm RAM per render600 – 900 MB
Concurrent renders per 1 GB instance1 (sometimes 2 if pages are tiny)

If your invoice service handles 100K renders/month, you’re paying for browser bootup energy on every cold container, even though zero of those renders needed JavaScript execution.

The “fonts in containers” trap

Chromium ships with a default font set — usually missing CJK, Cyrillic, Devanagari, Arabic, and a long tail of script-specific glyphs. Discovering this in production looks like:

Q3 2025 invoice for the Tokyo office prints ▢▢▢▢ 2025年第3四半期. Customer escalates. Your team spends a sprint debugging Dockerfile font installs and font-fallback CSS.

Embedding NotoSans CJK alone adds ~50 MB to your image. Embedding a global Noto fallback set adds ~250 MB. You’re paying for Chromium and a font cathedral, just to print one Japanese invoice.

Determinism

Puppeteer renders aren’t byte-identical across Chromium versions. A patch upgrade can subtly shift kerning, font baselines, or page break positions. If you have a test suite that diffs PDFs (and you should), every Chromium update is a small archaeological dig: which rendering changed, was it intentional?

Render-time JavaScript

Even your “static” HTML page has to be parsed, layout-computed, painted, and serialised. Empirically that’s 80 ms to 400 ms per page on a warm process. Most of it is layout, not paint.

For comparison: a server returning JSON directly to a binary renderer takes 3–8 ms for the same one-page invoice (we’ll get to those numbers).

Where gPdf fits

gPdf flips the model: instead of describing your document as HTML and asking a browser to paint it, you describe it as structured JSON (a DocumentRequest) and a Rust renderer compiled to WebAssembly emits the PDF directly. There is no browser. There is no DOM. There is no JavaScript layout pass.

That sounds restrictive, and it is — for HTML-shaped problems. But for the invoice / label / receipt / statement / certificate class of documents, the JSON-first model is actually a better fit:

  • You already have the data structured. Your invoice already lives as a { customer, lines, totals, taxes, notes } object somewhere. You don’t want to first render that to HTML, then ask a browser to read the HTML back into a layout. You want to go straight from data to PDF.
  • Layout becomes a contract. When font_size: 11 always means 11 points and gap: 8 always means 8 points, two engineers reviewing a PR see the exact same output. There’s no display: flex interpretation gap.
  • Output is byte-identical. Same input → same bytes. You can git diff two PDFs and only see what changed.
  • Cold start is the runtime startup, not browser bootup. A V8 isolate on Cloudflare Workers initialises in 5–20 ms. The WASM module is hot in memory across invocations on the same isolate.

A typical gPdf render of a single-page invoice clocks in at 3–5 ms p50 wall-clock at the edge, served from whichever Cloudflare colo the user hit. That’s about two orders of magnitude faster than Puppeteer’s warm path, and three orders of magnitude faster than its cold path.

The decision matrix

Use the table you’d actually use in a tech-design review.

WorkloadUse PuppeteerUse gPdf
Existing HTML report → PDF✅ first choice⚠️ requires rewrite
Invoices, statements, receipts⚠️ heavy hammer✅ first choice
Shipping labels with barcodes❌ avoid (font issues)✅ first choice
E-invoice (Factur-X / ZUGFeRD / EN 16931)❌ no built-in support✅ built-in
PDF/A long-term archival⚠️ needs Ghostscript pass✅ built-in profiles
Pixel-faithful design system mockups✅ first choice❌ wrong tool
Charts that need real D3 / Recharts✅ first choice❌ wrong tool
Tickets, certificates, name-tags⚠️ overkill✅ first choice
Anything that needs JavaScript at render time✅ only choice❌ wrong tool

If you’re in the right column on more than three of those rows, the savings are not subtle.

A real comparison: one-page invoice render

Same content. Same paper size. Same fonts (NotoSans). Same output PDF/A-3b profile.

Puppeteer (warm Lambda, 1 GB)gPdf (warm Cloudflare Worker)
p50 latency180 ms3.4 ms
p99 latency420 ms8 ms
Cold-start penalty+1800 ms first render+12 ms first render
Memory at peak720 MB18 MB
Image / module size280 MB4.5 MB
CJK glyphs❌ unless explicit install✅ embedded NotoSans CJK
Cost / 100K renders~$240 (Lambda compute)~$5 (gPdf Basic plan)

That last row tends to surprise people. The cost gap is real, and it’s not a teaser — it’s structural. We don’t have to amortise Chromium bootup, browser memory, or container cold-starts, so the per-render unit cost is genuinely tiny.

“But $5/100K pages sounds too cheap. What’s the catch?”

The catch is exactly that we don’t ship a browser. The cost of running a binary renderer on a warm V8 isolate is milliseconds of CPU and kilobytes of memory. Charging Puppeteer-shaped prices for it would be charging for infrastructure we don’t run.

When you should still pick Puppeteer

We’d be the worst person to ask if our answer was always “use gPdf”. It isn’t. The honest cases:

  1. You already ship Puppeteer in production and it works. Don’t migrate for sport. The right time to evaluate gPdf is when Puppeteer starts hurting — usually when monthly compute bills cross $400, or when cold-start SLA breaks something downstream.

  2. Your documents are existing webpages, full stop. A 60-page user-generated report styled by your design system, with nested charts and dynamic content, is not a JSON migration. It’s a redesign.

  3. You need pixel-perfect parity with a web preview. Some workflows (e.g. “what you see in the editor is what prints”) really need Chromium to be the renderer in both places.

If none of those apply, the math is straightforward: smaller deploy, lower latency, lower bill, byte-identical output, and no font-install drama.

How to migrate a real workload

If you’re convinced enough to try, the migration is usually a 1–2 day spike per document type, not a re-architecture:

  1. Pick one document — start with the highest-volume one, not the most complex.
  2. Map your HTML template’s logical sections to the gPdf JSON elements (text, box, table, barcode, image).
  3. Use the Playground to iterate on a real DocumentRequest until the output matches.
  4. Wire your existing data-shape into a small mapper function that emits the JSON.
  5. A/B the new endpoint against your Puppeteer one for a week. Diff the PDFs. Decide.

Most teams find the JSON model clicks within a day. The hard part isn’t the new tool — it’s untangling whatever HTML/CSS gymnastics the old template grew over time.

TL;DR

Puppeteer is the right answer for web pages. For documents, you’re paying a 100–200× tax on every render to avoid the small one-time cost of describing your document as data. If your fleet renders invoices, labels, receipts, statements, tickets, or anything else that’s “the same shape every time, just different values”, an edge-native renderer like gPdf will be measurably faster, smaller, cheaper, and more deterministic.

Try it in the Playground — it’s a real edge worker, no signup, response in your browser in under 5 ms.