Article

PDF properties should show your brand, not someone else's tool

Most white-label PDF stacks render the page in your brand but quietly stamp a third-party tool name into the file's Producer field. For B2B SaaS that ships PDFs on behalf of customers, that gap matters. Here's why, and what to do about it.

Open any business-critical PDF — an invoice, a shipping label, a monthly statement — and look at the document properties (Cmd+D on macOS Preview, Ctrl+D in Adobe Reader, “File → Properties” in most desktop viewers). Then look at the Producer field.

If the PDF was generated by a SaaS platform using a headless browser, you’ll often see something like:

$ pdfinfo invoice.pdf
Title:           invoice-20260318.pdf
Subject:
Author:
Creator:         Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (...) Chrome/120.0.0.0
Producer:        Skia/PDF m120
Language:

The page above looks like the SaaS vendor’s brand. The file properties name a browser engine that has nothing to do with the vendor — or with the customer the SaaS is shipping the document on behalf of.

That gap is what this post is about.

The page is branded, the file isn’t

White-label PDF generation is a well-understood requirement for B2B SaaS. The vendor lets the customer upload a logo, pick brand colours, configure a template; exported PDFs visually look like the customer’s brand, not the vendor’s.

Most platforms stop there. They solve the visible layer and leave the file-properties layer alone. The result: a document that says “Acme Logistics” on every page but identifies itself as “Skia/PDF m120” the moment anyone right-clicks → Properties.

For a one-off B2C download — a personal receipt, a movie ticket — file properties are mostly cosmetic. For a B2B document, or any regulated B2C output (medical reports, financial statements, legal disclosures, regulated insurance forms), the file properties are part of the document. They show up in:

  • Adobe Reader, Preview, Foxit, every desktop PDF viewer
  • Document management systems (SharePoint, M-Files, NetSuite Files)
  • Email-server PDF previewers
  • Search indexes (Spotlight, Outlook, internal DMS search)
  • Archive systems (PDF/A long-term preservation)
  • Anything that calls pdfinfo or pdftk dump_data in a pipeline

A document whose page says “Acme” and whose Producer field says “Chromium” reads to those systems as “rendered by Chromium for someone called Acme” — not “rendered by Acme.” For enterprise procurement and compliance, that distinction registers.

Why this is worse for the SaaS vendor than for direct users

If you generate a PDF for yourself, “Chromium” in the Producer field is your problem only.

If you’re a SaaS vendor and your customers generate PDFs through your platform, the chain is longer:

  • You picked the rendering stack.
  • Your customer ships the resulting PDF to their customer.
  • The final recipient — a procurement team, a carrier, a tax office, a finance department — sees a Producer field that names neither you nor your customer. It names the upstream renderer you happen to use.

Your customer’s brand on the page; an unfamiliar tool name in the file. From the recipient’s perspective the document looks slightly off in a way they can’t quite name. From your customer’s perspective, the white-label promise wasn’t fully delivered.

This is the part most platforms underinvest in, because the fix isn’t visible from the homepage. But the customer who runs a single pdfinfo against the output of their “white-label PDF” feature will notice.

When this actually bites

These are the situations where the Producer field has shown up as a real operational issue, not a hypothetical:

  • Vendor security questionnaires. Enterprise procurement runs a vendor risk review and asks: “list every third-party tool that appears in document outputs you ship to us.” The customer’s IT team runs pdfinfo on a sample document and finds an unfamiliar renderer name. Nobody’s angry — but it gets added to the sub-processor list, which then triggers vendor-management review and a separate set of compliance checks.
  • DMS / archive search. A customer’s document management system indexes PDFs by author. When PDFs from your platform have a blank Author field, the customer’s compliance team can’t easily filter “documents from this vendor” months later — they end up adding manual tags, which they shouldn’t have to.
  • Long-term archive validation. A PDF/A archive system flags documents where Producer doesn’t match the expected vendor list. The compliance team has to manually allow-list “Skia/PDF m120” and “wkhtmltopdf” as known-OK renderers — a small but ongoing operational burden.
  • Brand-consistency audits. Some enterprise marketing teams audit outbound document attribution as part of brand-governance. A document attributed to a tool the brand team has never heard of becomes a finding.

None of these are critical incidents. They’re papercuts that add friction to enterprise sales, vendor onboarding, and operations. They compound across thousands of documents per month.

What the file properties actually expose

The PDF specification reserves six standard metadata fields that nearly every viewer surfaces:

Field What it’s for What a leaky stack usually shows
Title Document title Auto-generated filename, or empty
Author The person or organisation that created the document Empty, or the developer’s name
Subject Short description of the document Empty
Creator The application that produced the source content “Chromium”, “Mozilla/5.0…”, or the SaaS vendor’s internal tool name
Producer The application that produced the PDF bytes “Skia/PDF m120”, “wkhtmltopdf 0.12.x”, “iText 7.x.x”
Language BCP-47 language tag Empty, or wrong locale

Each of these is one short string. None of them are technically hard to fill in. The reason they leak by default is that the rendering library writes its own name into Producer (correctly — that’s what the field is for), and most application code never sets the other five.

The fix is to set them — deliberately, on every render, from the application that knows what the document is for.

What “branded metadata” looks like in practice

Here is the same metadata block as gPdf produces it. Six fields, all overridable by the caller:

{
  "settings": {
    "metadata": {
      "title":    "Invoice INV-2026-3401",
      "language": "en",
      "author":   "Acme Logistics, Inc.",
      "subject":  "Monthly invoice — 2026-03",
      "creator":  "Acme Billing Platform v7.2",
      "producer": "Acme Billing Platform"
    }
  }
}

The same pdfinfo against the resulting PDF:

$ pdfinfo invoice.pdf
Title:           Invoice INV-2026-3401
Subject:         Monthly invoice — 2026-03
Author:          Acme Logistics, Inc.
Creator:         Acme Billing Platform v7.2
Producer:        Acme Billing Platform
Language:        en

The page renders as “Acme Logistics” — and the file properties say “Acme Logistics” too. Right-click → Properties shows a document that fully belongs to Acme. The fact that the bytes were produced by gPdf at the edge in ~4 ms doesn’t surface anywhere the recipient looks.

Won’t customers want to know you’re using gPdf?

This question comes up often enough that it’s worth answering directly.

Yes — your customers can absolutely know you build on gPdf. That’s between you and them, and it usually belongs in your engineering blog, your changelog, your security architecture documents, or your sub-processor list (which gPdf shows up on if relevant to your DPA).

The Producer field is not about that relationship. It’s about the end recipient of your customer’s document — a procurement clerk, a carrier dispatcher, a tax-office form processor — who has no relationship with your renderer choice and no reason to care what it is. To them, “Skia/PDF m120” in the Properties dialog is noise; “Acme Billing Platform” is signal.

There’s also nothing dishonest about it. The PDF spec defines Producer as “the name of the application that produced the original PDF.” If you build a PDF service on top of gPdf, your application produced the bytes that gPdf shipped. Saying so in Producer is accurate. The honest version is:

  • gPdf is the rendering infrastructure.
  • Your platform is the producer.
  • Your customer is the author.

Each layer gets credit where the PDF spec intends it.

A footnote on downstream pipelines

If your output PDF passes through any post-processing stage before it reaches the recipient — Ghostscript without explicit metadata-preservation flags, an enterprise DRM/watermarking tool, a “PDF optimiser” — some of those tools will quietly rewrite Producer to their own name and undo the branded metadata you just set. Test against your actual pipeline, not just the raw gPdf response.

A note on what isn’t here

To stay accurate: the six standard fields above are what gPdf exposes today. That’s enough for white-labelling the document properties — which is what the brand-identity story is about.

It is not enough for stashing arbitrary business context (order UUID, warehouse code, template version) inside the PDF for downstream systems to read. That’s a separate, complementary capability — XMP custom metadata + arbitrary key-value pairs — which the PDF spec supports and which we’re tracking as a roadmap item. If you need it today, ID-style data usually lives more reliably in your platform’s own database, keyed by the PDF’s filename or a hash, than inside the PDF itself. Metadata is for the document’s identity, not for moving structured business data through PDFs as a transport layer.

Branded metadata (today) ≠ hidden business-data flow (separate). Worth keeping them separate in your own planning.

The smallest possible upgrade

If you already POST to /api/v1/pdf/render and your current call has no settings.metadata, the smallest improvement is three lines added to the JSON you already send:

 {
   "pages": [...],
   "settings": {
+    "metadata": {
+      "author":   "Your customer's organisation",
+      "producer": "Your platform"
+    }
   }
 }

Two fields, one new key. Verifiable with pdfinfo in seconds. Once these land, fill in title, language, subject and creator when you have time.

Where this lands in the gPdf API

Six lines inside settings.metadata. Per-token policies can also strip or default these fields so a multi-tenant SaaS can enforce that every PDF its customers generate is correctly attributed without trusting every API caller to set them.

The visible page is half the brand. The file properties are the other half. If your platform ships PDFs on behalf of customers, both halves should say their name.