How to Generate PDFs from HTML (Without Puppeteer Headaches)

A practical guide to programmatic PDF generation. Learn why developers are moving away from self-hosted Puppeteer and what alternatives actually work.

If you've ever tried to generate a PDF from HTML programmatically, you know the frustration:

There has to be a better way, right?

In this guide, we'll walk through three approaches to HTML-to-PDF conversion, from the DIY nightmare to the "it just works" solution.

Method 1: Self-Hosted Puppeteer (The Hard Way)

Puppeteer is Google's headless Chrome automation library. It can do anything a browser can do, including rendering HTML to PDF.

The Setup

const puppeteer = require('puppeteer');

async function generatePDF(html) {
  const browser = await puppeteer.launch({
    headless: 'new',
    args: ['--no-sandbox', '--disable-setuid-sandbox']
  });
  
  const page = await browser.newPage();
  await page.setContent(html);
  
  const pdf = await page.pdf({
    format: 'A4',
    printBackground: true
  });
  
  await browser.close();
  return pdf;
}

Looks simple, right? Here's what you'll actually deal with:

Reality Check: This code works locally but breaks in production 80% of the time. Missing Chrome dependencies, memory leaks, and timeout issues will consume your life.

The Problems

1. Deployment Hell

Puppeteer requires Chrome (120MB download) plus dozens of system libraries. On a fresh Ubuntu server, you'll need to install:

apt-get install -y \
  libnss3 libatk-bridge2.0-0 libcups2 \
  libdrm2 libxkbcommon0 libxcomposite1 \
  libxdamage1 libxfixes3 libxrandr2 \
  libgbm1 libasound2

Miss one? Your PDF generation silently fails with "Error: Failed to launch the browser process!"

2. Memory Leaks

Each Puppeteer instance uses 150-250MB of RAM. If you don't properly close browser instances, your server will run out of memory within hours.

3. Unpredictable Performance

PDF generation times vary wildly based on:

4. Maintenance Burden

Chrome updates break things. Every 6-8 weeks, you'll see new deprecation warnings or outright failures that require code changes.

When Puppeteer Makes Sense: You need full browser automation (clicking buttons, filling forms, scraping dynamic content). For simple HTML→PDF conversion, it's overkill.

Method 2: wkhtmltopdf (The Old Way)

Before Puppeteer, there was wkhtmltopdf - a command-line tool that converts HTML to PDF using the WebKit rendering engine.

The Setup

const { exec } = require('child_process');

function generatePDF(htmlFile, outputFile) {
  return new Promise((resolve, reject) => {
    exec(`wkhtmltopdf ${htmlFile} ${outputFile}`, (error) => {
      if (error) reject(error);
      else resolve();
    });
  });
}

Why Developers Moved Away

wkhtmltopdf was abandoned by its maintainers and uses an ancient version of WebKit. It doesn't support:

Unless you're maintaining legacy code, skip this entirely.

Method 3: PDF Generation APIs (The Smart Way)

PDF APIs handle all the complexity for you. You send HTML, you get a PDF back. No Chrome dependencies, no memory management, no deployment headaches.

The Modern Approach

const fetch = require('node-fetch');

async function generatePDF(html) {
  const response = await fetch('https://api.snapapi.io/pdf', {
    method: 'POST',
    headers: {
      'X-API-Key': process.env.SNAPAPI_KEY,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({ html: html, format: 'A4' })
  });
  
  return await response.buffer();
}

That's it. No Chrome installation, no dependency management, no memory leaks.

The Benefits

Self-Hosted Puppeteer

  • Setup time: 4-8 hours
  • Deployment: Complex
  • Maintenance: High
  • Cost: Server overhead

PDF API

  • Setup time: 5 minutes
  • Deployment: Copy/paste
  • Maintenance: Zero
  • Cost: Predictable

Real-World Use Cases

1. Invoice Generation

Convert HTML invoices to PDF for email delivery:

const invoiceHTML = `



  


  

Invoice #12345

Amount: $299.00

`; const pdf = await generatePDF(invoiceHTML); // Email the PDF to customer

2. Report Generation

Create monthly analytics reports from dashboards:

const reportHTML = await renderDashboard(userId);
const pdf = await generatePDF(reportHTML);
await saveToS3(pdf, `reports/${userId}/monthly.pdf`);

3. Certificate Generation

Generate completion certificates for online courses:

const certificateHTML = renderCertificate({
  name: user.name,
  course: 'Web Development Fundamentals',
  completionDate: new Date()
});

const pdf = await generatePDF(certificateHTML);

Choosing Your Approach

Use Self-Hosted Puppeteer if:

Use a PDF API if:

Try SnapAPI's PDF Generation

Convert HTML to PDF with a single API call. Test it live on our homepage - no signup required.

Try Interactive Demo

Best Practices

Regardless of which method you choose, follow these guidelines for better PDF output:

1. Use Absolute URLs for Assets

Images, fonts, and stylesheets should use full URLs, not relative paths:





2. Embed CSS Inline for Critical Styles

External stylesheets can fail to load. Inline critical CSS:

3. Set Explicit Page Sizes

Don't rely on browser defaults. Specify dimensions:

await page.pdf({
  format: 'A4',
  margin: { top: '20px', bottom: '20px' },
  printBackground: true
});

4. Test with Real Data

PDFs that look perfect with placeholder text often break with real user data. Test with edge cases: long names, special characters, images of varying sizes.

Conclusion

HTML-to-PDF conversion doesn't have to be painful. While Puppeteer gives you maximum control, it comes with significant operational overhead.

For most developers, a PDF API is the pragmatic choice: faster to implement, easier to maintain, and more reliable in production.

The question isn't "which technology is best" - it's "what's the fastest path to shipping a working solution?"