Convert Man Pages to ASCII, HTML, Markdown, and PDF (Groff + Pandoc)

2026-01-02 4 min read Linux Developer Tools Documentation

Converting Man Pages to ASCII, HTML, Markdown, and PDF (with Groff + Pandoc): Introduction

Man pages are still the source of truth for a lot of Unix tooling, but they’re not always convenient to share. Maybe you want to:

  • paste docs into a README
  • generate a PDF for offline use
  • publish HTML on an internal wiki
  • extract clean text for search or notes

The good news: the man ecosystem is built on predictable pipelines (roff → formatter → output). Once you know the right switches, conversions are basically one-liners.

Converting Man Pages to ASCII, HTML, Markdown, and PDF (with Groff + Pandoc): Step 1 — Find the man page file path

Before converting, it’s useful to resolve where the man page actually lives (often compressed):

1
2
3
man -w nmap
# or
man --path nmap

This typically returns something like:

  • /usr/share/man/man1/nmap.1.gz
  • /usr/local/share/man/man1/foo.1

Validation: If man -w <cmd> prints a path, you’re good. Next: pick your output format.

Converting Man Pages to ASCII, HTML, Markdown, and PDF (with Groff + Pandoc): ASCII / Plain Text output

Converting Man Pages to ASCII, HTML, Markdown, and PDF (with Groff + Pandoc): Option A — Use man to emit plain text

This is the simplest “shareable text” approach:

1
man nmap | col -bx > nmap.txt

Why col -bx?

  • -b removes backspaces used for bold/underline overstrikes
  • -x normalizes tabs/spaces

If you want a “terminal width aware” render:

1
MANWIDTH=120 man nmap | col -bx > nmap.txt

Converting Man Pages to ASCII, HTML, Markdown, and PDF (with Groff + Pandoc): HTML output

Converting Man Pages to ASCII, HTML, Markdown, and PDF (with Groff + Pandoc): Option A — groff directly (classic)

If you already have the man source file:

1
zcat /usr/share/man/man1/nmap.1.gz | groff -mandoc -Thtml > nmap.html

If the file isn’t gzipped, use cat instead of zcat.

Converting Man Pages to ASCII, HTML, Markdown, and PDF (with Groff + Pandoc): Option B — man -Thtml (clean and easy)

Some man implementations can render directly:

1
man -Thtml nmap > nmap.html

Converting Man Pages to ASCII, HTML, Markdown, and PDF (with Groff + Pandoc): Markdown output (Pandoc)

Pandoc can parse man/roff and emit Markdown. You can either feed it the file or pipe content.

Converting Man Pages to ASCII, HTML, Markdown, and PDF (with Groff + Pandoc): Convert from a file

If you already have the uncompressed .1 file:

1
pandoc nmap.1 -f man -t markdown -o nmap.md

Note: specifying -f man makes the intent explicit and avoids mis-detection.

Converting Man Pages to ASCII, HTML, Markdown, and PDF (with Groff + Pandoc): One-liner: resolve path, decompress if needed, convert to Markdown

This is the “variable contains the program name” version:

1
2
3
# $doc contains the program name (e.g. doc=nmap)
man -w -- "$doc" | xargs -r -I{} sh -c 'case "$1" in (*.gz) gzip -cd -- "$1" ;; (*) cat -- "$1" ;; esac' sh {} \
  | pandoc -f man -t markdown -o "${doc}.md"

What’s happening:

  • man -w -- "$doc" prints the real path to the manpage
  • case chooses gzip -cd for .gz or cat for plain files
  • pandoc -f man -t markdown converts roff → Markdown

Converting Man Pages to ASCII, HTML, Markdown, and PDF (with Groff + Pandoc): PDF output

Converting Man Pages to ASCII, HTML, Markdown, and PDF (with Groff + Pandoc): Option A — man -Tpdf (fastest)

Install groff tools if you don’t have them:

1
sudo apt install groff

Then:

1
man -Tpdf nmap > nmap.pdf

Converting Man Pages to ASCII, HTML, Markdown, and PDF (with Groff + Pandoc): Option B — Groff pipeline (more explicit)

If you’re working from the underlying man source:

1
zcat /usr/share/man/man1/nmap.1.gz | groff -mandoc -Tpdf > nmap.pdf

Converting Man Pages to ASCII, HTML, Markdown, and PDF (with Groff + Pandoc): Common gotchas (and fixes)

  • Output has weird backspace characters (^H)
    • Use: man <cmd> | col -bx
  • man -Tpdf fails
    • Install groff: sudo apt install groff
  • Your man page isn’t gzipped
    • Don’t assume .gz; use the case one-liner above
  • Markdown looks “too literal”
    • Try different Markdown targets: -t gfm (GitHub-flavored)
    • Example: pandoc -f man -t gfm -o nmap.md

Converting Man Pages to ASCII, HTML, Markdown, and PDF (with Groff + Pandoc): Conclusion

If you want the shortest path per format:

  • Text/ASCII: man nmap | col -bx > nmap.txt
  • HTML: man -Thtml nmap > nmap.html
  • Markdown: man -w nmap | ... | pandoc -f man -t markdown -o nmap.md (use the one-liner)
  • PDF: man -Tpdf nmap > nmap.pdf
comments powered by Disqus