Convert Man Pages to ASCII, HTML, Markdown, and PDF (Groff + Pandoc)
Converting Man Pages to ASCII, HTML, Markdown, and PDF (with Groff + Pandoc): Introduction
Man pages are still the source of truth for a lot of Unix tooling, but they’re not always convenient to share. Maybe you want to:
- paste docs into a README
- generate a PDF for offline use
- publish HTML on an internal wiki
- extract clean text for search or notes
The good news: the man ecosystem is built on predictable pipelines (roff → formatter → output). Once you know the right switches, conversions are basically one-liners.
Converting Man Pages to ASCII, HTML, Markdown, and PDF (with Groff + Pandoc): Step 1 — Find the man page file path
Before converting, it’s useful to resolve where the man page actually lives (often compressed):
|
|
This typically returns something like:
/usr/share/man/man1/nmap.1.gz/usr/local/share/man/man1/foo.1
Validation: If man -w <cmd> prints a path, you’re good. Next: pick your output format.
Converting Man Pages to ASCII, HTML, Markdown, and PDF (with Groff + Pandoc): ASCII / Plain Text output
Converting Man Pages to ASCII, HTML, Markdown, and PDF (with Groff + Pandoc): Option A — Use man to emit plain text
This is the simplest “shareable text” approach:
|
|
Why col -bx?
-bremoves backspaces used for bold/underline overstrikes-xnormalizes tabs/spaces
If you want a “terminal width aware” render:
|
|
Converting Man Pages to ASCII, HTML, Markdown, and PDF (with Groff + Pandoc): HTML output
Converting Man Pages to ASCII, HTML, Markdown, and PDF (with Groff + Pandoc): Option A — groff directly (classic)
If you already have the man source file:
|
|
If the file isn’t gzipped, use cat instead of zcat.
Converting Man Pages to ASCII, HTML, Markdown, and PDF (with Groff + Pandoc): Option B — man -Thtml (clean and easy)
Some man implementations can render directly:
|
|
Converting Man Pages to ASCII, HTML, Markdown, and PDF (with Groff + Pandoc): Markdown output (Pandoc)
Pandoc can parse man/roff and emit Markdown. You can either feed it the file or pipe content.
Converting Man Pages to ASCII, HTML, Markdown, and PDF (with Groff + Pandoc): Convert from a file
If you already have the uncompressed .1 file:
|
|
Note: specifying -f man makes the intent explicit and avoids mis-detection.
Converting Man Pages to ASCII, HTML, Markdown, and PDF (with Groff + Pandoc): One-liner: resolve path, decompress if needed, convert to Markdown
This is the “variable contains the program name” version:
|
|
What’s happening:
man -w -- "$doc"prints the real path to the manpagecasechoosesgzip -cdfor.gzorcatfor plain filespandoc -f man -t markdownconverts roff → Markdown
Converting Man Pages to ASCII, HTML, Markdown, and PDF (with Groff + Pandoc): PDF output
Converting Man Pages to ASCII, HTML, Markdown, and PDF (with Groff + Pandoc): Option A — man -Tpdf (fastest)
Install groff tools if you don’t have them:
|
|
Then:
|
|
Converting Man Pages to ASCII, HTML, Markdown, and PDF (with Groff + Pandoc): Option B — Groff pipeline (more explicit)
If you’re working from the underlying man source:
|
|
Converting Man Pages to ASCII, HTML, Markdown, and PDF (with Groff + Pandoc): Common gotchas (and fixes)
- Output has weird backspace characters (
^H)- Use:
man <cmd> | col -bx
- Use:
man -Tpdffails- Install groff:
sudo apt install groff
- Install groff:
- Your man page isn’t gzipped
- Don’t assume
.gz; use thecaseone-liner above
- Don’t assume
- Markdown looks “too literal”
- Try different Markdown targets:
-t gfm(GitHub-flavored) - Example:
pandoc -f man -t gfm -o nmap.md
- Try different Markdown targets:
Converting Man Pages to ASCII, HTML, Markdown, and PDF (with Groff + Pandoc): Conclusion
If you want the shortest path per format:
- Text/ASCII:
man nmap | col -bx > nmap.txt - HTML:
man -Thtml nmap > nmap.html - Markdown:
man -w nmap | ... | pandoc -f man -t markdown -o nmap.md(use the one-liner) - PDF:
man -Tpdf nmap > nmap.pdf