Last month I received a PDF from a vendor that triggered three different AV signatures. The file was “clean” — just a contract — but it had embedded JavaScript, an auto-open action, and metadata pointing to an internal network share. The vendor had no idea. This is the reality of document security in 2026: every PDF, DOCX, and image file is a potential attack vector, and most people have zero tooling to deal with it.
That is when I started using Dangerzone seriously. It is an open-source tool from the Freedom of the Press Foundation that converts potentially dangerous documents into safe PDFs by rendering them inside disposable containers. No network access, no persistent state, no trust required in the source file.
How Dangerzone Actually Works Under the Hood
The architecture is surprisingly elegant. Dangerzone uses a two-container pipeline:
- Container 1 (pixels): Takes your input document (PDF, DOCX, XLSX, ODP, images — about 20 formats), converts it to raw RGB pixel data using LibreOffice or Poppler, then outputs flat pixel streams. No parsing of the output format happens here. The container has no network access.
- Container 2 (safe PDF): Takes those raw pixels, reassembles them into a clean PDF with OCR text layer (via Tesseract). The output PDF contains only images and an OCR text layer — no JavaScript, no macros, no embedded objects, no metadata from the original.
The key insight: by reducing everything to pixels between containers, you eliminate entire classes of attacks. Malicious macros? Gone after pixel conversion. Embedded executables? Cannot survive rasterization. Tracking URLs in metadata? Stripped completely.
# Install on Ubuntu/Debian
sudo apt install dangerzone
# Or on macOS
brew install --cask dangerzone
# Convert a single file
dangerzone-cli suspicious-contract.pdf
# Batch convert a directory
find ./inbox -name "*.pdf" -exec dangerzone-cli {} \;
Performance: Real Numbers on Real Hardware
I tested Dangerzone 0.8.1 on my homelab (Xeon E-2278G, 64GB RAM, NVMe storage) processing a batch of 50 documents:
- Simple 2-page PDF: 8-12 seconds per file
- 20-page DOCX with images: 25-35 seconds
- 100-page scanned PDF: 90-120 seconds (OCR is the bottleneck)
- Memory usage: peaks at ~800MB per conversion (container overhead)
It is not fast. If you are processing hundreds of files daily, you will want to run it on dedicated hardware. But for the security-conscious workflow of “I just received something from an unknown sender,” 10 seconds is nothing.
Dangerzone vs. The Alternatives
I tested three approaches head-to-head:
Dangerzone (free, open source, local):
- Pros: fully offline, open source, handles 20+ formats, OCR output is searchable
- Cons: slow on large files, requires Docker/Podman, no batch GUI
Qubes OS TrustedPDF (free, requires Qubes):
- Pros: VM-level isolation (stronger than containers), integrated into the OS
- Cons: requires running Qubes as your daily OS, PDF-only, no OCR layer
Online sanitization services (various, cloud-based):
- Pros: nothing to install, usually faster
- Cons: you are uploading potentially sensitive documents to a third party — defeats the purpose
For most people who are not running Qubes, Dangerzone is the best option. It works on Windows, macOS, and Linux, and it never phones home.
My Actual Workflow
I have integrated Dangerzone into my document pipeline with a simple bash script:
#!/bin/bash
# ~/bin/safe-open.sh - sanitize before opening
INPUT="$1"
OUTPUT="/tmp/safe-$(basename "$INPUT")"
echo "Sanitizing: $INPUT"
dangerzone-cli "$INPUT" --output "$OUTPUT" 2>/dev/null
if [ $? -eq 0 ]; then
xdg-open "$OUTPUT"
echo "Opened sanitized version"
else
echo "FAILED: Document could not be sanitized"
echo "This might indicate something malicious."
fi
I set this as my default PDF handler for files downloaded from email. Every attachment gets sanitized before I see it. The 10-second delay is barely noticeable.
When You Need This (And When You Do Not)
Use Dangerzone when:
- Opening documents from unknown or untrusted sources
- You are a journalist receiving leaked documents
- Processing vendor contracts or RFPs from new companies
- You work in finance and receive documents from clients (similar trust-nothing approach as verifying JWTs locally)
Skip it when:
- Documents from trusted internal sources you have worked with for years
- You need to preserve exact formatting (pixel conversion loses vector quality)
- Speed matters more than security for your use case
The Privacy Angle Most People Miss
Beyond malware protection, Dangerzone strips all metadata. When you sanitize a document before sharing it, you remove:
- Author names and organization info
- Edit history and tracked changes
- GPS coordinates in embedded images (same problem I wrote about in my EXIF metadata article)
- Internal file paths revealed in error messages
- Hidden comments and revision marks
I have seen NDAs with tracked changes showing the entire negotiation history. I have seen “final” PDFs with the original author home directory in the metadata. Dangerzone fixes all of this in one pass.
Setting It Up Right
One gotcha: Dangerzone needs either Docker or Podman. On Linux, I recommend Podman — it runs rootless by default, which means even if someone exploits the container runtime, they do not get root on your host.
# Install Podman (preferred for security)
sudo apt install podman
# Verify Dangerzone sees it
dangerzone-cli --help
# Should show "Using container runtime: podman"
On macOS, you will need Docker Desktop or Podman Desktop installed. The GUI version works fine — just drag and drop files onto it.
If you are already running a homelab with Docker (network segmentation helps here), adding Dangerzone is trivial. I run it on my TrueNAS box and access it over SSH for batch jobs.
What I Would Improve
Dangerzone is not perfect. My complaints after 6 months of daily use:
- No watch-folder mode for automated processing
- OCR quality degrades on handwritten documents
- The container pull on first run is ~2GB — not great for limited storage
- No API or daemon mode for integration with other tools
I have been considering wrapping it with inotifywait for a watch-folder setup. If you are security-conscious enough to sanitize documents, you probably want to automate it.
For the paranoid: pair Dangerzone with a solid encrypted storage setup. I keep sanitized documents on an encrypted ZFS dataset. A Samsung T9 portable SSD with hardware encryption works great for this if you need portability — I use one for sensitive client docs that need to travel. Full disclosure: affiliate link.
Dangerzone is at github.com/freedomofpress/dangerzone. It is free, it is open source, and it solves a real problem that most people ignore until they get hit. Install it, set it as your default document handler for untrusted files, and forget about it.
📡 For daily market intelligence and trading signals, join Alpha Signal — free on Telegram.
📧 Get weekly insights on security, trading, and tech. No spam, unsubscribe anytime.
Leave a Reply