---
name: pdf-tools
description: PDF swiss-army knife. Merge multiple PDFs, split into pages or page ranges, rotate, extract specific pages, encrypt with password, decrypt (if you have the password), compress (via ghostscript when available), and extract text. One CLI with subcommands — `merge`, `split`, `rotate`, `extract`, `lock`, `unlock`, `compress`, `text`. Pure pypdf for most ops. Use when the user asks to "merge PDFs", "split a PDF", "extract pages", "lock a PDF with a password", "compress this PDF", or has any PDF manipulation task.
---


# pdf-tools

PDF swiss-army knife. One CLI, lots of subcommands. Built on `pypdf` (auto-installed on first run). Optional `ghostscript` (`gs`) for high-quality compression on macOS (`brew install ghostscript`).

## Install / first run

The script auto-installs `pypdf` into the user site on first run. No manual setup.

## Subcommands

All commands run via `python3 scripts/pdf.py <subcommand> [args]`.

### merge — combine multiple PDFs into one
```
python3 scripts/pdf.py merge --inputs a.pdf b.pdf c.pdf --output combined.pdf
```

### split — explode a PDF into one file per page
Writes zero-padded names (`page_001.pdf`, `page_002.pdf`, ...) into the output dir.
```
python3 scripts/pdf.py split --input book.pdf --output-dir pages/
```

### rotate — rotate specific pages by 90/180/270 degrees
```
python3 scripts/pdf.py rotate --input doc.pdf --pages 1,3-5 --degrees 90 --output out.pdf
```

### extract — pull specific pages into a new PDF
```
python3 scripts/pdf.py extract --input doc.pdf --pages 2-5 --output excerpt.pdf
```

### lock — encrypt a PDF with a password
```
python3 scripts/pdf.py lock --input doc.pdf --password hunter2 --output secured.pdf
```

### unlock — remove the password from a PDF (you must know it)
```
python3 scripts/pdf.py unlock --input secured.pdf --password hunter2 --output open.pdf
```

### compress — shrink a PDF
Uses `ghostscript` if installed (much better quality/size). Falls back to pypdf stream compression.
Quality presets: `screen` (smallest), `ebook` (medium, default), `printer`, `prepress` (largest, best).
```
python3 scripts/pdf.py compress --input big.pdf --output small.pdf --quality ebook
```

### text — extract all text from a PDF
```
python3 scripts/pdf.py text --input doc.pdf --output doc.txt
```

## Page-range syntax

Anywhere `--pages` is accepted, the format is comma-separated 1-indexed pages and ranges:

- `1` — page 1
- `1,3,5` — pages 1, 3, 5
- `2-5` — pages 2 through 5
- `1,3-5,8` — pages 1, 3, 4, 5, 8

## Notes

- Pages in CLI are **1-indexed** (human-friendly). The script converts to 0-indexed pypdf internally.
- `compress` will auto-detect `gs` on PATH; if missing, it falls back silently to pypdf compression.
- `unlock` requires the correct password — there is no cracking. If you don't know it, you don't get in.
