---
name: screenshot-text
description: OCR any image (or PDF page, or video frame) to plain text. Uses Tesseract under the hood. Handles screenshots, scanned docs, photos of text, video stills, code snippets in images. Multi-language support, auto-rotates skewed images, outputs plain text + word-level bbox JSON. Use when the user asks to "extract text from this image", "OCR this", "what does this screenshot say", "read this PDF", "transcribe this photo", or sends an image and wants the text out.
---


# screenshot-text

Image (or PDF, or video frame) → plain text. Powered by Tesseract.

## Single image

```bash
python3 ~/.claude/skills/screenshot-text/scripts/ocr.py --input screenshot.png
# prints text to stdout + saves screenshot.txt
```

## PDF (each page → text file)

```bash
python3 ~/.claude/skills/screenshot-text/scripts/ocr.py --input scan.pdf
# → scan_ocr/page_001.txt, page_002.txt, ... + scan_ocr/all.txt (concatenated)
```

## Video (every Nth second, OCR each frame)

```bash
python3 ~/.claude/skills/screenshot-text/scripts/ocr.py --input lecture.mp4 --every 5
# → lecture_ocr/frame_00-00-00.txt, ...
```

## Folder of images

```bash
python3 ~/.claude/skills/screenshot-text/scripts/ocr.py --input ./scans/
```

## Word-level JSON (with bounding boxes)

```bash
python3 ~/.claude/skills/screenshot-text/scripts/ocr.py --input page.png --json
# → page.ocr.json with [{text, conf, x, y, w, h}, ...]
```

## Other languages

```bash
python3 ~/.claude/skills/screenshot-text/scripts/ocr.py --input doc.png --lang spa+eng
# install language packs: brew install tesseract-lang
```

## Flags

- `--input` — image, PDF, video, or folder
- `--lang` — tesseract language code (default `eng`)
- `--every` — for video: seconds between sampled frames (default 2)
- `--json` — also output word-level JSON with confidence + bounding boxes
- `--no-deskew` — skip auto-rotation of skewed images

## First-run install

Auto-installs `pytesseract` + `pdf2image` + `pillow`. Tesseract binary itself: `brew install tesseract` (script will prompt if missing).

## Pairs well with

- `pinterest-gif-scraper` — OCR captions in pinned references
- `clip-search` — find frames where specific text appears in a video
