screenshot-text

BY @BAT — 12 DOWNLOADS — CONTENT

OCR any image (or PDF page, or video frame) to plain text. Uses Tesseract under the hood. Handles screenshots, scanned docs, photos of text, video stills, code snippets in images. Multi-language support, auto-rotates skewed images, outputs plain text + word-level bbox JSON. Use when the user asks to "extract text from this image", "OCR this", "what does this screenshot say", "read this PDF", "transcribe this photo", or sends an image and wants the text out.

\u2606 0 DOWNLOAD ALL (.ZIP) SKILL.MD TAR.GZ V0.1.0

CLI INSTALL

curl -sS https://dem0n.vip/s/bat/screenshot-text/SKILL.md -o ~/.claude/skills/screenshot-text/SKILL.md --create-dirs

DOWNLOAD ALL gives you a single .zip containing SKILL.md + the tar.gz — drag it into Claude Code in one go.

Sign up to see the full skill

Get the source, install command, comments, and version history

GET AN INVITE

screenshot-text

Image (or PDF, or video frame) → plain text. Powered by Tesseract.

Single image

python3 ~/.claude/skills/screenshot-text/scripts/ocr.py --input screenshot.png
# prints text to stdout + saves screenshot.txt

PDF (each page → text file)

python3 ~/.claude/skills/screenshot-text/scripts/ocr.py --input scan.pdf
# → scan_ocr/page_001.txt, page_002.txt, ... + scan_ocr/all.txt (concatenated)

Video (every Nth second, OCR each frame)

python3 ~/.claude/skills/screenshot-text/scripts/ocr.py --input lecture.mp4 --every 5
# → lecture_ocr/frame_00-00-00.txt, ...

Folder of images

python3 ~/.claude/skills/screenshot-text/scripts/ocr.py --input ./scans/

Word-level JSON (with bounding boxes)

python3 ~/.claude/skills/screenshot-text/scripts/ocr.py --input page.png --json
# → page.ocr.json with [{text, conf, x, y, w, h}, ...]

Other languages

python3 ~/.claude/skills/screenshot-text/scripts/ocr.py --input doc.png --lang spa+eng
# install language packs: brew install tesseract-lang

Flags

--input — image, PDF, video, or folder
--lang — tesseract language code (default eng)
--every — for video: seconds between sampled frames (default 2)
--json — also output word-level JSON with confidence + bounding boxes
--no-deskew — skip auto-rotation of skewed images

First-run install

Auto-installs pytesseract + pdf2image + pillow. Tesseract binary itself: brew install tesseract (script will prompt if missing).

Pairs well with

pinterest-gif-scraper — OCR captions in pinned references
clip-search — find frames where specific text appears in a video

BADGE

![downloads](https://dem0n.vip/s/bat/screenshot-text/badge.svg)

VERSIONS

0.1.0 — 3.6 KB — 0d79279949a6

Sign up to see the full skill

screenshot-text

Single image

PDF (each page → text file)

Video (every Nth second, OCR each frame)

Folder of images

Word-level JSON (with bounding boxes)

Other languages

Flags

First-run install

Pairs well with

VERSIONS

COMMENTS (0)