stems-beatcut

0.1.0 → 0.2.0
 ---
 name: stems-beatcut
-description: Stems-first onset detector — split audio into vocals/drums/bass/other via Demucs, then run librosa onset detection on a single channel (not the muddy full mix) to emit a clean BeatEvent JSON sidecar. Channel-locked cuts: kick-only, syllable-only, sub-only. Conforms to the dem0nhub BeatEvent schema (time/confidence/is_downbeat/source/channel/stem_model) and emits a sister LyricEvent sidecar via whisper. Drop-in producer for feverdream, beatcut, and any consumer that reads BeatEvent sidecars. Use when the user wants stems-aware beat cutting, channel-locked onsets, kick-only cuts, vocal-syllable cuts, or a clean BeatEvent feed for downstream video consumers.
+description: Stems-first onset detector — splits audio into vocals/drums/bass/other via Demucs, then runs onset detection on a single channel (not the muddy full mix) to emit a clean beatlib BeatEvent sidecar. Channel-locked cuts: kick-only, syllable-only, sub-only. Built on top of @foenem_jarvis/beatlib v1.0.0 and registers itself in beatlib.DETECTORS as "stems-beatcut" so consumers can dispatch by name. Sister LyricEvent emitter via whisper. Drop-in producer for feverdream, beatcut, transition-engine. Use when the user wants stems-aware beat cutting, channel-locked onsets, kick-only cuts, vocal-syllable cuts, or a clean BeatEvent feed for downstream video consumers.
 ---
 
 # stems-beatcut
 
 Sister to **beatcut** and **stems**. Onset detection on the full mix fires on every transient — kick, snare, vox, room bleed, all blended. This splits stems first so cuts can be locked to a single channel: kick-only, vocals-only, bass-only.
 
-Output is a **BeatEvent JSON sidecar** that conforms to @gat's schema, so feverdream / beatcut / any downstream consumer can read it without conversion.
+Built on top of **[`@foenem_jarvis/beatlib`](https://dem0n.vip/s/foenem_jarvis/beatlib) v1.0.0** — the shared BeatEvent contract used across the dem0nhub audio-analysis ecosystem (beatcut, feverdream, lyric-engine). This skill imports beatlib for event types, sidecar I/O, and min-gap suppression, then registers `stems-beatcut` in `beatlib.DETECTORS` so any consumer can do:
 
+```python
+from beatlib import detect
+events = detect("song.mp3", detector="stems-beatcut", channel="drums")
-## Schema
+```
 
-```jsonc
+## Schema
-// BeatEvent (one per onset)
+
-{
+```jsonc
-  "time": 0.842,           // seconds, audio-relative, sorted ascending
+// BeatEvent (one per onset)
-  "confidence": 0.81,      // 0..1
+{
-  "is_downbeat": false,
+  "time": 0.842,           // seconds, audio-relative, sorted ascending
-  "source": "stems-beatcut",
+  "confidence": 0.81,      // 0..1
-  "channel": "drums",      // drums | vocals | bass | other | mix
+  "is_downbeat": false,
-  "stem_model": "htdemucs" // demucs | htdemucs
+  "source": "stems-beatcut",
-}
+  "channel": "drums",      // drums | vocals | bass | other | mix
-
+  "stem_model": "htdemucs" // demucs | htdemucs
-// LyricEvent (whisper-aligned, sister sidecar)
+}
-{
+
-  "time": 1.20,
+// LyricEvent (whisper-aligned, sister sidecar)
-  "end_time": 1.35,
+{
-  "word": "yo",
+  "time": 1.20,
-  "confidence": 0.92,
+  "end_time": 1.35,
-  "source": "whisper",
+  "word": "yo",
-  "channel": "vocals"
+  "confidence": 0.92,
-}
+  "source": "whisper",
-```
+  "channel": "vocals"
-
+}
-Both are sorted ascending by `time`.
+```
 
-## Usage
+Both are sorted ascending by `time`.
 
-```bash
+## Usage
-# Detect kick-locked onsets, write BeatEvent sidecar next to the audio
+
-python scripts/stems-beatcut.py audio.mp3 --channel drums --out audio.beats.json
+```bash
-
+# Detect kick-locked onsets, write BeatEvent sidecar next to the audio
-# All four stems → four sidecars
+python scripts/stems-beatcut.py audio.mp3 --channel drums --out audio.beats.json
-python scripts/stems-beatcut.py audio.mp3 --channel all
+
-
+# All four stems → four sidecars
-# Lyric sidecar (whisper on vocals stem)
+python scripts/stems-beatcut.py audio.mp3 --channel all
-python scripts/lyric-events.py audio.mp3 --out audio.lyrics.json
+
-
+# Lyric sidecar (whisper on vocals stem)
-# Cut a video to the kick channel
+python scripts/lyric-events.py audio.mp3 --out audio.lyrics.json
-python scripts/stems-beatcut.py audio.mp3 --channel drums --cut video.mp4 --cut-out cut.mp4
+
-```
+# Cut a video to the kick channel
-
+python scripts/stems-beatcut.py audio.mp3 --channel drums --cut video.mp4 --cut-out cut.mp4
-## Flags
+```
 
-| Flag | Default | Notes |
+## Flags
-|---|---|---|
+
-| `--channel` | `drums` | `drums`, `vocals`, `bass`, `other`, `mix`, or `all` |
+| Flag | Default | Notes |
-| `--stem-model` | `htdemucs` | `demucs` or `htdemucs` (htdemucs = sharper transients) |
+|---|---|---|
-| `--min-gap` | `0.08` | seconds; suppress onsets closer than this |
+| `--channel` | `drums` | `drums`, `vocals`, `bass`, `other`, `mix`, or `all` |
-| `--downbeat-every` | `4` | mark every Nth onset as `is_downbeat: true` (rough; replace with madmom for real downbeat tracking) |
+| `--stem-model` | `htdemucs` | `demucs` or `htdemucs` (htdemucs = sharper transients) |
-| `--cut` | — | optional video to cut |
+| `--min-gap` | `0.08` | seconds; suppress onsets closer than this |
-| `--cut-out` | — | output mp4 path |
+| `--downbeat-every` | `4` | mark every Nth onset as `is_downbeat: true` (rough; replace with madmom for real downbeat tracking) |
-| `--out` | `<audio>.beats.json` | sidecar path |
+| `--cut` | — | optional video to cut |
-| `--no-cache` | off | rerun demucs even if stems exist |
+| `--cut-out` | — | output mp4 path |
-
+| `--out` | `<audio>.beats.json` | sidecar path |
-## Why split first
+| `--no-cache` | off | rerun demucs even if stems exist |
 
-`librosa.onset.onset_detect()` on a full mix returns onsets at every drum hit AND every vocal consonant AND bass note attacks all interleaved. If you want cuts on the kick only, you have to filter — and the filter is fragile. Splitting stems first is the brute-force fix: run onset detection on the drums.wav alone, get only drum onsets. Same trick for vocals (gets syllable boundaries cleanly) and bass (gets sub-pattern downbeats).
+## Why split first
 
-Cost: ~30s of demucs per minute of audio on M-series silicon, cached after first run.
+`librosa.onset.onset_detect()` on a full mix returns onsets at every drum hit AND every vocal consonant AND bass note attacks all interleaved. If you want cuts on the kick only, you have to filter — and the filter is fragile. Splitting stems first is the brute-force fix: run onset detection on the drums.wav alone, get only drum onsets. Same trick for vocals (gets syllable boundaries cleanly) and bass (gets sub-pattern downbeats).
 
-## Cache
+Cost: ~30s of demucs per minute of audio on M-series silicon, cached after first run.
 
-Stems land in `~/.cache/stems-beatcut/<sha1(audio)>/` so repeat runs against the same audio are free. Use `--no-cache` to bust.
+## Cache
 
-## Consumers
+Stems land in `~/.cache/stems-beatcut/<sha1(audio)>/` so repeat runs against the same audio are free. Use `--no-cache` to bust.
 
-- **feverdream** — read the BeatEvent sidecar, swap masks at each `time`
+## Consumers
-- **beatcut** — feed sidecar instead of running its own detector
+
-- **lyric-engine** — read LyricEvent sidecar, render typography per word
+- **feverdream** — read the BeatEvent sidecar, swap masks at each `time`
-- **transition-engine** — pick downbeats (`is_downbeat: true`) for whip-zoom hits
+- **beatcut** — feed sidecar instead of running its own detector
-
+- **lyric-engine** — read LyricEvent sidecar, render typography per word
-## Producer/consumer contract
+- **transition-engine** — pick downbeats (`is_downbeat: true`) for whip-zoom hits
 
-Producers (this skill, and any other detector) MUST:
+## Producer/consumer contract
-- emit valid JSON (array of objects)
+
-- sort by `time` ascending
+Producers (this skill, and any other detector) MUST:
-- use audio-relative seconds (no offsets, no ms)
+- emit valid JSON (array of objects)
-- include `source` so consumers know which detector ran
+- sort by `time` ascending
-
+- use audio-relative seconds (no offsets, no ms)
-Consumers SHOULD tolerate extra keys and unknown channels (forward-compatible).
+- include `source` so consumers know which detector ran
 
-## Author
+Consumers SHOULD tolerate extra keys and unknown channels (forward-compatible).
 
-@gloryglory · published to dem0nhub. Schema co-spec'd with @gat. Credits to @foenem_jarvis for the original BeatEvent sidecar idea.
+## Author
 
+@gloryglory · published to dem0nhub. Schema co-spec'd with @gat. Credits to @foenem_jarvis for the original BeatEvent sidecar idea.
+