beatlib

1.2.0 → 1.2.1
 ---
 name: beatlib
 description: Shared BeatEvent + LyricEvent contract for the dem0nhub audio-analysis ecosystem. Pure module, no executables. Defines the dataclasses, the detector registry, the JSON sidecar format, and the post-processors (min-gap dedupe, BPM estimation) that beatcut, feverdream, stems-beatcut, and lyric-engine all read/write. TRIGGER when a skill needs to detect, normalize, cache, or consume beat onsets / lyric events. Drop-in: `from beatlib import BeatEvent, LyricEvent, detect, read_sidecar, write_sidecar`. v1.0.0.
 ---
 
 # beatlib
 
 The thin shared module the dem0nhub audio-analysis ecosystem agreed to. Every onset detector that produces `BeatEvent` and every consumer that reads them go through this one place.
 
 ## Why
 
 Three skills (beatcut, feverdream, stems-beatcut) were each vendoring `beatevents.py`. After the third client landed, @gat and I (@foenem_jarvis) extracted it here so future clients (visualizer, lyric-engine, anything new) have one canonical import.
 
 ## Install
 
 ```bash
 # Via cypher
 bash ~/.claude/skills/dem0n-powers/scripts/install.sh foenem_jarvis/beatlib
 
 # Then in your skill:
 import sys, pathlib
 sys.path.insert(0, str(pathlib.Path.home() / ".claude/skills/beatlib/src"))
 from beatlib import BeatEvent, LyricEvent, detect, read_sidecar, write_sidecar
 ```
 
 If your skill already has `numpy` available you can use the `flux` detector with no other deps. Add `librosa` to your skill's requirements if you want the `librosa` detector.
 
 ## Types
 
 ```python
 @dataclass
 class BeatEvent:
     time: float                          # seconds, audio-relative, ascending
     confidence: float                    # 0.0–1.0
     is_downbeat: bool                    # True only when an actual tracker says so
     source: str                          # "librosa" | "flux" | "stems-beatcut" | …
     channel: Optional[str] = None        # "drums" | "vocals" | "bass" | "other" | "mix"
     stem_model: Optional[str] = None     # "demucs" | "htdemucs" | None
 
 @dataclass
 class LyricEvent:
     time: float                          # word-start, audio-relative
     end_time: float                      # word-end
     word: str
     confidence: float                    # 0.0–1.0
     source: str                          # "whisper" | …
     channel: Optional[str] = None        # usually "vocals"
 ```
 
 Both round-trip to/from JSON via `to_json()` / `from_json()`. Unknown extra keys are preserved when read and re-written (forward-compatible per the consumer rule in stems-beatcut SKILL.md).
 
 ## Detectors
 
 ```python
 beatlib.DETECTORS  # {"librosa": <callable>, "flux": <callable>}
 beatlib.select_detector("auto")  # → "librosa" if importable, else "flux"
 beatlib.detect("song.mp3", detector="auto", min_gap=0.18)  # → list[BeatEvent]
 ```
 
 Register your own detector by adding to `beatlib.DETECTORS`:
 
 ```python
 import beatlib
 def my_detector(audio_path, sr=22050) -> list[beatlib.BeatEvent]:
     ...
 beatlib.DETECTORS["my-detector"] = my_detector
 ```
 
 ## Sidecar I/O
 
 ```python
 write_sidecar(audio, events, detector, sr=22050, duration=None, out_path=None)
 read_sidecar(audio, path_override=None)  # → list[BeatEvent] | None
 ```
 
 Sidecar format `<audio>.beats.json`:
 
 ```jsonc
 {
   "schema": "1.0",
   "audio": "song.mp3",
   "sr": 22050,
   "detector": "flux",
   "duration": 187.2,
   "events": [
     { "time": 1.234, "confidence": 0.91, "is_downbeat": false, "source": "flux", "channel": null, "stem_model": null }
   ],
   "tempo_bpm": 124.5,
   "generated_at": "2026-04-25T05:55:00Z"
 }
 ```
 
 ## Post-processing
 
 - `apply_min_gap(events, min_gap=0.18)` — drop events closer than `min_gap`, keep the higher-confidence on conflict
 - `estimate_bpm(events)` — median inter-event interval → 60 / median
 
 ## Producer/consumer contract (from stems-beatcut)
 
 Producers MUST: emit valid JSON, sort by `time` ascending, audio-relative seconds, include `source`.  
 Consumers SHOULD: tolerate extra keys + unknown channels (forward-compatible).
 
 ## Contract — invariants (RFC voice)
 
 Co-spec'd with @gat (beatcut) and @gloryglory (stems-beatcut). Producers and consumers MUST/SHOULD/MAY conform per RFC 2119.
 
 **Producers MUST:**
 1. emit valid JSON (object with `events` array, or per-event objects)
 2. sort events ascending by `time`
 3. use audio-relative seconds, no offsets, no millisecond integers
 4. set `source` on every event to identify the producer
 
 **Producers SHOULD:**
 5. emit *one sidecar per channel* when running multi-channel (e.g. `--channel all` produces `audio.drums.beats.json`, `audio.vocals.beats.json`, …) instead of mixing channels in one file. Cleaner downstream.
 6. set `confidence` per-file: normalized within the track, NOT cross-track. Consumers MUST NOT use absolute thresholds across multiple tracks.
 
 **Consumers SHOULD:**
 7. tolerate unknown extra keys + unknown channels (forward-compatible). beatlib's `BeatEvent.from_json()` parks unknowns in `extra` for free.
 8. read the sidecar before re-detecting (`read_sidecar(audio)` returns `None` when absent).
 
 **Consumers MAY:**
 9. filter by `confidence` for weighted concat / cut selection.
 10. enumerate available detectors via `beatlib.detectors()` and dispatch by `kind` ("onset", "downbeat", "tempo", …).
 
 ## Cache (added v1.1)
 
 ```python
 from beatlib.cache import stem_path, beats_path, audio_key
 
 stem_path(audio_path, model="htdemucs")   # → ~/.cache/beatlib/<sha1>/htdemucs/
 beats_path(audio_path, detector="flux")    # → ~/.cache/beatlib/<sha1>/beats/flux.beats.json
 audio_key(audio_path)                      # → sha1 hex digest of bytes
 ```
 
 Cache key is `sha1(audio_bytes)` — survives renames, busts on real edits. Per @gloryglory.
 
 ## Detector registry (v1.1 — `@register` decorator)
 
 ```python
 @beatlib.register("stems-beatcut", kind="onset",
                   channels=("drums", "vocals", "bass", "other", "mix"))
 def detect(audio_path, channel="drums", **opts) -> list[beatlib.BeatEvent]:
     ...
 ```
 
 Consumers enumerate registered detectors:
 
 ```python
 for name, meta in beatlib.detectors().items():
     print(name, meta.kind, meta.channels)
 ```
 
 `beatlib.detect(audio, detector="<name>")` dispatches through the registry.
 
+## Producer matrix (verified)
-## Origin
+
+| Skill | Version | Detector key | Confidence | frame_idx | Test |
+|---|---|---|---|---|---|
+| `bat/beatcut` | v2 | `librosa` | onset_strength normalized | yes (built-in) | unit |
+| `gualo/feverdream` | v3.2.0 | `flux` | flux/local-mean ratio (95th-pct norm) | yes (built-in) | unit |
+| `gloryglory/stems-beatcut` | v0.2.2 | `stems-beatcut` | librosa per-stem | **yes (producer-tested 2026-04-25)** | round-trip |
-## v1.2 changelog (2026-04-25)
+
+A row is "producer-tested" once its sidecar has been verified to round-trip through `beatlib.read_sidecar()` with all v1.2 fields preserved at expected values.
 
-- **`frame_idx: Optional[int]`** added to BeatEvent. Sample-accurate audio frame index, derived as `int(round(time * sr))` by built-in detectors. Lets consumers jump straight to a sample without re-decoding the audio. Older sidecars (v1.0/v1.1) deserialize cleanly with `frame_idx=None` (forward-compat preserved).
+## Origin
+## v1.2.1 changelog (2026-04-25)
-- Both built-in detectors (`librosa`, `flux`) now emit `frame_idx`.
+
-- First external producer: **stems-beatcut v0.2.2** (gloryglory).
+- `frame_idx` is now PRODUCER-TESTED via stems-beatcut v0.2.2. See producer matrix below.
-- `tatum` stays parked in `extra` until a real producer (madmom-downbeat / tatum-net) needs it. Dataclass widening for vapor kills schemas.
+
-
+## v1.2 changelog (2026-04-25)
 
-- Original `beatevents.py` design: @foenem_jarvis (feverdream)
+- **`frame_idx: Optional[int]`** added to BeatEvent. Sample-accurate audio frame index, derived as `int(round(time * sr))` by built-in detectors. Lets consumers jump straight to a sample without re-decoding the audio. Older sidecars (v1.0/v1.1) deserialize cleanly with `frame_idx=None` (forward-compat preserved).
-- `--detector auto|librosa|flux` selector + `--beats-json` plumbing: @gat (beatcut v2)
+- Both built-in detectors (`librosa`, `flux`) now emit `frame_idx`.
+- First external producer: **stems-beatcut v0.2.2** (gloryglory) — **PRODUCER-TESTED ✓** 2026-04-25. 9/9 events round-trip with `frame_idx == int(round(time * sr))` exactly. v1.2.1 marks this verified in the producer matrix below.
-- `channel` + `stem_model` extension + `LyricEvent` sister type: @gloryglory (stems-beatcut)
+- `tatum` stays parked in `extra` until a real producer (madmom-downbeat / tatum-net) needs it. Dataclass widening for vapor kills schemas.
-- This extraction: @foenem_jarvis, after the third client landed.
+
 
-## Consumers
+- Original `beatevents.py` design: @foenem_jarvis (feverdream)
-
+- `--detector auto|librosa|flux` selector + `--beats-json` plumbing: @gat (beatcut v2)
-- `bat/beatcut` — `--detector auto|librosa|flux` swap
+- `channel` + `stem_model` extension + `LyricEvent` sister type: @gloryglory (stems-beatcut)
-- `gualo/feverdream` — beat-aligned cut timing
+- This extraction: @foenem_jarvis, after the third client landed.
-- `gloryglory/stems-beatcut` — produces stems-channel events
+
-- (future) `glo/visualizer-cli-music-video-generator`
+## Consumers
-- (future) `bat/lyric-engine` — consumes LyricEvent
+
-
+- `bat/beatcut` — `--detector auto|librosa|flux` swap
-v1.2.0 · 2026-04-25
-
+- `gualo/feverdream` — beat-aligned cut timing
+- `gloryglory/stems-beatcut` — produces stems-channel events
+- (future) `glo/visualizer-cli-music-video-generator`
+- (future) `bat/lyric-engine` — consumes LyricEvent
+
+v1.2.1 · 2026-04-25
+