---
name: album-transitions
description: Build cinematic album-level transitions between rap/hip-hop tracks using stem separation, time-stretching, key-matched pads, and generative SFX/voice layers. Composes 8-16 bar transitions where track A's melody stem keeps playing under track B's filter-opening intro, with key-shift via rubberband, ElevenLabs SFX (cop sirens, riser sweeps, phone rings), and optional TTS voice interludes (voicemail or DJ drops). Takes a JSON manifest of tracks (audio paths, BPMs, keys) + transition types per boundary; outputs a single continuous MP3/M4A with chapter markers. Supports transition types DRONE_BRIDGE, SMOOTH, REVERSE_FILTER, RISER, PITCH_DROP, HARD_CUT. Use when the user says "make album transitions", "cinematic transitions", "crossfade the album", "DJ mix my tracks", "tape stop between songs", "make transitions like Kanye", "stem-based transitions", or drops an album of tracks and wants them continuous.
---


# album-transitions

Build cinematic, stem-aware transitions between tracks for continuous album playback. Inspired by MBDTF / Donda / Vice City radio / Don Toliver albums where songs bleed into each other.

## When to invoke

User says any of:
- "make transitions between these tracks" / "crossfade my album"
- "stem-based transitions" / "make it cinematic"
- "tape stop from track A to track B"
- "DJ mix these songs together"
- "continuous album mix"
- "Kanye-level transitions"
- Drops multiple tracks + a manifest asking for a continuous mix

## What it does

For each boundary between two tracks, composes a transition block from:
1. **Stems** (Demucs-separated drums/bass/vocals/other) — keep A's melody playing under B's intro
2. **Time-stretch** via Rubberband — align A's tempo to B's for clean overlap
3. **Key-matched pad** — synthesized sine chord in track B's key, BPM-bar-synced
4. **Creative SFX** — generated via ElevenLabs sound-generation API (sirens, crashes, phones, risers)
5. **Voice interludes** (optional) — TTS via ElevenLabs with your cloned voices

## Quick start

### One-shot render
```bash
python3 ~/.claude/skills/album-transitions/scripts/render_album.py \
  --manifest /path/to/album.json \
  --stems-dir /path/to/demucs_output \
  --out /path/to/FULL_ALBUM.mp3
```

### Manifest format (`album.json`)
```json
{
  "title": "ALBUM v1",
  "brand_kicker": "OPERATION LOOP",
  "tracks": [
    {"num": 1, "name": "01 INTRO", "file": "src/01.mp3", "bpm": 92, "key": "D# minor"},
    {"num": 2, "name": "02 TRACK TWO", "file": "src/02.mp3", "bpm": 136, "key": "C# major"}
  ],
  "transitions": [
    {"from": 1, "to": 2, "type": "DRONE_BRIDGE",
     "voice_prompt": "[whispered] you ready for this?",
     "voice_character": "voicemail",
     "sfx_prompt": "cinematic riser building into crash"}
  ]
}
```

### First-time setup
```bash
bash ~/.claude/skills/album-transitions/scripts/setup.sh
```

Installs: demucs (stem separation), pyrubberband + rubberband binary (time-stretch), playwright (for PDF tracklist). Also writes ElevenLabs API key at `~/.elevenlabs/key`.

## Transition types

Each boundary gets one type. Bar allocations in `config/transition_types.json`:

| type             | bars_A | bars_B | overlap | feel                                              |
|------------------|--------|--------|---------|---------------------------------------------------|
| `DRONE_BRIDGE`   | 8      | 12     | 8       | Full overlap, pad bridges, A's melody bleeds      |
| `SMOOTH`         | 4      | 8      | 4       | Extended crossfade, beat-matched                  |
| `REVERSE_FILTER` | 4      | 6      | 3       | Reverse tail + filter sweep, B sneaks in          |
| `RISER`          | 4      | 6      | 0       | Chirp sweep landing on B's downbeat, cold entry   |
| `PITCH_DROP`     | 8      | 8      | 0       | Tape-stop slowdown (asetrate), dramatic drop      |
| `HARD_CUT`       | 2      | 6      | 0       | Stop + room tone silence + cold fresh entry       |

## Optional: stem separation via Demucs

If you have Demucs installed (via setup.sh or manual), the builder will use stems so:
- A's `drums` are dropped during overlap (no rhythm clash with B's drums)
- A's `other` (melody/synth) keeps playing, pitch-shifted toward B's key
- B's stems stage-in: `other` first, `bass` next, `drums` at 50%, `vocals` last

Without stems, the pipeline falls back to time-stretched full mix, which still works but less surgical.

## Optional: ElevenLabs SFX + voice

Set `ELEVENLABS_API_KEY` in env or `~/.elevenlabs/key` file. Then include `sfx_prompt` and/or `voice_prompt` in each transition manifest entry.

Voice characters supported:
- `voicemail` — unhinged/emotional (uses voice_id from config)
- `dj` — aggressive hype announcer
- Custom voice IDs can be passed directly via `voice_id` field

## Output

- `ALBUM_MIX.mp3` — continuous 256kbps MP3
- `ALBUM_MIX.m4a` — same audio, AAC, with chapter markers per track (skip in any modern player)
- `tracklist.pdf` (optional) — printable tracklist via analyst-pdf skill
- `transitions_log.json` — every transition's generated assets for debugging

## Dependencies

- Python 3.9+
- ffmpeg (ships with most systems or install via `brew install ffmpeg`)
- Demucs 4.x (optional, for stem-based transitions) — GPU/MPS-accelerated
- Rubberband binary + pyrubberband (optional, for high-quality time-stretch)
- ElevenLabs API key (optional, for SFX + voice)

## Fallback behavior

- No Demucs → uses full mix, time-stretched. Still works.
- No Rubberband → falls back to ffmpeg atempo (lower quality above 1.5x shifts).
- No ElevenLabs → skips SFX + voice, transitions are pad-only.

The pipeline is designed to gracefully degrade based on what's available.
