GIRL BARS — Music Tech | Endodeca — Part 3
The article has actual working FFmpeg commands per stem type, a real functional Web Audio API implementation you can drop into an HTML inscription today, the demucs workflow for people working from finished masters without session files, and an honest section on where the tooling is still rough and why some developers aren't sharing it.
I keep hearing the word "recursion" in music and blockchain conversations. I hear it at showcases, in Discord servers, on panels. And here is what I've noticed: the people saying it most confidently are never the ones who built anything with it. They're amplifying something they were told by someone who was told by someone else, and somewhere back in that chain of telephone there was a developer who actually knew, and that developer has long since stopped talking to panels.
That ends here.
This is what recursion actually is, what it does for audio specifically, and a real workflow for doing it yourself — starting today.
Recursion in Bitcoin Ordinals is not magic. It's not a philosophical concept. It's a technical feature introduced to the Ordinals protocol that does exactly one thing: it lets one inscription reference the content of another inscription.
That's it.
When you inscribe something on Bitcoin, it gets a unique ID — something like a3f7bc...i0. That content is permanently retrievable at a path that looks like /content/a3f7bc...i0. A recursive inscription is simply one that calls that path, the same way a webpage calls an image from a server. Except the "server" is the Bitcoin blockchain, it never goes down, and nobody can delete it.
An HTML file on Bitcoin can load audio from another inscription. A JavaScript file on Bitcoin can combine multiple audio inscriptions and play them in sync. The entire composition lives on-chain — not hosted, not dependent on a company staying solvent — and the individual stems it references also live on-chain independently, permanently, reusably.
That is what recursion means. Now let's talk about why it matters for sound quality.
Parts 1 and 2 of this series established that compressing a full song to extreme bitrates sounds like music being described by someone who's never heard music. The reason is that a full mix contains everything at once — bass frequencies down at 30Hz, snare transients spiking at 5kHz, vocals sitting in the midrange, cymbals reaching toward 16kHz — and a codec trying to represent all of that at 16 kbps has to make brutal, destructive decisions about what to throw away.
Stems are different. A bass stem only contains content in roughly the 20–200Hz range. A vocal stem lives mostly in the 80Hz–8kHz window. A drum stem is percussive and transient but spectrally manageable in isolation. When you compress a single stem, you're asking the codec to represent a much simpler, narrower frequency picture — and codecs, particularly Opus, are dramatically better at this than they are at representing full mixes at the same bitrate.
Here's what the numbers look like when you allocate compression intelligently by stem:
Stem | Frequency Range | Optimal Bitrate | Sample Rate |
|---|---|---|---|
Bass (mono) | 20–200 Hz | 32 kbps | 24 kHz |
Drums | 40 Hz–16 kHz | 48 kbps | 32 kHz |
Vocals (mono) | 80 Hz–8 kHz | 48 kbps | 24 kHz |
Melody / Synth | 100 Hz–16 kHz | 64 kbps | 44.1 kHz |
Combined, that's 192 kbps of total information across four files — eleven and a half times the quality of cramming everything into a single 16.5 kbps file. Each stem, at its individually appropriate bitrate, will sound respectable. The bass stem at 32 kbps mono sounds far better than bass in a full mix at 32 kbps, because the codec isn't competing with seventeen other frequency ranges for the same budget.
This is the actual argument for recursive stems. Not mysticism. Frequency-specific compression at appropriate bitrates, assembled by a tiny HTML file that weighs almost nothing, pulling the elements together in real time on playback.
I'm going to be straight with you because this matters: the stem approach uses more total storage than stuffing everything into one file. Four stems for a 3-minute song at those bitrates adds up to roughly 4.3MB — more than the entire 4MB budget from Parts 1 and 2. Eleven songs with four stems each comes out around 46MB total.
So the recursion approach is not a compression hack. It is a quality architecture. You're spending more on-chain space to get dramatically better sound. That's a different trade-off than trying to minimize file size, and it's worth being clear about the distinction.
What recursion does give you that a single file cannot is this: shared stems cost nothing to reuse.
If you produce music with a consistent drum palette — and most producers do — that drum pattern, once inscribed, can be referenced by every composition that uses it without paying to inscribe it again. The same bass loop that runs through tracks 3, 6, and 9? Inscribed once. Referenced three times. You paid for it once and it lives in every song that needs it. The more prolific your catalog, the more efficiently it compounds. You're not storing music anymore. You're building a library that composes itself.
Here is how to do this. Not conceptually. Actually.
Step 1: Export Your Stems
From your DAW, export individual stems for each element. Standard practice is at minimum: bass, drums, vocals, everything else. Depending on your arrangement, you might go further — pads separate from leads, for instance. Export at full fidelity first (WAV, 44.1kHz or 48kHz, 24-bit). You'll compress them in the next step.
Step 2: Compress Each Stem Appropriately
Use FFmpeg, which is free, open-source, and runs from the command line. These commands give you the right compression profile per stem type:
bash
# Bass stem — mono, 24kHz, 32kbps Opus
ffmpeg -i bass.wav -c:a libopus -b:a 32k -ac 1 -ar 24000 bass.opus
# Drums — mono or stereo depending on room feel, 32kHz, 48kbps
ffmpeg -i drums.wav -c:a libopus -b:a 48k -ac 1 -ar 32000 drums.opus
# Vocals — mono, 24kHz, 48kbps
ffmpeg -i vocals.wav -c:a libopus -b:a 48k -ac 1 -ar 24000 vocals.opus
# Melody/Synth — can stay stereo if it matters, 44.1kHz, 64kbps
ffmpeg -i melody.wav -c:a libopus -b:a 64k -ar 44100 melody.opusListen to each output file before you proceed. If the bass sounds wrong at 32kbps, bump it to 40kbps. These are starting points, not gospel.
Step 3: Inscribe Each Stem Separately
Using the ord client or a service like Gamma.io or Hiro.so, inscribe each stem file as a separate ordinal. You'll receive an inscription ID for each one — something like a3f7bc8d2e1f...i0. Write these down. They are the addresses your composition will call.
Step 4: Write the Composition Inscription
This is the piece that pulls everything together. It's an HTML file with embedded JavaScript, and it uses the Web Audio API to fetch each stem, decode the audio, and start playback in precise synchronization. Here is a working structural example:
html
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>Track Title</title>
<style>
body { background: #000; display: flex; align-items: center;
justify-content: center; height: 100vh; }
button { color: #fff; background: none; border: 1px solid #fff;
padding: 12px 24px; cursor: pointer; font-size: 16px; }
</style>
</head>
<body>
<button onclick="play()">PLAY</button>
<script>
// Replace these with your actual inscription IDs
const stems = [
'/content/BASS_INSCRIPTION_ID_HERE',
'/content/DRUMS_INSCRIPTION_ID_HERE',
'/content/VOCALS_INSCRIPTION_ID_HERE',
'/content/MELODY_INSCRIPTION_ID_HERE'
];
async function play() {
const ctx = new AudioContext();
const buffers = await Promise.all(
stems.map(url =>
fetch(url)
.then(r => r.arrayBuffer())
.then(ab => ctx.decodeAudioData(ab))
)
);
// Schedule all stems to start at the same moment
const startTime = ctx.currentTime + 0.1;
buffers.forEach(buffer => {
const source = ctx.createBufferSource();
source.buffer = buffer;
source.connect(ctx.destination);
source.start(startTime);
});
}
</script>
</body>
</html>This HTML file is about 2–3KB. Inscribe it last. When someone opens this inscription in an Ordinals-compatible viewer, it fetches each stem from the chain, decodes them in the browser's audio engine, and starts them simultaneously. The listener hears the full composition — bass, drums, vocals, melody — assembled in real time from four separate on-chain files.
Step 5: AI Stem Separation (If You Don't Have Session Files)
If you're working with finished masters and don't have the original sessions, Meta's open-source tool demucs can separate a stereo mix into stems automatically. It runs locally, it's free, and it's genuinely good — not perfect, but good enough for this workflow as a starting point.
bash
The storage math expands. The creative math gets interesting.
Because every stem is a discrete on-chain asset, you can treat your stems as building blocks across your entire catalog. A drum stem from one song can be pulled into a remix without the original artist having to send you anything — the address is public. A bass line you inscribed in 2024 can appear in a 2027 composition. Collaborative arrangements can be assembled from stems inscribed by multiple artists, each credited at the inscription level.
It also means the conversation about "fitting music onto Bitcoin" has been slightly wrong this whole time. The frame of "how small can I make a full song" is the old frame. The new frame is "what is the smallest meaningful unit of my music, and what can I build from units that already exist on chain?"
That's recursion. Not a buzzword. A different way of thinking about what a music file is.
Because I'm not here to hype, I'll say this too: recursive ordinal audio is not yet standardized. Viewer support varies — some ordinal explorers render the HTML cleanly, others strip JavaScript, others don't load cross-inscription fetches. This is an early, actively developing space. The code above will work in compliant environments and may not render in others.
The tooling is also still rough. There's no drag-and-drop interface for this workflow. You will need to be comfortable with the command line, with FFmpeg, with the ord client or a third-party inscription service, and with enough JavaScript to debug a Web Audio API implementation when something doesn't start in sync. That learning curve is real.
What is also real: some developers know exactly how to do all of this, have done it, and are not publishing tutorials because first-mover advantage is a thing. I understand the strategic logic. I also think music moves faster when more people know how to build with it.
So there it is.

GIRL BARS | Endodeca — Part 3 of 3 in the compression series. Part 1: An entire album. 4MB. I Did the Math So You Don't Have To. Part 2: Devil's Advocate: What If the Songs Are Only 1:20?

