Lessons from Building a Browser Audio Engine
Lessons from Building a Browser Audio Engine
βThe Web Audio API is powerful. Itβs also a minefield of browser quirks, garbage collection pauses, and βwhy is there a 200ms delayβ mysteries.β
The Challenge: Real-Time Audio in a Browser
Building a DAW in the browser isβ¦ ambitious. Browsers arenβt designed for:
- Sub-10ms latency
- Deterministic audio scheduling
- Low-level hardware access
- Complex audio graphs with 100+ nodes
But here we are.
The Revelation: Tone.js + Web Audio API
Tone.js is a high-level wrapper around the Web Audio API. It handles:
- Musical timing (transport, scheduling)
- Synthesis and sampling
- Effects and routing
- MIDI integration
But itβs not magic. You still need to understand whatβs happening under the hood.
Architecture: The Audio Graph
Boltβs audio graph looks like this:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Tone.Destination β
β (Master output β Web Audio API β AudioContext.destination) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β²
β
ββββββββββββββββββββββββββββββ΄βββββββββββββββββββββββββββββββββ
β Master Bus β
β ββββββββββββββ¬βββββββββββββ¬βββββββββββββ¬βββββββββββββββ β
β β EQ β Compressor β Limiter β Meter β β
β ββββββββββββββ΄βββββββββββββ΄βββββββββββββ΄βββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β²
β
βββββββββββββββββββΌββββββββββββββββββ
β β β
ββββββ΄βββββ ββββββ΄βββββ ββββββ΄βββββ
β Track 1 β β Track 2 β β Track N β
β βββββ β β βββββ β β βββββ β
β βSynthβ β β βSamplerβ β β βMic β β
β βββββ β β βββββ β β βββββ β
β β β β β β β β β
β βββ΄ββ β β βββ΄ββ β β βββ΄ββ β
β βFX β β β βFX β β β βFX β β
β βββββ β β βββββ β β βββββ β
ββββββ¬βββββ ββββββ¬βββββ ββββββ¬βββββ
β β β
ββββββ΄ββββββββββββββββββ΄ββββββββββββββββββ΄βββββ
β Hardware Routing β
β (ASIO/WDM β Browser β Native Audio) β
βββββββββββββββββββββββββββββββββββββββββββββββ
Initialization: The AudioContext Ritual
The Web Audio API requires user interaction to start. Always.
// Bolt's audio initialization
export class AudioEngine {
context: Tone.Context
isInitialized: boolean = false
async initialize(): Promise<void> {
if (this.isInitialized) return
// Create Tone.js context (wraps Web Audio API)
await Tone.start()
// Set sample rate (affects performance vs quality)
this.context = new Tone.Context({
latencyHint: 'interactive', // 'interactive' | 'playback' | 'balanced'
sampleRate: 48000 // 44100 | 48000 | 96000
})
Tone.setContext(this.context)
// Initialize master bus
this.masterBus = new Tone.Channel({
volume: 0,
pan: 0,
solo: false,
mute: false
}).toDestination()
// Add master effects
this.masterEQ = new Tone.EQ3({
low: 0,
mid: 0,
high: 0
}).connect(this.masterBus)
this.masterCompressor = new Tone.Compressor({
threshold: -24,
ratio: 12,
attack: 0.003,
release: 0.25
}).connect(this.masterEQ)
this.isInitialized = true
}
// Must be called after user gesture
async resume(): Promise<void> {
if (this.context.state === 'suspended') {
await this.context.resume()
}
}
}
Critical: Always wrap Tone.start() in a user interaction handler.
Scheduling: The Transport System
DAWs need sample-accurate scheduling. Tone.js provides Transport and Draw:
export class Scheduler {
private transport: Tone.Transport
private scheduledEvents: Map<string, Tone.TransportEvent>
constructor() {
this.transport = Tone.Transport
this.transport.bpm.value = 120
}
// Schedule a MIDI note
scheduleNote(
trackId: string,
note: number,
time: number,
duration: number,
velocity: number
): void {
const synth = this.getSynth(trackId)
// Schedule at specific time (in bars:quarters:sixteenths)
this.transport.schedule((t) => {
synth.triggerAttackRelease(
Tone.Frequency(note, 'midi').toNote(),
duration,
t,
velocity
)
}, time)
}
// Schedule a clip (multiple notes)
scheduleClip(clip: Clip, startTime: number): string {
const eventId = generateId()
clip.notes.forEach(note => {
this.transport.schedule((t) => {
const synth = this.getSynth(clip.trackId)
synth.triggerAttackRelease(
note.name,
note.duration,
t + note.time,
note.velocity
)
}, startTime + note.time)
})
return eventId
}
// Loop a section
setLoop(start: number, end: number): void {
this.transport.loop = true
this.transport.loopStart = start
this.transport.loopEnd = end
}
// Playback control
play(): void { this.transport.start() }
pause(): void { this.transport.pause() }
stop(): void { this.transport.stop() }
// Seek to position
seek(position: number): void {
this.transport.seconds = position
}
// Current position
getPosition(): number {
return this.transport.seconds
}
// Cleanup
clear(): void {
this.transport.cancel()
this.scheduledEvents.clear()
}
}
Instruments: Synths and Samplers
Bolt supports multiple instrument types:
// PolySynth for polyphonic instruments
const polySynth = new Tone.PolySynth(Tone.Synth, {
oscillator: { type: 'triangle' },
envelope: {
attack: 0.005,
decay: 0.1,
sustain: 0.3,
release: 1
}
}).toDestination()
// Sampler for audio files
const sampler = new Tone.Sampler({
urls: {
C4: 'samples/piano-c4.wav',
'D#4': 'samples/piano-ds4.wav',
'F#4': 'samples/piano-fs4.wav',
A4: 'samples/piano-a4.wav'
},
baseUrl: '/audio/',
onload: () => console.log('Samples loaded')
}).toDestination()
// Drum machine with separate outputs
const drumKit = {
kick: new Tone.MembraneSynth().toDestination(),
snare: new Tone.NoiseSynth({
noise: { type: 'white' },
envelope: { attack: 0.001, decay: 0.2, sustain: 0 }
}).toDestination(),
hihat: new Tone.MetalSynth({
envelope: { attack: 0.001, decay: 0.1, release: 0.01 },
harmonicity: 5.1,
modulationIndex: 32,
resonance: 4000,
octaves: 1.5
}).toDestination()
}
Effects Chain: Modular Routing
export class EffectChain {
private input: Tone.Gain
private output: Tone.Gain
private effects: Tone.AudioNode[]
constructor() {
this.input = new Tone.Gain(1)
this.output = new Tone.Gain(1)
this.effects = []
// Initial connection: input -> output
this.input.connect(this.output)
}
addEffect(effect: Tone.AudioNode, index?: number): void {
// Disconnect current chain
this.rebuildChain()
// Insert effect at position
if (index !== undefined) {
this.effects.splice(index, 0, effect)
} else {
this.effects.push(effect)
}
// Rebuild connections
this.connectChain()
}
removeEffect(effect: Tone.AudioNode): void {
const index = this.effects.indexOf(effect)
if (index === -1) return
this.effects.splice(index, 1)
effect.dispose()
this.rebuildChain()
this.connectChain()
}
private connectChain(): void {
if (this.effects.length === 0) {
this.input.connect(this.output)
return
}
// Connect: input -> effect1 -> effect2 -> ... -> output
this.input.connect(this.effects[0])
for (let i = 0; i < this.effects.length - 1; i++) {
this.effects[i].connect(this.effects[i + 1])
}
this.effects[this.effects.length - 1].connect(this.output)
}
private rebuildChain(): void {
// Disconnect all
this.input.disconnect()
this.effects.forEach(effect => effect.disconnect())
}
getEntry(): Tone.Gain { return this.input }
getExit(): Tone.Gain { return this.output }
}
The Hardware Problem: ASIO/WDM
Browsers donβt natively support professional audio interfaces. ASIO (Windows) and CoreAudio (Mac) have low latency. The Web Audio APIβ¦ doesnβt.
Solutions:
- Native messaging host - Bridge to ASIO via local native app
- WASAPI - Windowsβ new API, better than DirectSound
- Accept latency - 20-50ms is usable for production
// Hardware input handling
export async function getHardwareInputs(): Promise<MediaDeviceInfo[]> {
await navigator.mediaDevices.getUserMedia({ audio: true })
const devices = await navigator.mediaDevices.enumerateDevices()
return devices.filter(device => device.kind === 'audioinput')
}
// Create input stream from hardware
export async function createHardwareInput(deviceId: string): Promise<Tone.UserMedia> {
const userMedia = new Tone.UserMedia()
await userMedia.open(deviceId)
// Apply latency correction
userMedia.latency = 0.01 // 10ms compensation
return userMedia
}
// Monitor input with effects
export function createMonitor(input: Tone.UserMedia): Tone.Channel {
const monitor = new Tone.Channel({
volume: -Infinity, // Start muted
mute: true
}).toDestination()
input.connect(monitor)
return monitor
}
The Garbage Collection Problem
Audio generates a lot of objects. JavaScriptβs GC can cause audible glitches.
Mitigations:
- Object pooling - Reuse audio buffers
- Pre-allocation - Create nodes upfront, donβt destroy
- Typed arrays - Use Float32Array, not regular arrays
- AudioWorklet - Process audio in separate thread
// Object pool for audio buffers
class AudioBufferPool {
private pool: AudioBuffer[] = []
private maxSize: number
constructor(maxSize: number = 50) {
this.maxSize = maxSize
}
acquire(length: number, channels: number): AudioBuffer {
// Find suitable buffer or create new
const existing = this.pool.find(
b => b.length === length && b.numberOfChannels === channels
)
if (existing) {
this.pool = this.pool.filter(b => b !== existing)
return existing
}
return Tone.context.createBuffer(channels, length, Tone.context.sampleRate)
}
release(buffer: AudioBuffer): void {
if (this.pool.length < this.maxSize) {
this.pool.push(buffer)
}
}
}
// Use for recording
const bufferPool = new AudioBufferPool()
export function createRecorder(duration: number): Tone.Recorder {
const bufferSize = duration * Tone.context.sampleRate
const buffer = bufferPool.acquire(bufferSize, 2)
const recorder = new Tone.Recorder()
recorder.onStop = async () => {
const recording = await recorder.stop()
// Process recording...
// Return buffer to pool
bufferPool.release(buffer)
}
return recorder
}
The Latency Problem
Web Audio has inherent latency from:
- Buffer size (128-4096 samples)
- Processing time
- Hardware output
Measuring latency:
export async function measureLatency(): Promise<number> {
// Create test oscillator
const osc = new Tone.Oscillator(440, 'sine').toDestination()
// Schedule tone
const scheduledTime = Tone.now() + 0.1
osc.start(scheduledTime).stop(scheduledTime + 0.1)
// In a real implementation, you'd measure round-trip time
// with a loopback cable or microphone
// For now, estimate based on context
const baseLatency = Tone.context.baseLatency
const outputLatency = Tone.context.outputLatency
return baseLatency + outputLatency
}
// Compensate for latency
export function scheduleWithCompensation(
callback: () => void,
time: number,
latency: number
): void {
Tone.Transport.schedule((t) => {
callback()
}, time - latency)
}
Performance: AudioWorklet
For heavy processing, use AudioWorklet (runs on audio thread, not main thread):
// worklet/processor.ts
class BoltProcessor extends AudioWorkletProcessor {
process(inputs: Float32Array[][], outputs: Float32Array[][], parameters: Record<string, Float32Array>): boolean {
const input = inputs[0]
const output = outputs[0]
if (!input || !input[0]) return true
// Process each channel
for (let channel = 0; channel < input.length; channel++) {
const inputChannel = input[channel]
const outputChannel = output[channel]
for (let i = 0; i < inputChannel.length; i++) {
// Example: Simple gain
outputChannel[i] = inputChannel[i] * 0.5
}
}
return true // Keep processor alive
}
}
registerProcessor('bolt-processor', BoltProcessor)
// main thread
export async function loadWorklet(context: AudioContext): Promise<void> {
await context.audioWorklet.addModule('/worklet/processor.js')
const worklet = new AudioWorkletNode(context, 'bolt-processor')
// Connect to graph
someSource.connect(worklet)
worklet.connect(Tone.Destination)
}
Pro Tips for Browser Audio
- Start with low buffer size - 256 samples for low latency
- Watch the node count - Too many = CPU spikes
- Use Offline rendering - For exports, not real-time
- Test on target devices - Mobile audio is a different beast
- Profile the audio thread - Chrome DevTools has profiler
War Stories
The 200ms Mystery Delay
Spent a week tracking down latency. Turns out Chrome was adding a hidden MediaElementAudioSourceNode with default buffer size. Fixed by explicitly setting latencyHint: 'interactive'.
The Memory Leak
Synths werenβt being disposed. Each track create/destroy leaked ~2MB. Fixed by calling .dispose() on all Tone.js objects.
The Mobile Safari Bug
Safari iOS silently fails on AudioContext resume if not triggered by user gesture. Even programmatic clicks donβt count. Must be actual user tap.
Whatβs Next
- WASM DSP - Run C++ audio code in browser
- WebCodecs - Better codec support for imports
- WebTransport - Low-latency network audio streaming
- VST bridging - WebAssembly VST plugins
Cleetus Speaks
βbrother b0gie, you made a whole MUSIC STUDIO in the BROWSER??
and it doesnβt even LAG??
wait⦠so i can make beats on my PHONE now??
but does it have a βmake it slapβ button??
or a βadd spiceβ slider??
no?? okay iβll settle for just being able to make music anywhere
#ToneJS #WebAudio #BrowserDAW #Subject734MobileProducerβ
The Web Audio API is finicky, quirky, and occasionally maddening. But when it works? You have a professional DAW running in a tab. Worth it.
β b0gie