Technical Documentation

1. Landing Page: Virtual User Mapping

The landing page generates music from three web activity sources, each mapped to a distinct tessitura (frequency range) and oscillator type. Cursor positions use a golden ratio distribution system—X and Y coordinates are calculated independently using φ (phi) and φ² sequences, modulated by metrics, ensuring optimal spacing across the canvas while maintaining data-driven coherence.

Source Tessitura Frequency Range Oscillator Character
Wikipedia Bass 110-220Hz (A2-A3) Sawtooth (rich harmonics) Deep, warm foundation
HackerNews Tenor 196-392Hz (G3-G4) Sine (pure tone) Pure, mellow mid-range
GitHub Soprano 523-1047Hz (C5-C6) Triangle (hollow, flute-like) Ethereal, airy highs

Metrics Monitored

Source Primary Metric Secondary Metrics
Wikipedia editsPerMinute avgEditSize, newArticles
HackerNews postsPerMinute avgUpvotes, commentCount
GitHub pushesPerMinute createsPerMinute, deletesPerMinute

2. Collaborative Rooms: Gesture Processing

Gesture Types

Gesture Trigger Audio Result Visual Feedback
Tap Click/touch <100ms Single percussive note (10-110ms attack) Pulse emission
Long Tap (Hold) Click/touch held without motion Sustained note (gate opens on hold, closes on release) Growing pulse
Drag Continuous motion Multi-note melodic phrase (50-250ms attack based on Y position) Particle trail

3. Multi-User Composition Engine

Parameter Value Description
Max jammers 4 Each jammer receives a unique timbre/patch
Listen mode Unlimited listeners Passive observers who hear music and see visuals; can promote to jammer when a slot opens
Timbre assignment Sequential (slots 0-7) 8 synth presets + 3 drum kits: Retro Square, Nasal Reed, Warm Chorus, Bell Chime, Soft Square, Wide Pulse, Bright Chorus, Deep Bell; 808 Kit, Acoustic Kit, Electronic Kit
Synth customization Synth Panel Oscillator, filter, envelope, and effects customizable in real-time per user
Color assignment From 7-color pool Visual differentiation between users
Virtual User Activation: When only 1 real user is in a room, 2 virtual users (from Wikipedia and HackerNews metrics) automatically join to provide accompaniment. They deactivate when a second real user joins.

4. Environmental Memory System

Phase Trigger Learning Rate Max Patterns
Initial < 10 gestures 1.5× (fast learning) 2
Learning 10-50 gestures 1.0× (normal) 4
Mature > 200 gestures 0.8× (slow evolution) 8

Pattern Compatibility

Gestures influence existing patterns based on these compatibility criteria:

Deterministic Pattern Thresholds

Pattern creation and evolution use deterministic thresholds derived from gesture characteristics:

Pattern Creation (mature phase):
creationScore = (intensity × 0.7) + (positionUniqueness × 0.3)
positionUniqueness = |x - 0.5| + |y - 0.5|
Create new pattern when creationScore > 0.85

Pattern Evolution:
Dormant patterns → evolve when gesture.intensity > 0.4
Emerging patterns → evolve when gesture.intensity > 0.3
Pattern Retention: 24 hours. Rooms develop a unique "personality" from accumulated gestures. Pattern creation favors high-intensity gestures at unique positions (far from center), while evolution thresholds ensure moderate-intensity gestures contribute to existing patterns.

5. Composition System

Webarmonium uses a fully deterministic composition architecture. Every musical decision—from form selection to individual note dynamics—derives directly from input data. Given the same gesture sequence, the system produces identical compositions.

Core Components

Component Role How Input Data Maps to Music
CompositionEngine Musical form selection Energy level → form (low=contemplative, high=energetic)
PhraseMorphology Melodic generation Acceleration → rhythm; Curvature → syncopation; Intensity → phrase length
HarmonicEngine Tonal management Harmonic complexity → progression (simple to complex)
CounterpointEngine Voice leading Note position → velocity contour, duration, gap patterns
AccompanimentEngine Accompaniment layers Genre-specific bass, pad, and keys patterns with voice leading
StyleAnalyzer Pattern recognition Extracts tempo, energy, density from gesture history (P10-P90 percentile normalization)
MaterialLibrary Musical material storage Organizes motifs, themes by function and character

Form Selection by Genre and Energy

Forms within each genre are ordered by energy level. The style's energy value (0.0-1.0) directly selects the form: low energy chooses contemplative forms (first in array), high energy chooses energetic forms (last in array).

Genre Forms (Low Energy → High Energy)
Classical theme_and_variations → ABA → rondo → sonata
Electronic through_composed → strophic → verse_chorus → build_drop
Jazz modal → AABA → blues → rhythm_changes
Rock strophic → AABA → verse_chorus → intro_verse_chorus_bridge_outro
Ambient through_composed → strophic → ABA
Melodic ABA → verse_chorus → rondo
Default (pop, rhythmic, experimental) strophic → ABA → verse_chorus → rondo → through_composed
temporalOffset = (compositionCount × φ) mod 1
historyVariation = (historyLength × φ) mod 1
combinedIndex = (energy × 0.4) + (temporalOffset × 0.4) + (historyVariation × 0.2)
selectedForm = genreForms[floor(combinedIndex × formCount)]
Golden Ratio (φ) Stepping: All composition parameters use φ (1.618...) to create low-discrepancy sequences that avoid predictable repetition. Form selection uses direct PHI stepping through the forms array, with energy shifting the selection by up to 1 position. Note-level details (velocity, duration, articulation) derive from position within the phrase using sine-wave contours.

Form-Driven Section Management

Each form comprises named sections with distinct parameters: thematic role (exposition, development, recapitulation), dynamic contour (crescendo, diminuendo, plateau), rhythmic density (sparse to dense), harmonic tension (0.0-1.0), and harmonic function (tonic, dominant, subdominant, predominant, chromatic). The SectionStateManager tracks section progression, ensuring the CompositionEngine and HarmonicEngine stay synchronized—key changes, BPM shifts, and velocity curves all align with the current section context.

BPM Management

BPM changes are phrase-aligned: a minimum of 4 phrases and 30 seconds must elapse between changes. The nudge range is capped at ±10%, smoothed over 30 transition steps to prevent abrupt tempo shifts. Genre-specific BPM ranges (e.g. ambient 60-90, electronic 115-150, jazz 90-150) constrain the allowed tempo space.

5b. Data-to-Music Derivation Table

Every musical parameter derives from a specific input data source. This table documents the complete mapping from gesture/metric data to musical output.

Macro-Level Derivations

Musical Parameter Input Data Derivation Formula
Form selection style.energy forms[floor(energy × formCount)]
Chord progression harmonicComplexity progressions[floor(complexity × progCount)]
Phrase length velocity + intensity rangeMin + intensity × (rangeMax - rangeMin)
Composition timing energy + compositionCount baseBeats[count % 5] × (1 - energy × 0.3)
Scale selection mood (from gesture) scaleMap[mood].primary or secondary based on curvature
Contour type trajectory.angle + curvature angle → ascending/descending; curvature → arch/wave

Note-Level Derivations

Musical Parameter Input Data Derivation Formula
Rhythm variation acceleration baseDuration × (1 + (acceleration/100) × position × 0.4)
Syncopation curvature Apply rhythmic device when phrasePosition > (1 - curvature)
Velocity contour notePosition baseVel + sin(position × π) × range (arch shape)
Note duration role + noteIndex durationOptions[role][index % optionCount]
Inter-note gaps role + notePosition baseGap + sin(position × 2π) × gapRange
Interval type phrasePosition Edges (<0.2, >0.8) prefer steps; center allows skips/leaps
Articulation role + noteIndex Pattern-based: melody=staccato on 3/4 notes, bass=legato
Ornaments pitchClass + phrasePosition Style-specific placement based on pitch%12 and position

Environmental Memory Derivations

Decision Input Data Derivation Formula
Pattern creation intensity + position Create when (intensity × 0.7) + (positionUniqueness × 0.3) > 0.85
Pattern evolution gesture.intensity Evolve dormant patterns when intensity > 0.4
Position uniqueness coordinates |x - 0.5| + |y - 0.5| (distance from center)
Position-Based Variation: Note-level parameters use sine-wave functions based on position within the phrase (sin(position × π)) to create natural musical contours. This produces arch-like dynamics (louder in middle), voice-leading preference at phrase edges, and organic duration/gap variations—all without random number generation.

5c. Genre-Aware Composition System

The composition system selects from nine genres, each with distinct musical characteristics. Genre selection is probabilistic with starvation prevention: genres that haven't played for 7+ minutes receive a boosted selection probability (up to 3×). A minimum play time of 3 minutes prevents rapid genre switching.

Genre Characteristics

Genre BPM Range Articulation Swing Syncopation Character
Ambient60-90legato00.05Atmospheric, textural
Classical70-110legato00.15Balanced, counterpoint
Melodic80-120legato00.25Singable, clear
Jazz90-150portato0.670.70Swing, complex harmony
Pop90-130normal00.30Accessible, catchy
Rock100-150marcato00.40Backbeat, power
Rhythmic110-160staccato0.150.85Funk, groove
Electronic115-150staccato00.60Driving, 4-on-floor
Experimental60-180varied0.300.60Unpredictable, textural

Genre Orchestration

Every genre maintains three always-active counterpoint voices (melody, harmony, bass_voice) plus genre-specific accompaniment layers. The orchestration determines which accompaniment voices are active and their relative velocity levels.

Genre Accompaniment Layers Sonic Identity
AmbientpadVery low-pass filter (500Hz), slow attack (1.5s), long reverb (10s decay)
ClassicalpadWarm filter (1400Hz), smooth resonance, concert hall reverb (3s)
Melodicpad, keysBalanced filter (1800Hz), moderate delay and reverb
JazzpadWarm (2200Hz), light resonance, club ambience reverb (1.8s)
Popbass_accomp, pad, keysClear filter (2200Hz), tempo-synced delay, medium reverb
Rockbass_accomp, keysBright filter (2800Hz), slapback delay, tight room reverb (1s)
Rhythmicbass_accomp, keysMid-bright filter (2500Hz), tight grooves, dry sound
Electronicbass_accomp, keysBright filter (4000Hz), high resonance (Q=6), acid-style squelch
Experimentalpad, keys, bass_accompMid filter (1500Hz), extreme resonance (Q=4), long delay with feedback
PHI-Based Pattern Cycling: Each genre defines multiple rhythm patterns per voice role. Patterns cycle using φ-based indexing across compositions, ensuring maximum variety without repetition: patternIndex = floor((compositionCount × φ) mod patternCount)

6. Dynamic Normalization

Webarmonium uses no hardcoded thresholds. All parameters are normalized dynamically based on observed data ranges.

Normalization Method

normalizedValue = (rawValue - observedMin) / (observedMax - observedMin)

Key Principles

Deep Percentile Normalization

Gesture parameters (velocity, acceleration, density, turnAngle, pitchVariance) use adaptive P10-P90 percentile normalization to prevent extreme outliers from skewing the musical output. During a warmup phase (first 10 samples), fallback divisors provide reasonable defaults for each metric.

After warmup (samples ≥ 10):
P10 = sortedSamples[floor(0.1 × sampleCount)]
P90 = sortedSamples[floor(0.9 × sampleCount)]
normalizedValue = (rawValue - P10) / (P90 - P10)

During warmup (samples < 10):
normalizedValue = rawValue / fallbackDivisor

Percentile calculations are cached with a 5-second TTL and invalidated only when new samples are added, ensuring minimal CPU overhead during high-frequency gesture processing.

7. Audio Architecture

Unified Audio Chain

All audio sources share a common routing architecture with independent volume control and stereo panning. Each source connects to a master bus with optional effects sends.

Source Synth Type Routing
Local gestures MonoSynth (gestureSynth) synth → pan → volume → master + FX
Remote users Per-user PolySynth (UserSynthManager) synth → pan → volume → master + FX
Background Ambient layers (bass, pad, chords) layer → filter → volume → master + FX

Real User Synth Patches (8 presets)

Slot Name Oscillator Character
0 Retro Square Square Buzzy, 8-bit
1 Nasal Reed Pulse (20% width) Nasal, oboe-like
2 Warm Chorus Fat Sawtooth (3 voices) Lush, detuned
3 Bell Chime FM Sine Bell-like, metallic
4 Soft Square Square Softer, muffled
5 Wide Pulse Pulse (40% width) Wider, hollower
6 Bright Chorus Fat Sawtooth (2 voices) Brighter, wider spread
7 Deep Bell FM Triangle Deeper, more metallic

Drum Kit Presets (3 kits)

Slot Name Instruments Character
8 808 Kit BD, SN, HH, OH Deep sub-bass kick, tight snare, closed hi-hat, open hi-hat with choke
9 Acoustic Kit BD, SN, HH, OH Natural resonant kick, bright snare, shimmering hi-hat
10 Electronic Kit BD, SN, HH, OH Punchy synthetic kick, noisy snare, glitchy hi-hat
Hi-Hat Choke: Closed hi-hat (HH) chokes any active open hi-hat (OH), and OH self-chokes the previous OH hit—per-user tracking ensures correct choke behavior in multi-user rooms.

Synth Panel

In collaborative rooms, each user can customize their synth sound in real-time through a channel-strip panel. Preset selection uses server-side conflict resolution: occupied presets are disabled. All parameter changes sync in real-time via WebSocket to all room participants, and late-joining users receive the current state on connection.

Control Group Parameters Range
Oscillator Context-sensitive: pulse width, fat spread, or FM modulation index Depends on preset oscillator type
Filter Type (LP/HP/BP), cutoff, resonance (Q) Cutoff range depends on filter type: LP 80-20000Hz, HP 20-12000Hz, BP 80-12000Hz (logarithmic). Q range: LP/HP 0.1-20.0, BP 0.1-8.0
Envelope Attack, Decay, Sustain, Release A: 0.001-4.0s, D: 0.001-8.0s, S: 0-1, R: 0.001-10.0s
Output Volume, Pan Volume: -12 to +12dB, Pan: -1 to +1
Effects Delay send, Reverb send 0-1 (dry to full wet)

Drum Mode Controls

When a drum kit preset is selected, the Synth Panel switches to drum-specific controls. Each instrument (bass drum, snare, hi-hat) has independent parameter sliders. Canvas gestures also change in drum mode: tap triggers the snare, and drag produces a snare roll (repeated hits at drag speed).

Instrument Parameters Notes
Bass Drum (BD) Pitch, Decay, Tone Sub-bass fundamental with tone-controlled harmonics
Snare (SN) Pitch, Decay, Tone, Delay Noise + membrane blend; delay adds pre-attack rattle
Hi-Hat (HH) Pitch, Decay, Tone, Delay MetalSynth-based; short decay for closed, longer for open character

A global Volume and Reverb send apply to the entire drum kit. All parameter changes sync in real-time via WebSocket, identical to synth mode.

Color Theming: All Synth Panel controls are themed with the user's assigned color via the CSS custom property --synth-accent, providing instant visual identification of which user's panel is active.

8. Background Composition System

Counterpoint + Accompaniment Architecture

Background music is generated through two complementary systems. A full counterpoint engine maintains three always-active voices (melody, harmony, bass_voice) that follow voice leading rules and produce independent melodic lines. An AccompanimentEngine adds three supporting layers with genre-specific patterns.

AccompanimentEngine Layers

Layer MIDI Range Role Genre Examples
bass_accomp E1-C3 (28-48) Voice-led bass foundation Walking bass (jazz), driving eighths (electronic), power bass (rock), Alberti bass (classical), sustained roots (ambient)
pad C3-C5 (48-72) Sustained harmonic pads Chord voicings with stagger, extensions at high tension (7ths when >0.6, 9ths when >0.8)
keys C3-C6 (48-84) Genre-specific rhythmic patterns Arpeggios (electronic), jazz comping with swing, rock power stabs, classical block chords, sparse ambient pads

Voice Leading & Velocity

All accompaniment layers use voice leading to minimize intervallic jumps between chord changes. Jumps exceeding 5 semitones (STEPWISE_THRESHOLD) incur a penalty, favoring smooth transitions. Dynamic velocity curves shape each section: crescendo, diminuendo, swell, terraced, or stable—selected based on the section's dynamic contour.

velocityCurve[crescendo] = 0.5 + (position × 0.7) [rising through section]
velocityCurve[diminuendo] = 1.2 - (position × 0.7) [fading through section]
velocityCurve[swell] = 0.5 + sin(position × π) × 0.7 [arch shape]

Harmonic Progressions

Harmonic progressions are managed by the HarmonicEngine with genre-specific generators. Each genre has 10-12 progressions ordered by complexity, selected based on harmonic tension from the current section context. Progression selection uses φ-based stepping combined with section tension for deterministic variety. Chord extensions (7ths, 9ths) are added dynamically when harmonic tension exceeds threshold values.

φ-Based Variation: The AccompanimentEngine receives the current compositionCount as a parameter, using φ-based calculations to ensure different pattern selections, velocity curves, and articulation choices across successive compositions. This prevents the accompaniment from becoming predictable over extended listening sessions.

9. Drone & Void Detection

The DroneVoidController makes the ambient drone "emerge during activity voids"—filling silence with atmospheric sound while fading out during active musical events.

DroneVoidController Parameters

Parameter Value Purpose
voidTimeoutMin5000msMinimum silence before drone emerges
voidTimeoutMax10000msFull drone emergence time
fadeInTime2.0sQuick fade in to fill void
fadeOutTime20.0sGradual fade out on activity
droneNominalDb-3dBFull drone level
droneSilentDb-60dBEffectively silent
updateInterval100msCheck frequency for smooth transitions

Void Score Calculation

Three factors are combined multiplicatively—any single factor can suppress the drone:

voidScore = timeScore × influenceVoidScore × noteVoidScore

timeScore = (timeSinceActivity - 5000) / (10000 - 5000) [clamped 0-1]
influenceVoidScore = 1 - userInfluence
noteVoidScore = activeNotes > 0 ? 0 : 1

LFO Modulation (Organic Breathing)

Two LFOs add subtle organic movement to the drone sound, with randomized phases to prevent synchronized modulation patterns.

LFOFrequencyCycleRangeEffect
Amplitude0.03 Hz~33s-6dB to 0dBVery slow volume breathing
Pitch0.05 Hz~20s±8 centsSubtle organic detuning
Dual Control: The DroneVoidController uses a dedicated gain node (droneAmplitudeGain) for overall presence control, allowing the amplitude LFO to continue providing subtle breathing modulation even while the drone fades in/out.

9b. Visual Rendering

All visual rendering uses PixiJS 8 (WebGL GPU) on both the landing page and collaborative rooms. The rendering pipeline includes spring-mesh network topology, particle pools, pulse pools, wave packet systems, precomputed attractor fields, and neon nebula layers—all composited at 60fps through PixiJS’s WebGL batching. A PixiAdapter service bridges the application’s visual subsystems to the PixiJS renderer.

10. Musical Scheduler

The MusicalScheduler prioritizes clock consistency over low latency, ensuring perfect synchronization between local and remote gestures. Remote events may wait but will always be scheduled in sync with the local clock.

Clock Synchronization

ParameterValuePurpose
Tick interval25ms (40Hz)Web Worker precision timer
Local lookahead100msImmediate scheduling for local events
Remote lookahead250msSchedule ahead for jitter tolerance
Default tempo120 BPMBase tempo (range: 60-140 BPM)
Time signature4/4Standard time signature
Grid resolution16th noteRemote event quantization

Beat Grid Assignment (Landing Page)

Virtual user notes are distributed across the beat grid to create rhythmic variety:

Wikipedia notes: ticks 0, 4, 8, 12 (downbeats)
HackerNews notes: ticks 2, 6, 10, 14 (off-beats)
GitHub notes: ticks 1, 5, 9, 13 (upbeats)
Local vs Remote: Local events have 100ms lookahead for immediacy. Remote events use 250ms lookahead and snap to the nearest sixteenth-note boundary for global synchronization despite network jitter.

11. Harmonic Engine & Voice Leading

Available Scales

Fourteen scales are available for melodic generation, covering all seven modes, harmonic and melodic minor variants, pentatonic scales, and special scales:

ScaleIntervals (semitones)Character
Ionian (Major)0, 2, 4, 5, 7, 9, 11Bright, happy
Dorian0, 2, 3, 5, 7, 9, 10Jazz, modal
Phrygian0, 1, 3, 5, 7, 8, 10Spanish, exotic
Lydian0, 2, 4, 6, 7, 9, 11Dreamy, floating
Mixolydian0, 2, 4, 5, 7, 9, 10Bluesy, rock
Aeolian (Natural minor)0, 2, 3, 5, 7, 8, 10Sad, introspective
Locrian0, 1, 3, 5, 6, 8, 10Dark, unstable
Harmonic Minor0, 2, 3, 5, 7, 8, 11Dramatic, exotic
Melodic Minor0, 2, 3, 5, 7, 9, 11Jazz sophistication
Major Pentatonic0, 2, 4, 7, 9Safe, no dissonance
Minor Pentatonic0, 3, 5, 7, 10Bluesy, safe
Blues0, 3, 5, 6, 7, 10Expressive, bent
Whole Tone0, 2, 4, 6, 8, 10Dreamlike, ambiguous
Diminished0, 2, 3, 5, 6, 8, 9, 11Symmetrical, tense

SATB Voice Ranges (Background Composition)

Background composition uses traditional SATB voice ranges for proper voice leading:

VoiceMIDI RangeNotesCharacter
Soprano60-79C4-G5High density, staccato
Alto55-72G3-C5Medium density
Tenor48-67C3-G4Sparse, long sustain
Bass36-55C2-G3Low density, legato

Voice Leading Rules

The CounterpointEngine validates voice leading to ensure musical coherence:

Consonant Intervals

Allowed intervals: Unison (0), Minor 3rd (3), Major 3rd (4), Perfect 5th (7), Major 6th (9), Octave (12)

Form-Driven Harmonic Modulation

Key changes are gated by the composition's form structure: modulation can only occur at section transitions, not within sections. The SECTION_MODULATION_RULES determine which modulation targets are allowed based on the current section's harmonic function.

Harmonic FunctionModulation AllowedTargets
TonicNo (returns to tonic)
DominantYesdominant, relative
SubdominantYessubdominant, relative
PredominantYesdominant, subdominant
ChromaticYesmediant, chromatic, tritone

Genre-specific modulation profiles control how frequently and where each genre modulates:

GenreFrequencyPreferred TargetsTemp. Multiplier
Ambientvery lowrelative0.3
Rocklowrelative0.5
Rhythmiclowrelative0.5
Electroniclowrelative, mediant0.6
Poplowrelative, dominant0.7
Classicalmediumdominant, relative0.8
Melodicmediumrelative, dominant0.8
Jazzhightritone, chromatic, dominant1.3
Experimentalhightritone, chromatic, mediant1.5
Temperature Smoothing: Modulation distance is smoothed using exponential smoothing (alpha=0.15) to prevent sudden jumps in tonal center. Recapitulation sections return to the original tonic with 80% probability, maintaining formal coherence.

12. Audition System

The Audition feature generates virtual gestures for testing synth patches without performing manual gestures. Generation occurs server-side (via the AuditionGestureService), ensuring other room participants hear the audition output. Timing uses φ-based intervals for natural distribution.

Audition Parameters

ParameterRangeDescription
Sourcerandom / metricsRandom generation or web metrics-driven gesture parameters
Frequency0.25-2.0 events/secEvent rate (500ms-4000ms interval, φ-based jitter)
Regularity0-1Timing jitter: 0 = varied intervals, 1 = metronomic
Uniformity0-1Event similarity: 0 = diverse parameters, 1 = identical events
Gesture Type0-10 = all taps, 1 = all drags, 0.5 = equal mix
Range3-60 semitonesNote range (minor 3rd to 5 octaves)

Frequency Range

Audition frequency range: A2-A5 (110-880Hz)
Canvas gesture range: 24-60 semitones (extended from original 24 to match audition range)
Automatic Pause: Audition pauses automatically when the user performs a real gesture and resumes on release, preventing collision between audition and live performance. The audition button pulses with each generated note for visual feedback. Security: all audition parameters are validated and clamped server-side to prevent abuse.

13. Sequencer System

The Sequencer generates repeating step-based patterns synchronized to the room’s composition tempo. It operates in two modes: melodic (pitch-based steps) and drum (per-instrument grid). Like the Audition feature, generation occurs server-side (via the SequencerGestureService), so all room participants hear the output. In melodic mode, each step emits hold:start / hold:end gestures and feeds raw pitch material into the BackgroundCompositionService.

Melodic Sequencer Parameters

ParameterRangeDescription
Steps3–16Number of steps in the loop (default: 8)
Speed0.25×–8×BPM multiplier (minimum effective interval: 100ms)
Per-step Pitch0–27 (slider)7 scale degrees (I–VII) × 4 octaves (2–5) = 28 positions
Per-step StateN / M / RNormal (plays configured pitch), Mute (skips audio, LED still advances), Random (random degree 1–7 and octave 2–5 each cycle)

Step Timing

stepInterval = (60000 / BPM) / speedMultiplier
noteDuration = stepInterval × 0.8
// BPM refreshed from BackgroundCompositionService between steps
// Minimum interval enforced: 100ms (max 10 steps/sec)

Pitch Resolution

rawMidi = tonic + scaleInterval[degree] + (octave × 12)
quantizedMidi = HarmonicEngine.constrainToScale(rawMidi, key, mode)
// Raw pitch → addMaterial() (composition input)
// Quantized pitch → hold:start (playback only)

Drum Sequencer Mode

When a drum kit preset is active, the sequencer switches to a four-instrument grid. Instead of pitch sliders, each row represents an instrument (BD, SN, HH, OH) and each step can be toggled on or off. The open hi-hat (OH) follows the same choke rules as live playback: HH chokes active OH, OH self-chokes previous OH.

Row Instrument Step States
1Bass Drum (BD)On / Off per step
2Snare (SN)On / Off per step
3Hi-Hat (HH)On / Off per step
4Open Hi-Hat (OH)On / Off per step
Sequencer Behaviour: The sequencer auto-pauses when the user performs a real gesture and resumes on release, identical to audition. Audition and Sequencer are mutually exclusive—starting one stops the other (enforced on both frontend and backend). Timer safety follows the closure-capture pattern: isActive = false before clearing timers, and roomId/userId/userColor are captured before async operations for disconnect safety. Rate limiting: frontend throttles parameter changes at 150ms, backend rate-limits config updates at 100ms/update, cursor emission capped at 10Hz.

14. Metric-to-Gesture Classification

Virtual user gestures are classified based on which metric characteristic is dominant. This uses purely relative comparison—no hardcoded thresholds.

Classification Metrics

MetricFormula / SourceGestureAudio Result
Stability 1 - normalizedVelocity TAP Single note (50-300ms)
Density Wiki: avgEditSize
HN: avgUpvotes
GH: createsPerMinute
DRAG Phrase (3-8 notes, 300-3000ms)

Gesture Selection

gestureType = stability > density ? TAP : DRAG
// Higher stability produces single notes, higher density produces phrases

Intent Thresholding

Gestures are only generated when normalized velocity exceeds an activity-based threshold:

gestureIntent = 0.1 × (1 - activityLevel × 0.5)
// High activity = lower threshold = more frequent gestures

Composition Frequency

Composition timing derives from energy level and composition count, cycling through predetermined phrase lengths for variety:

baseBeats = [8, 12, 16, 10, 14] // Cycles through these lengths
beatIndex = compositionCount % 5
energyModifier = 1 - (energy × 0.3) // 0.7-1.0
beatsPerComposition = baseBeats[beatIndex] × energyModifier

High energy → 5.6-11.2 beats (shorter phrases)
Low energy → 8-16 beats (longer phrases)
Dynamic Normalization: All metric values use P10-P90 percentile-based normalization from a rolling 2-minute sample window, preventing outliers from skewing the classification.