Webarmonium - Technical Documentation

1. Landing Page: Virtual User Mapping

The landing page generates music from three web activity sources, each mapped to a distinct tessitura (frequency range) and oscillator type. Cursor positions use a golden ratio distribution system—X and Y coordinates are calculated independently using φ (phi) and φ² sequences, modulated by metrics, ensuring optimal spacing across the canvas while maintaining data-driven coherence.

Source	Tessitura	Frequency Range	Oscillator	Character
Wikipedia	Bass	110-220Hz (A2-A3)	Sawtooth (rich harmonics)	Deep, warm foundation
HackerNews	Tenor	196-392Hz (G3-G4)	Sine (pure tone)	Pure, mellow mid-range
GitHub	Soprano	523-1047Hz (C5-C6)	Triangle (hollow, flute-like)	Ethereal, airy highs

Metrics Monitored

Source	Primary Metric	Secondary Metrics
Wikipedia	editsPerMinute	avgEditSize, newArticles
HackerNews	postsPerMinute	avgUpvotes, commentCount
GitHub	pushesPerMinute	createsPerMinute, deletesPerMinute

2. Collaborative Rooms: Gesture Processing

Gesture Types

Gesture	Trigger	Audio Result	Visual Feedback
Tap	Click/touch <100ms	Single percussive note (10-110ms attack)	Pulse emission
Long Tap (Hold)	Click/touch held without motion	Sustained note (gate opens on hold, closes on release)	Growing pulse
Drag	Continuous motion	Multi-note melodic phrase (50-250ms attack based on Y position)	Particle trail

3. Multi-User Composition Engine

Parameter	Value	Description
Max jammers	4	Each jammer receives a unique timbre/patch
Listen mode	Unlimited listeners	Passive observers who hear music and see visuals; can promote to jammer when a slot opens
Timbre assignment	Sequential (slots 0-7)	8 synth presets + 3 drum kits: Retro Square, Nasal Reed, Warm Chorus, Bell Chime, Soft Square, Wide Pulse, Bright Chorus, Deep Bell; 808 Kit, Acoustic Kit, Electronic Kit
Synth customization	Synth Panel	Oscillator, filter, envelope, and effects customizable in real-time per user
Color assignment	From 7-color pool	Visual differentiation between users

Virtual User Activation: When only 1 real user is in a room, 2 virtual users (from Wikipedia and HackerNews metrics) automatically join to provide accompaniment. They deactivate when a second real user joins.

4. Environmental Memory System

Phase	Trigger	Learning Rate	Max Patterns
Initial	< 10 gestures	1.5× (fast learning)	2
Learning	10-50 gestures	1.0× (normal)	4
Mature	> 200 gestures	0.8× (slow evolution)	8

Pattern Compatibility

Gestures influence existing patterns based on these compatibility criteria:

Smooth: Low intensity, non-touch input
Rhythmic: High intensity, touch input
Melodic: X coordinate in center range (0.3-0.7)
Intense: Intensity > 0.8

Deterministic Pattern Thresholds

Pattern creation and evolution use deterministic thresholds derived from gesture characteristics:

Pattern Creation (mature phase):
creationScore = (intensity × 0.7) + (positionUniqueness × 0.3)
positionUniqueness = |x - 0.5| + |y - 0.5|
Create new pattern when creationScore > 0.85

Pattern Evolution:
Dormant patterns → evolve when gesture.intensity > 0.4
Emerging patterns → evolve when gesture.intensity > 0.3

Pattern Retention: 24 hours. Rooms develop a unique "personality" from accumulated gestures. Pattern creation favors high-intensity gestures at unique positions (far from center), while evolution thresholds ensure moderate-intensity gestures contribute to existing patterns.

5. Composition System

Webarmonium uses a fully deterministic composition architecture. Every musical decision—from form selection to individual note dynamics—derives directly from input data. Given the same gesture sequence, the system produces identical compositions.

Core Components

Component	Role	How Input Data Maps to Music
CompositionEngine	Musical form selection	Energy level → form (low=contemplative, high=energetic)
PhraseMorphology	Melodic generation	Acceleration → rhythm; Curvature → syncopation; Intensity → phrase length
HarmonicEngine	Tonal management	Harmonic complexity → progression (simple to complex)
CounterpointEngine	Voice leading	Note position → velocity contour, duration, gap patterns
AccompanimentEngine	Accompaniment layers	Genre-specific bass, pad, and keys patterns with voice leading
StyleAnalyzer	Pattern recognition	Extracts tempo, energy, density from gesture history (P10-P90 percentile normalization)
MaterialLibrary	Musical material storage	Organizes motifs, themes by function and character

Form Selection by Genre and Energy

Forms within each genre are ordered by energy level. The style's energy value (0.0-1.0) directly selects the form: low energy chooses contemplative forms (first in array), high energy chooses energetic forms (last in array).

Genre	Forms (Low Energy → High Energy)
Classical	theme_and_variations → ABA → rondo → sonata
Electronic	through_composed → strophic → verse_chorus → build_drop
Jazz	modal → AABA → blues → rhythm_changes
Rock	strophic → AABA → verse_chorus → intro_verse_chorus_bridge_outro
Ambient	through_composed → strophic → ABA
Melodic	ABA → verse_chorus → rondo
Default (pop, rhythmic, experimental)	strophic → ABA → verse_chorus → rondo → through_composed

temporalOffset = (compositionCount × φ) mod 1
historyVariation = (historyLength × φ) mod 1
combinedIndex = (energy × 0.4) + (temporalOffset × 0.4) + (historyVariation × 0.2)
selectedForm = genreForms[floor(combinedIndex × formCount)]

Golden Ratio (φ) Stepping: All composition parameters use φ (1.618...) to create low-discrepancy sequences that avoid predictable repetition. Form selection uses direct PHI stepping through the forms array, with energy shifting the selection by up to 1 position. Note-level details (velocity, duration, articulation) derive from position within the phrase using sine-wave contours.

Form-Driven Section Management

Each form comprises named sections with distinct parameters: thematic role (exposition, development, recapitulation), dynamic contour (crescendo, diminuendo, plateau), rhythmic density (sparse to dense), harmonic tension (0.0-1.0), and harmonic function (tonic, dominant, subdominant, predominant, chromatic). The SectionStateManager tracks section progression, ensuring the CompositionEngine and HarmonicEngine stay synchronized—key changes, BPM shifts, and velocity curves all align with the current section context.

BPM Management

BPM changes are phrase-aligned: a minimum of 4 phrases and 30 seconds must elapse between changes. The nudge range is capped at ±10%, smoothed over 30 transition steps to prevent abrupt tempo shifts. Genre-specific BPM ranges (e.g. ambient 60-90, electronic 115-150, jazz 90-150) constrain the allowed tempo space.

5b. Data-to-Music Derivation Table

Every musical parameter derives from a specific input data source. This table documents the complete mapping from gesture/metric data to musical output.

Macro-Level Derivations

Musical Parameter	Input Data	Derivation Formula
Form selection	style.energy	forms[floor(energy × formCount)]
Chord progression	harmonicComplexity	progressions[floor(complexity × progCount)]
Phrase length	velocity + intensity	rangeMin + intensity × (rangeMax - rangeMin)
Composition timing	energy + compositionCount	baseBeats[count % 5] × (1 - energy × 0.3)
Scale selection	mood (from gesture)	scaleMap[mood].primary or secondary based on curvature
Contour type	trajectory.angle + curvature	angle → ascending/descending; curvature → arch/wave

Note-Level Derivations

Musical Parameter	Input Data	Derivation Formula
Rhythm variation	acceleration	baseDuration × (1 + (acceleration/100) × position × 0.4)
Syncopation	curvature	Apply rhythmic device when phrasePosition > (1 - curvature)
Velocity contour	notePosition	baseVel + sin(position × π) × range (arch shape)
Note duration	role + noteIndex	durationOptions[role][index % optionCount]
Inter-note gaps	role + notePosition	baseGap + sin(position × 2π) × gapRange
Interval type	phrasePosition	Edges (<0.2, >0.8) prefer steps; center allows skips/leaps
Articulation	role + noteIndex	Pattern-based: melody=staccato on 3/4 notes, bass=legato
Ornaments	pitchClass + phrasePosition	Style-specific placement based on pitch%12 and position

Environmental Memory Derivations

Decision	Input Data	Derivation Formula
Pattern creation	intensity + position	Create when (intensity × 0.7) + (positionUniqueness × 0.3) > 0.85
Pattern evolution	gesture.intensity	Evolve dormant patterns when intensity > 0.4
Position uniqueness	coordinates	\|x - 0.5\| + \|y - 0.5\| (distance from center)

Position-Based Variation: Note-level parameters use sine-wave functions based on position within the phrase (sin(position × π)) to create natural musical contours. This produces arch-like dynamics (louder in middle), voice-leading preference at phrase edges, and organic duration/gap variations—all without random number generation.

5c. Genre-Aware Composition System

The composition system selects from nine genres, each with distinct musical characteristics. Genre selection is probabilistic with starvation prevention: genres that haven't played for 7+ minutes receive a boosted selection probability (up to 3×). A minimum play time of 3 minutes prevents rapid genre switching.

Genre Characteristics

Genre	BPM Range	Articulation	Swing	Syncopation	Character
Ambient	60-90	legato	0	0.05	Atmospheric, textural
Classical	70-110	legato	0	0.15	Balanced, counterpoint
Melodic	80-120	legato	0	0.25	Singable, clear
Jazz	90-150	portato	0.67	0.70	Swing, complex harmony
Pop	90-130	normal	0	0.30	Accessible, catchy
Rock	100-150	marcato	0	0.40	Backbeat, power
Rhythmic	110-160	staccato	0.15	0.85	Funk, groove
Electronic	115-150	staccato	0	0.60	Driving, 4-on-floor
Experimental	60-180	varied	0.30	0.60	Unpredictable, textural

Genre Orchestration

Every genre maintains three always-active counterpoint voices (melody, harmony, bass_voice) plus genre-specific accompaniment layers. The orchestration determines which accompaniment voices are active and their relative velocity levels.

Genre	Accompaniment Layers	Sonic Identity
Ambient	pad	Very low-pass filter (500Hz), slow attack (1.5s), long reverb (10s decay)
Classical	pad	Warm filter (1400Hz), smooth resonance, concert hall reverb (3s)
Melodic	pad, keys	Balanced filter (1800Hz), moderate delay and reverb
Jazz	pad	Warm (2200Hz), light resonance, club ambience reverb (1.8s)
Pop	bass_accomp, pad, keys	Clear filter (2200Hz), tempo-synced delay, medium reverb
Rock	bass_accomp, keys	Bright filter (2800Hz), slapback delay, tight room reverb (1s)
Rhythmic	bass_accomp, keys	Mid-bright filter (2500Hz), tight grooves, dry sound
Electronic	bass_accomp, keys	Bright filter (4000Hz), high resonance (Q=6), acid-style squelch
Experimental	pad, keys, bass_accomp	Mid filter (1500Hz), extreme resonance (Q=4), long delay with feedback

PHI-Based Pattern Cycling: Each genre defines multiple rhythm patterns per voice role. Patterns cycle using φ-based indexing across compositions, ensuring maximum variety without repetition: patternIndex = floor((compositionCount × φ) mod patternCount)

6. Dynamic Normalization

Webarmonium uses no hardcoded thresholds. All parameters are normalized dynamically based on observed data ranges.

Normalization Method

normalizedValue = (rawValue - observedMin) / (observedMax - observedMin)

Key Principles

Min/Max Tracking: Rolling window over gesture/metric history
Gesture Classification: Relative comparison only (stability vs density vs periodicity)
Frequency Mapping: Percentile-based (P10-P90) rather than fixed ranges
Velocity Calculation: Rate of change (positive = increasing, negative = decreasing)

Deep Percentile Normalization

Gesture parameters (velocity, acceleration, density, turnAngle, pitchVariance) use adaptive P10-P90 percentile normalization to prevent extreme outliers from skewing the musical output. During a warmup phase (first 10 samples), fallback divisors provide reasonable defaults for each metric.

After warmup (samples ≥ 10):
P10 = sortedSamples[floor(0.1 × sampleCount)]
P90 = sortedSamples[floor(0.9 × sampleCount)]
normalizedValue = (rawValue - P10) / (P90 - P10)

During warmup (samples < 10):
normalizedValue = rawValue / fallbackDivisor

Percentile calculations are cached with a 5-second TTL and invalidated only when new samples are added, ensuring minimal CPU overhead during high-frequency gesture processing.

7. Audio Architecture

Unified Audio Chain

All audio sources share a common routing architecture with independent volume control and stereo panning. Each source connects to a master bus with optional effects sends.

Source	Synth Type	Routing
Local gestures	MonoSynth (gestureSynth)	synth → pan → volume → master + FX
Remote users	Per-user PolySynth (UserSynthManager)	synth → pan → volume → master + FX
Background	Ambient layers (bass, pad, chords)	layer → filter → volume → master + FX

Real User Synth Patches (8 presets)

Slot	Name	Oscillator	Character
0	Retro Square	Square	Buzzy, 8-bit
1	Nasal Reed	Pulse (20% width)	Nasal, oboe-like
2	Warm Chorus	Fat Sawtooth (3 voices)	Lush, detuned
3	Bell Chime	FM Sine	Bell-like, metallic
4	Soft Square	Square	Softer, muffled
5	Wide Pulse	Pulse (40% width)	Wider, hollower
6	Bright Chorus	Fat Sawtooth (2 voices)	Brighter, wider spread
7	Deep Bell	FM Triangle	Deeper, more metallic

Drum Kit Presets (3 kits)

Slot	Name	Instruments	Character
8	808 Kit	BD, SN, HH, OH	Deep sub-bass kick, tight snare, closed hi-hat, open hi-hat with choke
9	Acoustic Kit	BD, SN, HH, OH	Natural resonant kick, bright snare, shimmering hi-hat
10	Electronic Kit	BD, SN, HH, OH	Punchy synthetic kick, noisy snare, glitchy hi-hat

Hi-Hat Choke: Closed hi-hat (HH) chokes any active open hi-hat (OH), and OH self-chokes the previous OH hit—per-user tracking ensures correct choke behavior in multi-user rooms.

Synth Panel

In collaborative rooms, each user can customize their synth sound in real-time through a channel-strip panel. Preset selection uses server-side conflict resolution: occupied presets are disabled. All parameter changes sync in real-time via WebSocket to all room participants, and late-joining users receive the current state on connection.

Control Group	Parameters	Range
Oscillator	Context-sensitive: pulse width, fat spread, or FM modulation index	Depends on preset oscillator type
Filter	Type (LP/HP/BP), cutoff, resonance (Q)	Cutoff range depends on filter type: LP 80-20000Hz, HP 20-12000Hz, BP 80-12000Hz (logarithmic). Q range: LP/HP 0.1-20.0, BP 0.1-8.0
Envelope	Attack, Decay, Sustain, Release	A: 0.001-4.0s, D: 0.001-8.0s, S: 0-1, R: 0.001-10.0s
Output	Volume, Pan	Volume: -12 to +12dB, Pan: -1 to +1
Effects	Delay send, Reverb send	0-1 (dry to full wet)

Drum Mode Controls

When a drum kit preset is selected, the Synth Panel switches to drum-specific controls. Each instrument (bass drum, snare, hi-hat) has independent parameter sliders. Canvas gestures also change in drum mode: tap triggers the snare, and drag produces a snare roll (repeated hits at drag speed).

Instrument	Parameters	Notes
Bass Drum (BD)	Pitch, Decay, Tone	Sub-bass fundamental with tone-controlled harmonics
Snare (SN)	Pitch, Decay, Tone, Delay	Noise + membrane blend; delay adds pre-attack rattle
Hi-Hat (HH)	Pitch, Decay, Tone, Delay	MetalSynth-based; short decay for closed, longer for open character

A global Volume and Reverb send apply to the entire drum kit. All parameter changes sync in real-time via WebSocket, identical to synth mode.

Color Theming: All Synth Panel controls are themed with the user's assigned color via the CSS custom property --synth-accent, providing instant visual identification of which user's panel is active.

8. Background Composition System

Counterpoint + Accompaniment Architecture

Background music is generated through two complementary systems. A full counterpoint engine maintains three always-active voices (melody, harmony, bass_voice) that follow voice leading rules and produce independent melodic lines. An AccompanimentEngine adds three supporting layers with genre-specific patterns.

AccompanimentEngine Layers

Layer	MIDI Range	Role	Genre Examples
bass_accomp	E1-C3 (28-48)	Voice-led bass foundation	Walking bass (jazz), driving eighths (electronic), power bass (rock), Alberti bass (classical), sustained roots (ambient)
pad	C3-C5 (48-72)	Sustained harmonic pads	Chord voicings with stagger, extensions at high tension (7ths when >0.6, 9ths when >0.8)
keys	C3-C6 (48-84)	Genre-specific rhythmic patterns	Arpeggios (electronic), jazz comping with swing, rock power stabs, classical block chords, sparse ambient pads

Voice Leading & Velocity

All accompaniment layers use voice leading to minimize intervallic jumps between chord changes. Jumps exceeding 5 semitones (STEPWISE_THRESHOLD) incur a penalty, favoring smooth transitions. Dynamic velocity curves shape each section: crescendo, diminuendo, swell, terraced, or stable—selected based on the section's dynamic contour.

velocityCurve[crescendo] = 0.5 + (position × 0.7) [rising through section]
velocityCurve[diminuendo] = 1.2 - (position × 0.7) [fading through section]
velocityCurve[swell] = 0.5 + sin(position × π) × 0.7 [arch shape]

Harmonic Progressions

Harmonic progressions are managed by the HarmonicEngine with genre-specific generators. Each genre has 10-12 progressions ordered by complexity, selected based on harmonic tension from the current section context. Progression selection uses φ-based stepping combined with section tension for deterministic variety. Chord extensions (7ths, 9ths) are added dynamically when harmonic tension exceeds threshold values.

φ-Based Variation: The AccompanimentEngine receives the current compositionCount as a parameter, using φ-based calculations to ensure different pattern selections, velocity curves, and articulation choices across successive compositions. This prevents the accompaniment from becoming predictable over extended listening sessions.

9. Drone & Void Detection

The DroneVoidController makes the ambient drone "emerge during activity voids"—filling silence with atmospheric sound while fading out during active musical events.

DroneVoidController Parameters

Parameter	Value	Purpose
voidTimeoutMin	5000ms	Minimum silence before drone emerges
voidTimeoutMax	10000ms	Full drone emergence time
fadeInTime	2.0s	Quick fade in to fill void
fadeOutTime	20.0s	Gradual fade out on activity
droneNominalDb	-3dB	Full drone level
droneSilentDb	-60dB	Effectively silent
updateInterval	100ms	Check frequency for smooth transitions

Void Score Calculation

Three factors are combined multiplicatively—any single factor can suppress the drone:

voidScore = timeScore × influenceVoidScore × noteVoidScore

timeScore = (timeSinceActivity - 5000) / (10000 - 5000) [clamped 0-1]
influenceVoidScore = 1 - userInfluence
noteVoidScore = activeNotes > 0 ? 0 : 1

LFO Modulation (Organic Breathing)

Two LFOs add subtle organic movement to the drone sound, with randomized phases to prevent synchronized modulation patterns.

LFO	Frequency	Cycle	Range	Effect
Amplitude	0.03 Hz	~33s	-6dB to 0dB	Very slow volume breathing
Pitch	0.05 Hz	~20s	±8 cents	Subtle organic detuning

Dual Control: The DroneVoidController uses a dedicated gain node (droneAmplitudeGain) for overall presence control, allowing the amplitude LFO to continue providing subtle breathing modulation even while the drone fades in/out.

9b. Visual Rendering

All visual rendering uses PixiJS 8 (WebGL GPU) on both the landing page and collaborative rooms. The rendering pipeline includes spring-mesh network topology, particle pools, pulse pools, wave packet systems, precomputed attractor fields, and neon nebula layers—all composited at 60fps through PixiJS’s WebGL batching. A PixiAdapter service bridges the application’s visual subsystems to the PixiJS renderer.

10. Musical Scheduler

The MusicalScheduler prioritizes clock consistency over low latency, ensuring perfect synchronization between local and remote gestures. Remote events may wait but will always be scheduled in sync with the local clock.

Clock Synchronization

Parameter	Value	Purpose
Tick interval	25ms (40Hz)	Web Worker precision timer
Local lookahead	100ms	Immediate scheduling for local events
Remote lookahead	250ms	Schedule ahead for jitter tolerance
Default tempo	120 BPM	Base tempo (range: 60-140 BPM)
Time signature	4/4	Standard time signature
Grid resolution	16th note	Remote event quantization

Beat Grid Assignment (Landing Page)

Virtual user notes are distributed across the beat grid to create rhythmic variety:

Wikipedia notes: ticks 0, 4, 8, 12 (downbeats)
HackerNews notes: ticks 2, 6, 10, 14 (off-beats)
GitHub notes: ticks 1, 5, 9, 13 (upbeats)

Local vs Remote: Local events have 100ms lookahead for immediacy. Remote events use 250ms lookahead and snap to the nearest sixteenth-note boundary for global synchronization despite network jitter.

11. Harmonic Engine & Voice Leading

Available Scales

Fourteen scales are available for melodic generation, covering all seven modes, harmonic and melodic minor variants, pentatonic scales, and special scales:

Scale	Intervals (semitones)	Character
Ionian (Major)	0, 2, 4, 5, 7, 9, 11	Bright, happy
Dorian	0, 2, 3, 5, 7, 9, 10	Jazz, modal
Phrygian	0, 1, 3, 5, 7, 8, 10	Spanish, exotic
Lydian	0, 2, 4, 6, 7, 9, 11	Dreamy, floating
Mixolydian	0, 2, 4, 5, 7, 9, 10	Bluesy, rock
Aeolian (Natural minor)	0, 2, 3, 5, 7, 8, 10	Sad, introspective
Locrian	0, 1, 3, 5, 6, 8, 10	Dark, unstable
Harmonic Minor	0, 2, 3, 5, 7, 8, 11	Dramatic, exotic
Melodic Minor	0, 2, 3, 5, 7, 9, 11	Jazz sophistication
Major Pentatonic	0, 2, 4, 7, 9	Safe, no dissonance
Minor Pentatonic	0, 3, 5, 7, 10	Bluesy, safe
Blues	0, 3, 5, 6, 7, 10	Expressive, bent
Whole Tone	0, 2, 4, 6, 8, 10	Dreamlike, ambiguous
Diminished	0, 2, 3, 5, 6, 8, 9, 11	Symmetrical, tense

SATB Voice Ranges (Background Composition)

Background composition uses traditional SATB voice ranges for proper voice leading:

Voice	MIDI Range	Notes	Character
Soprano	60-79	C4-G5	High density, staccato
Alto	55-72	G3-C5	Medium density
Tenor	48-67	C3-G4	Sparse, long sustain
Bass	36-55	C2-G3	Low density, legato

Voice Leading Rules

The CounterpointEngine validates voice leading to ensure musical coherence:

No parallel fifths/octaves: Consecutive voices cannot move in parallel perfect intervals
No voice crossing: Lower voices must stay below upper voices
Maximum spacing: Adjacent voices within one octave (12 semitones)
Stepwise motion preferred: Leaps > 5 semitones trigger octave alternatives

Consonant Intervals

Allowed intervals: Unison (0), Minor 3rd (3), Major 3rd (4), Perfect 5th (7), Major 6th (9), Octave (12)

Form-Driven Harmonic Modulation

Key changes are gated by the composition's form structure: modulation can only occur at section transitions, not within sections. The SECTION_MODULATION_RULES determine which modulation targets are allowed based on the current section's harmonic function.

Harmonic Function	Modulation Allowed	Targets
Tonic	No (returns to tonic)	—
Dominant	Yes	dominant, relative
Subdominant	Yes	subdominant, relative
Predominant	Yes	dominant, subdominant
Chromatic	Yes	mediant, chromatic, tritone

Genre-specific modulation profiles control how frequently and where each genre modulates:

Genre	Frequency	Preferred Targets	Temp. Multiplier
Ambient	very low	relative	0.3
Rock	low	relative	0.5
Rhythmic	low	relative	0.5
Electronic	low	relative, mediant	0.6
Pop	low	relative, dominant	0.7
Classical	medium	dominant, relative	0.8
Melodic	medium	relative, dominant	0.8
Jazz	high	tritone, chromatic, dominant	1.3
Experimental	high	tritone, chromatic, mediant	1.5

Temperature Smoothing: Modulation distance is smoothed using exponential smoothing (alpha=0.15) to prevent sudden jumps in tonal center. Recapitulation sections return to the original tonic with 80% probability, maintaining formal coherence.

12. Audition System

The Audition feature generates virtual gestures for testing synth patches without performing manual gestures. Generation occurs server-side (via the AuditionGestureService), ensuring other room participants hear the audition output. Timing uses φ-based intervals for natural distribution.

Audition Parameters

Parameter	Range	Description
Source	random / metrics	Random generation or web metrics-driven gesture parameters
Frequency	0.25-2.0 events/sec	Event rate (500ms-4000ms interval, φ-based jitter)
Regularity	0-1	Timing jitter: 0 = varied intervals, 1 = metronomic
Uniformity	0-1	Event similarity: 0 = diverse parameters, 1 = identical events
Gesture Type	0-1	0 = all taps, 1 = all drags, 0.5 = equal mix
Range	3-60 semitones	Note range (minor 3rd to 5 octaves)

Frequency Range

Audition frequency range: A2-A5 (110-880Hz)
Canvas gesture range: 24-60 semitones (extended from original 24 to match audition range)

Automatic Pause: Audition pauses automatically when the user performs a real gesture and resumes on release, preventing collision between audition and live performance. The audition button pulses with each generated note for visual feedback. Security: all audition parameters are validated and clamped server-side to prevent abuse.

13. Sequencer System

The Sequencer generates repeating step-based patterns synchronized to the room’s composition tempo. It operates in two modes: melodic (pitch-based steps) and drum (per-instrument grid). Like the Audition feature, generation occurs server-side (via the SequencerGestureService), so all room participants hear the output. In melodic mode, each step emits hold:start / hold:end gestures and feeds raw pitch material into the BackgroundCompositionService.

Melodic Sequencer Parameters

Parameter	Range	Description
Steps	3–16	Number of steps in the loop (default: 8)
Speed	0.25×–8×	BPM multiplier (minimum effective interval: 100ms)
Per-step Pitch	0–27 (slider)	7 scale degrees (I–VII) × 4 octaves (2–5) = 28 positions
Per-step State	N / M / R	Normal (plays configured pitch), Mute (skips audio, LED still advances), Random (random degree 1–7 and octave 2–5 each cycle)

Step Timing

stepInterval = (60000 / BPM) / speedMultiplier
noteDuration = stepInterval × 0.8
// BPM refreshed from BackgroundCompositionService between steps
// Minimum interval enforced: 100ms (max 10 steps/sec)

Pitch Resolution

rawMidi = tonic + scaleInterval[degree] + (octave × 12)
quantizedMidi = HarmonicEngine.constrainToScale(rawMidi, key, mode)
// Raw pitch → addMaterial() (composition input)
// Quantized pitch → hold:start (playback only)

Drum Sequencer Mode

When a drum kit preset is active, the sequencer switches to a four-instrument grid. Instead of pitch sliders, each row represents an instrument (BD, SN, HH, OH) and each step can be toggled on or off. The open hi-hat (OH) follows the same choke rules as live playback: HH chokes active OH, OH self-chokes previous OH.

Row	Instrument	Step States
1	Bass Drum (BD)	On / Off per step
2	Snare (SN)	On / Off per step
3	Hi-Hat (HH)	On / Off per step
4	Open Hi-Hat (OH)	On / Off per step

Sequencer Behaviour: The sequencer auto-pauses when the user performs a real gesture and resumes on release, identical to audition. Audition and Sequencer are mutually exclusive—starting one stops the other (enforced on both frontend and backend). Timer safety follows the closure-capture pattern: isActive = false before clearing timers, and roomId/userId/userColor are captured before async operations for disconnect safety. Rate limiting: frontend throttles parameter changes at 150ms, backend rate-limits config updates at 100ms/update, cursor emission capped at 10Hz.

14. Metric-to-Gesture Classification

Virtual user gestures are classified based on which metric characteristic is dominant. This uses purely relative comparison—no hardcoded thresholds.

Classification Metrics

Metric	Formula / Source	Gesture	Audio Result
Stability	1 - normalizedVelocity	TAP	Single note (50-300ms)
Density	Wiki: avgEditSize HN: avgUpvotes GH: createsPerMinute	DRAG	Phrase (3-8 notes, 300-3000ms)

Gesture Selection

gestureType = stability > density ? TAP : DRAG
// Higher stability produces single notes, higher density produces phrases

Intent Thresholding

Gestures are only generated when normalized velocity exceeds an activity-based threshold:

gestureIntent = 0.1 × (1 - activityLevel × 0.5)
// High activity = lower threshold = more frequent gestures

Composition Frequency

Composition timing derives from energy level and composition count, cycling through predetermined phrase lengths for variety:

baseBeats = [8, 12, 16, 10, 14] // Cycles through these lengths
beatIndex = compositionCount % 5
energyModifier = 1 - (energy × 0.3) // 0.7-1.0
beatsPerComposition = baseBeats[beatIndex] × energyModifier

High energy → 5.6-11.2 beats (shorter phrases)
Low energy → 8-16 beats (longer phrases)

Dynamic Normalization: All metric values use P10-P90 percentile-based normalization from a rolling 2-minute sample window, preventing outliers from skewing the classification.