ESSENTIA CHOP SUITE

Real-time & Batch Audio Analysis for TouchDesigner

Five C++ CHOP plugins powered by Essentia exposing spectrum analysis, mel bands, MFCCs, pitch detection, key estimation, onset/BPM tracking, and EBU R128 loudness metering — all running natively inside TouchDesigner. Each operator supports both real-time per-frame analysis and offline batch processing of entire audio files.

WHAT IT DOES

Five Operators, Two Modes

Each analysis operator (Spectral, Tonal, Rhythm, Loudness) has a Mode parameter that switches between real-time per-frame analysis and offline batch processing. The fifth operator, Spectrum, provides the shared FFT for real-time mode.

Audio flows through two independent paths in Realtime mode:

Spectral path (Realtime) — Audio CHOP feeds Essentia Spectrum, which computes the FFT once. Essentia Spectral, Essentia Tonal, and Essentia Rhythm all read from the shared spectrum output. Spectral descriptors, mel bands, pitch, chroma, key, dissonance, onset detection, and BPM estimation all share a single FFT computation per frame.

Direct path (Realtime) — Audio CHOP feeds Essentia Loudness (EBU R128 loudness, RMS energy, zero-crossing rate) independently. This operator works directly on the time-domain audio signal and does not depend on the shared spectrum.

Batch mode — When set to Batch, each operator processes an entire audio buffer (e.g. from a File In CHOP) on a background thread. Each operator handles its own windowing and FFT internally — no Spectrum CHOP needed. Output contains one sample per analysis frame across the full file. Trigger analysis with the Compute pulse or enable Autocompute for automatic re-analysis when the input changes.

Analysis, not visualization — Essentia Spectrum outputs a linear-bin FFT magnitude spectrum designed for downstream analysis algorithms. Its bins are uniformly spaced in Hz, which is what Essentia's algorithms expect but looks bottom-heavy when plotted — most musical detail is crammed into the lower bins. For spectral visualization, use TouchDesigner's built-in Audio Spectrum CHOP, which provides a perceptually scaled output suited for display. Note that TD's Audio Spectrum cannot be used as input to the Essentia analysis CHOPs — they require the linear-bin format that Essentia Spectrum provides.

Mono by design — The suite processes a single audio channel. Stereo analysis would multiply every output channel (e.g., mfcc0_L / mfcc0_R, spectral_centroid_L / spectral_centroid_R), making outputs unwieldy and harder to map in a visual context. For stereo-aware analysis, use a Select CHOP to pick each channel independently and run two separate analysis chains. For most audio-reactive scenarios, the recommended approach is to sum left and right with a Math CHOP (Combine Channels → Average) before feeding into Essentia Spectrum — this preserves the full frequency content of both channels without phase cancellation artifacts.

GETTING STARTED

Installation

Pre-built DLLs — no compilation required. Drop them into TouchDesigner's plugin folder and restart.

Step 1 — Copy the DLLs

Place all 5 plugin DLLs into TouchDesigner's Plugins folder (or any subfolder — TD scans subdirectories):

EssentiaSpectrumCHOP.dll

EssentiaSpectralCHOP.dll

EssentiaTonalCHOP.dll

EssentiaRhythmCHOP.dll

EssentiaLoudnessCHOP.dll

Default plugin path:

C:/Users/<you>/Documents/Derivative/Plugins/

Step 2 — Restart TouchDesigner

TouchDesigner loads plugin DLLs at startup. After copying the files, restart TD for the new operators to appear.

Step 3 — Add the operators

In the OP Create Dialog (Tab key), search for the operator names:

Essentia Spectrum Essentia Spectral Essentia Tonal Essentia Rhythm Essentia Loudness

Tip — For Realtime mode, connect an Audio CHOP to Essentia Spectrum first, then wire the Spectrum output to Spectral, Tonal, and Rhythm. Loudness takes raw audio directly. For Batch mode, connect a File In CHOP directly to any analysis operator — no Spectrum CHOP needed. See the Signal Flow section for details.

OPERATORS

The Plugin Suite

Each operator is a standalone DLL loaded by TouchDesigner as a custom CHOP. Click any card to expand its full specification.

Essentia Spectrum CORE

Computes the FFT magnitude spectrum from raw audio. Outputs a single channel spectrum containing fftSize/2+1 bins as a static sample buffer. This is the shared upstream node — Spectral, Tonal, and Rhythm CHOPs all read from it.

Input

Audio CHOP (mono, first channel used)

Output

1 channel × 513 samples (at FFT 1024) — static indexed buffer, not time-domain

Processing

Reads latest fftSize samples → Windowing (Hann / Hamming / Triangular / Blackman-Harris) → Zero Padding (optional) → Essentia Spectrum algorithm

Parameters

FFT Size Hop Size Window Type Zero Padding

Essentia Spectral ANALYSIS

Computes spectral shape descriptors and mel-frequency band energies. Supports Realtime (per-frame from shared spectrum) and Batch (full-file offline analysis). Each feature group can be independently toggled.

Input

Realtime: Essentia Spectrum CHOP | Batch: Audio/File In CHOP (raw audio)

Output

Realtime: 1 sample per channel at frame rate | Batch: N samples (one per analysis frame). Channels: mfcc0–mfcc12, spectral_centroid, spectral_flux, spectral_rolloff, spectral_contrast0–spectral_contrast5, hfc, spectral_complexity, mel0–melN

Features

MFCC (timbral fingerprint, 13 coefficients) — Centroid (brightness) — Flux (rate of spectral change) — Rolloff (85% energy frequency) — Contrast (peak-valley difference in 6 bands) — HFC (high-frequency content) — Complexity (number of spectral peaks) — Mel Bands (perceptual frequency band energies)

Parameters

Enable MFCC MFCC Count MFCC Low Freq MFCC High Freq Enable Centroid Enable Flux Flux Half Rectify Flux Norm Enable Rolloff Rolloff Cutoff Enable Contrast Contrast Bands Enable HFC HFC Type Enable Complexity Complexity Threshold Enable Mel Bands Mel Bands Count Mel Low Freq Mel High Freq Mel Freq Names Log Mel (dB Scale) Mode

Essentia Tonal ANALYSIS

Detects pitch, harmonic content, musical key, dissonance, and inharmonicity. Supports Realtime (per-frame with EMA smoothing) and Batch (full-file with global or windowed key detection).

Input

Realtime: Essentia Spectrum CHOP | Batch: Audio/File In CHOP (raw audio)

Output

Realtime: 1 sample per channel | Batch: N samples per channel. Channels: pitch, pitch_confidence, note_a–note_gs (12 bins with Musical Labels on), key, key_scale, key_strength, dissonance, inharmonicity

Features

Pitch (YinFFT algorithm, Hz + confidence) — HPCP (harmonic pitch class profile / chroma, 12/24/36 bins) — Key (key + major/minor + strength) — Dissonance (sensory roughness) — Inharmonicity (deviation from harmonic series)

Parameters

Pitch Algorithm HPCP Size Enable Pitch Pitch Min Freq Pitch Max Freq Pitch Tolerance Enable HPCP HPCP Harmonics Reference Freq HPCP Non-Linear HPCP Normalized Enable Key Key Profile Key Frames Peak Threshold Peak Max Freq Enable Dissonance Enable Inharmonicity Musical Labels Enable Pitch Note Smoothing Mode

Essentia Rhythm ANALYSIS

Detects onsets and estimates tempo. Realtime uses TempoTapDegara BPM with Onsets-style adaptive thresholding. Batch uses RhythmExtractor2013 for full-file BPM and beat tracking.

Input

Realtime: Essentia Spectrum CHOP | Batch: Audio/File In CHOP (raw audio)

Output

Realtime: 1 sample per channel | Batch: N samples per channel. 6 channels: onset (0/1 trigger), onset_strength, bpm, beat (0/1 trigger), beat_phase (0–1 sawtooth), beat_confidence

Processing

RT: OnsetDetection/SuperFlux → Onsets-style adaptive threshold → onset trigger. ODF history → TempoTapDegara → BPM + tick-anchored beat phase. Batch: RhythmExtractor2013 → BPM/beats + per-frame onset detection via Onsets algorithm.

Parameters

Onset Method Onset Sensitivity BPM Min BPM Max Rhythm Method Mode

Essentia Loudness METERING

EBU R128 loudness metering with momentary, short-term, and integrated measurements, plus RMS energy and zero-crossing rate. Supports Realtime (per-frame via ring buffer) and Batch (full-file offline analysis).

Input

Audio CHOP (mono) — both modes take raw audio directly

Output

Realtime: 1 sample per channel | Batch: N samples per channel. 7 channels: loudness (instantaneous dB), loudness_momentary, loudness_shortterm, loudness_integrated, dynamic_range, rms, zcr

Processing

Audio → ring buffer → frame dispatch → Essentia Loudness → dB conversion → sliding windows (momentary 400ms, short-term 3s) → EBU R128 two-pass gating for integrated. RMS and ZCR computed directly from the audio frame.

Parameters

Frame Size Gate Threshold Normalize dB Floor dB Ceiling ZCR Threshold Mode

ARCHITECTURE

Signal Flow

In Realtime mode, audio flows through two paths: the spectral path computes the FFT once and fans out to three analysis domains, while the direct path handles loudness metering. In Batch mode, each operator processes raw audio independently with its own FFT.

Audio Input

FFT Processing

Spectral Analysis

Loudness & Energy

Batch mode — When any analysis CHOP is set to Mode = Batch, it takes raw audio directly (no Spectrum CHOP needed) and handles its own windowing + FFT internally. The diagram above shows Realtime mode routing only.

CREATIVE APPLICATIONS

Output Use Cases

Every output channel has a purpose. This reference describes what each channel measures and suggests practical ways to use it inside TouchDesigner for audio-reactive visuals, installations, and live performance.

Essentia Spectrum CORE

Channel	Range	What it does & how to use it
`spectrum`	0+ (fftSize/2+1 samples)	Raw FFT magnitude bins. Visualize as a bar graph, 3D terrain, or heatmap. Use as input for custom analysis. The lower bins represent bass, upper bins represent treble — slice with a Select CHOP to isolate frequency bands.

Essentia Spectral ANALYSIS

Channel	Range	What it does & how to use it
`mfcc0`–`mfcc12`	~−50 to 50	Timbral fingerprint. MFCCs capture the "texture" of a sound independent of pitch. Use them to cluster similar sounds, drive visual style transitions based on tonal character, or distinguish instruments. Feed into a Math CHOP for normalization, then map to shader uniforms.
`spectral_centroid`	0 – sr/2 Hz	Brightness. The "center of mass" of the spectrum. High values mean bright, shimmery sounds; low values mean warm, bassy tones. Map to color temperature (orange ↔ blue), particle speed, or lighting intensity.
`spectral_flux`	0+	Rate of spectral change. Spikes when the timbre shifts suddenly — a new instrument enters, a filter sweeps, or a transition occurs. Use to trigger visual scene changes, glitch effects, or novelty highlights.
`spectral_rolloff`	0 – sr/2 Hz	Energy distribution edge. The frequency below which 85% of spectral energy lies. Distinguishes bright mixes from dark ones. Control cutoff-style visual filters, blur intensity, or fog density.
`spectral_contrast0`–`5`	unbounded	Per-band tonal vs. noisy content. Six sub-bands measuring peak-to-valley ratio. Drive multi-layer visual intensities — assign each band to a separate geometry layer or glow ring for a multi-band reactive sculpture.
`hfc`	0+	High-frequency content. Sensitive to cymbals, hi-hats, and sibilance. Use for sparkle/shimmer particle effects, percussive high-end triggers, or treble-reactive lighting.
`spectral_complexity`	0+	Number of spectral peaks. Simple tones (sine, flute) score low; complex sounds (orchestra, noise) score high. Drive visual density — particle count, fractal detail, or geometric subdivision level.
`mel0`–`melN`	0+	Perceptual frequency band energies. Mel bands approximate human hearing — evenly spaced in perceived pitch, not linear frequency. Ideal for multi-band audio visualizers, frequency-mapped color gradients, per-band particle emitters, or as input features for ML classifiers.

Essentia Tonal ANALYSIS

Channel	Range	What it does & how to use it
`pitch`	0+ Hz	Fundamental frequency. Detected via YinFFT. Map to note names for generative music notation, control animation speed or vertical position with pitch, or track vocals for lip-sync effects.
`pitch_confidence`	0 – 1	Pitch reliability. Low during noise or silence, high for clear tonal content. Use as a gate — only apply pitch-driven effects when confidence exceeds a threshold. Crossfade between pitched and unpitched visual modes.
`note_a`–`note_gs`	0 – 1	Chroma / pitch class energy. 12 bins (one per semitone, A through G#) with Musical Labels on by default. Bin 0 = A (reference frequency). Visualize as a harmony wheel, map each note to a unique color, detect chord changes for scene transitions, or build a real-time piano-roll display. Set HPCP Size to 24 or 36 for finer resolution.
`key`	0 – 11	Musical key. Encoded integer (0=C, 1=C#, ... 11=B). Assign color palettes or scene themes per key. Drive generative art parameters that shift as the music modulates to a new key.
`key_scale`	0 or 1	Major or minor. 0 = major, 1 = minor. Use for mood — major keys → warm, bright, expansive palettes; minor keys → cool, dark, constrained palettes. Switch between two visual presets.
`key_strength`	0 – 1	Key confidence. Higher when the harmonic content strongly matches a key template. Blend strength into color saturation or confidence-gated transitions.
`dissonance`	0 – 1	Sensory roughness. Measures perceptual dissonance between spectral peaks. Map to visual chaos — glitch intensity, distortion amount, turbulence in fluid simulations, or camera shake.
`inharmonicity`	0 – 1	Deviation from harmonic series. Low for flutes and voices, high for bells and percussion. Distinguish melodic from percussive sources, or drive material textures (metallic vs. organic).

Essentia Rhythm ANALYSIS

Channel	Range	What it does & how to use it
`onset`	0 or 1	Transient trigger. Fires on detected attacks — drum hits, plucks, consonants. Use for particle bursts, flash/strobe triggers, camera cuts, step-sequenced events, or instantiating geometry.
`onset_strength`	0+	Continuous onset function. The raw detection signal before thresholding. Scale burst intensity or particle count by strength for velocity-sensitive triggers. Smooth with a Lag CHOP for an onset envelope.
`bpm`	BPM min – max	Estimated tempo. Sync LFO rates, animation cycle durations, or generative pattern timing to the music. Divide by 60 to get beats-per-second for direct use in Speed parameters.
`beat`	0 or 1	Beat trigger. Fires on the estimated beat grid. Use for rhythmic pulsing, beat-locked scene transitions, synchronized step-sequencing, or quantized color cycling.
`beat_phase`	0 – 1 (sawtooth)	Position within beat. Ramps linearly from 0 to 1 between beats. Feed into easing curves (Lookup CHOP) for smooth beat-synced animation — bouncing, breathing, pendulum swings, or rhythmic camera motion.
`beat_confidence`	0 – 1	Beat tracking reliability. Low during ambient or arrhythmic passages, high during steady beats. Use to fade in beat-synced effects only when tracking is stable, or crossfade to an onset-only mode when confidence drops.

Essentia Loudness METERING

Channel	Range	What it does & how to use it
`loudness`	dB (~−100 to 0)	Instantaneous perceived loudness. The fastest-responding level signal. Use for frame-level reactive scaling — VU-meter needle, geometry size, or opacity. Responds immediately to transients.
`loudness_momentary`	dB	EBU R128 momentary (400 ms). A smoothed level that tracks phrase-level dynamics without jitter. Use for dynamic scaling of visual elements, responsive but stable brightness, or gain-riding effects.
`loudness_shortterm`	dB	EBU R128 short-term (3 s). Captures the current "energy zone" of the track. Use for scene-level intensity — ambient lighting adjustments, macro-level background color, or fog density that follows the song's structure.
`loudness_integrated`	dB	Running gated average. EBU R128 integrated loudness with absolute + relative gating. Use for overall show level monitoring, normalization reference, or auto-gain to keep visuals consistent across tracks of different loudness.
`dynamic_range`	dB (0+)	Loudness swing over 3 s. Peak-to-valley of the short-term window. Detects builds and drops — high values mean the audio is moving between quiet and loud. Drive contrast-based transitions, tension/release mapping, or drop-triggered explosions.
`rms`	0 – 1	Root mean square energy. The classic "audio reactive" signal — simple, linear amplitude. Map directly to geometry scale, opacity, displacement amount, or any parameter that should pulse with the music's energy.
`zcr`	0 – 1	Zero-crossing rate. Measures how "noisy" vs. "tonal" the audio is. High ZCR = noise, percussion, sibilance. Low ZCR = clean pitched tones. Use to drive grain/static effects, distinguish drums from melody, or control visual roughness.

CONFIGURATION

Parameters Reference

Essentia Spectrum
Parameter	Type	Default	Options / Range	Description
FFT Size	Menu	1024	512 / 1024 / 2048 / 4096 / 8192 / 16384	Window size for FFT computation
Hop Size	Int	512	64 – 16384	Samples between analysis frames
Window Type	Menu	Hann	Hann / Hamming / Triangular / Blackman-Harris 62/70/74/92	Window function applied before FFT
Zero Padding	Menu	None	None / Half FFT / Full FFT	Interpolates the spectrum for better frequency resolution

FFT Size & Quality Tradeoff

Larger FFT sizes improve frequency resolution (more bins, better at distinguishing close pitches) at the cost of time resolution (each frame covers more audio, smearing transients). For tonal analysis (pitch, key, HPCP), 2048–4096 is the sweet spot. For rhythm/onset detection, 1024 responds faster to transients. Going beyond 8192 has diminishing returns and adds latency. The default 1024 is a good general-purpose balance.

Window Type

Each window trades main-lobe width for side-lobe suppression. Hann is a good general default. The Blackman-Harris variants offer progressively stronger side-lobe suppression (62/70/74/92 dB) at the cost of wider main lobes — the 74 dB variant is often preferred for tonal analysis.

Zero Padding

Appends zeros to the windowed frame before FFT, which interpolates spectral bins without changing frequency resolution. This improves the accuracy of peak-based descriptors (centroid, rolloff, pitch) and produces smoother spectrum plots. “Half FFT” adds fftSize/2 zeros; “Full FFT” doubles the frame.

Essentia Spectral
Parameter	Type	Default	Description
Mode	Menu	Realtime	Realtime (per-frame from Spectrum CHOP) / Batch (full-file offline)
FFT Size (Batch)	Menu	1024	Window size for batch FFT — only visible in Batch mode
Hop Size (Batch)	Int	512	Samples between analysis frames — only visible in Batch mode
Window Type (Batch)	Menu	Hann	Window function for batch FFT — only visible in Batch mode
Compute (Batch)	Pulse	—	Trigger batch analysis — only visible in Batch mode
Autocompute (Batch)	Toggle	Off	Auto-recompute when input changes — only visible in Batch mode
Enable MFCC	Toggle	On	Enable/disable MFCC output channels
MFCC Count	Int	13	Number of MFCC coefficients (1–20)
MFCC Low Freq	Float	0 Hz	Lower frequency bound for MFCC mel filters
MFCC High Freq	Float	11000 Hz	Upper frequency bound for MFCC mel filters
Enable Centroid	Toggle	On	Enable spectral centroid
Enable Flux	Toggle	Off	Enable spectral flux
Flux Half Rectify	Toggle	Off	Only count energy increases (onset emphasis)
Flux Norm	Menu	L2	L1 or L2 norm for difference computation
Enable Rolloff	Toggle	Off	Enable spectral rolloff
Rolloff Cutoff	Float	0.85	Energy fraction threshold (0.5 = median, 0.85 = standard, 0.95 = brightness)
Enable Contrast	Toggle	Off	Enable spectral contrast
Contrast Bands	Menu	6	Number of octave sub-bands (4 / 6 / 8)
Enable HFC	Toggle	On	Enable high-frequency content
HFC Type	Menu	Masri	Masri / Jensen / Brossier — different HFC formulations
Enable Complexity	Toggle	On	Enable spectral complexity
Complexity Threshold	Float	0.005	Minimum peak magnitude to count (0–0.1)
Enable Mel Bands	Toggle	On	Enable mel band output channels
Mel Bands Count	Menu	40	24 / 40 / 60 / 80 / 128
Mel Low Freq	Float	0 Hz	Lower frequency bound for mel filters
Mel High Freq	Float	22050 Hz	Upper frequency bound for mel filters
Mel Freq Names	Toggle	Off	Include frequency ranges in channel names
Log Mel (dB Scale)	Toggle	Off	Convert mel band output to dB scale

MFCC Frequency Bounds

The default 0–11000 Hz covers the full speech/music range. For voice-only analysis, narrow to 80–3400 Hz to exclude sub-bass and high-frequency noise. For full-band analysis, set High Freq to the Nyquist (sampleRate/2).

HFC Type

Masri weights by energy×frequency (default), Jensen by amplitude×frequency² (stronger high-frequency emphasis), Brossier by amplitude×frequency (linear). Jensen and Brossier respond more aggressively to transients in the upper spectrum.

Essentia Tonal
Parameter	Type	Default	Description
Mode	Menu	Realtime	Realtime (per-frame from Spectrum CHOP) / Batch (full-file offline)
FFT Size (Batch)	Menu	4096	Window size for batch FFT — tonal analysis needs high frequency resolution
Hop Size (Batch)	Int	2048	Samples between analysis frames
Compute (Batch)	Pulse	—	Trigger batch analysis
Autocompute (Batch)	Toggle	Off	Auto-recompute when input changes
Pitch Algorithm	Menu	YinFFT	YinFFT / YinProbabilistic
HPCP Size	Menu	12	12 (default, 1 per semitone) / 24 / 36 bins
Enable Pitch	Toggle	On	Enable pitch detection
Pitch Min Freq	Float	20 Hz	Minimum detectable frequency (constrain to instrument range)
Pitch Max Freq	Float	22050 Hz	Maximum detectable frequency
Pitch Tolerance	Float	1.0	Peak detection strictness (lower = fewer octave errors, more unvoiced frames)
Enable HPCP	Toggle	On	Enable chroma output
HPCP Harmonics	Int	0	Harmonic contributions (0 = fundamental only, 3–5 for harmonic instruments)
Reference Freq	Float	440 Hz	Tuning reference (415 = Baroque, 432 = alternative, 440 = standard)
HPCP Non-Linear	Toggle	Off	Apply peak-sharpening post-processing
HPCP Normalized	Menu	Unit Max	Unit Max / Unit Sum / None
Enable Key	Toggle	On	Enable key detection
Key Frames (RT)	Int	8	HPCP frames to average for key detection (1–300) — Realtime only
Key Profile	Menu	Temperley	Temperley (default) / Bgate / Krumhansl / EDMA / Diatonic / Gomez
Peak Threshold	Float	0.00001	Minimum spectral peak magnitude — filters noise-floor peaks from HPCP/Key/Dissonance/Inharmonicity
Peak Max Freq	Float	3500 Hz	Upper frequency limit for spectral peak detection — tonal content lives below 3500 Hz
Enable Dissonance	Toggle	On	Enable dissonance output
Enable Inharmonicity	Toggle	On	Enable inharmonicity output
Musical Labels	Toggle	On	Use note names (A through G#) instead of indices for HPCP channels
Enable Pitch Note	Toggle	Off	Output pitch-to-note-class channel
Smoothing (RT)	Float	0.5	EMA smoothing coefficient (0 = none, 1 = maximum) — Realtime only
Key Mode (Batch)	Menu	Global	Global (single key for whole file) / Windowed (per-frame key) — Batch only
Key Window Size (Batch)	Int	8	HPCP frames for windowed key detection — Batch only, visible when Key Mode = Windowed

Recommended Settings

The defaults follow Essentia’s MusicExtractor recommendations. Here are per-use-case tweaks:

Key Detection (best accuracy): HPCP Size = 36, Key Profile = Temperley (default), HPCP Harmonics = 4, Key Frames = 30–60 (RT), Smoothing = 0.7. For batch, use Global key mode unless tracking modulations.

Pitch Tracking: Pitch Algorithm = YinFFT (default). Constrain frequency range to your source — voice 80–800 Hz, guitar 80–1200 Hz, bass 30–300 Hz. Smoothing = 0.3 for responsiveness.

Chord / Harmony Analysis: HPCP Size = 36, HPCP Harmonics = 8, HPCP Non-Linear = On, Key Profile = EDMA (electronic/dance) or Temperley (pop/rock).

Live Visuals (fast RT): HPCP Size = 12 (default), Musical Labels = On (default), Smoothing = 0.5, Key Frames = 8 (default). Enable Pitch Note for a 0–11 note class output.

Key Profile

Different profiles are tuned for different genres. Temperley (default) is a well-tested general-purpose profile used by Essentia’s MusicExtractor. Bgate works well for polyphonic pop/rock. Krumhansl is the classical music theory standard. EDMA is designed for electronic/dance music. Diatonic is the simplest model. Gomez is optimized for guitar-heavy material.

Pitch Frequency Range

Constraining to instrument-appropriate bands eliminates octave errors. Common ranges: guitar 80–1200 Hz, voice 80–800 Hz, bass 30–300 Hz.

HPCP Harmonics

When set to 0 (default), only the fundamental contributes to chroma. Setting to 3–5 makes HPCP more robust for harmonic instruments (piano, guitar, voice) where overtones reinforce the pitch class. Essentia’s MusicExtractor uses 4 harmonics for key detection and 8 for chord detection.

Peak Threshold & Max Freq

The default threshold (0.00001) and max frequency (3500 Hz) match Essentia’s MusicExtractor. These filter noise-floor peaks and limit analysis to the tonal range where pitch classes are meaningful. Increase threshold for noisier signals; raise max frequency above 3500 only for very high-pitched instruments.

Essentia Rhythm
Parameter	Type	Default	Description
Mode	Menu	Realtime	Realtime (per-frame from Spectrum CHOP) / Batch (full-file offline)
FFT Size (Batch)	Menu	2048	Window size for batch FFT
Hop Size (Batch)	Int	256	Samples between analysis frames (~5.8 ms at 44100 Hz)
Compute (Batch)	Pulse	—	Trigger batch analysis
Autocompute (Batch)	Toggle	Off	Auto-recompute when input changes
Onset Method	Menu	Complex	HFC / Complex / Flux / Mel Flux / RMS / SuperFlux
Onset Sensitivity	Float	0.5	0.0 (rare triggers) – 1.0 (frequent). In batch mode, maps to Onsets algorithm alpha and silence threshold
BPM Min / Max	Int	60 / 180	BPM search range (internally clamped to [40,180] / [60,250] for Essentia algorithms)
Rhythm Method (Batch)	Menu	Degara	Degara / Multi-Feature — RhythmExtractor2013 algorithm variant, Batch only

Onset Method

Complex (default) uses both magnitude and phase for the most accurate general-purpose onset detection. HFC emphasizes high-frequency transients, good for percussive material. Flux measures overall spectral change. Mel Flux applies mel-weighted spectral difference — more robust for harmonic/melodic content. RMS uses simple energy change — fast and reliable for broadband signals. SuperFlux uses TriangularBands + SuperFluxNovelty for music with soft/gradual onsets.

Beat Detection

Realtime: BPM estimation uses Essentia’s TempoTapDegara algorithm, run periodically (~1.5 s) on accumulated onset detection history. BPM is derived from median tick intervals; beat phase is anchored to tick positions for audio-synchronized animation. Batch: RhythmExtractor2013 (auto-resampled to 44100 Hz) with autocorrelation fallback.

Best Quality Settings

Realtime: Onset Method = Complex, upstream SpectrumCHOP at FFT 2048 / Hop 256. Batch: Onset Method = Complex, Rhythm Method = Degara, FFT 2048, Hop 256 (all defaults). For both modes: set the narrowest BPM range that covers your material (e.g., 100–140 for house, 80–160 for pop). Use SuperFlux for music with soft/gradual onsets.

Essentia Loudness
Parameter	Type	Default	Description
Mode	Menu	Realtime	Realtime (per-frame via ring buffer) / Batch (full-file offline)
Compute (Batch)	Pulse	—	Trigger batch analysis
Autocompute (Batch)	Toggle	Off	Auto-recompute when input changes
Frame Size	Menu	1024	512 / 1024 / 2048
Gate Threshold	Float	-70 dB	Absolute gate for EBU R128 integration
Normalize	Toggle	Off	Map dB outputs to 0–1 range
dB Floor	Float	-60 dB	Lower bound for normalization (enabled when Normalize is on)
dB Ceiling	Float	0 dB	Upper bound for normalization (enabled when Normalize is on)
ZCR Threshold	Float	0	Dead-band around zero for ZCR (0–0.1). Increase to filter noise-floor chatter on quiet signals

BUILD FROM SOURCE

Build & Install

Prerequisites
Essentia static library (essentia.lib, MSVC x64) — see building-essentia.md
TouchDesigner C++ SDK headers (CHOP_CPlusPlusBase.h, CPlusPlus_Common.h)

Build

cd src

cmake -B build -G "Visual Studio 17 2022" -A x64

cmake --build build --config Release

Deploy

Copy all 5 DLLs from src/build/Release/ to your TouchDesigner Plugins folder:

C:/Users/<user>/Documents/Derivative/Plugins/

Note — All operators appear in TouchDesigner's OP Create Dialog under their registered names. Restart TouchDesigner after copying new DLLs.