ESSENTIA CHOP SUITE

Real-time & Batch Audio Analysis for TouchDesigner

Five C++ CHOP plugins powered by Essentia exposing spectrum analysis, mel bands, MFCCs, pitch detection, key estimation, onset/BPM tracking, and EBU R128 loudness metering — all running natively inside TouchDesigner. Each operator supports both real-time per-frame analysis and offline batch processing of entire audio files.

WHAT IT DOES

Five Operators, Two Modes

Each analysis operator (Spectral, Tonal, Rhythm, Loudness) has a Mode parameter that switches between real-time per-frame analysis and offline batch processing. The fifth operator, Spectrum, provides the shared FFT for real-time mode.

Audio flows through two independent paths in Realtime mode:

Spectral path (Realtime) — Audio CHOP feeds Essentia Spectrum, which computes the FFT once. Essentia Spectral, Essentia Tonal, and Essentia Rhythm all read from the shared spectrum output. Spectral descriptors, mel bands, pitch, chroma, key, dissonance, onset detection, and BPM estimation all share a single FFT computation per frame.

Direct path (Realtime) — Audio CHOP feeds Essentia Loudness (EBU R128 loudness, RMS energy, zero-crossing rate) independently. This operator works directly on the time-domain audio signal and does not depend on the shared spectrum.
Batch mode — When set to Batch, each operator processes an entire audio buffer (e.g. from a File In CHOP) on a background thread. Each operator handles its own windowing and FFT internally — no Spectrum CHOP needed. Output contains one sample per analysis frame across the full file. Trigger analysis with the Compute pulse or enable Autocompute for automatic re-analysis when the input changes.
Analysis, not visualization — Essentia Spectrum outputs a linear-bin FFT magnitude spectrum designed for downstream analysis algorithms. Its bins are uniformly spaced in Hz, which is what Essentia's algorithms expect but looks bottom-heavy when plotted — most musical detail is crammed into the lower bins. For spectral visualization, use TouchDesigner's built-in Audio Spectrum CHOP, which provides a perceptually scaled output suited for display. Note that TD's Audio Spectrum cannot be used as input to the Essentia analysis CHOPs — they require the linear-bin format that Essentia Spectrum provides.
Mono by design — The suite processes a single audio channel. Stereo analysis would multiply every output channel (e.g., mfcc0_L / mfcc0_R, spectral_centroid_L / spectral_centroid_R), making outputs unwieldy and harder to map in a visual context. For stereo-aware analysis, use a Select CHOP to pick each channel independently and run two separate analysis chains. For most audio-reactive scenarios, the recommended approach is to sum left and right with a Math CHOP (Combine Channels → Average) before feeding into Essentia Spectrum — this preserves the full frequency content of both channels without phase cancellation artifacts.

GETTING STARTED

Installation

Pre-built DLLs — no compilation required. Drop them into TouchDesigner's plugin folder and restart.

Step 1 — Copy the DLLs

Place all 5 plugin DLLs into TouchDesigner's Plugins folder (or any subfolder — TD scans subdirectories):

EssentiaSpectrumCHOP.dll
EssentiaSpectralCHOP.dll
EssentiaTonalCHOP.dll
EssentiaRhythmCHOP.dll
EssentiaLoudnessCHOP.dll

Default plugin path:

C:/Users/<you>/Documents/Derivative/Plugins/

Step 2 — Restart TouchDesigner

TouchDesigner loads plugin DLLs at startup. After copying the files, restart TD for the new operators to appear.

Step 3 — Add the operators

In the OP Create Dialog (Tab key), search for the operator names:

Essentia Spectrum Essentia Spectral Essentia Tonal Essentia Rhythm Essentia Loudness
Tip — For Realtime mode, connect an Audio CHOP to Essentia Spectrum first, then wire the Spectrum output to Spectral, Tonal, and Rhythm. Loudness takes raw audio directly. For Batch mode, connect a File In CHOP directly to any analysis operator — no Spectrum CHOP needed. See the Signal Flow section for details.

OPERATORS

The Plugin Suite

Each operator is a standalone DLL loaded by TouchDesigner as a custom CHOP. Click any card to expand its full specification.

Essentia Spectrum CORE

Computes the FFT magnitude spectrum from raw audio. Outputs a single channel spectrum containing fftSize/2+1 bins as a static sample buffer. This is the shared upstream node — Spectral, Tonal, and Rhythm CHOPs all read from it.

Input

Audio CHOP (mono, first channel used)

Output

1 channel × 513 samples (at FFT 1024) — static indexed buffer, not time-domain

Processing

Reads latest fftSize samples → Windowing (Hann / Hamming / Triangular / Blackman-Harris) → Zero Padding (optional) → Essentia Spectrum algorithm

Parameters

FFT Size Hop Size Window Type Zero Padding
Essentia Spectral ANALYSIS

Computes spectral shape descriptors and mel-frequency band energies. Supports Realtime (per-frame from shared spectrum) and Batch (full-file offline analysis). Each feature group can be independently toggled.

Input

Realtime: Essentia Spectrum CHOP  |  Batch: Audio/File In CHOP (raw audio)

Output

Realtime: 1 sample per channel at frame rate  |  Batch: N samples (one per analysis frame). Channels: mfcc0mfcc12, spectral_centroid, spectral_flux, spectral_rolloff, spectral_contrast0spectral_contrast5, hfc, spectral_complexity, mel0melN

Features

MFCC (timbral fingerprint, 13 coefficients) — Centroid (brightness) — Flux (rate of spectral change) — Rolloff (85% energy frequency) — Contrast (peak-valley difference in 6 bands) — HFC (high-frequency content) — Complexity (number of spectral peaks) — Mel Bands (perceptual frequency band energies)

Parameters

Enable MFCC MFCC Count MFCC Low Freq MFCC High Freq Enable Centroid Enable Flux Flux Half Rectify Flux Norm Enable Rolloff Rolloff Cutoff Enable Contrast Contrast Bands Enable HFC HFC Type Enable Complexity Complexity Threshold Enable Mel Bands Mel Bands Count Mel Low Freq Mel High Freq Mel Freq Names Log Mel (dB Scale) Mode
Essentia Tonal ANALYSIS

Detects pitch, harmonic content, musical key, dissonance, and inharmonicity. Supports Realtime (per-frame with EMA smoothing) and Batch (full-file with global or windowed key detection).

Input

Realtime: Essentia Spectrum CHOP  |  Batch: Audio/File In CHOP (raw audio)

Output

Realtime: 1 sample per channel  |  Batch: N samples per channel. Channels: pitch, pitch_confidence, note_anote_gs (12 bins with Musical Labels on), key, key_scale, key_strength, dissonance, inharmonicity

Features

Pitch (YinFFT algorithm, Hz + confidence) — HPCP (harmonic pitch class profile / chroma, 12/24/36 bins) — Key (key + major/minor + strength) — Dissonance (sensory roughness) — Inharmonicity (deviation from harmonic series)

Parameters

Pitch Algorithm HPCP Size Enable Pitch Pitch Min Freq Pitch Max Freq Pitch Tolerance Enable HPCP HPCP Harmonics Reference Freq HPCP Non-Linear HPCP Normalized Enable Key Key Profile Key Frames Peak Threshold Peak Max Freq Enable Dissonance Enable Inharmonicity Musical Labels Enable Pitch Note Smoothing Mode
Essentia Rhythm ANALYSIS

Detects onsets and estimates tempo. Realtime uses TempoTapDegara BPM with Onsets-style adaptive thresholding. Batch uses RhythmExtractor2013 for full-file BPM and beat tracking.

Input

Realtime: Essentia Spectrum CHOP  |  Batch: Audio/File In CHOP (raw audio)

Output

Realtime: 1 sample per channel  |  Batch: N samples per channel. 6 channels: onset (0/1 trigger), onset_strength, bpm, beat (0/1 trigger), beat_phase (0–1 sawtooth), beat_confidence

Processing

RT: OnsetDetection/SuperFlux → Onsets-style adaptive threshold → onset trigger. ODF history → TempoTapDegara → BPM + tick-anchored beat phase. Batch: RhythmExtractor2013 → BPM/beats + per-frame onset detection via Onsets algorithm.

Parameters

Onset Method Onset Sensitivity BPM Min BPM Max Rhythm Method Mode
Essentia Loudness METERING

EBU R128 loudness metering with momentary, short-term, and integrated measurements, plus RMS energy and zero-crossing rate. Supports Realtime (per-frame via ring buffer) and Batch (full-file offline analysis).

Input

Audio CHOP (mono) — both modes take raw audio directly

Output

Realtime: 1 sample per channel  |  Batch: N samples per channel. 7 channels: loudness (instantaneous dB), loudness_momentary, loudness_shortterm, loudness_integrated, dynamic_range, rms, zcr

Processing

Audio → ring buffer → frame dispatch → Essentia Loudness → dB conversion → sliding windows (momentary 400ms, short-term 3s) → EBU R128 two-pass gating for integrated. RMS and ZCR computed directly from the audio frame.

Parameters

Frame Size Gate Threshold Normalize dB Floor dB Ceiling ZCR Threshold Mode

ARCHITECTURE

Signal Flow

In Realtime mode, audio flows through two paths: the spectral path computes the FFT once and fans out to three analysis domains, while the direct path handles loudness metering. In Batch mode, each operator processes raw audio independently with its own FFT.

Audio CHOP audio input Essentia Spectrum spectrum (513 samples) Essentia Spectral mfcc, centroid, flux, mel bands... Essentia Tonal pitch, hpcp, key... Essentia Rhythm onset, bpm, beat... Essentia Loudness loudness, rms, zcr... SPECTRAL PATH DIRECT PATH
Audio Input
FFT Processing
Spectral Analysis
Loudness & Energy
Batch mode — When any analysis CHOP is set to Mode = Batch, it takes raw audio directly (no Spectrum CHOP needed) and handles its own windowing + FFT internally. The diagram above shows Realtime mode routing only.

CREATIVE APPLICATIONS

Output Use Cases

Every output channel has a purpose. This reference describes what each channel measures and suggests practical ways to use it inside TouchDesigner for audio-reactive visuals, installations, and live performance.

Essentia Spectrum CORE
Channel Range What it does & how to use it
spectrum 0+ (fftSize/2+1 samples) Raw FFT magnitude bins. Visualize as a bar graph, 3D terrain, or heatmap. Use as input for custom analysis. The lower bins represent bass, upper bins represent treble — slice with a Select CHOP to isolate frequency bands.
Essentia Spectral ANALYSIS
Channel Range What it does & how to use it
mfcc0mfcc12 ~−50 to 50 Timbral fingerprint. MFCCs capture the "texture" of a sound independent of pitch. Use them to cluster similar sounds, drive visual style transitions based on tonal character, or distinguish instruments. Feed into a Math CHOP for normalization, then map to shader uniforms.
spectral_centroid 0 – sr/2 Hz Brightness. The "center of mass" of the spectrum. High values mean bright, shimmery sounds; low values mean warm, bassy tones. Map to color temperature (orange ↔ blue), particle speed, or lighting intensity.
spectral_flux 0+ Rate of spectral change. Spikes when the timbre shifts suddenly — a new instrument enters, a filter sweeps, or a transition occurs. Use to trigger visual scene changes, glitch effects, or novelty highlights.
spectral_rolloff 0 – sr/2 Hz Energy distribution edge. The frequency below which 85% of spectral energy lies. Distinguishes bright mixes from dark ones. Control cutoff-style visual filters, blur intensity, or fog density.
spectral_contrast05 unbounded Per-band tonal vs. noisy content. Six sub-bands measuring peak-to-valley ratio. Drive multi-layer visual intensities — assign each band to a separate geometry layer or glow ring for a multi-band reactive sculpture.
hfc 0+ High-frequency content. Sensitive to cymbals, hi-hats, and sibilance. Use for sparkle/shimmer particle effects, percussive high-end triggers, or treble-reactive lighting.
spectral_complexity 0+ Number of spectral peaks. Simple tones (sine, flute) score low; complex sounds (orchestra, noise) score high. Drive visual density — particle count, fractal detail, or geometric subdivision level.
mel0melN 0+ Perceptual frequency band energies. Mel bands approximate human hearing — evenly spaced in perceived pitch, not linear frequency. Ideal for multi-band audio visualizers, frequency-mapped color gradients, per-band particle emitters, or as input features for ML classifiers.
Essentia Tonal ANALYSIS
Channel Range What it does & how to use it
pitch 0+ Hz Fundamental frequency. Detected via YinFFT. Map to note names for generative music notation, control animation speed or vertical position with pitch, or track vocals for lip-sync effects.
pitch_confidence 0 – 1 Pitch reliability. Low during noise or silence, high for clear tonal content. Use as a gate — only apply pitch-driven effects when confidence exceeds a threshold. Crossfade between pitched and unpitched visual modes.
note_anote_gs 0 – 1 Chroma / pitch class energy. 12 bins (one per semitone, A through G#) with Musical Labels on by default. Bin 0 = A (reference frequency). Visualize as a harmony wheel, map each note to a unique color, detect chord changes for scene transitions, or build a real-time piano-roll display. Set HPCP Size to 24 or 36 for finer resolution.
key 0 – 11 Musical key. Encoded integer (0=C, 1=C#, ... 11=B). Assign color palettes or scene themes per key. Drive generative art parameters that shift as the music modulates to a new key.
key_scale 0 or 1 Major or minor. 0 = major, 1 = minor. Use for mood — major keys → warm, bright, expansive palettes; minor keys → cool, dark, constrained palettes. Switch between two visual presets.
key_strength 0 – 1 Key confidence. Higher when the harmonic content strongly matches a key template. Blend strength into color saturation or confidence-gated transitions.
dissonance 0 – 1 Sensory roughness. Measures perceptual dissonance between spectral peaks. Map to visual chaos — glitch intensity, distortion amount, turbulence in fluid simulations, or camera shake.
inharmonicity 0 – 1 Deviation from harmonic series. Low for flutes and voices, high for bells and percussion. Distinguish melodic from percussive sources, or drive material textures (metallic vs. organic).
Essentia Rhythm ANALYSIS
Channel Range What it does & how to use it
onset 0 or 1 Transient trigger. Fires on detected attacks — drum hits, plucks, consonants. Use for particle bursts, flash/strobe triggers, camera cuts, step-sequenced events, or instantiating geometry.
onset_strength 0+ Continuous onset function. The raw detection signal before thresholding. Scale burst intensity or particle count by strength for velocity-sensitive triggers. Smooth with a Lag CHOP for an onset envelope.
bpm BPM min – max Estimated tempo. Sync LFO rates, animation cycle durations, or generative pattern timing to the music. Divide by 60 to get beats-per-second for direct use in Speed parameters.
beat 0 or 1 Beat trigger. Fires on the estimated beat grid. Use for rhythmic pulsing, beat-locked scene transitions, synchronized step-sequencing, or quantized color cycling.
beat_phase 0 – 1 (sawtooth) Position within beat. Ramps linearly from 0 to 1 between beats. Feed into easing curves (Lookup CHOP) for smooth beat-synced animation — bouncing, breathing, pendulum swings, or rhythmic camera motion.
beat_confidence 0 – 1 Beat tracking reliability. Low during ambient or arrhythmic passages, high during steady beats. Use to fade in beat-synced effects only when tracking is stable, or crossfade to an onset-only mode when confidence drops.
Essentia Loudness METERING
Channel Range What it does & how to use it
loudness dB (~−100 to 0) Instantaneous perceived loudness. The fastest-responding level signal. Use for frame-level reactive scaling — VU-meter needle, geometry size, or opacity. Responds immediately to transients.
loudness_momentary dB EBU R128 momentary (400 ms). A smoothed level that tracks phrase-level dynamics without jitter. Use for dynamic scaling of visual elements, responsive but stable brightness, or gain-riding effects.
loudness_shortterm dB EBU R128 short-term (3 s). Captures the current "energy zone" of the track. Use for scene-level intensity — ambient lighting adjustments, macro-level background color, or fog density that follows the song's structure.
loudness_integrated dB Running gated average. EBU R128 integrated loudness with absolute + relative gating. Use for overall show level monitoring, normalization reference, or auto-gain to keep visuals consistent across tracks of different loudness.
dynamic_range dB (0+) Loudness swing over 3 s. Peak-to-valley of the short-term window. Detects builds and drops — high values mean the audio is moving between quiet and loud. Drive contrast-based transitions, tension/release mapping, or drop-triggered explosions.
rms 0 – 1 Root mean square energy. The classic "audio reactive" signal — simple, linear amplitude. Map directly to geometry scale, opacity, displacement amount, or any parameter that should pulse with the music's energy.
zcr 0 – 1 Zero-crossing rate. Measures how "noisy" vs. "tonal" the audio is. High ZCR = noise, percussion, sibilance. Low ZCR = clean pitched tones. Use to drive grain/static effects, distinguish drums from melody, or control visual roughness.

CONFIGURATION

Parameters Reference

Essentia Spectrum
Parameter Type Default Options / Range Description
FFT Size Menu 1024 512 / 1024 / 2048 / 4096 / 8192 / 16384 Window size for FFT computation
Hop Size Int 512 64 – 16384 Samples between analysis frames
Window Type Menu Hann Hann / Hamming / Triangular / Blackman-Harris 62/70/74/92 Window function applied before FFT
Zero Padding Menu None None / Half FFT / Full FFT Interpolates the spectrum for better frequency resolution

FFT Size & Quality Tradeoff

Larger FFT sizes improve frequency resolution (more bins, better at distinguishing close pitches) at the cost of time resolution (each frame covers more audio, smearing transients). For tonal analysis (pitch, key, HPCP), 2048–4096 is the sweet spot. For rhythm/onset detection, 1024 responds faster to transients. Going beyond 8192 has diminishing returns and adds latency. The default 1024 is a good general-purpose balance.

Window Type

Each window trades main-lobe width for side-lobe suppression. Hann is a good general default. The Blackman-Harris variants offer progressively stronger side-lobe suppression (62/70/74/92 dB) at the cost of wider main lobes — the 74 dB variant is often preferred for tonal analysis.

Zero Padding

Appends zeros to the windowed frame before FFT, which interpolates spectral bins without changing frequency resolution. This improves the accuracy of peak-based descriptors (centroid, rolloff, pitch) and produces smoother spectrum plots. “Half FFT” adds fftSize/2 zeros; “Full FFT” doubles the frame.

Essentia Spectral
Parameter Type Default Description
Mode Menu Realtime Realtime (per-frame from Spectrum CHOP) / Batch (full-file offline)
FFT Size (Batch) Menu 1024 Window size for batch FFT — only visible in Batch mode
Hop Size (Batch) Int 512 Samples between analysis frames — only visible in Batch mode
Window Type (Batch) Menu Hann Window function for batch FFT — only visible in Batch mode
Compute (Batch) Pulse Trigger batch analysis — only visible in Batch mode
Autocompute (Batch) Toggle Off Auto-recompute when input changes — only visible in Batch mode
Enable MFCC Toggle On Enable/disable MFCC output channels
MFCC Count Int 13 Number of MFCC coefficients (1–20)
MFCC Low Freq Float 0 Hz Lower frequency bound for MFCC mel filters
MFCC High Freq Float 11000 Hz Upper frequency bound for MFCC mel filters
Enable Centroid Toggle On Enable spectral centroid
Enable Flux Toggle Off Enable spectral flux
Flux Half Rectify Toggle Off Only count energy increases (onset emphasis)
Flux Norm Menu L2 L1 or L2 norm for difference computation
Enable Rolloff Toggle Off Enable spectral rolloff
Rolloff Cutoff Float 0.85 Energy fraction threshold (0.5 = median, 0.85 = standard, 0.95 = brightness)
Enable Contrast Toggle Off Enable spectral contrast
Contrast Bands Menu 6 Number of octave sub-bands (4 / 6 / 8)
Enable HFC Toggle On Enable high-frequency content
HFC Type Menu Masri Masri / Jensen / Brossier — different HFC formulations
Enable Complexity Toggle On Enable spectral complexity
Complexity Threshold Float 0.005 Minimum peak magnitude to count (0–0.1)
Enable Mel Bands Toggle On Enable mel band output channels
Mel Bands Count Menu 40 24 / 40 / 60 / 80 / 128
Mel Low Freq Float 0 Hz Lower frequency bound for mel filters
Mel High Freq Float 22050 Hz Upper frequency bound for mel filters
Mel Freq Names Toggle Off Include frequency ranges in channel names
Log Mel (dB Scale) Toggle Off Convert mel band output to dB scale

MFCC Frequency Bounds

The default 0–11000 Hz covers the full speech/music range. For voice-only analysis, narrow to 80–3400 Hz to exclude sub-bass and high-frequency noise. For full-band analysis, set High Freq to the Nyquist (sampleRate/2).

HFC Type

Masri weights by energy×frequency (default), Jensen by amplitude×frequency² (stronger high-frequency emphasis), Brossier by amplitude×frequency (linear). Jensen and Brossier respond more aggressively to transients in the upper spectrum.

Essentia Tonal
Parameter Type Default Description
Mode Menu Realtime Realtime (per-frame from Spectrum CHOP) / Batch (full-file offline)
FFT Size (Batch) Menu 4096 Window size for batch FFT — tonal analysis needs high frequency resolution
Hop Size (Batch) Int 2048 Samples between analysis frames
Compute (Batch) Pulse Trigger batch analysis
Autocompute (Batch) Toggle Off Auto-recompute when input changes
Pitch Algorithm Menu YinFFT YinFFT / YinProbabilistic
HPCP Size Menu 12 12 (default, 1 per semitone) / 24 / 36 bins
Enable Pitch Toggle On Enable pitch detection
Pitch Min Freq Float 20 Hz Minimum detectable frequency (constrain to instrument range)
Pitch Max Freq Float 22050 Hz Maximum detectable frequency
Pitch Tolerance Float 1.0 Peak detection strictness (lower = fewer octave errors, more unvoiced frames)
Enable HPCP Toggle On Enable chroma output
HPCP Harmonics Int 0 Harmonic contributions (0 = fundamental only, 3–5 for harmonic instruments)
Reference Freq Float 440 Hz Tuning reference (415 = Baroque, 432 = alternative, 440 = standard)
HPCP Non-Linear Toggle Off Apply peak-sharpening post-processing
HPCP Normalized Menu Unit Max Unit Max / Unit Sum / None
Enable Key Toggle On Enable key detection
Key Frames (RT) Int 8 HPCP frames to average for key detection (1–300) — Realtime only
Key Profile Menu Temperley Temperley (default) / Bgate / Krumhansl / EDMA / Diatonic / Gomez
Peak Threshold Float 0.00001 Minimum spectral peak magnitude — filters noise-floor peaks from HPCP/Key/Dissonance/Inharmonicity
Peak Max Freq Float 3500 Hz Upper frequency limit for spectral peak detection — tonal content lives below 3500 Hz
Enable Dissonance Toggle On Enable dissonance output
Enable Inharmonicity Toggle On Enable inharmonicity output
Musical Labels Toggle On Use note names (A through G#) instead of indices for HPCP channels
Enable Pitch Note Toggle Off Output pitch-to-note-class channel
Smoothing (RT) Float 0.5 EMA smoothing coefficient (0 = none, 1 = maximum) — Realtime only
Key Mode (Batch) Menu Global Global (single key for whole file) / Windowed (per-frame key) — Batch only
Key Window Size (Batch) Int 8 HPCP frames for windowed key detection — Batch only, visible when Key Mode = Windowed

Recommended Settings

The defaults follow Essentia’s MusicExtractor recommendations. Here are per-use-case tweaks:

Key Detection (best accuracy): HPCP Size = 36, Key Profile = Temperley (default), HPCP Harmonics = 4, Key Frames = 30–60 (RT), Smoothing = 0.7. For batch, use Global key mode unless tracking modulations.

Pitch Tracking: Pitch Algorithm = YinFFT (default). Constrain frequency range to your source — voice 80–800 Hz, guitar 80–1200 Hz, bass 30–300 Hz. Smoothing = 0.3 for responsiveness.

Chord / Harmony Analysis: HPCP Size = 36, HPCP Harmonics = 8, HPCP Non-Linear = On, Key Profile = EDMA (electronic/dance) or Temperley (pop/rock).

Live Visuals (fast RT): HPCP Size = 12 (default), Musical Labels = On (default), Smoothing = 0.5, Key Frames = 8 (default). Enable Pitch Note for a 0–11 note class output.

Key Profile

Different profiles are tuned for different genres. Temperley (default) is a well-tested general-purpose profile used by Essentia’s MusicExtractor. Bgate works well for polyphonic pop/rock. Krumhansl is the classical music theory standard. EDMA is designed for electronic/dance music. Diatonic is the simplest model. Gomez is optimized for guitar-heavy material.

Pitch Frequency Range

Constraining to instrument-appropriate bands eliminates octave errors. Common ranges: guitar 80–1200 Hz, voice 80–800 Hz, bass 30–300 Hz.

HPCP Harmonics

When set to 0 (default), only the fundamental contributes to chroma. Setting to 3–5 makes HPCP more robust for harmonic instruments (piano, guitar, voice) where overtones reinforce the pitch class. Essentia’s MusicExtractor uses 4 harmonics for key detection and 8 for chord detection.

Peak Threshold & Max Freq

The default threshold (0.00001) and max frequency (3500 Hz) match Essentia’s MusicExtractor. These filter noise-floor peaks and limit analysis to the tonal range where pitch classes are meaningful. Increase threshold for noisier signals; raise max frequency above 3500 only for very high-pitched instruments.

Essentia Rhythm
Parameter Type Default Description
Mode Menu Realtime Realtime (per-frame from Spectrum CHOP) / Batch (full-file offline)
FFT Size (Batch) Menu 2048 Window size for batch FFT
Hop Size (Batch) Int 256 Samples between analysis frames (~5.8 ms at 44100 Hz)
Compute (Batch) Pulse Trigger batch analysis
Autocompute (Batch) Toggle Off Auto-recompute when input changes
Onset Method Menu Complex HFC / Complex / Flux / Mel Flux / RMS / SuperFlux
Onset Sensitivity Float 0.5 0.0 (rare triggers) – 1.0 (frequent). In batch mode, maps to Onsets algorithm alpha and silence threshold
BPM Min / Max Int 60 / 180 BPM search range (internally clamped to [40,180] / [60,250] for Essentia algorithms)
Rhythm Method (Batch) Menu Degara Degara / Multi-Feature — RhythmExtractor2013 algorithm variant, Batch only

Onset Method

Complex (default) uses both magnitude and phase for the most accurate general-purpose onset detection. HFC emphasizes high-frequency transients, good for percussive material. Flux measures overall spectral change. Mel Flux applies mel-weighted spectral difference — more robust for harmonic/melodic content. RMS uses simple energy change — fast and reliable for broadband signals. SuperFlux uses TriangularBands + SuperFluxNovelty for music with soft/gradual onsets.

Beat Detection

Realtime: BPM estimation uses Essentia’s TempoTapDegara algorithm, run periodically (~1.5 s) on accumulated onset detection history. BPM is derived from median tick intervals; beat phase is anchored to tick positions for audio-synchronized animation. Batch: RhythmExtractor2013 (auto-resampled to 44100 Hz) with autocorrelation fallback.

Best Quality Settings

Realtime: Onset Method = Complex, upstream SpectrumCHOP at FFT 2048 / Hop 256. Batch: Onset Method = Complex, Rhythm Method = Degara, FFT 2048, Hop 256 (all defaults). For both modes: set the narrowest BPM range that covers your material (e.g., 100–140 for house, 80–160 for pop). Use SuperFlux for music with soft/gradual onsets.

Essentia Loudness
Parameter Type Default Description
Mode Menu Realtime Realtime (per-frame via ring buffer) / Batch (full-file offline)
Compute (Batch) Pulse Trigger batch analysis
Autocompute (Batch) Toggle Off Auto-recompute when input changes
Frame Size Menu 1024 512 / 1024 / 2048
Gate Threshold Float -70 dB Absolute gate for EBU R128 integration
Normalize Toggle Off Map dB outputs to 0–1 range
dB Floor Float -60 dB Lower bound for normalization (enabled when Normalize is on)
dB Ceiling Float 0 dB Upper bound for normalization (enabled when Normalize is on)
ZCR Threshold Float 0 Dead-band around zero for ZCR (0–0.1). Increase to filter noise-floor chatter on quiet signals

BUILD FROM SOURCE

Build & Install

Prerequisites
Essentia static library (essentia.lib, MSVC x64) — see building-essentia.md
TouchDesigner C++ SDK headers (CHOP_CPlusPlusBase.h, CPlusPlus_Common.h)

Build

cd src
cmake -B build -G "Visual Studio 17 2022" -A x64
cmake --build build --config Release

Deploy

Copy all 5 DLLs from src/build/Release/ to your TouchDesigner Plugins folder:

C:/Users/<user>/Documents/Derivative/Plugins/
Note — All operators appear in TouchDesigner's OP Create Dialog under their registered names. Restart TouchDesigner after copying new DLLs.