Tutorial & Reference Guide
The Data Folding Synthesizer treats images as audio and audio as images. It's based on the principle that both are just sequences of numbers—pixels and samples are interchangeable when you understand their relationship.
A grayscale value of 127 (middle gray) represents silence. Values below 127 produce positive amplitude (peaks), values above 127 produce negative amplitude (troughs). This creates a direct visual-sonic correspondence: what you see is what you hear.
The first screen asks you to select the width and height of your canvas.
Width determines which frequencies are available. The tool only offers widths that are divisors of 44,100 (the sample rate). This ensures that waveforms align perfectly with row boundaries, creating clean tones.
Height determines duration:
For example, a 441×441 image = 194,481 samples = ~4.41 seconds of audio.
After confirming dimensions, you'll see:
The synthesizer works with 5 channels:
| Channel | Default Routing | Description |
|---|---|---|
| R (Red) | Right speaker | Standard color channel |
| G (Green) | Left speaker | Standard color channel |
| B (Blue) | Center (both) | Standard color channel |
| A (Alpha) | Envelope | Controls volume of RGB |
| M (Mono) | Center | Grayscale working channel |
Each channel can be routed to: LEFT, RIGHT, CENTER, or NONE. This creates stereo separation—put different frequencies in R and G for spatial effects.
When "Alpha = envelope" is checked (default), the A channel controls the volume of the RGB channels:
This lets you create amplitude envelopes (fade in/out, tremolo, gating) by drawing in the Alpha channel.
The available frequencies are determined by your canvas width. A frequency is created when a waveform's wavelength (λ) divides evenly into the width.
For example, with width 441:
Frequencies are grouped into:
Click and drag on the canvas to select a rectangular region. The selection info shows dimensions and duration. Click ALL to select the entire canvas.
Fill selections with periodic waveforms:
Two fill options:
Fill selections with amplitude gradients (great for the Alpha channel):
Direction: HORIZONTAL (across time within rows) or VERTICAL (across rows/time).
The amplitude slider (0–100%) controls the intensity of fills. Lower amplitude = quieter tones.
Effects modify the actual pixel data—they're "baked in" to the image. What you see IS what you hear.
Applies soft clipping (tanh saturation) to push values toward extremes.
Visually: Increases contrast. Sonically: Adds harmonics, creates grit.
Copies and offsets the signal backward in time.
Visually: Creates offset copies/echoes of patterns. Apply multiple times for cascading echoes.
Multi-tap delay algorithm creating diffuse reflections.
Visually: Creates smeared/ghosted trails.
During playback, a magenta highlight shows the current row being read, and a vertical line shows the exact sample position.
Toggle GRID to show horizontal yellow lines marking each second boundary. This helps visualize temporal structure.
Click ↑ to load any image (PNG, JPG, etc.). The image is scaled to fit your canvas dimensions. RGBA channels are mapped directly—sonify your photographs!
Click ↓ in the toolbar to export the image as raw RGBA data.
To open in GIMP: File → Open → Select file → Import as "Raw image data" → Set width/height, RGBA, 8-bit.
Click ↓ next to playback controls to export stereo audio.
To open in Audacity: File → Import → Raw Data → Unsigned 8-bit, 2 channels (stereo), 44100 Hz.
Example 1: Your First Tone
Example 2: Stereo Chord
Example 3: Fade In/Out Envelope
Example 4: Delay Effect
Example 5: Sonify a Photo