Buzzer – Comprehensive Technical Documentation

Extreme verbose version of README.md. This document covers every subsystem, design decision, challenge, limitation, and implementation detail of the Buzzer project — an 8‑voice polyphonic wavetable synthesizer built on an Arduino Uno.

Project Overview
Hardware Architecture
How Mozzi Works
Full Audio Signal Path
Sound Engine Deep Dive
Envelope System (ADSR)
Pitch Transpose System
Voice Allocation & Stealing
Gain Staging & Anti‑Clipping
Wavetables
Recording System
Melody Player
Serial Protocol
Hardware I/O
Python GUI
Timing System
Performance Optimisations
Challenges & Solutions
Limitations
Sound Quality Notes
Resource Usage
File Reference

1. Project Overview

Buzzer is a polyphonic synthesizer that runs entirely on an Arduino Uno (ATmega328P) — a microcontroller with 16 MHz clock, 2 KB of SRAM, 32 KB of Flash, and 1 KB of EEPROM. Despite these severe constraints, it achieves:

8 simultaneous voices with independent ADSR envelopes
15 selectable waveforms from 512‑sample wavetables
Real‑time pitch transposition via hardware potentiometer
EEPROM‑based recording and playback
13 built‑in melodies stored in PROGMEM
A full Python tkinter GUI communicating over 2 Mbaud serial

The audio engine is Mozzi, an open‑source library that generates audio on AVR microcontrollers using high‑frequency PWM. In HIFI mode (MOZZI_OUTPUT_2PIN_PWM), Mozzi splits 14‑bit audio across two pins (9 and 10), each carrying 7 bits, combined via a resistor network. The 125 kHz PWM carrier is well above audible range, and a 2‑pole RC filter further attenuates it by ~36 dB. Mozzi handles timer configuration, interrupt‑driven sample output, and provides building blocks like oscillators, envelopes, and filters.

2. Hardware Architecture

Pin Assignment

Pin	Function	Type	Details
9	Mozzi audio output (high bits)	PWM (Timer1)	Reserved — cannot be used for anything else. Mozzi drives this pin with the high 7 bits of 14‑bit HIFI audio at 16,384 Hz sample rate.
10	Mozzi audio output (low bits)	PWM (Timer1)	Reserved — cannot be used for anything else. Carries the low 7 bits. Combined with pin 9 via HIFI resistor network.
A0	Record button	Digital input (INPUT_PULLUP)	Active LOW. Starts EEPROM recording.
A1	Stop button	Digital input (INPUT_PULLUP)	Active LOW. Stops recording or playback.
A2	Play button	Digital input (INPUT_PULLUP)	Active LOW. Plays back EEPROM recording.
A3	Clear button	Digital input (INPUT_PULLUP)	Active LOW. Clears EEPROM recording data.
A5	Pitch potentiometer	Analog input	10 kΩ linear pot. Read via `mozziAnalogRead()`. Maps 0–1023 → ±24 semitones.
3	RGB LED (Red)	PWM output	Common cathode RGB. Brightness 0–40 (USB power safety).
5	RGB LED (Green)	PWM output	Colour derived from pitch (hue mapping).
6	RGB LED (Blue)	PWM output	Brightness modulated by envelope level.
4	Feedback LED	Digital output	Flashes briefly on button press (~150 ms).
7	Record status LED	Digital output	Steady ON while recording is active.
8	Play status LED	Digital output	Steady ON during playback.
11	TM1637 CLK	Digital output	7‑segment display clock line.
12	TM1637 DIO	Digital output	7‑segment display data line. Shows note name + octave (e.g. "C 4").

Critical Hardware Interaction: Mozzi vs Buttons

When startMozzi() is called, Mozzi writes to the DIDR0 register to disable digital input buffers on all analog pins (A0–A5). This is an ADC noise reduction feature. However, the four buttons are wired to A0–A3 and require digitalRead(). The fix:

// After startMozzi():
DIDR0 &= ~(0x0F);  // Re-enable digital input buffers for A0-A3

Without this, all four buttons would read as LOW permanently.

Audio Output Circuit

The Arduino Uno has no DAC (Digital‑to‑Analog Converter). Mozzi outputs audio as high‑frequency PWM. In HIFI mode (MOZZI_OUTPUT_2PIN_PWM), the 14‑bit audio sample is split across two pins: pin 9 carries the high 7 bits and pin 10 carries the low 7 bits, each at a 125 kHz PWM carrier rate. A resistor network sums them with a 128:1 weighting ratio, and a 2‑pole RC filter attenuates the PWM carrier by ~36 dB.

This provides ~84 dB of dynamic range (14 bits) compared to ~48 dB from standard single‑pin 8‑bit PWM — a massive improvement in audio quality, especially in quiet passages where quantisation noise was previously dominant.

The current circuit routes the combined HIFI signal through a 10 kΩ volume potentiometer, then into a PN2222 common‑emitter amplifier driving a passive piezo buzzer.

Signal Chain

flowchart LR
    P9["Pin 9\n(high 7 bits)"] --> HIFI["Stage 1\nHIFI RC Filter"]
    P10["Pin 10\n(low 7 bits)"] --> HIFI
    HIFI --> POT["Stage 2\nVolume Pot 10kΩ"]
    POT --> AMP["Stage 3\nPN2222 Amp"]
    AMP --> BUZ["Passive Buzzer"]

    style P9 fill:#4a9eff,color:#fff
    style P10 fill:#4a9eff,color:#fff
    style HIFI fill:#ff9f43,color:#fff
    style POT fill:#2ed573,color:#fff
    style AMP fill:#e056fd,color:#fff
    style BUZ fill:#1e90ff,color:#fff

HIFI mode uses Timer1 AND Timer2. This means pins 9, 10 (Timer1) and 11 (Timer2) lose analogWrite(). Pin 11 is used for TM1637 CLK via bit‑bang, so is unaffected.

Stage 1: HIFI Resistor Network + 2‑Pole RC Filter

Combines pin 9 (high byte) and pin 10 (low byte) into a single analog signal. Target ratio R_low/R_high ≈ 128. Actual: 500 kΩ / 4 kΩ = 125 (2.3% error, <0.05 bit loss).

flowchart LR
    P9["Pin 9 high 7 bits"]
    P10["Pin 10 low 7 bits"]
    R1["2kΩ"]
    R2["2kΩ"]
    R3["1MΩ"]
    R4["1MΩ"]
    C1["0.01µF ceramic"]
    C2["0.01µF ceramic"]
    A(("Node A to Pot"))
    G1["GND"]
    G2["GND"]

    P9 --- R1 --- C1 --- G1
    R1 --- R2 --- A
    R2 --- C2 --- G2
    P10 --- R3 --- A
    P10 --- R4 --- A

    style P9 fill:#4a9eff,color:#fff
    style P10 fill:#4a9eff,color:#fff
    style A fill:#ff9f43,color:#fff
    style G1 fill:#555,color:#fff
    style G2 fill:#555,color:#fff

Parts: 2× 2 kΩ (series = 4 kΩ), 2× 1 MΩ (parallel = 500 kΩ), 2× 0.01 µF ceramic. Filter: 2‑pole RC, −40 dB/decade rolloff. Attenuates 125 kHz PWM carrier by ~36 dB.

Stage 2: Volume Control (10 kΩ Potentiometer)

Simple voltage divider between the filtered HIFI audio signal and ground.

flowchart LR
    P9["Pin 9"] --> POT_1["Pot Pin 1"]
    POT_1 --- POT_W["Pot Wiper"] --> AMP["To Amp"]
    POT_1 --- POT_3["Pot Pin 3"] --- GND["GND"]

    style P9 fill:#4a9eff,color:#fff
    style POT_1 fill:#2ed573,color:#fff
    style POT_W fill:#2ed573,color:#fff
    style POT_3 fill:#2ed573,color:#fff
    style GND fill:#555,color:#fff
    style AMP fill:#e056fd,color:#fff

Stage 3: PN2222 Common‑Emitter Amplifier

Use a PASSIVE buzzer (no internal oscillator). Active buzzers only beep at one fixed frequency. Passive piezos are non‑polarized.

Bias: 10 kΩ / 2 kΩ divider → V_base ≈ 0.83 V, V_emitter ≈ 0.23 V, I_C ≈ 2.3 mA
AC Gain: ~90× (100 µF bypass cap shorts R_E for AC, leaving only transistor r_e)
R_C (1 kΩ) and Buzzer are in parallel between +5 V and Collector

flowchart TD
    VCC["+5V"]
    WIPER["Pot wiper"]

    R1["10kΩ bias top"]
    R2["2kΩ bias bottom"]
    CIN["10µF coupling cap"]
    RC["1kΩ collector load"]
    BUZ["Passive Buzzer"]
    RE["100Ω emitter"]
    CBYP["100µF bypass cap"]

    QB(("B"))
    QC(("C"))
    QE(("E"))
    QL["PN2222"]

    G1["GND"]
    G2["GND"]
    G3["GND"]

    VCC --- R1 --- QB
    QB --- R2 --- G1
    WIPER --- CIN --- QB
    QB --- QL
    QL --- QC
    QL --- QE
    VCC --- RC --- QC
    VCC --- BUZ --- QC
    QE --- RE --- G2
    RE --- CBYP --- G3

    style VCC fill:#ff4757,color:#fff
    style QL fill:#ffa502,color:#fff
    style QB fill:#ffa502,color:#fff
    style QC fill:#ffa502,color:#fff
    style QE fill:#ffa502,color:#fff
    style WIPER fill:#e056fd,color:#fff
    style BUZ fill:#1e90ff,color:#fff
    style G1 fill:#555,color:#fff
    style G2 fill:#555,color:#fff
    style G3 fill:#555,color:#fff

Parallel connection: Both R_C (1 kΩ) and Buzzer have one leg on +5 V and the other on Collector. R_C provides the DC bias path. The AC voltage swing at the collector drives the piezo.

Electrolytic capacitor polarity (stripe = negative):

Cap	+ leg connects to	− leg connects to
10 µF coupling	Pot wiper (~2.5 V DC)	Base (~0.83 V DC)
100 µF bypass	Emitter (~0.23 V DC)	GND (0 V)

Rule: + leg always toward higher DC voltage.

Bill of Materials (Audio Circuit)

Part	Qty	Role
2 kΩ resistor	2	HIFI R_high (series = 4 kΩ)
1 MΩ resistor	2	HIFI R_low (parallel = 500 kΩ)
0.01 µF ceramic cap	2	2‑pole PWM filter
10 kΩ potentiometer	1	Volume control
10 kΩ resistor	1	Bias divider top
2 kΩ resistor	1	Bias divider bottom
1 kΩ resistor	1	Collector load
100 Ω resistor	1	Emitter degeneration
10 µF electrolytic	1	Input coupling
100 µF electrolytic	1	Emitter bypass
PN2222 transistor	1	Amplifier
Passive buzzer	1	Speaker output

3. How Mozzi Works

Mozzi is a real‑time audio synthesis library for Arduino. Understanding its architecture is essential to understanding this project.

Timer Takeover

Mozzi takes control of hardware timers:

Timer1 — drives the PWM output on Pins 9 and 10. In HIFI mode, Timer1 outputs the high 7 bits on pin 9 and low 7 bits on pin 10, each at 125 kHz PWM carrier rate. An interrupt at MOZZI_AUDIO_RATE (16,384 Hz) calls updateAudio() to fetch the next sample.
Timer2 — claimed by Mozzi in HIFI mode for the second PWM channel (pin 10). Not available for user code.
Timer0 — Mozzi reconfigures Timer0, which breaks millis(), delay(), and micros(). This is why the Buzzer project uses its own tick‑based timing system instead.

The Two Callbacks

Mozzi operates on a two‑rate system:

updateAudio() — called at ~16,384 Hz (MOZZI_AUDIO_RATE). Must return one audio sample. This runs inside a timer interrupt, so it must be extremely fast (budget: ~61 µs per call, but realistically <30 µs to leave headroom). No Serial, no analogRead(), no digitalWrite() — only pure math.
updateControl() — called at MOZZI_CONTROL_RATE (64 Hz in this project). This is where "slow" operations happen: reading the potentiometer, updating ADSR envelopes, polling buttons, parsing serial commands. Budget is more generous (~15.6 ms per call) but must still avoid blocking.

The Main Loop

Mozzi requires loop() to contain only audioHook():

void loop() {
  audioHook();
}

audioHook() is a non‑blocking scheduler that manages the timing between updateControl() calls and fills the audio output buffer.

Single Compilation Unit Constraint

<Mozzi.h> (or the older <MozziGuts.h>) can only be #include'd in one .cpp file (main.cpp). Other source files must use <MozziHeadersOnly.h> to access Mozzi types and utilities. This is because Mozzi.h defines ISR handlers and global state that would cause linker errors if included multiple times.

This constraint means:

mozziAnalogRead() is only available in main.cpp
startMozzi() and audioHook() are only callable from main.cpp
sound.cpp uses <MozziHeadersOnly.h> for Oscil, ADSR, Smooth etc.

4. Full Audio Signal Path

Here is the complete journey of a single audio sample from note trigger to speaker output:

Step 1: Note Trigger

A note arrives via serial (GUI/MIDI) or melody player as a frequency in Hz. Sound::noteOn(freq) is called.

Step 2: Instrument Offset

The raw frequency is adjusted by the current instrument's octave shift. The shift is stored as semitones (multiples of 12) and precomputed as a bit‑shift count at instrument‑load time to avoid division in the hot path.

freq = applyInstrument(freq)  // bit-shift by cached octave offset

Step 3: Voice Allocation

A free voice is obtained from the voice free stack in O(1). If all 8 voices are busy, the oldest voice is stolen (linear scan).

Step 4: Pitch Transpose

The hardware pot's current transpose value (±24 semitones) is applied using a PROGMEM lookup table of fixed‑point frequency ratios:

shiftedFreq = (freq * SEMITONE_RATIO[transpose + 24]) >> 8

This avoids floating‑point math (pow(2, semitones/12.0)) which would take ~2000 cycles on AVR.

Step 5: Oscillator Setup

The voice's Oscil<512, MOZZI_AUDIO_RATE> oscillator is configured with the transposed frequency. The oscillator performs phase‑accumulator wavetable lookup — it steps through a 512‑sample wavetable at a rate determined by the frequency.

Step 6: ADSR Envelope

The voice's ADSR<MOZZI_CONTROL_RATE, MOZZI_AUDIO_RATE> envelope is configured with the current instrument's parameters and noteOn() is called. The ADSR progresses through Attack → Decay → Sustain → Release phases, outputting a value 0–255.

Step 7: Sample Generation (`updateAudio()`, ~16 kHz)

Inside the audio interrupt, for each active voice:

int16_t sample = voice.osc.next();     // Wavetable lookup: -128 to +127
uint16_t envVal = voice.env.next();    // ADSR output: 0 to 255
mix += (int16_t)(sample * envVal);     // -32640 to +32385

The sample * envVal multiplication uses the AVR's native 8×8→16 MULS instruction (2 cycles), avoiding the expensive 32‑bit multiply that the compiler would otherwise generate.

Step 8: Voice Mixing

All active voices are summed into a single int32_t mix accumulator. Only active voices are processed, identified via an 8‑bit bitmask (activeVoiceMask) — inactive voices are skipped without even checking their struct.

Step 9: Gain Scaling

The raw mix is scaled by a smoothed gain factor:

int16_t gainDelta = (int16_t)targetGain - (int16_t)smoothedGain;
smoothedGain += gainDelta >> 2;
mix = (mix * smoothedGain) >> 8;

targetGain is a voice‑count‑dependent value updated at control rate (64 Hz), boosted ~25% for HIFI's cleaner 14‑bit output:

1 voice: 320 (1.25×)
2 voices: 240
3 voices: 200
4 voices: 160
5–6 voices: 120
7–8 voices: 100

The integer IIR filter (>>2 coefficient, ~16 Hz cutoff at 64 Hz rate) ensures gain transitions happen gradually to prevent audible pops when voices start/stop.

Critical lesson learned: An earlier attempt applied Smooth<int>(0.95f) directly to audio samples at 16 kHz. This created a 134 Hz low‑pass filter that destroyed everything above 134 Hz — extreme volume loss and distortion. Smoothing must only be applied to control signals at control rate. The current integer IIR approach eliminates the Smooth library dependency entirely.

Step 10: Hard Clipping

if (mix > 32767) mix = 32767;
else if (mix < -32767) mix = -32767;

Step 11: Mozzi Output

The final sample is returned as MonoOutput::from16Bit(mix), which Mozzi splits into high and low 7‑bit values. The high byte is loaded into Timer1’s OCR1A register (pin 9) and the low byte into OCR1B (pin 10).

Step 12: HIFI PWM to Analog

Both pins output PWM at 125 kHz carrier rate. The external HIFI resistor network sums them with a 128:1 weighting (4 kΩ for high bits, 500 kΩ for low bits), reconstructing the 14‑bit analog value. A 2‑pole RC filter (2× 0.01 µF ceramic caps) attenuates the 125 kHz carrier by ~36 dB. The resulting audio signal passes through the volume pot and amplifier to the buzzer.

5. Sound Engine Deep Dive

The sound engine (sound.cpp) manages 8 Voice structs:

struct Voice {
  Oscil<512, MOZZI_AUDIO_RATE> osc;  // Wavetable oscillator
  ADSR<MOZZI_CONTROL_RATE, MOZZI_AUDIO_RATE> env;  // Envelope
  uint16_t freq;              // Current playing frequency (post-transpose)
  uint16_t baseFreq;          // Pre-transpose frequency (for matching noteOff)
  uint16_t timedDurationTicks; // Auto-release countdown (0 = sustained)
  uint32_t timedStartTick;    // When the timed note started
  uint32_t noteOnTick;        // When the note was triggered (for age-based stealing)
  bool active;
  volatile uint8_t currentLevel; // Envelope level for LED visualiser
};

Two Note Modes

Sustained (noteOn / noteOff) — used by GUI keyboard and MIDI. Note plays indefinitely until noteOff(freq) is called. timedDurationTicks is 0.
Timed (playTimed) — used by melody player and EEPROM playback. Note auto‑releases after the specified duration. timedDurationTicks is set, and controlUpdate() checks elapsed ticks.

Note Matching

noteOff(freq) must find the correct voice. Since pitch transpose can change between noteOn and noteOff, matching is done by baseFreq (the pre‑transpose frequency), not the actual playing frequency. The instrument offset is applied to the incoming noteOff frequency before matching.

6. Envelope System (ADSR)

Each voice has an independent ADSR (Attack, Decay, Sustain, Release) envelope generator from Mozzi.

Parameters

Parameter	Range	Description
Attack	0–1000 ms	Time to rise from 0 to peak (255)
Decay	0–1000 ms	Time to fall from peak to sustain level
Sustain Level	0–255	Steady‑state amplitude during hold
Sustain Length	0–10000 ms	How long to hold sustain before auto‑release (0 = hold until noteOff)
Release	0–1000 ms	Time to fade from sustain to silence after noteOff

8 Presets

ID	Name	A	D	S	SL	R	Octave Shift
0	Piano	5	200	40	5000	200	0
1	Organ	20	0	255	10000	150	0
2	Staccato	5	50	0	500	20	0
3	Pad	150	800	180	10000	1000	0
4	Flute	80	0	220	10000	300	0
5	Bell	5	400	0	1000	1200	0
6	Bass	10	150	150	5000	100	−12 (1 octave down)
7	Custom	user	user	user	user	user	user

Presets are stored in PROGMEM and loaded via memcpy_P on instrument change. The "Custom" preset is stored in RAM and updated live from the GUI's ADSR sliders.

7. Pitch Transpose System

A potentiometer on pin A5 controls real‑time pitch transposition from −24 to +24 semitones (±2 octaves).

Reading the Pot

uint16_t raw = mozziAnalogRead(PIN_POT_PITCH);  // 0-1023

Why mozziAnalogRead instead of analogRead? Mozzi reconfigures the ADC for its own background conversions. Calling analogRead() would conflict with Mozzi's ADC state machine, potentially corrupting both the analog read and Mozzi's audio timing. mozziAnalogRead() cooperates with Mozzi's ADC scheduler.

Deadband Filter

ADC noise causes ±2–5 LSB jitter even with a stable pot. A deadband of ±20 raw units prevents spurious pitch updates:

int16_t diff = (int16_t)raw - (int16_t)lastRawPot;
if (diff > 20 || diff < -20) {
  lastRawPot = raw;
  // ... process new value
}

Semitone Mapping

The 0–1023 range maps linearly to −24…+24 semitones. The semitone value is applied to frequencies using a PROGMEM lookup table of 49 fixed‑point ratios (8.8 format):

ratio[0]  = 64   // -24 semitones = 0.25× (2 octaves down)
ratio[24] = 256  // 0 semitones = 1.0× (unity)
ratio[48] = 1024 // +24 semitones = 4.0× (2 octaves up)

Application: newFreq = (freq * ratio) >> 8

This avoids pow(2.0, semitones/12.0) which would require floating‑point math (~2000 cycles on AVR vs ~10 cycles for integer multiply + shift).

Live Update

When the pot changes, setPitchTranspose() updates all currently‑active voices in real‑time by recalculating their frequencies from baseFreq through the ratio table. This produces a smooth pitch‑bend effect.

GUI Sync

The Arduino sends SYNC:PI <value> over serial whenever the pitch changes. The Python GUI updates its "Pitch (Pot A5)" label accordingly.

8. Voice Allocation & Stealing

O(1) Free Voice Stack

Instead of scanning all 8 voices to find a free one (O(n)), a stack‑based allocator provides O(1) allocation:

uint8_t freeVoiceStack[8];  // Stack of free voice indices
uint8_t freeVoiceCount = 8; // Number of free voices

// Allocate: pop from stack
int8_t vi = freeVoiceStack[--freeVoiceCount];

// Free: push to stack
freeVoiceStack[freeVoiceCount++] = vi;

Oldest‑Voice Stealing

When all 8 voices are active and a new note arrives, the oldest voice (by noteOnTick) is stolen. The scanner uses the global tick counter to determine age:

for (i = 0..7) {
  age = now - voices[i].noteOnTick;
  if (age > oldestTime) steal this voice;
}

Active Voice Bitmask

An 8‑bit bitmask tracks which voices are active:

uint8_t activeVoiceMask = 0;
markVoiceActive(i)   → activeVoiceMask |= (1 << i);
markVoiceInactive(i) → activeVoiceMask &= ~(1 << i);

Benefits:

__builtin_popcount(activeVoiceMask) gives active voice count in ~4 cycles (single AVR instruction via software)
The audio loop iterates via bit‑shifting the mask, skipping inactive voices without loading their struct
activeVoiceMask == 0 provides instant early‑exit from audioUpdate()

9. Gain Staging & Anti‑Clipping

The Clipping Problem

Each voice produces samples in the range −32,640 to +32,385 (int8 × uint8). With 8 voices, the theoretical maximum sum is ±261,120 — far exceeding the 16‑bit output range of ±32,767.

Voice‑Count‑Based Gain Table

Rather than simple equal division (which sounds too quiet with few voices), a tuned gain table provides musically useful levels:

Active Voices	Gain (8.8 fixed point)	Effective Scale
0–1	320	1.25× (boosted for HIFI)
2	240	0.9375×
3	200	0.78×
4	160	0.625×
5–6	120	0.47×
7–8	100	0.39×

Smoothed Transitions

When voices start or stop, the gain target changes. Applying it instantly would cause an audible click (the entire mix suddenly jumping in volume). An integer IIR filter (exponential moving average), running at 64 Hz control rate, ramps the gain smoothly:

int16_t gainDelta = (int16_t)targetGain - (int16_t)smoothedGain;
smoothedGain += gainDelta >> 2;  // ~16 Hz cutoff at 64 Hz CR

The >>2 shift gives ~70 ms settling time, preventing pops when voices start/stop.

HIFI gain boost rationale: With 14‑bit output (vs old 9‑bit standard mode), the DAC resolves 32× more amplitude levels. Signals that were previously lost in quantisation noise are now cleanly rendered. This allows running the gain ~25% hotter without introducing artefacts.

Hard Clipping

After gain scaling, samples exceeding ±32,767 are hard‑clipped. With the gain table's conservative approach, clipping only occurs when many loud voices play simultaneously — and the smooth gain transition prevents sudden clipping artefacts.

10. Wavetables

All waveforms are 512‑sample wavetables stored in PROGMEM as int8_t arrays (−128 to +127).

Available Waveforms

ID	Name	Source
0	Triangle	Mozzi built‑in
1	Sawtooth	Mozzi built‑in
2	Sine	Mozzi built‑in
3	Square (no alias)	Mozzi built‑in
4	Electric Piano	AKWF collection, modified
5	Clarinet	AKWF collection
6	Violin	AKWF collection
7	FM Synth	AKWF collection
8	Guitar	AKWF collection
9	Cello	AKWF collection
10	Flute	AKWF collection
11	NES Pulse (12.5%)	AKWF/custom
12	Oboe	AKWF collection
13	Osc Chip	AKWF collection
14	Piano	AKWF collection

AKWF = Adventure Kid Waveforms — a public‑domain collection of single‑cycle waveforms. The originals are 600+ samples; they were resampled to 512 and quantised to 8‑bit signed for Mozzi compatibility.

How Wavetable Synthesis Works

A wavetable oscillator works by stepping through a stored waveform at a variable rate:

A phase accumulator increments by a step size each sample.
The step size is proportional to the desired frequency: step = tableSize × freq / sampleRate.
The accumulator index wraps around at tableSize, producing a periodic waveform.
The sample at the current index is output.

Mozzi's Oscil<512, MOZZI_AUDIO_RATE> handles all of this with fixed‑point arithmetic for sub‑sample accuracy.

11. Recording System

Architecture

The recorder captures note events to the Arduino Uno's 1 KB EEPROM for offline playback.

Record Format

Each note is stored as a 6‑byte RecordedNote:

struct RecordedNote {
  uint16_t deltaMs;    // Silence before this note (ms since last event)
  uint16_t frequency;  // Hz (0 = rest)
  uint16_t duration;   // How long the note played (ms)
};

Capacity: (1024 − 2 header bytes) / 6 = 170 notes maximum.

The EEPROM Write Problem

EEPROM writes take 3.3 ms each and are blocking — the CPU halts until the write completes. At 16 kHz audio rate, 3.3 ms = 54 missed audio samples, causing a massive audible glitch.

Solution: Non‑Blocking Trickle Write

Notes are buffered in a 32‑entry RAM ring buffer. In each updateControl() call (64 Hz), if the EEPROM hardware is ready (eeprom_is_ready()), exactly one byte is written using raw register access:

EEAR = address;         // Set EEPROM address register
EEDR = data_byte;       // Set EEPROM data register
EECR |= (1 << EEMPE);  // Master Program Enable
EECR |= (1 << EEPE);   // Start Write

Since each note is 6 bytes and writes happen at 64 Hz (one byte per cycle), a single note takes 6/64 = 93.75 ms to fully persist. The ring buffer absorbs bursts of fast notes.

Write‑on‑Release Design

Events are only written when a note is released (key up). At that point, both the delta (silence before the note started) and the duration (how long it played) are known. This halves the number of EEPROM writes compared to writing on both note‑on and note‑off.

Trailing Silence

When recording stops, a silence marker is written capturing the gap between the last note‑off and the stop button press. Without this, the last note would appear to have zero duration on playback.

Deferred Stop

Recorder::stop() doesn't immediately halt — it enters a "stopping" state and waits for the ring buffer to fully flush to EEPROM before finalising the recording count.

12. Melody Player

Storage

13 melodies are stored in PROGMEM as arrays of Note structs:

struct Note {
  uint16_t frequency;  // Hz (0 = rest)
  uint16_t duration;   // ms
};

O(1) Lookup

A PROGMEM lookup table maps melody IDs to their data pointers and lengths, replacing a 13‑case switch statement:

const MelodyInfo MELODY_TABLE[13] PROGMEM = {
  {melody_mario, melody_mario_length},
  {melody_tetris, melody_tetris_length},
  // ...
};

Playback Timing

The player uses the same tick‑based timing as all other modules. Perfect time accumulation prevents drift:

noteStartMs += noteDurationMs;  // Accumulate, don't use "now" each time

If the system falls behind by >65 ms (e.g. due to heavy processing), it snaps to current time rather than trying to catch up.

Same Path as GUI

Melody notes use Sound::noteOn(freq) / Sound::noteOff(freq) — the exact same code path as GUI‑triggered MIDI notes. Previous implementations used playTimed() which caused double instrument‑offset application and voice pile‑up. The fix tracks lastMelodyFreq and releases the previous note before starting a new one.

13. Serial Protocol

Design Rationale

At 2 Mbaud, the protocol must be byte‑efficient. A single note‑on is 1 byte. ADSR configuration is 11 bytes (command + 10 data). Text‑based protocols (e.g. "NOTE:ON:440\n") would be 12+ bytes and require parsing.

Byte Map

System Commands (0x00–0x14):

Byte	Command	Data Bytes
0x00	STOP_ALL	0
0x01	REC_START	0
0x02	REC_STOP	0
0x03	REC_PLAY	0
0x04	REC_CLEAR	0
0x05	SET_ADSR	10 (see below)
0x07	SET_INST	1 (instrument ID 0–7)
0x08	PLAY_MELODY	1 (melody ID 0–12)
0x09	SET_WAVEFORM	1 (waveform ID 0–14)

Note Commands:

Range	Meaning	Mapping
0x15–0x7F (21–127)	NOTE_OFF	MIDI note = byte value
0x95–0xFF (149–255)	NOTE_ON	MIDI note = byte − 128

The gap (0x80–0x94) is unused. MIDI note 60 = Middle C.

ADSR Data Format (10 bytes after 0x05):

[A_hi, A_lo, D_hi, D_lo, S, SL_hi, SL_lo, R_hi, R_lo, Oct]

16‑bit values are transmitted big‑endian. Oct is a signed int8 (octave shift in semitones).

State Machine Parser

Protocol::hasCommand() implements a byte‑by‑byte state machine that can parse commands across multiple calls without blocking:

State::IDLE → receives system command → State::ARG1 or State::ARG_ADSR
State::ARG1 → receives 1 argument byte → command ready
State::ARG_ADSR → receives 10 bytes → command ready

Arduino→GUI Responses

Text‑based responses are sent via Serial.println():

NOTE:ON <freq> — note triggered (for GUI piano key visualisation)
NOTE:OFF <freq> — specific note released (per‑voice, not global)
SYNC:PI <semitones> — pitch pot changed
STOP — stop all acknowledged
PLAYING <id> — melody started

Baud Rate

2 Mbaud (2,000,000 baud) is the maximum stable rate achievable on the test hardware. High baud rate is critical because Serial.print() blocks until the TX buffer is flushed. At lower baud rates, serial output during updateControl() would take long enough to cause audio glitches. At 2 Mbaud, a 15‑character response takes only ~75 µs.

14. Hardware I/O

Buttons

Four buttons on A0–A3 use INPUT_PULLUP and active‑LOW logic. They are polled every updateControl() cycle (64 Hz) with software debouncing. Each button triggers its action once on the initial press, ignoring subsequent reads until released and re‑pressed.

RGB LED Visualiser

The RGB LED provides visual feedback:

Hue derives from the current note's pitch (frequency → colour mapping via HSV→RGB conversion)
Brightness is modulated by the activeenvelope level (loud = bright, silent = off)
Passive mode — when no notes are playing, a slow rainbow cycle runs
Maximum brightness is capped at 40/255 for USB power safety (5V, 500 mA shared with the entire board)

TM1637 Display

A 4‑digit 7‑segment display shows the currently‑playing note:

Digits 0–1: Note name (e.g. "C ", "Ab")
Digit 3: Octave number (0–9)

Custom segment patterns handle sharps/flats. The display updates at control rate (64 Hz).

15. Python GUI

Architecture

The GUI is a tkinter application split across several modules:

File	Purpose
`piano_gui.py`	Entry point with argument parsing
`gui/app.py`	Main PianoGUI class — all UI setup and event handling
`gui/config.py`	Constants: keybindings, instruments, melodies, waveforms
`gui/serial_comm.py`	Thread‑safe serial communication (ArduinoSerial class)
`gui/components.py`	PianoKey widget (tk.Canvas subclass)
`gui/midi_handler.py`	MIDI file loading and playback (mido library)

Piano Keyboard

61 visual keys (C2–C7) rendered with tk.Canvas widgets. Black keys are overlaid on white keys with place() geometry. The keyboard auto‑scales on window resize (debounced).

QWERTY Bindings

Two‑row piano layout using tkinter keysyms:

Z‑row (lower): C3–B3 white keys, home row (S, D, G, H, J, L, ;) for sharps
Q‑row (upper): C4–G5 white keys
Punctuation keys (comma, period, slash, brackets) mapped by keysym name

Keys are rebindable at runtime by right‑clicking a piano key.

Serial Communication

A background thread reads serial lines and dispatches to on_arduino_message() via root.after() for thread safety. The except Exception: pass pattern was replaced with explicit error logging to prevent silent failures.

Pitch Sync

SYNC:PI <value> messages from Arduino update the "Pitch (Pot A5)" label in real‑time. The label shows the current semitone offset with sign (e.g. "+5 st", "−12 st", "0 st").

Reset

A "Reset" button in the connection bar sends STOP_ALL + SET_WAVEFORM(Piano) + SET_INST(Piano) to the Arduino and resets all GUI controls to defaults.

16. Timing System

The `millis()` Problem

Mozzi reconfigures Timer0, which powers millis(). While millis() technically still works in Mozzi 2.x, it may have reduced accuracy and its use inside audio callbacks is unsafe. More critically, millis() requires 4 bytes of global state protected by interrupt disabling, which adds latency jitter to the audio interrupt.

Tick Counter

A global volatile uint32_t controlTicks is incremented once per updateControl() call (64 Hz). All timing in the project uses this counter:

// Convert ticks to milliseconds
uint32_t ticksToMs(uint32_t ticks) {
  return (ticks * 125) >> 3;  // ticks × 15.625 = ticks × 125 / 8
}

// Convert milliseconds to ticks (fast integer math, no division)
uint16_t msToTicks(uint16_t ms) {
  uint32_t ticks = (ms * 4195UL) >> 16;
  return ticks > 0 ? ticks : 1;
}

The tick counter overflows after 2³² / 64 / 86400 ≈ 776 days of continuous operation.

Timing Resolution

At 64 Hz, the timing resolution is 15.625 ms. This is sufficient for musical timing (a semiquaver at 200 BPM = 75 ms), but means very short notes (<15 ms) may be quantised up to 1 tick minimum.

17. Performance Optimisations

Summary Table

#	Optimisation	Where	Saving
1	Tick‑based timing	sound.cpp, player.cpp, recorder.cpp	Eliminates `millis()` overhead and Timer0 conflicts
2	Active voice bitmask	sound.cpp audioUpdate	Skips inactive voices without struct access
3	Melody lookup table	player.cpp	O(1) melody selection instead of 13‑case switch
4	Bit‑shift octave	sound.cpp applyInstrument	2 cycles vs 200+ for division
5	Cached octave offset	sound.cpp loadProfile	Division by 12 done once on instrument change
6	Voice free stack	sound.cpp allocateVoice	O(1) allocation instead of O(n) scan
7	Fixed‑point transpose	sound.cpp applyPitchTranspose	~10 cycles vs ~2000 for `pow()`
8	int8×uint8 multiply	sound.cpp audioUpdate	Uses AVR MULS (~2 cycles) not int32 (~40 cycles)
9	PROGMEM tables	throughout	Wavetables and melodies in Flash, not RAM
10	Non‑blocking EEPROM	recorder.cpp	1 byte per cycle, zero audio impact
11	Lightweight NOTE sync	player.cpp	Manual utoa + single Serial.write() vs snprintf / Serial.print() chain
12	Integer IIR gain smoothing	sound.cpp	Replaces Smooth<> library (saves ~1.5 KB flash)
13	Bitmask modulo	recorder.cpp	`& (SIZE-1)` replaces `% SIZE` for power‑of‑2 ring buffer
14	Bitshift map()	main.cpp	`(raw * 49) >> 10` replaces `map()` (avoids division)

Build Flags

build_flags = -O3 -flto -DMOZZI_ANALOG_READ_RESOLUTION=10 -DMOZZI_AUDIO_MODE=MOZZI_OUTPUT_2PIN_PWM -DMOZZI_AUDIO_RATE=16384

-O3: Aggressive compiler optimisations (inlining, loop unrolling)
-flto: Link‑Time Optimisation — allows cross‑file inlining and dead code elimination
-DMOZZI_ANALOG_READ_RESOLUTION=10: Keeps standard 10‑bit ADC resolution
-DMOZZI_AUDIO_MODE=MOZZI_OUTPUT_2PIN_PWM: 14‑bit HIFI output on pins 9+10
-DMOZZI_AUDIO_RATE=16384: 16 kHz sample rate (32 kHz default would starve CPU with 8 voices)

18. Challenges & Solutions

Challenge 1: Audio Glitches from EEPROM Writes

Problem: EEPROM writes block the CPU for 3.3 ms, causing 54 missed audio samples (audible click/pop).

Solution: Non‑blocking trickle write — buffer notes in RAM, write 1 byte per control cycle using raw register access. The EEPROM hardware signals readiness via eeprom_is_ready().

Challenge 2: Mozzi Breaks `digitalRead()` on Analog Pins

Problem: startMozzi() sets the DIDR0 register to disable digital input buffers on A0–A5 for ADC noise reduction. This permanently breaks digitalRead() on button pins A0–A3.

Solution: Manually re‑enable digital input buffers after startMozzi():

DIDR0 &= ~(0x0F);  // Re-enable bits 0-3 (A0-A3)

Challenge 3: Smooth Filter Destroying Audio

Problem: Applying Smooth<int>(0.95f) to audio samples at 16 kHz created a low‑pass filter with 134 Hz cutoff, destroying all audio above 134 Hz. The result was extremely quiet, muffled sound with crackling.

Explanation: Mozzi's Smooth is an exponential moving average. At audio rate (16 kHz), coefficient 0.95 means the cutoff frequency is: fc = -(sampleRate × ln(coeff)) / (2π) = -(16384 × ln(0.95)) / (2π) ≈ 134 Hz.

Solution: Never apply Smooth to audio samples. Apply it only to control signals (gain, parameters) at control rate (64 Hz), where 0.8f gives a ~14 Hz cutoff — appropriate for smooth parameter transitions.

Challenge 4: Voice Pile‑Up in Melodies

Problem: The melody player called noteOn() for each new note without releasing the previous one. Over time, all 8 voices filled up, forcing voice stealing on every subsequent note. This caused notes to steal from each other in rapid succession, creating timing drift and slowdown.

Solution: Track lastMelodyFreq and call Sound::noteOff(lastMelodyFreq) before each new note — the same code path as GUI MIDI notes. Additionally, removed a double applyInstrument() call that was doubling the octave shift.

Challenge 5: Pot Jitter

Problem: The ADC on pin A5 has inherent noise (±2–5 LSB). With 1024 ADC values mapped to 49 semitone positions, even 1 LSB of noise could cause the pitch to jitter between two semitones during sustain.

Solution: A raw deadband of ±20 ADC units. The pot value must change by at least 20 (out of 1024) before the system registers a new position. This eliminates noise‑induced jitter at the cost of very slight positional hysteresis.

Challenge 6: Division on AVR

Problem: The ATmega328P has no hardware divide instruction. Division by 12 (for octave calculation) takes ~200–400 cycles via software emulation.

Solution: Precompute the result. Octave offset = octaveShift / 12 is calculated once at instrument‑load time and cached. In the note‑on hot path, only bit‑shifts are used.

Challenge 7: Serial Blocking Audio

Problem: At 9600 baud (default), sending "NOTE:ON:440\n" (12 chars) takes ~12.5 ms — nearly an entire control cycle. This delays the audio loop and causes glitches.

Solution: 2 Mbaud serial. The same 12 characters take ~60 µs. Additionally, the binary protocol reduces most commands to 1–2 bytes.

Challenge 8: RAM Pressure

Problem: 8 Voice structs + Mozzi internals + ring buffers + stack consume nearly all 2048 bytes of RAM. Any new feature risks stack overflow.

Solution: Aggressive use of PROGMEM for all constant data (wavetables, melodies, instrument profiles, melody lookup table, string literals). F() macro for all Serial.print() strings. Current usage: 92.4% (1892/2048 bytes) — only 156 bytes free.

Challenge 9: Silent Exception Swallowing

Problem: The serial read loop used except Exception: pass, silently swallowing every error including AttributeError in the message handler. This caused the pitch sync handler to never execute.

Solution: Split the try/except — serial I/O errors are caught separately from callback errors. Callback errors are printed to console for debugging.

19. Limitations

Hardware Limitations

Limitation	Impact	Mitigation
No DAC	Audio is PWM only. Without filtering, raw output sounds like a buzzer.	HIFI 2‑pin PWM + external 2‑pole RC filter + PN2222 amplifier.
14‑bit HIFI PWM	~84 dB dynamic range. Much better than 8‑bit (~48 dB) but still not CD quality.	Adequate for the piezo buzzer’s output — the speaker is the bottleneck, not the DAC.
Timer1 + Timer2 taken	HIFI mode claims both timers. Pins 9, 10 (Timer1) and 11 (Timer2) lose `analogWrite()`.	Pin 11 (TM1637 CLK) uses bit‑bang protocol, unaffected.
~16 kHz sample rate	Nyquist limit = ~8 kHz. No audio content above 8 kHz. Higher harmonics of waveforms are lost.	Acceptable for "chiptune" aesthetic.
2 KB RAM	92%+ usage. Very little room for new features. Any RAM leak causes stack overflow and crashes.	PROGMEM for all constants. Minimise stack depth.
1 KB EEPROM	~170 notes max. No way to record long pieces.	External storage (SD card) would need SPI pins + library RAM.
16 MHz clock	CPU budget is tight. 8 voices × wavetable + envelope at 16 kHz ≈ 50% CPU.	Careful optimisation of audio loop.
Single core	No parallel processing. Audio interrupt preempts control code.	Non‑blocking design throughout.

Software Limitations

Limitation	Impact
Monophonic recording	Recorder captures one note at a time. Chords are not recorded.
No velocity sensitivity	All notes play at full volume (envelope attack peak = 255).
No polyphonic aftertouch	Envelope is set-and-forget per voice.
No reverb/delay/chorus	DSP effects require RAM buffers (e.g. 256+ samples for delay). Infeasible at 2 KB.
Timer0 compromised	`millis()` accuracy is reduced. `delay()` should never be used.
Timing quantisation	64 Hz control rate = 15.6 ms resolution. Notes shorter than this are rounded up.

The "Buzzer" Quality

Without external filtering, the output quality hierarchy is:

Raw pin → passive buzzer — Harsh, tinny, distorted. The PWM carrier dominates.
HIFI 2‑pin + RC filter → passive buzzer — Dramatically better. The 125 kHz carrier is inaudible. 14‑bit resolution gives cleaner dynamics.
HIFI + RC filter → small amplifier → speaker — Best achievable quality. Still "lo‑fi" by modern standards but pleasant for chiptune/retro music.

20. Sound Quality Notes

Not all waveform/envelope combinations sound equally good through the piezo buzzer output chain. The 14‑bit HIFI mode greatly improves dynamic range over standard 8‑bit PWM, and the 125 kHz carrier is inaudible (vs the old ~16 kHz carrier which added harsh high‑frequency noise). However, the limited bandwidth (~8 kHz Nyquist) and piezo resonance characteristics still favour certain sounds.

Waveforms That Sound Good

Waveform	Notes
Piano (14)	Best all‑round waveform. Clean, recognisable timbre. Default for a reason.
Violin (6)	Rich harmonic content that translates well even through the buzzer.
Square (3)	Classic chiptune sound. The piezo buzzer is naturally suited to square‑wave‑like signals.
Clarinet (5)	Odd‑harmonic‑heavy spectrum (similar to square) — works well with the limited bandwidth.

Envelopes That Sound Good

Envelope	Notes
Piano (0)	Short attack, moderate decay/release. The go‑to preset.
Staccato (2)	Very short, punchy notes. Great for fast melodies and rhythm.

Everything Else

The remaining waveforms (Triangle, Sawtooth, Sine, EP, FM Synth, Guitar, Cello, Flute, NES Pulse, Oboe, Osc Chip) and envelopes (Organ, Pad, Flute, Bell, Bass) are functional but sound mediocre to poor through the buzzer output. The 8‑bit resolution and piezo frequency response don't do them justice. They might sound better with proper filtering and a real speaker, but through the current hardware they are underwhelming.

Recommended default: Piano waveform + Piano envelope.

21. Resource Usage

Flash (Program Memory)

Component	Approximate Size
Wavetables (15 × 512 bytes)	~7.5 KB
Melodies (13 PROGMEM arrays)	~3 KB
Mozzi library code (incl. HIFI)	~3.5 KB
Sound engine	~2.5 KB
Protocol + recorder + player	~2 KB
Arduino framework	~2 KB
Other modules	~1 KB
Total	~21.9 KB / 32 KB (68%)

RAM (SRAM)

Component	Approximate Size
8 Voice structs	~400 bytes
Mozzi internals (buffers, state)	~600 bytes
EEPROM ring buffer (32 × 6 bytes)	192 bytes
Stack, globals, strings	~700 bytes
Total	~1892 / 2048 bytes (92.4%)

EEPROM

Usage	Size
Recording count (header)	2 bytes
Note data	Up to 1022 bytes (170 notes × 6 bytes)
Total	1024 bytes

22. File Reference

File	Lines	Description
`src/main.cpp`	~200	Mozzi integration, `updateControl()`/`updateAudio()`, pot reading, command dispatch
`src/sound.cpp`	~680	Voice management, wavetable selection, ADSR, pitch transpose, audio mixing
`src/recorder.cpp`	~380	Ring‑buffered EEPROM recording, trickle write, playback state machine
`src/player.cpp`	~180	PROGMEM melody playback with tick‑based timing
`src/protocol.cpp`	~160	Binary serial parser, state machine, response helpers
`src/buttons.cpp`	~60	Debounced hardware button polling
`src/leds.cpp`	~100	RGB visualiser with HSV mapping, status LEDs
`src/display.cpp`	~130	TM1637 note display with custom segment patterns
`include/config.h`	~70	Pin definitions, timing constants, serial config
`include/sound.h`	~150	Sound module API declarations and EnvelopeProfile struct
`include/protocol.h`	~80	Command byte map and CmdType enum
`include/recorder.h`	~70	RecordedNote struct and Recorder API
`include/melodies.h`	~600+	13 melody note arrays in PROGMEM
`tools/gui/app.py`	~800	Main GUI class with all UI and event handling
`tools/gui/config.py`	~100	Key bindings, instrument presets, waveform list
`tools/gui/serial_comm.py`	~70	Thread‑safe Arduino serial communication
`tools/gui/components.py`	~60	PianoKey canvas widget
`tools/gui/midi_handler.py`	~200	MIDI file loading and playback

Last updated: 2026‑04‑02

FilesExpand file tree

verbose.md

Latest commit

History

verbose.md

File metadata and controls

Buzzer – Comprehensive Technical Documentation

Table of Contents

1. Project Overview

2. Hardware Architecture

Pin Assignment

Critical Hardware Interaction: Mozzi vs Buttons

Audio Output Circuit

Signal Chain

Stage 1: HIFI Resistor Network + 2‑Pole RC Filter

Stage 2: Volume Control (10 kΩ Potentiometer)

Stage 3: PN2222 Common‑Emitter Amplifier

Bill of Materials (Audio Circuit)

3. How Mozzi Works

Timer Takeover

The Two Callbacks

The Main Loop

Single Compilation Unit Constraint

4. Full Audio Signal Path

Step 1: Note Trigger

Step 2: Instrument Offset

Step 3: Voice Allocation

Step 4: Pitch Transpose

Step 5: Oscillator Setup

Step 6: ADSR Envelope

Step 7: Sample Generation (updateAudio(), ~16 kHz)

Step 8: Voice Mixing

Step 9: Gain Scaling

Step 10: Hard Clipping

Step 11: Mozzi Output

Step 12: HIFI PWM to Analog

5. Sound Engine Deep Dive

Two Note Modes

Note Matching

6. Envelope System (ADSR)

Parameters

8 Presets

7. Pitch Transpose System

Reading the Pot

Deadband Filter

Semitone Mapping

Live Update

GUI Sync

8. Voice Allocation & Stealing

O(1) Free Voice Stack

Oldest‑Voice Stealing

Active Voice Bitmask

9. Gain Staging & Anti‑Clipping

The Clipping Problem

Voice‑Count‑Based Gain Table

Smoothed Transitions

Hard Clipping

10. Wavetables

Available Waveforms

How Wavetable Synthesis Works

11. Recording System

Architecture

Record Format

The EEPROM Write Problem

Solution: Non‑Blocking Trickle Write

Write‑on‑Release Design

Trailing Silence

Deferred Stop

12. Melody Player

Storage

O(1) Lookup

Playback Timing

Same Path as GUI

13. Serial Protocol

Design Rationale

Byte Map

State Machine Parser

Arduino→GUI Responses

Baud Rate

14. Hardware I/O

Step 7: Sample Generation (`updateAudio()`, ~16 kHz)

The `millis()` Problem

Challenge 2: Mozzi Breaks `digitalRead()` on Analog Pins