Skip to content

Latest commit

 

History

History
1089 lines (754 loc) · 46.5 KB

File metadata and controls

1089 lines (754 loc) · 46.5 KB

Buzzer – Comprehensive Technical Documentation

Extreme verbose version of README.md. This document covers every subsystem, design decision, challenge, limitation, and implementation detail of the Buzzer project — an 8‑voice polyphonic wavetable synthesizer built on an Arduino Uno.


Table of Contents

  1. Project Overview
  2. Hardware Architecture
  3. How Mozzi Works
  4. Full Audio Signal Path
  5. Sound Engine Deep Dive
  6. Envelope System (ADSR)
  7. Pitch Transpose System
  8. Voice Allocation & Stealing
  9. Gain Staging & Anti‑Clipping
  10. Wavetables
  11. Recording System
  12. Melody Player
  13. Serial Protocol
  14. Hardware I/O
  15. Python GUI
  16. Timing System
  17. Performance Optimisations
  18. Challenges & Solutions
  19. Limitations
  20. Sound Quality Notes
  21. Resource Usage
  22. File Reference

1. Project Overview

Buzzer is a polyphonic synthesizer that runs entirely on an Arduino Uno (ATmega328P) — a microcontroller with 16 MHz clock, 2 KB of SRAM, 32 KB of Flash, and 1 KB of EEPROM. Despite these severe constraints, it achieves:

  • 8 simultaneous voices with independent ADSR envelopes
  • 15 selectable waveforms from 512‑sample wavetables
  • Real‑time pitch transposition via hardware potentiometer
  • EEPROM‑based recording and playback
  • 13 built‑in melodies stored in PROGMEM
  • A full Python tkinter GUI communicating over 2 Mbaud serial

The audio engine is Mozzi, an open‑source library that generates audio on AVR microcontrollers using high‑frequency PWM. In HIFI mode (MOZZI_OUTPUT_2PIN_PWM), Mozzi splits 14‑bit audio across two pins (9 and 10), each carrying 7 bits, combined via a resistor network. The 125 kHz PWM carrier is well above audible range, and a 2‑pole RC filter further attenuates it by ~36 dB. Mozzi handles timer configuration, interrupt‑driven sample output, and provides building blocks like oscillators, envelopes, and filters.


2. Hardware Architecture

Pin Assignment

Pin Function Type Details
9 Mozzi audio output (high bits) PWM (Timer1) Reserved — cannot be used for anything else. Mozzi drives this pin with the high 7 bits of 14‑bit HIFI audio at 16,384 Hz sample rate.
10 Mozzi audio output (low bits) PWM (Timer1) Reserved — cannot be used for anything else. Carries the low 7 bits. Combined with pin 9 via HIFI resistor network.
A0 Record button Digital input (INPUT_PULLUP) Active LOW. Starts EEPROM recording.
A1 Stop button Digital input (INPUT_PULLUP) Active LOW. Stops recording or playback.
A2 Play button Digital input (INPUT_PULLUP) Active LOW. Plays back EEPROM recording.
A3 Clear button Digital input (INPUT_PULLUP) Active LOW. Clears EEPROM recording data.
A5 Pitch potentiometer Analog input 10 kΩ linear pot. Read via mozziAnalogRead(). Maps 0–1023 → ±24 semitones.
3 RGB LED (Red) PWM output Common cathode RGB. Brightness 0–40 (USB power safety).
5 RGB LED (Green) PWM output Colour derived from pitch (hue mapping).
6 RGB LED (Blue) PWM output Brightness modulated by envelope level.
4 Feedback LED Digital output Flashes briefly on button press (~150 ms).
7 Record status LED Digital output Steady ON while recording is active.
8 Play status LED Digital output Steady ON during playback.
11 TM1637 CLK Digital output 7‑segment display clock line.
12 TM1637 DIO Digital output 7‑segment display data line. Shows note name + octave (e.g. "C 4").

Critical Hardware Interaction: Mozzi vs Buttons

When startMozzi() is called, Mozzi writes to the DIDR0 register to disable digital input buffers on all analog pins (A0–A5). This is an ADC noise reduction feature. However, the four buttons are wired to A0–A3 and require digitalRead(). The fix:

// After startMozzi():
DIDR0 &= ~(0x0F);  // Re-enable digital input buffers for A0-A3

Without this, all four buttons would read as LOW permanently.

Audio Output Circuit

The Arduino Uno has no DAC (Digital‑to‑Analog Converter). Mozzi outputs audio as high‑frequency PWM. In HIFI mode (MOZZI_OUTPUT_2PIN_PWM), the 14‑bit audio sample is split across two pins: pin 9 carries the high 7 bits and pin 10 carries the low 7 bits, each at a 125 kHz PWM carrier rate. A resistor network sums them with a 128:1 weighting ratio, and a 2‑pole RC filter attenuates the PWM carrier by ~36 dB.

This provides ~84 dB of dynamic range (14 bits) compared to ~48 dB from standard single‑pin 8‑bit PWM — a massive improvement in audio quality, especially in quiet passages where quantisation noise was previously dominant.

The current circuit routes the combined HIFI signal through a 10 kΩ volume potentiometer, then into a PN2222 common‑emitter amplifier driving a passive piezo buzzer.

Signal Chain

flowchart LR
    P9["Pin 9\n(high 7 bits)"] --> HIFI["Stage 1\nHIFI RC Filter"]
    P10["Pin 10\n(low 7 bits)"] --> HIFI
    HIFI --> POT["Stage 2\nVolume Pot 10kΩ"]
    POT --> AMP["Stage 3\nPN2222 Amp"]
    AMP --> BUZ["Passive Buzzer"]

    style P9 fill:#4a9eff,color:#fff
    style P10 fill:#4a9eff,color:#fff
    style HIFI fill:#ff9f43,color:#fff
    style POT fill:#2ed573,color:#fff
    style AMP fill:#e056fd,color:#fff
    style BUZ fill:#1e90ff,color:#fff
Loading

HIFI mode uses Timer1 AND Timer2. This means pins 9, 10 (Timer1) and 11 (Timer2) lose analogWrite(). Pin 11 is used for TM1637 CLK via bit‑bang, so is unaffected.

Stage 1: HIFI Resistor Network + 2‑Pole RC Filter

Combines pin 9 (high byte) and pin 10 (low byte) into a single analog signal. Target ratio R_low/R_high ≈ 128. Actual: 500 kΩ / 4 kΩ = 125 (2.3% error, <0.05 bit loss).

flowchart LR
    P9["Pin 9 high 7 bits"]
    P10["Pin 10 low 7 bits"]
    R1["2kΩ"]
    R2["2kΩ"]
    R3["1MΩ"]
    R4["1MΩ"]
    C1["0.01µF ceramic"]
    C2["0.01µF ceramic"]
    A(("Node A to Pot"))
    G1["GND"]
    G2["GND"]

    P9 --- R1 --- C1 --- G1
    R1 --- R2 --- A
    R2 --- C2 --- G2
    P10 --- R3 --- A
    P10 --- R4 --- A

    style P9 fill:#4a9eff,color:#fff
    style P10 fill:#4a9eff,color:#fff
    style A fill:#ff9f43,color:#fff
    style G1 fill:#555,color:#fff
    style G2 fill:#555,color:#fff
Loading

Parts: 2× 2 kΩ (series = 4 kΩ), 2× 1 MΩ (parallel = 500 kΩ), 2× 0.01 µF ceramic. Filter: 2‑pole RC, −40 dB/decade rolloff. Attenuates 125 kHz PWM carrier by ~36 dB.

Stage 2: Volume Control (10 kΩ Potentiometer)

Simple voltage divider between the filtered HIFI audio signal and ground.

flowchart LR
    P9["Pin 9"] --> POT_1["Pot Pin 1"]
    POT_1 --- POT_W["Pot Wiper"] --> AMP["To Amp"]
    POT_1 --- POT_3["Pot Pin 3"] --- GND["GND"]

    style P9 fill:#4a9eff,color:#fff
    style POT_1 fill:#2ed573,color:#fff
    style POT_W fill:#2ed573,color:#fff
    style POT_3 fill:#2ed573,color:#fff
    style GND fill:#555,color:#fff
    style AMP fill:#e056fd,color:#fff
Loading

Stage 3: PN2222 Common‑Emitter Amplifier

Use a PASSIVE buzzer (no internal oscillator). Active buzzers only beep at one fixed frequency. Passive piezos are non‑polarized.

  • Bias: 10 kΩ / 2 kΩ divider → V_base ≈ 0.83 V, V_emitter ≈ 0.23 V, I_C ≈ 2.3 mA
  • AC Gain: ~90× (100 µF bypass cap shorts R_E for AC, leaving only transistor r_e)
  • R_C (1 kΩ) and Buzzer are in parallel between +5 V and Collector
flowchart TD
    VCC["+5V"]
    WIPER["Pot wiper"]

    R1["10kΩ bias top"]
    R2["2kΩ bias bottom"]
    CIN["10µF coupling cap"]
    RC["1kΩ collector load"]
    BUZ["Passive Buzzer"]
    RE["100Ω emitter"]
    CBYP["100µF bypass cap"]

    QB(("B"))
    QC(("C"))
    QE(("E"))
    QL["PN2222"]

    G1["GND"]
    G2["GND"]
    G3["GND"]

    VCC --- R1 --- QB
    QB --- R2 --- G1
    WIPER --- CIN --- QB
    QB --- QL
    QL --- QC
    QL --- QE
    VCC --- RC --- QC
    VCC --- BUZ --- QC
    QE --- RE --- G2
    RE --- CBYP --- G3

    style VCC fill:#ff4757,color:#fff
    style QL fill:#ffa502,color:#fff
    style QB fill:#ffa502,color:#fff
    style QC fill:#ffa502,color:#fff
    style QE fill:#ffa502,color:#fff
    style WIPER fill:#e056fd,color:#fff
    style BUZ fill:#1e90ff,color:#fff
    style G1 fill:#555,color:#fff
    style G2 fill:#555,color:#fff
    style G3 fill:#555,color:#fff
Loading

Parallel connection: Both R_C (1 kΩ) and Buzzer have one leg on +5 V and the other on Collector. R_C provides the DC bias path. The AC voltage swing at the collector drives the piezo.

Electrolytic capacitor polarity (stripe = negative):

Cap + leg connects to − leg connects to
10 µF coupling Pot wiper (~2.5 V DC) Base (~0.83 V DC)
100 µF bypass Emitter (~0.23 V DC) GND (0 V)

Rule: + leg always toward higher DC voltage.

Bill of Materials (Audio Circuit)

Part Qty Role
2 kΩ resistor 2 HIFI R_high (series = 4 kΩ)
1 MΩ resistor 2 HIFI R_low (parallel = 500 kΩ)
0.01 µF ceramic cap 2 2‑pole PWM filter
10 kΩ potentiometer 1 Volume control
10 kΩ resistor 1 Bias divider top
2 kΩ resistor 1 Bias divider bottom
1 kΩ resistor 1 Collector load
100 Ω resistor 1 Emitter degeneration
10 µF electrolytic 1 Input coupling
100 µF electrolytic 1 Emitter bypass
PN2222 transistor 1 Amplifier
Passive buzzer 1 Speaker output

3. How Mozzi Works

Mozzi is a real‑time audio synthesis library for Arduino. Understanding its architecture is essential to understanding this project.

Timer Takeover

Mozzi takes control of hardware timers:

  • Timer1 — drives the PWM output on Pins 9 and 10. In HIFI mode, Timer1 outputs the high 7 bits on pin 9 and low 7 bits on pin 10, each at 125 kHz PWM carrier rate. An interrupt at MOZZI_AUDIO_RATE (16,384 Hz) calls updateAudio() to fetch the next sample.
  • Timer2 — claimed by Mozzi in HIFI mode for the second PWM channel (pin 10). Not available for user code.
  • Timer0 — Mozzi reconfigures Timer0, which breaks millis(), delay(), and micros(). This is why the Buzzer project uses its own tick‑based timing system instead.

The Two Callbacks

Mozzi operates on a two‑rate system:

  1. updateAudio() — called at ~16,384 Hz (MOZZI_AUDIO_RATE). Must return one audio sample. This runs inside a timer interrupt, so it must be extremely fast (budget: ~61 µs per call, but realistically <30 µs to leave headroom). No Serial, no analogRead(), no digitalWrite() — only pure math.

  2. updateControl() — called at MOZZI_CONTROL_RATE (64 Hz in this project). This is where "slow" operations happen: reading the potentiometer, updating ADSR envelopes, polling buttons, parsing serial commands. Budget is more generous (~15.6 ms per call) but must still avoid blocking.

The Main Loop

Mozzi requires loop() to contain only audioHook():

void loop() {
  audioHook();
}

audioHook() is a non‑blocking scheduler that manages the timing between updateControl() calls and fills the audio output buffer.

Single Compilation Unit Constraint

<Mozzi.h> (or the older <MozziGuts.h>) can only be #include'd in one .cpp file (main.cpp). Other source files must use <MozziHeadersOnly.h> to access Mozzi types and utilities. This is because Mozzi.h defines ISR handlers and global state that would cause linker errors if included multiple times.

This constraint means:

  • mozziAnalogRead() is only available in main.cpp
  • startMozzi() and audioHook() are only callable from main.cpp
  • sound.cpp uses <MozziHeadersOnly.h> for Oscil, ADSR, Smooth etc.

4. Full Audio Signal Path

Here is the complete journey of a single audio sample from note trigger to speaker output:

Step 1: Note Trigger

A note arrives via serial (GUI/MIDI) or melody player as a frequency in Hz. Sound::noteOn(freq) is called.

Step 2: Instrument Offset

The raw frequency is adjusted by the current instrument's octave shift. The shift is stored as semitones (multiples of 12) and precomputed as a bit‑shift count at instrument‑load time to avoid division in the hot path.

freq = applyInstrument(freq)  // bit-shift by cached octave offset

Step 3: Voice Allocation

A free voice is obtained from the voice free stack in O(1). If all 8 voices are busy, the oldest voice is stolen (linear scan).

Step 4: Pitch Transpose

The hardware pot's current transpose value (±24 semitones) is applied using a PROGMEM lookup table of fixed‑point frequency ratios:

shiftedFreq = (freq * SEMITONE_RATIO[transpose + 24]) >> 8

This avoids floating‑point math (pow(2, semitones/12.0)) which would take ~2000 cycles on AVR.

Step 5: Oscillator Setup

The voice's Oscil<512, MOZZI_AUDIO_RATE> oscillator is configured with the transposed frequency. The oscillator performs phase‑accumulator wavetable lookup — it steps through a 512‑sample wavetable at a rate determined by the frequency.

Step 6: ADSR Envelope

The voice's ADSR<MOZZI_CONTROL_RATE, MOZZI_AUDIO_RATE> envelope is configured with the current instrument's parameters and noteOn() is called. The ADSR progresses through Attack → Decay → Sustain → Release phases, outputting a value 0–255.

Step 7: Sample Generation (updateAudio(), ~16 kHz)

Inside the audio interrupt, for each active voice:

int16_t sample = voice.osc.next();     // Wavetable lookup: -128 to +127
uint16_t envVal = voice.env.next();    // ADSR output: 0 to 255
mix += (int16_t)(sample * envVal);     // -32640 to +32385

The sample * envVal multiplication uses the AVR's native 8×8→16 MULS instruction (2 cycles), avoiding the expensive 32‑bit multiply that the compiler would otherwise generate.

Step 8: Voice Mixing

All active voices are summed into a single int32_t mix accumulator. Only active voices are processed, identified via an 8‑bit bitmask (activeVoiceMask) — inactive voices are skipped without even checking their struct.

Step 9: Gain Scaling

The raw mix is scaled by a smoothed gain factor:

int16_t gainDelta = (int16_t)targetGain - (int16_t)smoothedGain;
smoothedGain += gainDelta >> 2;
mix = (mix * smoothedGain) >> 8;

targetGain is a voice‑count‑dependent value updated at control rate (64 Hz), boosted ~25% for HIFI's cleaner 14‑bit output:

  • 1 voice: 320 (1.25×)
  • 2 voices: 240
  • 3 voices: 200
  • 4 voices: 160
  • 5–6 voices: 120
  • 7–8 voices: 100

The integer IIR filter (>>2 coefficient, ~16 Hz cutoff at 64 Hz rate) ensures gain transitions happen gradually to prevent audible pops when voices start/stop.

Critical lesson learned: An earlier attempt applied Smooth<int>(0.95f) directly to audio samples at 16 kHz. This created a 134 Hz low‑pass filter that destroyed everything above 134 Hz — extreme volume loss and distortion. Smoothing must only be applied to control signals at control rate. The current integer IIR approach eliminates the Smooth library dependency entirely.

Step 10: Hard Clipping

if (mix > 32767) mix = 32767;
else if (mix < -32767) mix = -32767;

Step 11: Mozzi Output

The final sample is returned as MonoOutput::from16Bit(mix), which Mozzi splits into high and low 7‑bit values. The high byte is loaded into Timer1’s OCR1A register (pin 9) and the low byte into OCR1B (pin 10).

Step 12: HIFI PWM to Analog

Both pins output PWM at 125 kHz carrier rate. The external HIFI resistor network sums them with a 128:1 weighting (4 kΩ for high bits, 500 kΩ for low bits), reconstructing the 14‑bit analog value. A 2‑pole RC filter (2× 0.01 µF ceramic caps) attenuates the 125 kHz carrier by ~36 dB. The resulting audio signal passes through the volume pot and amplifier to the buzzer.


5. Sound Engine Deep Dive

The sound engine (sound.cpp) manages 8 Voice structs:

struct Voice {
  Oscil<512, MOZZI_AUDIO_RATE> osc;  // Wavetable oscillator
  ADSR<MOZZI_CONTROL_RATE, MOZZI_AUDIO_RATE> env;  // Envelope
  uint16_t freq;              // Current playing frequency (post-transpose)
  uint16_t baseFreq;          // Pre-transpose frequency (for matching noteOff)
  uint16_t timedDurationTicks; // Auto-release countdown (0 = sustained)
  uint32_t timedStartTick;    // When the timed note started
  uint32_t noteOnTick;        // When the note was triggered (for age-based stealing)
  bool active;
  volatile uint8_t currentLevel; // Envelope level for LED visualiser
};

Two Note Modes

  1. Sustained (noteOn / noteOff) — used by GUI keyboard and MIDI. Note plays indefinitely until noteOff(freq) is called. timedDurationTicks is 0.
  2. Timed (playTimed) — used by melody player and EEPROM playback. Note auto‑releases after the specified duration. timedDurationTicks is set, and controlUpdate() checks elapsed ticks.

Note Matching

noteOff(freq) must find the correct voice. Since pitch transpose can change between noteOn and noteOff, matching is done by baseFreq (the pre‑transpose frequency), not the actual playing frequency. The instrument offset is applied to the incoming noteOff frequency before matching.


6. Envelope System (ADSR)

Each voice has an independent ADSR (Attack, Decay, Sustain, Release) envelope generator from Mozzi.

Parameters

Parameter Range Description
Attack 0–1000 ms Time to rise from 0 to peak (255)
Decay 0–1000 ms Time to fall from peak to sustain level
Sustain Level 0–255 Steady‑state amplitude during hold
Sustain Length 0–10000 ms How long to hold sustain before auto‑release (0 = hold until noteOff)
Release 0–1000 ms Time to fade from sustain to silence after noteOff

8 Presets

ID Name A D S SL R Octave Shift
0 Piano 5 200 40 5000 200 0
1 Organ 20 0 255 10000 150 0
2 Staccato 5 50 0 500 20 0
3 Pad 150 800 180 10000 1000 0
4 Flute 80 0 220 10000 300 0
5 Bell 5 400 0 1000 1200 0
6 Bass 10 150 150 5000 100 −12 (1 octave down)
7 Custom user user user user user user

Presets are stored in PROGMEM and loaded via memcpy_P on instrument change. The "Custom" preset is stored in RAM and updated live from the GUI's ADSR sliders.


7. Pitch Transpose System

A potentiometer on pin A5 controls real‑time pitch transposition from −24 to +24 semitones (±2 octaves).

Reading the Pot

uint16_t raw = mozziAnalogRead(PIN_POT_PITCH);  // 0-1023

Why mozziAnalogRead instead of analogRead? Mozzi reconfigures the ADC for its own background conversions. Calling analogRead() would conflict with Mozzi's ADC state machine, potentially corrupting both the analog read and Mozzi's audio timing. mozziAnalogRead() cooperates with Mozzi's ADC scheduler.

Deadband Filter

ADC noise causes ±2–5 LSB jitter even with a stable pot. A deadband of ±20 raw units prevents spurious pitch updates:

int16_t diff = (int16_t)raw - (int16_t)lastRawPot;
if (diff > 20 || diff < -20) {
  lastRawPot = raw;
  // ... process new value
}

Semitone Mapping

The 0–1023 range maps linearly to −24…+24 semitones. The semitone value is applied to frequencies using a PROGMEM lookup table of 49 fixed‑point ratios (8.8 format):

ratio[0]  = 64   // -24 semitones = 0.25× (2 octaves down)
ratio[24] = 256  // 0 semitones = 1.0× (unity)
ratio[48] = 1024 // +24 semitones = 4.0× (2 octaves up)

Application: newFreq = (freq * ratio) >> 8

This avoids pow(2.0, semitones/12.0) which would require floating‑point math (~2000 cycles on AVR vs ~10 cycles for integer multiply + shift).

Live Update

When the pot changes, setPitchTranspose() updates all currently‑active voices in real‑time by recalculating their frequencies from baseFreq through the ratio table. This produces a smooth pitch‑bend effect.

GUI Sync

The Arduino sends SYNC:PI <value> over serial whenever the pitch changes. The Python GUI updates its "Pitch (Pot A5)" label accordingly.


8. Voice Allocation & Stealing

O(1) Free Voice Stack

Instead of scanning all 8 voices to find a free one (O(n)), a stack‑based allocator provides O(1) allocation:

uint8_t freeVoiceStack[8];  // Stack of free voice indices
uint8_t freeVoiceCount = 8; // Number of free voices

// Allocate: pop from stack
int8_t vi = freeVoiceStack[--freeVoiceCount];

// Free: push to stack
freeVoiceStack[freeVoiceCount++] = vi;

Oldest‑Voice Stealing

When all 8 voices are active and a new note arrives, the oldest voice (by noteOnTick) is stolen. The scanner uses the global tick counter to determine age:

for (i = 0..7) {
  age = now - voices[i].noteOnTick;
  if (age > oldestTime) steal this voice;
}

Active Voice Bitmask

An 8‑bit bitmask tracks which voices are active:

uint8_t activeVoiceMask = 0;
markVoiceActive(i)   → activeVoiceMask |= (1 << i);
markVoiceInactive(i) → activeVoiceMask &= ~(1 << i);

Benefits:

  • __builtin_popcount(activeVoiceMask) gives active voice count in ~4 cycles (single AVR instruction via software)
  • The audio loop iterates via bit‑shifting the mask, skipping inactive voices without loading their struct
  • activeVoiceMask == 0 provides instant early‑exit from audioUpdate()

9. Gain Staging & Anti‑Clipping

The Clipping Problem

Each voice produces samples in the range −32,640 to +32,385 (int8 × uint8). With 8 voices, the theoretical maximum sum is ±261,120 — far exceeding the 16‑bit output range of ±32,767.

Voice‑Count‑Based Gain Table

Rather than simple equal division (which sounds too quiet with few voices), a tuned gain table provides musically useful levels:

Active Voices Gain (8.8 fixed point) Effective Scale
0–1 320 1.25× (boosted for HIFI)
2 240 0.9375×
3 200 0.78×
4 160 0.625×
5–6 120 0.47×
7–8 100 0.39×

Smoothed Transitions

When voices start or stop, the gain target changes. Applying it instantly would cause an audible click (the entire mix suddenly jumping in volume). An integer IIR filter (exponential moving average), running at 64 Hz control rate, ramps the gain smoothly:

int16_t gainDelta = (int16_t)targetGain - (int16_t)smoothedGain;
smoothedGain += gainDelta >> 2;  // ~16 Hz cutoff at 64 Hz CR

The >>2 shift gives ~70 ms settling time, preventing pops when voices start/stop.

HIFI gain boost rationale: With 14‑bit output (vs old 9‑bit standard mode), the DAC resolves 32× more amplitude levels. Signals that were previously lost in quantisation noise are now cleanly rendered. This allows running the gain ~25% hotter without introducing artefacts.

Hard Clipping

After gain scaling, samples exceeding ±32,767 are hard‑clipped. With the gain table's conservative approach, clipping only occurs when many loud voices play simultaneously — and the smooth gain transition prevents sudden clipping artefacts.


10. Wavetables

All waveforms are 512‑sample wavetables stored in PROGMEM as int8_t arrays (−128 to +127).

Available Waveforms

ID Name Source
0 Triangle Mozzi built‑in
1 Sawtooth Mozzi built‑in
2 Sine Mozzi built‑in
3 Square (no alias) Mozzi built‑in
4 Electric Piano AKWF collection, modified
5 Clarinet AKWF collection
6 Violin AKWF collection
7 FM Synth AKWF collection
8 Guitar AKWF collection
9 Cello AKWF collection
10 Flute AKWF collection
11 NES Pulse (12.5%) AKWF/custom
12 Oboe AKWF collection
13 Osc Chip AKWF collection
14 Piano AKWF collection

AKWF = Adventure Kid Waveforms — a public‑domain collection of single‑cycle waveforms. The originals are 600+ samples; they were resampled to 512 and quantised to 8‑bit signed for Mozzi compatibility.

How Wavetable Synthesis Works

A wavetable oscillator works by stepping through a stored waveform at a variable rate:

  1. A phase accumulator increments by a step size each sample.
  2. The step size is proportional to the desired frequency: step = tableSize × freq / sampleRate.
  3. The accumulator index wraps around at tableSize, producing a periodic waveform.
  4. The sample at the current index is output.

Mozzi's Oscil<512, MOZZI_AUDIO_RATE> handles all of this with fixed‑point arithmetic for sub‑sample accuracy.


11. Recording System

Architecture

The recorder captures note events to the Arduino Uno's 1 KB EEPROM for offline playback.

Record Format

Each note is stored as a 6‑byte RecordedNote:

struct RecordedNote {
  uint16_t deltaMs;    // Silence before this note (ms since last event)
  uint16_t frequency;  // Hz (0 = rest)
  uint16_t duration;   // How long the note played (ms)
};

Capacity: (1024 − 2 header bytes) / 6 = 170 notes maximum.

The EEPROM Write Problem

EEPROM writes take 3.3 ms each and are blocking — the CPU halts until the write completes. At 16 kHz audio rate, 3.3 ms = 54 missed audio samples, causing a massive audible glitch.

Solution: Non‑Blocking Trickle Write

Notes are buffered in a 32‑entry RAM ring buffer. In each updateControl() call (64 Hz), if the EEPROM hardware is ready (eeprom_is_ready()), exactly one byte is written using raw register access:

EEAR = address;         // Set EEPROM address register
EEDR = data_byte;       // Set EEPROM data register
EECR |= (1 << EEMPE);  // Master Program Enable
EECR |= (1 << EEPE);   // Start Write

Since each note is 6 bytes and writes happen at 64 Hz (one byte per cycle), a single note takes 6/64 = 93.75 ms to fully persist. The ring buffer absorbs bursts of fast notes.

Write‑on‑Release Design

Events are only written when a note is released (key up). At that point, both the delta (silence before the note started) and the duration (how long it played) are known. This halves the number of EEPROM writes compared to writing on both note‑on and note‑off.

Trailing Silence

When recording stops, a silence marker is written capturing the gap between the last note‑off and the stop button press. Without this, the last note would appear to have zero duration on playback.

Deferred Stop

Recorder::stop() doesn't immediately halt — it enters a "stopping" state and waits for the ring buffer to fully flush to EEPROM before finalising the recording count.


12. Melody Player

Storage

13 melodies are stored in PROGMEM as arrays of Note structs:

struct Note {
  uint16_t frequency;  // Hz (0 = rest)
  uint16_t duration;   // ms
};

O(1) Lookup

A PROGMEM lookup table maps melody IDs to their data pointers and lengths, replacing a 13‑case switch statement:

const MelodyInfo MELODY_TABLE[13] PROGMEM = {
  {melody_mario, melody_mario_length},
  {melody_tetris, melody_tetris_length},
  // ...
};

Playback Timing

The player uses the same tick‑based timing as all other modules. Perfect time accumulation prevents drift:

noteStartMs += noteDurationMs;  // Accumulate, don't use "now" each time

If the system falls behind by >65 ms (e.g. due to heavy processing), it snaps to current time rather than trying to catch up.

Same Path as GUI

Melody notes use Sound::noteOn(freq) / Sound::noteOff(freq) — the exact same code path as GUI‑triggered MIDI notes. Previous implementations used playTimed() which caused double instrument‑offset application and voice pile‑up. The fix tracks lastMelodyFreq and releases the previous note before starting a new one.


13. Serial Protocol

Design Rationale

At 2 Mbaud, the protocol must be byte‑efficient. A single note‑on is 1 byte. ADSR configuration is 11 bytes (command + 10 data). Text‑based protocols (e.g. "NOTE:ON:440\n") would be 12+ bytes and require parsing.

Byte Map

System Commands (0x00–0x14):

Byte Command Data Bytes
0x00 STOP_ALL 0
0x01 REC_START 0
0x02 REC_STOP 0
0x03 REC_PLAY 0
0x04 REC_CLEAR 0
0x05 SET_ADSR 10 (see below)
0x07 SET_INST 1 (instrument ID 0–7)
0x08 PLAY_MELODY 1 (melody ID 0–12)
0x09 SET_WAVEFORM 1 (waveform ID 0–14)

Note Commands:

Range Meaning Mapping
0x15–0x7F (21–127) NOTE_OFF MIDI note = byte value
0x95–0xFF (149–255) NOTE_ON MIDI note = byte − 128

The gap (0x80–0x94) is unused. MIDI note 60 = Middle C.

ADSR Data Format (10 bytes after 0x05):

[A_hi, A_lo, D_hi, D_lo, S, SL_hi, SL_lo, R_hi, R_lo, Oct]

16‑bit values are transmitted big‑endian. Oct is a signed int8 (octave shift in semitones).

State Machine Parser

Protocol::hasCommand() implements a byte‑by‑byte state machine that can parse commands across multiple calls without blocking:

State::IDLE → receives system command → State::ARG1 or State::ARG_ADSR
State::ARG1 → receives 1 argument byte → command ready
State::ARG_ADSR → receives 10 bytes → command ready

Arduino→GUI Responses

Text‑based responses are sent via Serial.println():

  • NOTE:ON <freq> — note triggered (for GUI piano key visualisation)
  • NOTE:OFF <freq> — specific note released (per‑voice, not global)
  • SYNC:PI <semitones> — pitch pot changed
  • STOP — stop all acknowledged
  • PLAYING <id> — melody started

Baud Rate

2 Mbaud (2,000,000 baud) is the maximum stable rate achievable on the test hardware. High baud rate is critical because Serial.print() blocks until the TX buffer is flushed. At lower baud rates, serial output during updateControl() would take long enough to cause audio glitches. At 2 Mbaud, a 15‑character response takes only ~75 µs.


14. Hardware I/O

Buttons

Four buttons on A0–A3 use INPUT_PULLUP and active‑LOW logic. They are polled every updateControl() cycle (64 Hz) with software debouncing. Each button triggers its action once on the initial press, ignoring subsequent reads until released and re‑pressed.

RGB LED Visualiser

The RGB LED provides visual feedback:

  • Hue derives from the current note's pitch (frequency → colour mapping via HSV→RGB conversion)
  • Brightness is modulated by the activeenvelope level (loud = bright, silent = off)
  • Passive mode — when no notes are playing, a slow rainbow cycle runs
  • Maximum brightness is capped at 40/255 for USB power safety (5V, 500 mA shared with the entire board)

TM1637 Display

A 4‑digit 7‑segment display shows the currently‑playing note:

  • Digits 0–1: Note name (e.g. "C ", "Ab")
  • Digit 3: Octave number (0–9)

Custom segment patterns handle sharps/flats. The display updates at control rate (64 Hz).


15. Python GUI

Architecture

The GUI is a tkinter application split across several modules:

File Purpose
piano_gui.py Entry point with argument parsing
gui/app.py Main PianoGUI class — all UI setup and event handling
gui/config.py Constants: keybindings, instruments, melodies, waveforms
gui/serial_comm.py Thread‑safe serial communication (ArduinoSerial class)
gui/components.py PianoKey widget (tk.Canvas subclass)
gui/midi_handler.py MIDI file loading and playback (mido library)

Piano Keyboard

61 visual keys (C2–C7) rendered with tk.Canvas widgets. Black keys are overlaid on white keys with place() geometry. The keyboard auto‑scales on window resize (debounced).

QWERTY Bindings

Two‑row piano layout using tkinter keysyms:

  • Z‑row (lower): C3–B3 white keys, home row (S, D, G, H, J, L, ;) for sharps
  • Q‑row (upper): C4–G5 white keys
  • Punctuation keys (comma, period, slash, brackets) mapped by keysym name

Keys are rebindable at runtime by right‑clicking a piano key.

Serial Communication

A background thread reads serial lines and dispatches to on_arduino_message() via root.after() for thread safety. The except Exception: pass pattern was replaced with explicit error logging to prevent silent failures.

Pitch Sync

SYNC:PI <value> messages from Arduino update the "Pitch (Pot A5)" label in real‑time. The label shows the current semitone offset with sign (e.g. "+5 st", "−12 st", "0 st").

Reset

A "Reset" button in the connection bar sends STOP_ALL + SET_WAVEFORM(Piano) + SET_INST(Piano) to the Arduino and resets all GUI controls to defaults.


16. Timing System

The millis() Problem

Mozzi reconfigures Timer0, which powers millis(). While millis() technically still works in Mozzi 2.x, it may have reduced accuracy and its use inside audio callbacks is unsafe. More critically, millis() requires 4 bytes of global state protected by interrupt disabling, which adds latency jitter to the audio interrupt.

Tick Counter

A global volatile uint32_t controlTicks is incremented once per updateControl() call (64 Hz). All timing in the project uses this counter:

// Convert ticks to milliseconds
uint32_t ticksToMs(uint32_t ticks) {
  return (ticks * 125) >> 3;  // ticks × 15.625 = ticks × 125 / 8
}

// Convert milliseconds to ticks (fast integer math, no division)
uint16_t msToTicks(uint16_t ms) {
  uint32_t ticks = (ms * 4195UL) >> 16;
  return ticks > 0 ? ticks : 1;
}

The tick counter overflows after 2³² / 64 / 86400 ≈ 776 days of continuous operation.

Timing Resolution

At 64 Hz, the timing resolution is 15.625 ms. This is sufficient for musical timing (a semiquaver at 200 BPM = 75 ms), but means very short notes (<15 ms) may be quantised up to 1 tick minimum.


17. Performance Optimisations

Summary Table

# Optimisation Where Saving
1 Tick‑based timing sound.cpp, player.cpp, recorder.cpp Eliminates millis() overhead and Timer0 conflicts
2 Active voice bitmask sound.cpp audioUpdate Skips inactive voices without struct access
3 Melody lookup table player.cpp O(1) melody selection instead of 13‑case switch
4 Bit‑shift octave sound.cpp applyInstrument 2 cycles vs 200+ for division
5 Cached octave offset sound.cpp loadProfile Division by 12 done once on instrument change
6 Voice free stack sound.cpp allocateVoice O(1) allocation instead of O(n) scan
7 Fixed‑point transpose sound.cpp applyPitchTranspose ~10 cycles vs ~2000 for pow()
8 int8×uint8 multiply sound.cpp audioUpdate Uses AVR MULS (~2 cycles) not int32 (~40 cycles)
9 PROGMEM tables throughout Wavetables and melodies in Flash, not RAM
10 Non‑blocking EEPROM recorder.cpp 1 byte per cycle, zero audio impact
11 Lightweight NOTE sync player.cpp Manual utoa + single Serial.write() vs snprintf / Serial.print() chain
12 Integer IIR gain smoothing sound.cpp Replaces Smooth<> library (saves ~1.5 KB flash)
13 Bitmask modulo recorder.cpp & (SIZE-1) replaces % SIZE for power‑of‑2 ring buffer
14 Bitshift map() main.cpp (raw * 49) >> 10 replaces map() (avoids division)

Build Flags

build_flags = -O3 -flto -DMOZZI_ANALOG_READ_RESOLUTION=10 -DMOZZI_AUDIO_MODE=MOZZI_OUTPUT_2PIN_PWM -DMOZZI_AUDIO_RATE=16384
  • -O3: Aggressive compiler optimisations (inlining, loop unrolling)
  • -flto: Link‑Time Optimisation — allows cross‑file inlining and dead code elimination
  • -DMOZZI_ANALOG_READ_RESOLUTION=10: Keeps standard 10‑bit ADC resolution
  • -DMOZZI_AUDIO_MODE=MOZZI_OUTPUT_2PIN_PWM: 14‑bit HIFI output on pins 9+10
  • -DMOZZI_AUDIO_RATE=16384: 16 kHz sample rate (32 kHz default would starve CPU with 8 voices)

18. Challenges & Solutions

Challenge 1: Audio Glitches from EEPROM Writes

Problem: EEPROM writes block the CPU for 3.3 ms, causing 54 missed audio samples (audible click/pop).

Solution: Non‑blocking trickle write — buffer notes in RAM, write 1 byte per control cycle using raw register access. The EEPROM hardware signals readiness via eeprom_is_ready().

Challenge 2: Mozzi Breaks digitalRead() on Analog Pins

Problem: startMozzi() sets the DIDR0 register to disable digital input buffers on A0–A5 for ADC noise reduction. This permanently breaks digitalRead() on button pins A0–A3.

Solution: Manually re‑enable digital input buffers after startMozzi():

DIDR0 &= ~(0x0F);  // Re-enable bits 0-3 (A0-A3)

Challenge 3: Smooth Filter Destroying Audio

Problem: Applying Smooth<int>(0.95f) to audio samples at 16 kHz created a low‑pass filter with 134 Hz cutoff, destroying all audio above 134 Hz. The result was extremely quiet, muffled sound with crackling.

Explanation: Mozzi's Smooth is an exponential moving average. At audio rate (16 kHz), coefficient 0.95 means the cutoff frequency is: fc = -(sampleRate × ln(coeff)) / (2π) = -(16384 × ln(0.95)) / (2π) ≈ 134 Hz.

Solution: Never apply Smooth to audio samples. Apply it only to control signals (gain, parameters) at control rate (64 Hz), where 0.8f gives a ~14 Hz cutoff — appropriate for smooth parameter transitions.

Challenge 4: Voice Pile‑Up in Melodies

Problem: The melody player called noteOn() for each new note without releasing the previous one. Over time, all 8 voices filled up, forcing voice stealing on every subsequent note. This caused notes to steal from each other in rapid succession, creating timing drift and slowdown.

Solution: Track lastMelodyFreq and call Sound::noteOff(lastMelodyFreq) before each new note — the same code path as GUI MIDI notes. Additionally, removed a double applyInstrument() call that was doubling the octave shift.

Challenge 5: Pot Jitter

Problem: The ADC on pin A5 has inherent noise (±2–5 LSB). With 1024 ADC values mapped to 49 semitone positions, even 1 LSB of noise could cause the pitch to jitter between two semitones during sustain.

Solution: A raw deadband of ±20 ADC units. The pot value must change by at least 20 (out of 1024) before the system registers a new position. This eliminates noise‑induced jitter at the cost of very slight positional hysteresis.

Challenge 6: Division on AVR

Problem: The ATmega328P has no hardware divide instruction. Division by 12 (for octave calculation) takes ~200–400 cycles via software emulation.

Solution: Precompute the result. Octave offset = octaveShift / 12 is calculated once at instrument‑load time and cached. In the note‑on hot path, only bit‑shifts are used.

Challenge 7: Serial Blocking Audio

Problem: At 9600 baud (default), sending "NOTE:ON:440\n" (12 chars) takes ~12.5 ms — nearly an entire control cycle. This delays the audio loop and causes glitches.

Solution: 2 Mbaud serial. The same 12 characters take ~60 µs. Additionally, the binary protocol reduces most commands to 1–2 bytes.

Challenge 8: RAM Pressure

Problem: 8 Voice structs + Mozzi internals + ring buffers + stack consume nearly all 2048 bytes of RAM. Any new feature risks stack overflow.

Solution: Aggressive use of PROGMEM for all constant data (wavetables, melodies, instrument profiles, melody lookup table, string literals). F() macro for all Serial.print() strings. Current usage: 92.4% (1892/2048 bytes) — only 156 bytes free.

Challenge 9: Silent Exception Swallowing

Problem: The serial read loop used except Exception: pass, silently swallowing every error including AttributeError in the message handler. This caused the pitch sync handler to never execute.

Solution: Split the try/except — serial I/O errors are caught separately from callback errors. Callback errors are printed to console for debugging.


19. Limitations

Hardware Limitations

Limitation Impact Mitigation
No DAC Audio is PWM only. Without filtering, raw output sounds like a buzzer. HIFI 2‑pin PWM + external 2‑pole RC filter + PN2222 amplifier.
14‑bit HIFI PWM ~84 dB dynamic range. Much better than 8‑bit (~48 dB) but still not CD quality. Adequate for the piezo buzzer’s output — the speaker is the bottleneck, not the DAC.
Timer1 + Timer2 taken HIFI mode claims both timers. Pins 9, 10 (Timer1) and 11 (Timer2) lose analogWrite(). Pin 11 (TM1637 CLK) uses bit‑bang protocol, unaffected.
~16 kHz sample rate Nyquist limit = ~8 kHz. No audio content above 8 kHz. Higher harmonics of waveforms are lost. Acceptable for "chiptune" aesthetic.
2 KB RAM 92%+ usage. Very little room for new features. Any RAM leak causes stack overflow and crashes. PROGMEM for all constants. Minimise stack depth.
1 KB EEPROM ~170 notes max. No way to record long pieces. External storage (SD card) would need SPI pins + library RAM.
16 MHz clock CPU budget is tight. 8 voices × wavetable + envelope at 16 kHz ≈ 50% CPU. Careful optimisation of audio loop.
Single core No parallel processing. Audio interrupt preempts control code. Non‑blocking design throughout.

Software Limitations

Limitation Impact
Monophonic recording Recorder captures one note at a time. Chords are not recorded.
No velocity sensitivity All notes play at full volume (envelope attack peak = 255).
No polyphonic aftertouch Envelope is set-and-forget per voice.
No reverb/delay/chorus DSP effects require RAM buffers (e.g. 256+ samples for delay). Infeasible at 2 KB.
Timer0 compromised millis() accuracy is reduced. delay() should never be used.
Timing quantisation 64 Hz control rate = 15.6 ms resolution. Notes shorter than this are rounded up.

The "Buzzer" Quality

Without external filtering, the output quality hierarchy is:

  1. Raw pin → passive buzzer — Harsh, tinny, distorted. The PWM carrier dominates.
  2. HIFI 2‑pin + RC filter → passive buzzer — Dramatically better. The 125 kHz carrier is inaudible. 14‑bit resolution gives cleaner dynamics.
  3. HIFI + RC filter → small amplifier → speaker — Best achievable quality. Still "lo‑fi" by modern standards but pleasant for chiptune/retro music.

20. Sound Quality Notes

Not all waveform/envelope combinations sound equally good through the piezo buzzer output chain. The 14‑bit HIFI mode greatly improves dynamic range over standard 8‑bit PWM, and the 125 kHz carrier is inaudible (vs the old ~16 kHz carrier which added harsh high‑frequency noise). However, the limited bandwidth (~8 kHz Nyquist) and piezo resonance characteristics still favour certain sounds.

Waveforms That Sound Good

Waveform Notes
Piano (14) Best all‑round waveform. Clean, recognisable timbre. Default for a reason.
Violin (6) Rich harmonic content that translates well even through the buzzer.
Square (3) Classic chiptune sound. The piezo buzzer is naturally suited to square‑wave‑like signals.
Clarinet (5) Odd‑harmonic‑heavy spectrum (similar to square) — works well with the limited bandwidth.

Envelopes That Sound Good

Envelope Notes
Piano (0) Short attack, moderate decay/release. The go‑to preset.
Staccato (2) Very short, punchy notes. Great for fast melodies and rhythm.

Everything Else

The remaining waveforms (Triangle, Sawtooth, Sine, EP, FM Synth, Guitar, Cello, Flute, NES Pulse, Oboe, Osc Chip) and envelopes (Organ, Pad, Flute, Bell, Bass) are functional but sound mediocre to poor through the buzzer output. The 8‑bit resolution and piezo frequency response don't do them justice. They might sound better with proper filtering and a real speaker, but through the current hardware they are underwhelming.

Recommended default: Piano waveform + Piano envelope.


21. Resource Usage

Flash (Program Memory)

Component Approximate Size
Wavetables (15 × 512 bytes) ~7.5 KB
Melodies (13 PROGMEM arrays) ~3 KB
Mozzi library code (incl. HIFI) ~3.5 KB
Sound engine ~2.5 KB
Protocol + recorder + player ~2 KB
Arduino framework ~2 KB
Other modules ~1 KB
Total ~21.9 KB / 32 KB (68%)

RAM (SRAM)

Component Approximate Size
8 Voice structs ~400 bytes
Mozzi internals (buffers, state) ~600 bytes
EEPROM ring buffer (32 × 6 bytes) 192 bytes
Stack, globals, strings ~700 bytes
Total ~1892 / 2048 bytes (92.4%)

EEPROM

Usage Size
Recording count (header) 2 bytes
Note data Up to 1022 bytes (170 notes × 6 bytes)
Total 1024 bytes

22. File Reference

File Lines Description
src/main.cpp ~200 Mozzi integration, updateControl()/updateAudio(), pot reading, command dispatch
src/sound.cpp ~680 Voice management, wavetable selection, ADSR, pitch transpose, audio mixing
src/recorder.cpp ~380 Ring‑buffered EEPROM recording, trickle write, playback state machine
src/player.cpp ~180 PROGMEM melody playback with tick‑based timing
src/protocol.cpp ~160 Binary serial parser, state machine, response helpers
src/buttons.cpp ~60 Debounced hardware button polling
src/leds.cpp ~100 RGB visualiser with HSV mapping, status LEDs
src/display.cpp ~130 TM1637 note display with custom segment patterns
include/config.h ~70 Pin definitions, timing constants, serial config
include/sound.h ~150 Sound module API declarations and EnvelopeProfile struct
include/protocol.h ~80 Command byte map and CmdType enum
include/recorder.h ~70 RecordedNote struct and Recorder API
include/melodies.h ~600+ 13 melody note arrays in PROGMEM
tools/gui/app.py ~800 Main GUI class with all UI and event handling
tools/gui/config.py ~100 Key bindings, instrument presets, waveform list
tools/gui/serial_comm.py ~70 Thread‑safe Arduino serial communication
tools/gui/components.py ~60 PianoKey canvas widget
tools/gui/midi_handler.py ~200 MIDI file loading and playback

Last updated: 2026‑04‑02