Extreme verbose version of README.md. This document covers every subsystem, design decision, challenge, limitation, and implementation detail of the Buzzer project — an 8‑voice polyphonic wavetable synthesizer built on an Arduino Uno.
- Project Overview
- Hardware Architecture
- How Mozzi Works
- Full Audio Signal Path
- Sound Engine Deep Dive
- Envelope System (ADSR)
- Pitch Transpose System
- Voice Allocation & Stealing
- Gain Staging & Anti‑Clipping
- Wavetables
- Recording System
- Melody Player
- Serial Protocol
- Hardware I/O
- Python GUI
- Timing System
- Performance Optimisations
- Challenges & Solutions
- Limitations
- Sound Quality Notes
- Resource Usage
- File Reference
Buzzer is a polyphonic synthesizer that runs entirely on an Arduino Uno (ATmega328P) — a microcontroller with 16 MHz clock, 2 KB of SRAM, 32 KB of Flash, and 1 KB of EEPROM. Despite these severe constraints, it achieves:
- 8 simultaneous voices with independent ADSR envelopes
- 15 selectable waveforms from 512‑sample wavetables
- Real‑time pitch transposition via hardware potentiometer
- EEPROM‑based recording and playback
- 13 built‑in melodies stored in PROGMEM
- A full Python tkinter GUI communicating over 2 Mbaud serial
The audio engine is Mozzi, an open‑source library that generates audio on AVR microcontrollers using high‑frequency PWM. In HIFI mode (MOZZI_OUTPUT_2PIN_PWM), Mozzi splits 14‑bit audio across two pins (9 and 10), each carrying 7 bits, combined via a resistor network. The 125 kHz PWM carrier is well above audible range, and a 2‑pole RC filter further attenuates it by ~36 dB. Mozzi handles timer configuration, interrupt‑driven sample output, and provides building blocks like oscillators, envelopes, and filters.
| Pin | Function | Type | Details |
|---|---|---|---|
| 9 | Mozzi audio output (high bits) | PWM (Timer1) | Reserved — cannot be used for anything else. Mozzi drives this pin with the high 7 bits of 14‑bit HIFI audio at 16,384 Hz sample rate. |
| 10 | Mozzi audio output (low bits) | PWM (Timer1) | Reserved — cannot be used for anything else. Carries the low 7 bits. Combined with pin 9 via HIFI resistor network. |
| A0 | Record button | Digital input (INPUT_PULLUP) | Active LOW. Starts EEPROM recording. |
| A1 | Stop button | Digital input (INPUT_PULLUP) | Active LOW. Stops recording or playback. |
| A2 | Play button | Digital input (INPUT_PULLUP) | Active LOW. Plays back EEPROM recording. |
| A3 | Clear button | Digital input (INPUT_PULLUP) | Active LOW. Clears EEPROM recording data. |
| A5 | Pitch potentiometer | Analog input | 10 kΩ linear pot. Read via mozziAnalogRead(). Maps 0–1023 → ±24 semitones. |
| 3 | RGB LED (Red) | PWM output | Common cathode RGB. Brightness 0–40 (USB power safety). |
| 5 | RGB LED (Green) | PWM output | Colour derived from pitch (hue mapping). |
| 6 | RGB LED (Blue) | PWM output | Brightness modulated by envelope level. |
| 4 | Feedback LED | Digital output | Flashes briefly on button press (~150 ms). |
| 7 | Record status LED | Digital output | Steady ON while recording is active. |
| 8 | Play status LED | Digital output | Steady ON during playback. |
| 11 | TM1637 CLK | Digital output | 7‑segment display clock line. |
| 12 | TM1637 DIO | Digital output | 7‑segment display data line. Shows note name + octave (e.g. "C 4"). |
When startMozzi() is called, Mozzi writes to the DIDR0 register to disable digital input buffers on all analog pins (A0–A5). This is an ADC noise reduction feature. However, the four buttons are wired to A0–A3 and require digitalRead(). The fix:
// After startMozzi():
DIDR0 &= ~(0x0F); // Re-enable digital input buffers for A0-A3Without this, all four buttons would read as LOW permanently.
The Arduino Uno has no DAC (Digital‑to‑Analog Converter). Mozzi outputs audio as high‑frequency PWM. In HIFI mode (MOZZI_OUTPUT_2PIN_PWM), the 14‑bit audio sample is split across two pins: pin 9 carries the high 7 bits and pin 10 carries the low 7 bits, each at a 125 kHz PWM carrier rate. A resistor network sums them with a 128:1 weighting ratio, and a 2‑pole RC filter attenuates the PWM carrier by ~36 dB.
This provides ~84 dB of dynamic range (14 bits) compared to ~48 dB from standard single‑pin 8‑bit PWM — a massive improvement in audio quality, especially in quiet passages where quantisation noise was previously dominant.
The current circuit routes the combined HIFI signal through a 10 kΩ volume potentiometer, then into a PN2222 common‑emitter amplifier driving a passive piezo buzzer.
flowchart LR
P9["Pin 9\n(high 7 bits)"] --> HIFI["Stage 1\nHIFI RC Filter"]
P10["Pin 10\n(low 7 bits)"] --> HIFI
HIFI --> POT["Stage 2\nVolume Pot 10kΩ"]
POT --> AMP["Stage 3\nPN2222 Amp"]
AMP --> BUZ["Passive Buzzer"]
style P9 fill:#4a9eff,color:#fff
style P10 fill:#4a9eff,color:#fff
style HIFI fill:#ff9f43,color:#fff
style POT fill:#2ed573,color:#fff
style AMP fill:#e056fd,color:#fff
style BUZ fill:#1e90ff,color:#fff
HIFI mode uses Timer1 AND Timer2. This means pins 9, 10 (Timer1) and 11 (Timer2) lose
analogWrite(). Pin 11 is used for TM1637 CLK via bit‑bang, so is unaffected.
Combines pin 9 (high byte) and pin 10 (low byte) into a single analog signal. Target ratio R_low/R_high ≈ 128. Actual: 500 kΩ / 4 kΩ = 125 (2.3% error, <0.05 bit loss).
flowchart LR
P9["Pin 9 high 7 bits"]
P10["Pin 10 low 7 bits"]
R1["2kΩ"]
R2["2kΩ"]
R3["1MΩ"]
R4["1MΩ"]
C1["0.01µF ceramic"]
C2["0.01µF ceramic"]
A(("Node A to Pot"))
G1["GND"]
G2["GND"]
P9 --- R1 --- C1 --- G1
R1 --- R2 --- A
R2 --- C2 --- G2
P10 --- R3 --- A
P10 --- R4 --- A
style P9 fill:#4a9eff,color:#fff
style P10 fill:#4a9eff,color:#fff
style A fill:#ff9f43,color:#fff
style G1 fill:#555,color:#fff
style G2 fill:#555,color:#fff
Parts: 2× 2 kΩ (series = 4 kΩ), 2× 1 MΩ (parallel = 500 kΩ), 2× 0.01 µF ceramic. Filter: 2‑pole RC, −40 dB/decade rolloff. Attenuates 125 kHz PWM carrier by ~36 dB.
Simple voltage divider between the filtered HIFI audio signal and ground.
flowchart LR
P9["Pin 9"] --> POT_1["Pot Pin 1"]
POT_1 --- POT_W["Pot Wiper"] --> AMP["To Amp"]
POT_1 --- POT_3["Pot Pin 3"] --- GND["GND"]
style P9 fill:#4a9eff,color:#fff
style POT_1 fill:#2ed573,color:#fff
style POT_W fill:#2ed573,color:#fff
style POT_3 fill:#2ed573,color:#fff
style GND fill:#555,color:#fff
style AMP fill:#e056fd,color:#fff
Use a PASSIVE buzzer (no internal oscillator). Active buzzers only beep at one fixed frequency. Passive piezos are non‑polarized.
- Bias: 10 kΩ / 2 kΩ divider → V_base ≈ 0.83 V, V_emitter ≈ 0.23 V, I_C ≈ 2.3 mA
- AC Gain: ~90× (100 µF bypass cap shorts R_E for AC, leaving only transistor r_e)
- R_C (1 kΩ) and Buzzer are in parallel between +5 V and Collector
flowchart TD
VCC["+5V"]
WIPER["Pot wiper"]
R1["10kΩ bias top"]
R2["2kΩ bias bottom"]
CIN["10µF coupling cap"]
RC["1kΩ collector load"]
BUZ["Passive Buzzer"]
RE["100Ω emitter"]
CBYP["100µF bypass cap"]
QB(("B"))
QC(("C"))
QE(("E"))
QL["PN2222"]
G1["GND"]
G2["GND"]
G3["GND"]
VCC --- R1 --- QB
QB --- R2 --- G1
WIPER --- CIN --- QB
QB --- QL
QL --- QC
QL --- QE
VCC --- RC --- QC
VCC --- BUZ --- QC
QE --- RE --- G2
RE --- CBYP --- G3
style VCC fill:#ff4757,color:#fff
style QL fill:#ffa502,color:#fff
style QB fill:#ffa502,color:#fff
style QC fill:#ffa502,color:#fff
style QE fill:#ffa502,color:#fff
style WIPER fill:#e056fd,color:#fff
style BUZ fill:#1e90ff,color:#fff
style G1 fill:#555,color:#fff
style G2 fill:#555,color:#fff
style G3 fill:#555,color:#fff
Parallel connection: Both R_C (1 kΩ) and Buzzer have one leg on +5 V and the other on Collector. R_C provides the DC bias path. The AC voltage swing at the collector drives the piezo.
Electrolytic capacitor polarity (stripe = negative):
| Cap | + leg connects to | − leg connects to |
|---|---|---|
| 10 µF coupling | Pot wiper (~2.5 V DC) | Base (~0.83 V DC) |
| 100 µF bypass | Emitter (~0.23 V DC) | GND (0 V) |
Rule: + leg always toward higher DC voltage.
| Part | Qty | Role |
|---|---|---|
| 2 kΩ resistor | 2 | HIFI R_high (series = 4 kΩ) |
| 1 MΩ resistor | 2 | HIFI R_low (parallel = 500 kΩ) |
| 0.01 µF ceramic cap | 2 | 2‑pole PWM filter |
| 10 kΩ potentiometer | 1 | Volume control |
| 10 kΩ resistor | 1 | Bias divider top |
| 2 kΩ resistor | 1 | Bias divider bottom |
| 1 kΩ resistor | 1 | Collector load |
| 100 Ω resistor | 1 | Emitter degeneration |
| 10 µF electrolytic | 1 | Input coupling |
| 100 µF electrolytic | 1 | Emitter bypass |
| PN2222 transistor | 1 | Amplifier |
| Passive buzzer | 1 | Speaker output |
Mozzi is a real‑time audio synthesis library for Arduino. Understanding its architecture is essential to understanding this project.
Mozzi takes control of hardware timers:
- Timer1 — drives the PWM output on Pins 9 and 10. In HIFI mode, Timer1 outputs the high 7 bits on pin 9 and low 7 bits on pin 10, each at 125 kHz PWM carrier rate. An interrupt at MOZZI_AUDIO_RATE (16,384 Hz) calls
updateAudio()to fetch the next sample. - Timer2 — claimed by Mozzi in HIFI mode for the second PWM channel (pin 10). Not available for user code.
- Timer0 — Mozzi reconfigures Timer0, which breaks
millis(),delay(), andmicros(). This is why the Buzzer project uses its own tick‑based timing system instead.
Mozzi operates on a two‑rate system:
-
updateAudio()— called at ~16,384 Hz (MOZZI_AUDIO_RATE). Must return one audio sample. This runs inside a timer interrupt, so it must be extremely fast (budget: ~61 µs per call, but realistically <30 µs to leave headroom). No Serial, noanalogRead(), nodigitalWrite()— only pure math. -
updateControl()— called at MOZZI_CONTROL_RATE (64 Hz in this project). This is where "slow" operations happen: reading the potentiometer, updating ADSR envelopes, polling buttons, parsing serial commands. Budget is more generous (~15.6 ms per call) but must still avoid blocking.
Mozzi requires loop() to contain only audioHook():
void loop() {
audioHook();
}audioHook() is a non‑blocking scheduler that manages the timing between updateControl() calls and fills the audio output buffer.
<Mozzi.h> (or the older <MozziGuts.h>) can only be #include'd in one .cpp file (main.cpp). Other source files must use <MozziHeadersOnly.h> to access Mozzi types and utilities. This is because Mozzi.h defines ISR handlers and global state that would cause linker errors if included multiple times.
This constraint means:
mozziAnalogRead()is only available in main.cppstartMozzi()andaudioHook()are only callable from main.cpp- sound.cpp uses
<MozziHeadersOnly.h>forOscil,ADSR,Smoothetc.
Here is the complete journey of a single audio sample from note trigger to speaker output:
A note arrives via serial (GUI/MIDI) or melody player as a frequency in Hz. Sound::noteOn(freq) is called.
The raw frequency is adjusted by the current instrument's octave shift. The shift is stored as semitones (multiples of 12) and precomputed as a bit‑shift count at instrument‑load time to avoid division in the hot path.
freq = applyInstrument(freq) // bit-shift by cached octave offset
A free voice is obtained from the voice free stack in O(1). If all 8 voices are busy, the oldest voice is stolen (linear scan).
The hardware pot's current transpose value (±24 semitones) is applied using a PROGMEM lookup table of fixed‑point frequency ratios:
shiftedFreq = (freq * SEMITONE_RATIO[transpose + 24]) >> 8
This avoids floating‑point math (pow(2, semitones/12.0)) which would take ~2000 cycles on AVR.
The voice's Oscil<512, MOZZI_AUDIO_RATE> oscillator is configured with the transposed frequency. The oscillator performs phase‑accumulator wavetable lookup — it steps through a 512‑sample wavetable at a rate determined by the frequency.
The voice's ADSR<MOZZI_CONTROL_RATE, MOZZI_AUDIO_RATE> envelope is configured with the current instrument's parameters and noteOn() is called. The ADSR progresses through Attack → Decay → Sustain → Release phases, outputting a value 0–255.
Inside the audio interrupt, for each active voice:
int16_t sample = voice.osc.next(); // Wavetable lookup: -128 to +127
uint16_t envVal = voice.env.next(); // ADSR output: 0 to 255
mix += (int16_t)(sample * envVal); // -32640 to +32385The sample * envVal multiplication uses the AVR's native 8×8→16 MULS instruction (2 cycles), avoiding the expensive 32‑bit multiply that the compiler would otherwise generate.
All active voices are summed into a single int32_t mix accumulator. Only active voices are processed, identified via an 8‑bit bitmask (activeVoiceMask) — inactive voices are skipped without even checking their struct.
The raw mix is scaled by a smoothed gain factor:
int16_t gainDelta = (int16_t)targetGain - (int16_t)smoothedGain;
smoothedGain += gainDelta >> 2;
mix = (mix * smoothedGain) >> 8;targetGain is a voice‑count‑dependent value updated at control rate (64 Hz), boosted ~25% for HIFI's cleaner 14‑bit output:
- 1 voice: 320 (1.25×)
- 2 voices: 240
- 3 voices: 200
- 4 voices: 160
- 5–6 voices: 120
- 7–8 voices: 100
The integer IIR filter (>>2 coefficient, ~16 Hz cutoff at 64 Hz rate) ensures gain transitions happen gradually to prevent audible pops when voices start/stop.
Critical lesson learned: An earlier attempt applied Smooth<int>(0.95f) directly to audio samples at 16 kHz. This created a 134 Hz low‑pass filter that destroyed everything above 134 Hz — extreme volume loss and distortion. Smoothing must only be applied to control signals at control rate. The current integer IIR approach eliminates the Smooth library dependency entirely.
if (mix > 32767) mix = 32767;
else if (mix < -32767) mix = -32767;The final sample is returned as MonoOutput::from16Bit(mix), which Mozzi splits into high and low 7‑bit values. The high byte is loaded into Timer1’s OCR1A register (pin 9) and the low byte into OCR1B (pin 10).
Both pins output PWM at 125 kHz carrier rate. The external HIFI resistor network sums them with a 128:1 weighting (4 kΩ for high bits, 500 kΩ for low bits), reconstructing the 14‑bit analog value. A 2‑pole RC filter (2× 0.01 µF ceramic caps) attenuates the 125 kHz carrier by ~36 dB. The resulting audio signal passes through the volume pot and amplifier to the buzzer.
The sound engine (sound.cpp) manages 8 Voice structs:
struct Voice {
Oscil<512, MOZZI_AUDIO_RATE> osc; // Wavetable oscillator
ADSR<MOZZI_CONTROL_RATE, MOZZI_AUDIO_RATE> env; // Envelope
uint16_t freq; // Current playing frequency (post-transpose)
uint16_t baseFreq; // Pre-transpose frequency (for matching noteOff)
uint16_t timedDurationTicks; // Auto-release countdown (0 = sustained)
uint32_t timedStartTick; // When the timed note started
uint32_t noteOnTick; // When the note was triggered (for age-based stealing)
bool active;
volatile uint8_t currentLevel; // Envelope level for LED visualiser
};- Sustained (
noteOn/noteOff) — used by GUI keyboard and MIDI. Note plays indefinitely untilnoteOff(freq)is called.timedDurationTicksis 0. - Timed (
playTimed) — used by melody player and EEPROM playback. Note auto‑releases after the specified duration.timedDurationTicksis set, andcontrolUpdate()checks elapsed ticks.
noteOff(freq) must find the correct voice. Since pitch transpose can change between noteOn and noteOff, matching is done by baseFreq (the pre‑transpose frequency), not the actual playing frequency. The instrument offset is applied to the incoming noteOff frequency before matching.
Each voice has an independent ADSR (Attack, Decay, Sustain, Release) envelope generator from Mozzi.
| Parameter | Range | Description |
|---|---|---|
| Attack | 0–1000 ms | Time to rise from 0 to peak (255) |
| Decay | 0–1000 ms | Time to fall from peak to sustain level |
| Sustain Level | 0–255 | Steady‑state amplitude during hold |
| Sustain Length | 0–10000 ms | How long to hold sustain before auto‑release (0 = hold until noteOff) |
| Release | 0–1000 ms | Time to fade from sustain to silence after noteOff |
| ID | Name | A | D | S | SL | R | Octave Shift |
|---|---|---|---|---|---|---|---|
| 0 | Piano | 5 | 200 | 40 | 5000 | 200 | 0 |
| 1 | Organ | 20 | 0 | 255 | 10000 | 150 | 0 |
| 2 | Staccato | 5 | 50 | 0 | 500 | 20 | 0 |
| 3 | Pad | 150 | 800 | 180 | 10000 | 1000 | 0 |
| 4 | Flute | 80 | 0 | 220 | 10000 | 300 | 0 |
| 5 | Bell | 5 | 400 | 0 | 1000 | 1200 | 0 |
| 6 | Bass | 10 | 150 | 150 | 5000 | 100 | −12 (1 octave down) |
| 7 | Custom | user | user | user | user | user | user |
Presets are stored in PROGMEM and loaded via memcpy_P on instrument change. The "Custom" preset is stored in RAM and updated live from the GUI's ADSR sliders.
A potentiometer on pin A5 controls real‑time pitch transposition from −24 to +24 semitones (±2 octaves).
uint16_t raw = mozziAnalogRead(PIN_POT_PITCH); // 0-1023Why mozziAnalogRead instead of analogRead? Mozzi reconfigures the ADC for its own background conversions. Calling analogRead() would conflict with Mozzi's ADC state machine, potentially corrupting both the analog read and Mozzi's audio timing. mozziAnalogRead() cooperates with Mozzi's ADC scheduler.
ADC noise causes ±2–5 LSB jitter even with a stable pot. A deadband of ±20 raw units prevents spurious pitch updates:
int16_t diff = (int16_t)raw - (int16_t)lastRawPot;
if (diff > 20 || diff < -20) {
lastRawPot = raw;
// ... process new value
}The 0–1023 range maps linearly to −24…+24 semitones. The semitone value is applied to frequencies using a PROGMEM lookup table of 49 fixed‑point ratios (8.8 format):
ratio[0] = 64 // -24 semitones = 0.25× (2 octaves down)
ratio[24] = 256 // 0 semitones = 1.0× (unity)
ratio[48] = 1024 // +24 semitones = 4.0× (2 octaves up)
Application: newFreq = (freq * ratio) >> 8
This avoids pow(2.0, semitones/12.0) which would require floating‑point math (~2000 cycles on AVR vs ~10 cycles for integer multiply + shift).
When the pot changes, setPitchTranspose() updates all currently‑active voices in real‑time by recalculating their frequencies from baseFreq through the ratio table. This produces a smooth pitch‑bend effect.
The Arduino sends SYNC:PI <value> over serial whenever the pitch changes. The Python GUI updates its "Pitch (Pot A5)" label accordingly.
Instead of scanning all 8 voices to find a free one (O(n)), a stack‑based allocator provides O(1) allocation:
uint8_t freeVoiceStack[8]; // Stack of free voice indices
uint8_t freeVoiceCount = 8; // Number of free voices
// Allocate: pop from stack
int8_t vi = freeVoiceStack[--freeVoiceCount];
// Free: push to stack
freeVoiceStack[freeVoiceCount++] = vi;When all 8 voices are active and a new note arrives, the oldest voice (by noteOnTick) is stolen. The scanner uses the global tick counter to determine age:
for (i = 0..7) {
age = now - voices[i].noteOnTick;
if (age > oldestTime) steal this voice;
}An 8‑bit bitmask tracks which voices are active:
uint8_t activeVoiceMask = 0;
markVoiceActive(i) → activeVoiceMask |= (1 << i);
markVoiceInactive(i) → activeVoiceMask &= ~(1 << i);Benefits:
__builtin_popcount(activeVoiceMask)gives active voice count in ~4 cycles (single AVR instruction via software)- The audio loop iterates via bit‑shifting the mask, skipping inactive voices without loading their struct
activeVoiceMask == 0provides instant early‑exit fromaudioUpdate()
Each voice produces samples in the range −32,640 to +32,385 (int8 × uint8). With 8 voices, the theoretical maximum sum is ±261,120 — far exceeding the 16‑bit output range of ±32,767.
Rather than simple equal division (which sounds too quiet with few voices), a tuned gain table provides musically useful levels:
| Active Voices | Gain (8.8 fixed point) | Effective Scale |
|---|---|---|
| 0–1 | 320 | 1.25× (boosted for HIFI) |
| 2 | 240 | 0.9375× |
| 3 | 200 | 0.78× |
| 4 | 160 | 0.625× |
| 5–6 | 120 | 0.47× |
| 7–8 | 100 | 0.39× |
When voices start or stop, the gain target changes. Applying it instantly would cause an audible click (the entire mix suddenly jumping in volume). An integer IIR filter (exponential moving average), running at 64 Hz control rate, ramps the gain smoothly:
int16_t gainDelta = (int16_t)targetGain - (int16_t)smoothedGain;
smoothedGain += gainDelta >> 2; // ~16 Hz cutoff at 64 Hz CRThe >>2 shift gives ~70 ms settling time, preventing pops when voices start/stop.
HIFI gain boost rationale: With 14‑bit output (vs old 9‑bit standard mode), the DAC resolves 32× more amplitude levels. Signals that were previously lost in quantisation noise are now cleanly rendered. This allows running the gain ~25% hotter without introducing artefacts.
After gain scaling, samples exceeding ±32,767 are hard‑clipped. With the gain table's conservative approach, clipping only occurs when many loud voices play simultaneously — and the smooth gain transition prevents sudden clipping artefacts.
All waveforms are 512‑sample wavetables stored in PROGMEM as int8_t arrays (−128 to +127).
| ID | Name | Source |
|---|---|---|
| 0 | Triangle | Mozzi built‑in |
| 1 | Sawtooth | Mozzi built‑in |
| 2 | Sine | Mozzi built‑in |
| 3 | Square (no alias) | Mozzi built‑in |
| 4 | Electric Piano | AKWF collection, modified |
| 5 | Clarinet | AKWF collection |
| 6 | Violin | AKWF collection |
| 7 | FM Synth | AKWF collection |
| 8 | Guitar | AKWF collection |
| 9 | Cello | AKWF collection |
| 10 | Flute | AKWF collection |
| 11 | NES Pulse (12.5%) | AKWF/custom |
| 12 | Oboe | AKWF collection |
| 13 | Osc Chip | AKWF collection |
| 14 | Piano | AKWF collection |
AKWF = Adventure Kid Waveforms — a public‑domain collection of single‑cycle waveforms. The originals are 600+ samples; they were resampled to 512 and quantised to 8‑bit signed for Mozzi compatibility.
A wavetable oscillator works by stepping through a stored waveform at a variable rate:
- A phase accumulator increments by a step size each sample.
- The step size is proportional to the desired frequency:
step = tableSize × freq / sampleRate. - The accumulator index wraps around at
tableSize, producing a periodic waveform. - The sample at the current index is output.
Mozzi's Oscil<512, MOZZI_AUDIO_RATE> handles all of this with fixed‑point arithmetic for sub‑sample accuracy.
The recorder captures note events to the Arduino Uno's 1 KB EEPROM for offline playback.
Each note is stored as a 6‑byte RecordedNote:
struct RecordedNote {
uint16_t deltaMs; // Silence before this note (ms since last event)
uint16_t frequency; // Hz (0 = rest)
uint16_t duration; // How long the note played (ms)
};Capacity: (1024 − 2 header bytes) / 6 = 170 notes maximum.
EEPROM writes take 3.3 ms each and are blocking — the CPU halts until the write completes. At 16 kHz audio rate, 3.3 ms = 54 missed audio samples, causing a massive audible glitch.
Notes are buffered in a 32‑entry RAM ring buffer. In each updateControl() call (64 Hz), if the EEPROM hardware is ready (eeprom_is_ready()), exactly one byte is written using raw register access:
EEAR = address; // Set EEPROM address register
EEDR = data_byte; // Set EEPROM data register
EECR |= (1 << EEMPE); // Master Program Enable
EECR |= (1 << EEPE); // Start WriteSince each note is 6 bytes and writes happen at 64 Hz (one byte per cycle), a single note takes 6/64 = 93.75 ms to fully persist. The ring buffer absorbs bursts of fast notes.
Events are only written when a note is released (key up). At that point, both the delta (silence before the note started) and the duration (how long it played) are known. This halves the number of EEPROM writes compared to writing on both note‑on and note‑off.
When recording stops, a silence marker is written capturing the gap between the last note‑off and the stop button press. Without this, the last note would appear to have zero duration on playback.
Recorder::stop() doesn't immediately halt — it enters a "stopping" state and waits for the ring buffer to fully flush to EEPROM before finalising the recording count.
13 melodies are stored in PROGMEM as arrays of Note structs:
struct Note {
uint16_t frequency; // Hz (0 = rest)
uint16_t duration; // ms
};A PROGMEM lookup table maps melody IDs to their data pointers and lengths, replacing a 13‑case switch statement:
const MelodyInfo MELODY_TABLE[13] PROGMEM = {
{melody_mario, melody_mario_length},
{melody_tetris, melody_tetris_length},
// ...
};The player uses the same tick‑based timing as all other modules. Perfect time accumulation prevents drift:
noteStartMs += noteDurationMs; // Accumulate, don't use "now" each timeIf the system falls behind by >65 ms (e.g. due to heavy processing), it snaps to current time rather than trying to catch up.
Melody notes use Sound::noteOn(freq) / Sound::noteOff(freq) — the exact same code path as GUI‑triggered MIDI notes. Previous implementations used playTimed() which caused double instrument‑offset application and voice pile‑up. The fix tracks lastMelodyFreq and releases the previous note before starting a new one.
At 2 Mbaud, the protocol must be byte‑efficient. A single note‑on is 1 byte. ADSR configuration is 11 bytes (command + 10 data). Text‑based protocols (e.g. "NOTE:ON:440\n") would be 12+ bytes and require parsing.
System Commands (0x00–0x14):
| Byte | Command | Data Bytes |
|---|---|---|
| 0x00 | STOP_ALL | 0 |
| 0x01 | REC_START | 0 |
| 0x02 | REC_STOP | 0 |
| 0x03 | REC_PLAY | 0 |
| 0x04 | REC_CLEAR | 0 |
| 0x05 | SET_ADSR | 10 (see below) |
| 0x07 | SET_INST | 1 (instrument ID 0–7) |
| 0x08 | PLAY_MELODY | 1 (melody ID 0–12) |
| 0x09 | SET_WAVEFORM | 1 (waveform ID 0–14) |
Note Commands:
| Range | Meaning | Mapping |
|---|---|---|
| 0x15–0x7F (21–127) | NOTE_OFF | MIDI note = byte value |
| 0x95–0xFF (149–255) | NOTE_ON | MIDI note = byte − 128 |
The gap (0x80–0x94) is unused. MIDI note 60 = Middle C.
ADSR Data Format (10 bytes after 0x05):
[A_hi, A_lo, D_hi, D_lo, S, SL_hi, SL_lo, R_hi, R_lo, Oct]
16‑bit values are transmitted big‑endian. Oct is a signed int8 (octave shift in semitones).
Protocol::hasCommand() implements a byte‑by‑byte state machine that can parse commands across multiple calls without blocking:
State::IDLE → receives system command → State::ARG1 or State::ARG_ADSR
State::ARG1 → receives 1 argument byte → command ready
State::ARG_ADSR → receives 10 bytes → command ready
Text‑based responses are sent via Serial.println():
NOTE:ON <freq>— note triggered (for GUI piano key visualisation)NOTE:OFF <freq>— specific note released (per‑voice, not global)SYNC:PI <semitones>— pitch pot changedSTOP— stop all acknowledgedPLAYING <id>— melody started
2 Mbaud (2,000,000 baud) is the maximum stable rate achievable on the test hardware. High baud rate is critical because Serial.print() blocks until the TX buffer is flushed. At lower baud rates, serial output during updateControl() would take long enough to cause audio glitches. At 2 Mbaud, a 15‑character response takes only ~75 µs.
Four buttons on A0–A3 use INPUT_PULLUP and active‑LOW logic. They are polled every updateControl() cycle (64 Hz) with software debouncing. Each button triggers its action once on the initial press, ignoring subsequent reads until released and re‑pressed.
The RGB LED provides visual feedback:
- Hue derives from the current note's pitch (frequency → colour mapping via HSV→RGB conversion)
- Brightness is modulated by the activeenvelope level (loud = bright, silent = off)
- Passive mode — when no notes are playing, a slow rainbow cycle runs
- Maximum brightness is capped at 40/255 for USB power safety (5V, 500 mA shared with the entire board)
A 4‑digit 7‑segment display shows the currently‑playing note:
- Digits 0–1: Note name (e.g. "C ", "Ab")
- Digit 3: Octave number (0–9)
Custom segment patterns handle sharps/flats. The display updates at control rate (64 Hz).
The GUI is a tkinter application split across several modules:
| File | Purpose |
|---|---|
piano_gui.py |
Entry point with argument parsing |
gui/app.py |
Main PianoGUI class — all UI setup and event handling |
gui/config.py |
Constants: keybindings, instruments, melodies, waveforms |
gui/serial_comm.py |
Thread‑safe serial communication (ArduinoSerial class) |
gui/components.py |
PianoKey widget (tk.Canvas subclass) |
gui/midi_handler.py |
MIDI file loading and playback (mido library) |
61 visual keys (C2–C7) rendered with tk.Canvas widgets. Black keys are overlaid on white keys with place() geometry. The keyboard auto‑scales on window resize (debounced).
Two‑row piano layout using tkinter keysyms:
- Z‑row (lower): C3–B3 white keys, home row (S, D, G, H, J, L, ;) for sharps
- Q‑row (upper): C4–G5 white keys
- Punctuation keys (comma, period, slash, brackets) mapped by keysym name
Keys are rebindable at runtime by right‑clicking a piano key.
A background thread reads serial lines and dispatches to on_arduino_message() via root.after() for thread safety. The except Exception: pass pattern was replaced with explicit error logging to prevent silent failures.
SYNC:PI <value> messages from Arduino update the "Pitch (Pot A5)" label in real‑time. The label shows the current semitone offset with sign (e.g. "+5 st", "−12 st", "0 st").
A "Reset" button in the connection bar sends STOP_ALL + SET_WAVEFORM(Piano) + SET_INST(Piano) to the Arduino and resets all GUI controls to defaults.
Mozzi reconfigures Timer0, which powers millis(). While millis() technically still works in Mozzi 2.x, it may have reduced accuracy and its use inside audio callbacks is unsafe. More critically, millis() requires 4 bytes of global state protected by interrupt disabling, which adds latency jitter to the audio interrupt.
A global volatile uint32_t controlTicks is incremented once per updateControl() call (64 Hz). All timing in the project uses this counter:
// Convert ticks to milliseconds
uint32_t ticksToMs(uint32_t ticks) {
return (ticks * 125) >> 3; // ticks × 15.625 = ticks × 125 / 8
}
// Convert milliseconds to ticks (fast integer math, no division)
uint16_t msToTicks(uint16_t ms) {
uint32_t ticks = (ms * 4195UL) >> 16;
return ticks > 0 ? ticks : 1;
}The tick counter overflows after 2³² / 64 / 86400 ≈ 776 days of continuous operation.
At 64 Hz, the timing resolution is 15.625 ms. This is sufficient for musical timing (a semiquaver at 200 BPM = 75 ms), but means very short notes (<15 ms) may be quantised up to 1 tick minimum.
| # | Optimisation | Where | Saving |
|---|---|---|---|
| 1 | Tick‑based timing | sound.cpp, player.cpp, recorder.cpp | Eliminates millis() overhead and Timer0 conflicts |
| 2 | Active voice bitmask | sound.cpp audioUpdate | Skips inactive voices without struct access |
| 3 | Melody lookup table | player.cpp | O(1) melody selection instead of 13‑case switch |
| 4 | Bit‑shift octave | sound.cpp applyInstrument | 2 cycles vs 200+ for division |
| 5 | Cached octave offset | sound.cpp loadProfile | Division by 12 done once on instrument change |
| 6 | Voice free stack | sound.cpp allocateVoice | O(1) allocation instead of O(n) scan |
| 7 | Fixed‑point transpose | sound.cpp applyPitchTranspose | ~10 cycles vs ~2000 for pow() |
| 8 | int8×uint8 multiply | sound.cpp audioUpdate | Uses AVR MULS (~2 cycles) not int32 (~40 cycles) |
| 9 | PROGMEM tables | throughout | Wavetables and melodies in Flash, not RAM |
| 10 | Non‑blocking EEPROM | recorder.cpp | 1 byte per cycle, zero audio impact |
| 11 | Lightweight NOTE sync | player.cpp | Manual utoa + single Serial.write() vs snprintf / Serial.print() chain |
| 12 | Integer IIR gain smoothing | sound.cpp | Replaces Smooth<> library (saves ~1.5 KB flash) |
| 13 | Bitmask modulo | recorder.cpp | & (SIZE-1) replaces % SIZE for power‑of‑2 ring buffer |
| 14 | Bitshift map() | main.cpp | (raw * 49) >> 10 replaces map() (avoids division) |
build_flags = -O3 -flto -DMOZZI_ANALOG_READ_RESOLUTION=10 -DMOZZI_AUDIO_MODE=MOZZI_OUTPUT_2PIN_PWM -DMOZZI_AUDIO_RATE=16384-O3: Aggressive compiler optimisations (inlining, loop unrolling)-flto: Link‑Time Optimisation — allows cross‑file inlining and dead code elimination-DMOZZI_ANALOG_READ_RESOLUTION=10: Keeps standard 10‑bit ADC resolution-DMOZZI_AUDIO_MODE=MOZZI_OUTPUT_2PIN_PWM: 14‑bit HIFI output on pins 9+10-DMOZZI_AUDIO_RATE=16384: 16 kHz sample rate (32 kHz default would starve CPU with 8 voices)
Problem: EEPROM writes block the CPU for 3.3 ms, causing 54 missed audio samples (audible click/pop).
Solution: Non‑blocking trickle write — buffer notes in RAM, write 1 byte per control cycle using raw register access. The EEPROM hardware signals readiness via eeprom_is_ready().
Problem: startMozzi() sets the DIDR0 register to disable digital input buffers on A0–A5 for ADC noise reduction. This permanently breaks digitalRead() on button pins A0–A3.
Solution: Manually re‑enable digital input buffers after startMozzi():
DIDR0 &= ~(0x0F); // Re-enable bits 0-3 (A0-A3)Problem: Applying Smooth<int>(0.95f) to audio samples at 16 kHz created a low‑pass filter with 134 Hz cutoff, destroying all audio above 134 Hz. The result was extremely quiet, muffled sound with crackling.
Explanation: Mozzi's Smooth is an exponential moving average. At audio rate (16 kHz), coefficient 0.95 means the cutoff frequency is: fc = -(sampleRate × ln(coeff)) / (2π) = -(16384 × ln(0.95)) / (2π) ≈ 134 Hz.
Solution: Never apply Smooth to audio samples. Apply it only to control signals (gain, parameters) at control rate (64 Hz), where 0.8f gives a ~14 Hz cutoff — appropriate for smooth parameter transitions.
Problem: The melody player called noteOn() for each new note without releasing the previous one. Over time, all 8 voices filled up, forcing voice stealing on every subsequent note. This caused notes to steal from each other in rapid succession, creating timing drift and slowdown.
Solution: Track lastMelodyFreq and call Sound::noteOff(lastMelodyFreq) before each new note — the same code path as GUI MIDI notes. Additionally, removed a double applyInstrument() call that was doubling the octave shift.
Problem: The ADC on pin A5 has inherent noise (±2–5 LSB). With 1024 ADC values mapped to 49 semitone positions, even 1 LSB of noise could cause the pitch to jitter between two semitones during sustain.
Solution: A raw deadband of ±20 ADC units. The pot value must change by at least 20 (out of 1024) before the system registers a new position. This eliminates noise‑induced jitter at the cost of very slight positional hysteresis.
Problem: The ATmega328P has no hardware divide instruction. Division by 12 (for octave calculation) takes ~200–400 cycles via software emulation.
Solution: Precompute the result. Octave offset = octaveShift / 12 is calculated once at instrument‑load time and cached. In the note‑on hot path, only bit‑shifts are used.
Problem: At 9600 baud (default), sending "NOTE:ON:440\n" (12 chars) takes ~12.5 ms — nearly an entire control cycle. This delays the audio loop and causes glitches.
Solution: 2 Mbaud serial. The same 12 characters take ~60 µs. Additionally, the binary protocol reduces most commands to 1–2 bytes.
Problem: 8 Voice structs + Mozzi internals + ring buffers + stack consume nearly all 2048 bytes of RAM. Any new feature risks stack overflow.
Solution: Aggressive use of PROGMEM for all constant data (wavetables, melodies, instrument profiles, melody lookup table, string literals). F() macro for all Serial.print() strings. Current usage: 92.4% (1892/2048 bytes) — only 156 bytes free.
Problem: The serial read loop used except Exception: pass, silently swallowing every error including AttributeError in the message handler. This caused the pitch sync handler to never execute.
Solution: Split the try/except — serial I/O errors are caught separately from callback errors. Callback errors are printed to console for debugging.
| Limitation | Impact | Mitigation |
|---|---|---|
| No DAC | Audio is PWM only. Without filtering, raw output sounds like a buzzer. | HIFI 2‑pin PWM + external 2‑pole RC filter + PN2222 amplifier. |
| 14‑bit HIFI PWM | ~84 dB dynamic range. Much better than 8‑bit (~48 dB) but still not CD quality. | Adequate for the piezo buzzer’s output — the speaker is the bottleneck, not the DAC. |
| Timer1 + Timer2 taken | HIFI mode claims both timers. Pins 9, 10 (Timer1) and 11 (Timer2) lose analogWrite(). |
Pin 11 (TM1637 CLK) uses bit‑bang protocol, unaffected. |
| ~16 kHz sample rate | Nyquist limit = ~8 kHz. No audio content above 8 kHz. Higher harmonics of waveforms are lost. | Acceptable for "chiptune" aesthetic. |
| 2 KB RAM | 92%+ usage. Very little room for new features. Any RAM leak causes stack overflow and crashes. | PROGMEM for all constants. Minimise stack depth. |
| 1 KB EEPROM | ~170 notes max. No way to record long pieces. | External storage (SD card) would need SPI pins + library RAM. |
| 16 MHz clock | CPU budget is tight. 8 voices × wavetable + envelope at 16 kHz ≈ 50% CPU. | Careful optimisation of audio loop. |
| Single core | No parallel processing. Audio interrupt preempts control code. | Non‑blocking design throughout. |
| Limitation | Impact |
|---|---|
| Monophonic recording | Recorder captures one note at a time. Chords are not recorded. |
| No velocity sensitivity | All notes play at full volume (envelope attack peak = 255). |
| No polyphonic aftertouch | Envelope is set-and-forget per voice. |
| No reverb/delay/chorus | DSP effects require RAM buffers (e.g. 256+ samples for delay). Infeasible at 2 KB. |
| Timer0 compromised | millis() accuracy is reduced. delay() should never be used. |
| Timing quantisation | 64 Hz control rate = 15.6 ms resolution. Notes shorter than this are rounded up. |
Without external filtering, the output quality hierarchy is:
- Raw pin → passive buzzer — Harsh, tinny, distorted. The PWM carrier dominates.
- HIFI 2‑pin + RC filter → passive buzzer — Dramatically better. The 125 kHz carrier is inaudible. 14‑bit resolution gives cleaner dynamics.
- HIFI + RC filter → small amplifier → speaker — Best achievable quality. Still "lo‑fi" by modern standards but pleasant for chiptune/retro music.
Not all waveform/envelope combinations sound equally good through the piezo buzzer output chain. The 14‑bit HIFI mode greatly improves dynamic range over standard 8‑bit PWM, and the 125 kHz carrier is inaudible (vs the old ~16 kHz carrier which added harsh high‑frequency noise). However, the limited bandwidth (~8 kHz Nyquist) and piezo resonance characteristics still favour certain sounds.
| Waveform | Notes |
|---|---|
| Piano (14) | Best all‑round waveform. Clean, recognisable timbre. Default for a reason. |
| Violin (6) | Rich harmonic content that translates well even through the buzzer. |
| Square (3) | Classic chiptune sound. The piezo buzzer is naturally suited to square‑wave‑like signals. |
| Clarinet (5) | Odd‑harmonic‑heavy spectrum (similar to square) — works well with the limited bandwidth. |
| Envelope | Notes |
|---|---|
| Piano (0) | Short attack, moderate decay/release. The go‑to preset. |
| Staccato (2) | Very short, punchy notes. Great for fast melodies and rhythm. |
The remaining waveforms (Triangle, Sawtooth, Sine, EP, FM Synth, Guitar, Cello, Flute, NES Pulse, Oboe, Osc Chip) and envelopes (Organ, Pad, Flute, Bell, Bass) are functional but sound mediocre to poor through the buzzer output. The 8‑bit resolution and piezo frequency response don't do them justice. They might sound better with proper filtering and a real speaker, but through the current hardware they are underwhelming.
Recommended default: Piano waveform + Piano envelope.
| Component | Approximate Size |
|---|---|
| Wavetables (15 × 512 bytes) | ~7.5 KB |
| Melodies (13 PROGMEM arrays) | ~3 KB |
| Mozzi library code (incl. HIFI) | ~3.5 KB |
| Sound engine | ~2.5 KB |
| Protocol + recorder + player | ~2 KB |
| Arduino framework | ~2 KB |
| Other modules | ~1 KB |
| Total | ~21.9 KB / 32 KB (68%) |
| Component | Approximate Size |
|---|---|
| 8 Voice structs | ~400 bytes |
| Mozzi internals (buffers, state) | ~600 bytes |
| EEPROM ring buffer (32 × 6 bytes) | 192 bytes |
| Stack, globals, strings | ~700 bytes |
| Total | ~1892 / 2048 bytes (92.4%) |
| Usage | Size |
|---|---|
| Recording count (header) | 2 bytes |
| Note data | Up to 1022 bytes (170 notes × 6 bytes) |
| Total | 1024 bytes |
| File | Lines | Description |
|---|---|---|
src/main.cpp |
~200 | Mozzi integration, updateControl()/updateAudio(), pot reading, command dispatch |
src/sound.cpp |
~680 | Voice management, wavetable selection, ADSR, pitch transpose, audio mixing |
src/recorder.cpp |
~380 | Ring‑buffered EEPROM recording, trickle write, playback state machine |
src/player.cpp |
~180 | PROGMEM melody playback with tick‑based timing |
src/protocol.cpp |
~160 | Binary serial parser, state machine, response helpers |
src/buttons.cpp |
~60 | Debounced hardware button polling |
src/leds.cpp |
~100 | RGB visualiser with HSV mapping, status LEDs |
src/display.cpp |
~130 | TM1637 note display with custom segment patterns |
include/config.h |
~70 | Pin definitions, timing constants, serial config |
include/sound.h |
~150 | Sound module API declarations and EnvelopeProfile struct |
include/protocol.h |
~80 | Command byte map and CmdType enum |
include/recorder.h |
~70 | RecordedNote struct and Recorder API |
include/melodies.h |
~600+ | 13 melody note arrays in PROGMEM |
tools/gui/app.py |
~800 | Main GUI class with all UI and event handling |
tools/gui/config.py |
~100 | Key bindings, instrument presets, waveform list |
tools/gui/serial_comm.py |
~70 | Thread‑safe Arduino serial communication |
tools/gui/components.py |
~60 | PianoKey canvas widget |
tools/gui/midi_handler.py |
~200 | MIDI file loading and playback |
Last updated: 2026‑04‑02