diff --git a/gateware/docs/dsp/index.rst b/gateware/docs/dsp/index.rst index d28a4e24..f1f7f99f 100644 --- a/gateware/docs/dsp/index.rst +++ b/gateware/docs/dsp/index.rst @@ -4,18 +4,36 @@ DSP Library Philosophy ---------- -TODO short overview of the DSP library philosophy. +Tiliqua's DSP library is designed as a suite of DSP components - independent 'cores' which can be connected together in different ways in order to build a custom DSP pipeline. It makes heavy use of Amaranth streams (`lib.stream `_) for connecting components and `lib.fixed `_ for fixed-point types. `lib.stream `_ makes it possible to chain DSP components together in different ways (without components needing to know implementation details of each other), and `lib.fixed `_ makes it easier to write common numeric operations in Amaranth. -TODO link to Amaranth documentation on streams. +.. note:: -.. image:: /_static/mydsp.png - :width: 800 + Streams are an Amaranth construct describing a *stream of data* that is accompanied by a ``valid``/``ready`` handshake. This is a simple protocol used commonly in digital logic. For more details, see `Data streams `_ in the Amaranth documentation. + +Interconnect +------------ + +Building a custom DSP pipeline with the components provided here is often an act of figuring out how to massage the input and output ports of each component such that the design does what you want. In simple cases, like oscillators or filters, DSP components will often have a ``self.i`` stream for incoming and ``self.o`` stream for outgoing samples - these can be chained together in any order using `wiring.connect() `_. In more complex cases, like delay lines, components may expose a memory bus (for writes to external memory), multiple input or output ports, or global registers. It is important to read the documentation of each component and take a look at some of the example cores in order to understand how each component can be used, and how exactly input and output samples are synchronized. + +As of now, input and output ports of DSP components generally take on one of the following shapes: + + - ``stream.Signature(fixed.SQ)``: A stream of audio samples, one at a time. + - ``stream.Signature(ArrayLayout(N, fixed.SQ))``: A stream of N audio samples, one 1D array at a time. This is used for multi-channel, time-synchronized inputs and outputs -- like Tiliqua's 4 inputs or 4 outputs, or the :class:`tiliqua.dsp.MatrixMix` component. These can be split into streams of single samples using :class:`tiliqua.dsp.Split` or :class:`tiliqua.dsp.Merge` (see :doc:`stream_util`). + - ``stream.Signature(StructLayout({...}))``: A stream of N different types of data, one set at a time. This is often used when each audio sample needs a piece of metadata alongside it (e.g. realtime tweakable filters like :class:`tiliqua.dsp.SVF`). + - ``stream.Signature(Block(...))``: Some components can only operate on blocks of samples, like :class:`tiliqua.dsp.fft.FFT` - see :doc:`block` for details. + +The art is in knowing exactly which components can be used in translating between the interface styles. For example, :class:`tiliqua.dsp.fft.ComputeOverlappingBlocks` can help going from a sample stream to a block stream. :class:`tiliqua.dsp.Split` for going from an `ArrayLayout `_ to independent sample streams. Depending on the application, often `StructLayout `_ streams will need some manual handshaking logic. There is no one right answer for every adaptation, especially in cases where you have some control signals alongside synchronized audio streams. + +A few components have auxiliary interfaces to the outside world. Examples are :class:`tiliqua.dsp.DelayLine`, which may have a ``bus`` port to talk to external memory (for storing audio samples), or ``usb_audio`` components which require a connection to a USB PHY to service their audio in/out streams. + +'Basic' and 'Specialized' components +------------------------------------ + +DSP cores are split into 2 types, 'Basic' and 'Specialized'. Basic cores do not require qualified access - after a statement like ``from tiliqua import dsp``, these can be accessed through :class:`dsp.Split ` or similar. 'Specialized' cores need qualified access and may be accessed through :class:`dsp.fft.STFTProcessor ` or similar. Basic DSP Components -------------------- -After a statement like ``from tiliqua import dsp``, these can be accessed through ``dsp.Split`` or similar: - .. toctree:: :maxdepth: 2 @@ -33,8 +51,6 @@ After a statement like ``from tiliqua import dsp``, these can be accessed throug Specialized Modules ------------------- -These require qualified access - after a statement like ``from tiliqua import dsp``, these can be accessed through ``dsp.fft.STFTProcessor`` or similar: - .. toctree:: :maxdepth: 2 diff --git a/gateware/src/tiliqua/dsp/delay_effect.py b/gateware/src/tiliqua/dsp/delay_effect.py index c30b3ef6..b7606ac5 100644 --- a/gateware/src/tiliqua/dsp/delay_effect.py +++ b/gateware/src/tiliqua/dsp/delay_effect.py @@ -28,12 +28,30 @@ class PingPongDelay(wiring.Component): Delay lines are created external to this component, and may be SRAM-backed or PSRAM-backed depending on the application. + + Members + ------- + i : :py:`In(stream.Signature(data.ArrayLayout(ASQ, 2)))` + Stereo sample pairs into ping-pong delay. + + o : :py:`Out(stream.Signature(data.ArrayLayout(ASQ, 2)))` + Stereo sample pairs out of ping-pong delay. One per input. """ i: In(stream.Signature(data.ArrayLayout(ASQ, 2))) o: Out(stream.Signature(data.ArrayLayout(ASQ, 2))) def __init__(self, delayln1, delayln2, delay_samples=15000): + """ + delayln1 : delay_line.DelayLine + First delay line, must have max length > ``delay_samples``, and have + been created with ``DelayLine.write_triggers_read == True``. + delayln2 : delay_line.DelayLine + Second delay line, must have max length > ``delay_samples``, and have + been created with ``DelayLine.write_triggers_read == True``. + delay_samples : int + Length of each ping-pong section in samples. + """ super().__init__() self.delayln1 = delayln1 @@ -104,12 +122,28 @@ class Diffuser(wiring.Component): Delay lines are created external to this component, and may be SRAM-backed or PSRAM-backed depending on the application. + + Members + ------- + i : :py:`In(stream.Signature(data.ArrayLayout(ASQ, 4)))` + Sample array into the delay effect. + + o : :py:`Out(stream.Signature(data.ArrayLayout(ASQ, 4)))` + Sample array out of the delay effect. One is produced per input. """ i: In(stream.Signature(data.ArrayLayout(ASQ, 4))) o: Out(stream.Signature(data.ArrayLayout(ASQ, 4))) def __init__(self, delay_lines, delays=None): + """ + delay_lines : [delay_line.DelayLine] + Array of 4 delay lines used for feedback. Each delay line must be + at least as long as the corresponding entry in ``delays``. + delays : [int] + Fixed tap delay of each feedback path - one for each delay line. + If not provided, some default tap lengths are used. + """ super().__init__() if delays is None: @@ -197,9 +231,27 @@ class Boxcar(wiring.Component): no multiplies but instead requiring space for N samples. Can be used in low- or high-pass mode. + + Members + ------- + i : :py:`In(stream.Signature(sq))` + Samples into the boxcar averager. + + o : :py:`Out(stream.Signature(sq))` + Samples out of the boxcar averager, one produced per input. """ def __init__(self, n: int=32, hpf=False, sq=ASQ): + """ + n : int + Delay line size and window length of the averager. + hpf : bool + High-pass mode - if true, the average value is subtracted from the + last sample and we emit the difference, rather than emitting the + average value itself. Almost no extra cost, useful for other applications. + sq : fixed.SQ + Fixed-point type used for underlying inputs, outputs and storage. + """ # pow2 constraint on N allows us to shift instead of divide assert(2**exact_log2(n) == n) self.n = n diff --git a/gateware/src/tiliqua/dsp/effects.py b/gateware/src/tiliqua/dsp/effects.py index db16940c..2b613149 100644 --- a/gateware/src/tiliqua/dsp/effects.py +++ b/gateware/src/tiliqua/dsp/effects.py @@ -16,16 +16,43 @@ class WaveShaper(wiring.Component): """ - Waveshaper that maps x to f(x), where the function must be - stateless so we can precompute a mapping lookup table. + ``Waveshaper`` maps every sample ``x`` to ``f(x)``, where ``f`` + can be any arbitrary python function. - Linear interpolation is used between lut elements. + ``f(x)`` is evaluated at ``N=lut_size`` points at elaboration time, + to create a LUT (lookup table ROM) mapping the input domain (``ASQ``) + to output samples. For any input sample that sits between elements in the + ROM, linear interpolation is used to determine the output sample. + + This can be used for waveshaping, but is also useful for arbitrary + remapping of samples, for example tanh-based soft clipping, linear- + to exponential or linear-to-log space conversion. + + Members + ------- + i : :py:`In(stream.Signature(ASQ))` + Input stream for sending samples to the waveshaper. + + o : :py:`In(stream.Signature(ASQ))` + Output stream for getting samples from the waveshaper. """ i: In(stream.Signature(ASQ)) o: Out(stream.Signature(ASQ)) def __init__(self, lut_function=None, lut_size=512, continuous=False, macp=None): + """ + lut_function : function + Function taking and emitting ``float`` values in a valid ``ASQ`` range. + lut_size : int + Size of the LUT ROM in elements. Larger provides a better approximation. + continuous : bool + Behavior of linear interpolation at ``ASQ`` endpoints. For ``ASQ.i_bits==1`` + and for a function where ``f(+1) ~= f(-1)``, this should be used to ensure an + incoming saw results in a continuous output. + macp : mac.MAC + Optional shared MAC provider. + """ self.lut_size = lut_size self.lut_addr_width = exact_log2(lut_size) self.continuous = continuous @@ -127,16 +154,44 @@ def elaborate(self, platform): class PitchShift(wiring.Component): """ - Granular pitch shifter. Works by crossfading 2 separately - tracked taps on a delay line. As a result, maximum grain - size is the delay line 'max_delay' // 2. - - The delay line tap itself must be hooked up to the input - source from outside this component (this allows multiple - shifters to share a single delay line). + Granular pitch shifter. Works by crossfading 2 separately tracked taps on + a delay line - both tap positions are moving towards the write head (for + pitching up) or away from the write head (for pitching down). Whenever + the tap positions are close to overflowing, the discontinuity is smoothed + by a crossfade of length ``xfade``. + + Maximum pitch-shifting grain size is the delay line 'max_delay' // 2. Smaller + grain sizes or crossfades result in a 'fluttering' effect, but have lower latency. + To reduce fluttering at low latency, one can dynamically track the ``grain_sz`` + based on the input frequency. + + The delay line write head must be hooked up to the input source from outside this + component (this allows multiple shifters to share a single delay line). + + Members + ------- + i : :py:`In(stream.Signature(StructLayout({"pitch": ..., "grain_sz": ...}))` + Input stream, one element per desired output sample. ``pitch`` is a + ``fixed.SQ`` i where 0 is no pitch shift, positive shifts up (e.g. 1 is 2x speed), + negative shifts down. ``grain_sz`` is the length of audio grain used for pitch + shifting - up to the ``tap.max_delay`` + + o : :py:`In(stream.Signature(ASQ))` + Output stream of pitch shifted samples. """ def __init__(self, tap, xfade=256, macp=None): + """ + tap : delay_line.DelayLineTap() + ``DelayLineTap`` which pitch shifter reads at 2 tap positions for every + output sample. The delayline write head must be hooked up to the input source. + xfade : int + Crossfade length between taps, in samples. Crossfades occur at every + transition where we switch from one part of the delayline to the other. + Longer crossfades and grain sizes produce less 'fluttering'. + macp : mac.MAC + Optional shared MAC provider. + """ assert xfade <= (tap.max_delay // 4) self.tap = tap self.xfade = xfade diff --git a/gateware/src/tiliqua/dsp/filters.py b/gateware/src/tiliqua/dsp/filters.py index 59859a14..6ded13ee 100644 --- a/gateware/src/tiliqua/dsp/filters.py +++ b/gateware/src/tiliqua/dsp/filters.py @@ -16,16 +16,32 @@ class SVF(wiring.Component): """ - Oversampled Chamberlin State Variable Filter. + Oversampled Chamberlin State Variable Filter. Provides tunable high-, low-, and + band-pass outputs given a stream of input samples. Filter ``cutoff`` and + ``resonance`` are tunable at the system sample rate, with highpass, lowpass, + bandpass outputs available on stream payloads `hp`, `lp`, `bp`. - Filter `cutoff` and `resonance` are tunable at the system sample rate. - - Highpass, lowpass, bandpass routed out on stream payloads `hp`, `lp`, `bp`. + Includes 2x oversampling internally for improved stability close to Nyquist + / at high resonances. Each output sample uses 6 multiplies. Reference: Fig.3 in https://arxiv.org/pdf/2111.05592 + + Members + ------- + i : :py:`In(stream.Signature(StructLayout({"x": sq, "cutoff": sq, "resonance": sq}))` + Input stream for sending input ``x`` and tuning values ``cutoff``, ``resonance`` to the filter. + + o : :py:`Out(stream.Signature(StructLayout({"lp": sq, "hp": sq, "bp": sq}))` + Output stream for getting high-, low-, bandpass samples from the filter. """ def __init__(self, sq=ASQ, macp=None): + """ + sq : fixed.SQ + Data type for all input/output payloads of the SVF. + macp : mac.MAC + Optional shared MAC provider. + """ self.sq = sq self.macp = macp or mac.MAC.default() super().__init__({ @@ -112,11 +128,23 @@ def elaborate(self, platform): class DCBlock(wiring.Component): """ + DC blocker (single-pole IIR with a cutoff near DC). Useful before components + that can stack DC and overflow (e.g. matrix mixers). Only needs a single multiply + per sample. + Loosely based on: https://dspguru.com/dsp/tricks/fixed-point-dc-blocking-filter-with-noise-shaping/ """ def __init__(self, pole=0.999, sq=ASQ, macp=None): + """ + pole : float + Filter cutoff. Closer to 1 is closer to DC. + sq : fixed.SQ + Data type for all input/output payloads of the filter. + macp : mac.MAC + Optional shared MAC provider. + """ self.macp = macp or mac.MAC.default() self.pole = pole self.sq = sq @@ -168,14 +196,33 @@ def elaborate(self, platform): class OnePole(wiring.Component): """ - Simple lowpass using no multipliers. + Simple lowpass using no multipliers. This is useful for cheap smoothing of + e.g. step changes in control signals. + + Each output sample is computed as ``output += (input - output) >> shift`` + + Members + ------- + i : :py:`In(stream.Signature(ASQ))` + Input stream of samples for the filter. - ``output += (input - output) >> shift`` + o : :py:`Out(stream.Signature(ASQ))` + Output stream of samples from the filter. - :py:`shift` is dynamic: 0 is passthrough, higher values give more smoothing. + shift : :py:`In(unsigned(4))` + Optionally dynamic amount of smoothing. 0 is passthrough, higher + values give exponentially more smoothing. TODO: add default value for + ``self.shift`` so it doesn't need to be hooked up? """ def __init__(self, sq=ASQ, extra_bits=10): + """ + sq : fixed.SQ + Data type for all input/output payloads of the filter. + extra_bits : int + Extra fractional bits on top of ``sq`` for internal data types. This + is needed to reduce quantization of the input samples. + """ self.sq = sq self.sqw = fixed.SQ(sq.i_bits, sq.f_bits + extra_bits) super().__init__({ diff --git a/gateware/src/tiliqua/dsp/mix.py b/gateware/src/tiliqua/dsp/mix.py index a7fd3bc2..ff4f28b0 100644 --- a/gateware/src/tiliqua/dsp/mix.py +++ b/gateware/src/tiliqua/dsp/mix.py @@ -25,19 +25,49 @@ class CoeffUpdate(enum.Enum): class MatrixMix(wiring.Component): """ - Matrix mixer with tunable coefficients and configurable - input & output channel count. Uses a single multiplier. + ``MatrixMix`` takes a stream of samples ``i_channels`` wide and emits + a stream ``o_channels`` wide. The input channels are multiplied by a + matrix of ``i_channels*o_channels`` coefficients, which may be static + or dynamically updated through gateware. - Coefficients must fit inside the self.ctype declared below. - Coefficient update mode is selected by ``coeff_update``: + A single multiplier is shared, where total latency is of the order + ``2*i_channels*o_channels`` from input to output. + + All coefficients must fit inside the self.ctype declared below. + + Coefficients may be updated dynamically, depending on ``coeff_update``: - ``CoeffUpdate.NONE``: No update port. - ``CoeffUpdate.XY``: Stream of ``(o_x, i_y, v)`` updates. - ``CoeffUpdate.BLOCK``: Block stream, mapped row-major to coefficients. + + Members + ------- + i : :py:`In(stream.Signature(data.ArrayLayout(ASQ, i_channels)))` + Input stream for sending sample arrays to the mixer. + + o : :py:`In(stream.Signature(data.ArrayLayout(ASQ, o_channels)))` + Output stream for fetching sample arrays from the mixer. + + c : :py:`In(stream.Signature(...))` + Optional coefficient update port, type depends on ``self.coeff_update``. """ def __init__(self, i_channels, o_channels, coefficients, - coeff_update=CoeffUpdate.XY): + coeff_update=CoeffUpdate.XY, ctype=mac.SQNative): + """ + i_channels : int + Number of input channels. + o_channels : int + Number of output channels + coefficients : [[float]] + Nested array of static matrix coefficients, used as initial values + of the coefficient memory. + coeff_update : CoeffUpdate + Whether a dynamic coefficient update port should be added (see above). + ctype : fixed.SQ + Fixed-point type of coefficients in the coefficient ROM. + """ assert(len(coefficients) == i_channels) assert(len(coefficients[0]) == o_channels) @@ -46,7 +76,7 @@ def __init__(self, i_channels, o_channels, coefficients, self.o_channels = o_channels self.coeff_update = coeff_update - self.ctype = mac.SQNative + self.ctype = ctype coefficients_flat = [ fixed.Const(x, shape=self.ctype) diff --git a/gateware/src/tiliqua/dsp/oscillators.py b/gateware/src/tiliqua/dsp/oscillators.py index 2294035d..3128c084 100644 --- a/gateware/src/tiliqua/dsp/oscillators.py +++ b/gateware/src/tiliqua/dsp/oscillators.py @@ -15,12 +15,15 @@ class SawNCO(wiring.Component): """ Sawtooth Numerically Controlled Oscillator. - Often this can be simply routed into a LUT waveshaper for any other waveform type. + Frequency is linearly proportional to ``i.payload.freq_inc``, with optional + phase modulation on ``i.payload.phase``. One output sample per input sample. + + Often a saw is used routed into a waveshaper for arbitrary waveform types. Members ------- i : :py:`In(stream.Signature(data.StructLayout)` - Input stream, with fields :py:`freq_inc` (linear frequency) and + Input stream - :py:`freq_inc` (linear frequency) and :py:`phase` (phase offset). One output sample is produced for each input sample. o : :py:`Out(stream.Signature(ASQ))` @@ -60,14 +63,14 @@ def elaborate(self, platform): class WhiteNoise(wiring.Component): """ - Simple white noise generator. + Simple white noise generator based on an LFSR. See: https://www.musicdsp.org/en/latest/Synthesis/216-fast-whitenoise-generator.html Members ------- o : :py:`Out(stream.Signature(ASQ))` - Output stream of white noise. + Output stream of white noise. Throttled by output backpressure on ``o.ready``. """ o: Out(stream.Signature(ASQ)) @@ -111,12 +114,20 @@ class DWO(wiring.Component): Members ------- o : :py:`Out(stream.Signature(ASQ))` - Output stream of sinusoid samples. + Output stream of sinusoid samples. Throttled by output backpressure on ``o.ready``. """ o: Out(stream.Signature(ASQ)) def __init__(self, sq=None, macp=None, c=0.99): + """ + sq : fixed.SQ + Fixed-point type of internal waveguide calculations. + macp : mac.MAC + (optional) shared multiplier provider. + c : float + Tuning coefficient ``C = cos(2*pi*f/fs)`` + """ super().__init__() self.c = c self.sq = sq or self.o.payload.shape() @@ -129,7 +140,6 @@ def elaborate(self, platform): m.submodules.macp = mp = self.macp - # Frequency tuning coefficient: `C = cos(2*pi*f/fs)`. C = fixed.Const(self.c, shape=sq) # Initial conditions (determines output amplitude) diff --git a/gateware/src/top/__init__.py b/gateware/src/top/__init__.py new file mode 100644 index 00000000..e69de29b diff --git a/gateware/tests/test_dsp.py b/gateware/tests/test_dsp.py index 38366146..5a0ef13c 100644 --- a/gateware/tests/test_dsp.py +++ b/gateware/tests/test_dsp.py @@ -415,6 +415,29 @@ async def testbench(ctx): with sim.write_vcd(vcd_file=open("test_dcblock.vcd", "w")): sim.run() + def test_onepole(self): + + dut = dsp.OnePole() + target = 0.5 + + async def stimulus(ctx): + x = fixed.Const(target, shape=ASQ) + while True: + await stream.put(ctx, dut.i, x) + + async def testbench(ctx): + ctx.set(dut.shift, 4) + for n in range(0, 256): + y = await stream.get(ctx, dut.o) + self.assertAlmostEqual(y.as_float(), target, places=2) + + sim = Simulator(dut) + sim.add_clock(1e-6) + sim.add_testbench(stimulus, background=True) + sim.add_testbench(testbench) + with sim.write_vcd(vcd_file=open("test_onepole.vcd", "w")): + sim.run() + def test_stream_arbiter(self): n_channels = 3