Dual channel 16bit dac PT8211

Great !
I don't have time enough the next days, but i'll add some filtering next week.
I think I'll add a FIR filter. Any idea which parameters i should use ? How many taps (It'l be slower with more taps), Q...
 
This is a 1kHz sine without oversampling (OS):
quicksave2.png

This is the same signal with linear OS:
quicksave3.png

Yellow: DAC output
Blue: 1st order LPF with fc=20.4 kHz (7.8kOhm and 1nF)


For the FFT is used a 5kHz signal, here's how it looks in the time domain without OS:
quicksave5.png

And with linear interpolated OS:
quicksave14.png

These are the FFTs of above signals:
No OS, No filter (yellow)
quicksave7.png

No OS, Filter (blue)
quicksave8.png

Linear interpolated OS, No filter (yellow)
quicksave9.png

Linear interpolated OS, Filter (Blue)
quicksave10.png

So, as expected, for frequencies well below the nyquist limit (fs/2) linear interpolation works quite well, and it gets worse the closer you come towards fs/2. Here you can see this very well, this is 10kHz and while linear interpolation still helps, it doesn't give you quite a sine shape:
quicksave12.png
quicksave13.png
I also tried higher freq's up to 20k, but IMHO somewhere at 10...12kHz is the practical limit for linear interpolation with 4x OS. For higher freq's we need a better filter.

Regarding filter design:
I guess you can use the online filter designer Paul links in the Audio GUI (http://t-filter.engineerjs.com/) if you set fs to 176468 (44117 * 4). I wouldn't bother trying to create hifi-quality, that's not possible without a high order IIR. Maybe try a 16kHz passband and -60dB stop band beginning at 22.05kHz?
 
I just tried to implement a FIR low pass filter, but I'm horrible at software, so I failed miserably. :rolleyes:
I got *something* to work, but when I tried more than 30 taps the Teensy began to drop samples. It was a mess. And with <30 taps the filter was useless. So I'll leave FIR to the professionals :D


So then I read this paper: http://dspguru.com/sites/dspguru/files/cic.pdf and I thought I can't mess up too much and implemented a 3-Stage CIC filter (3 Comb and three Integrator stages). And to my surprise, after some fiddling, it works! Have a look at this:

quicksave15.png
Channel 4 (red): linear interpolation, implemented by Frank
Channel 1 (yellow): 3 stage CIC filter, pretty much exactly what you find in the paper I linked.
Channel 3 (blue): the yellow signal after the same RC low-pass used in the other measurements
This is a 9kHz signal, I chose it because the difference is best visible in this frequency range.


For comparison, here is a FFT of the same 5kHz sine I used to compare OS/no OS and filter/ no filter in post #52. The screenshots are directly comparable, same settings on the scope for all of them, but this time I used CIC filtering plus the RC-low-pass.
quicksave16.png

Compare this to the last FFT picture in #52, it's even better!

Next I'll try to expand the CIC, up to 8 stages should be possible with 32 bit integers without overflow (CIC filters have gain, so higher number of stages need a lot of headroom).

Here's the modified part in Frank's output_pt8211.cpp, this goes inside if (blockL && blockR) {...}
Code:
//memcpy_tointerleaveLR(dest, blockL->data + offsetL, blockR->data + offsetR);
	for (int i=0; i< AUDIO_BLOCK_SAMPLES / 2; i++, offsetL++, offsetR++) {
		int32_t valL = blockL->data[offsetL];
		int32_t valR = blockR->data[offsetR];			
		
		// int32_t nL = (oldL+valL) >> 1;
		int32_t nR = (oldR+valR) >> 1;
		
		
		int32_t comb[3] = {0};
		static int32_t combOld[2] = {0};
		
		comb[0] = valL - oldL;
		comb[1] = comb[0] - combOld[0];
		comb[2] = comb[1] - combOld[1];
		// comb[2] now holds input val
		combOld[0] = comb[0];
		combOld[1] = comb[1];
		
		for (int j = 0; j < 4; j++) {
			int32_t integrate[3];
			static int32_t integrateOld[3] = {0};
			integrate[0] = ( (j==0) ? (comb[2]) : (0) ) + integrateOld[0];
			integrate[1] = integrate[0] + integrateOld[1];
			integrate[2] = integrate[1] + integrateOld[2];
			// integrate[2] now holds j'th upsampled value
			*(dest+j*2) = integrate[2] >> 4;
			integrateOld[0] = integrate[0];
			integrateOld[1] = integrate[1];
			integrateOld[2] = integrate[2];
		}
			
		// *(dest+0) = (oldL+nL) >> 1;
		*(dest+1) = (oldR+nR) >> 1;
		// *(dest+2) = nL;
		*(dest+3) = nR;
		// *(dest+4) = (nL+valL) >> 1;
		*(dest+5) = (nR+valR) >> 1;
		// *(dest+6) = valL;
		*(dest+7) = valR;
		
		dest+=8;
		
		oldL = valL;
		oldR = valR;
		}
"Incoming" values go through the comb filters, the output is then used for the integrator filters. The next three inputs for the integrators are zero. This reduces output by 1/4, but the filter gain is is 64, so the effective filter gain is 64*(1/4)=16, which is accounted for by bitshifting (>> 4) at the output.

Edit: It turns out filter orders >3 significantly reduce amplitude in the passband; they are not useful with a resampling factor as low as 4. I'll leave the filter as it is right now and will add a higher order analog output filter when I get the parts needed.
@Frank, do you have an idea how we could add options for the Frame-Sync-Early bit and for oversampling to the output_i2s object so it can eventually be merged to the Audio lib?
 
Last edited:
I forked the Audio Lib and added support for the PT8211. There are #defines in output_pt8211.cpp (Lines 32 to 36) that can be used to enable oversampling and select the interpolation method (linear or CIC).

Both the linear and the CIC method are completely non-optimized regarding CPU cost, but here are some numbers for the output ISR duty cycle, Teensy running at 96MHz (optimized):

Stereo:
no oversampling, no filtering: 0.3%
4x OS, linear interpolation: 2.0%
4x OS, CIC interpolation: 7.2%

Mono:
0.3%
1.5%
3.3%

The problem with CIC filters is that they hack away at the pass band, my 3-Stage implementation has about -2dB at 10kHz, -5dB at 15kHz and -10dB at 20kHz.
So far I'm quite happy with the results. I hope the CPU cost of the CIC can be reduced once someone who actually knows what he/she is doing (read: not me) looks at the code. Until then I'll try to implement a high shelving filter to compensate for the pass band droop, my goal is a flat frequency response up to about 16kHz. Wish me luck :D

-Ben
 
It works, I simply added a stock biquad object in front of the output object. I found a shelf filter with fc = 12kHz and +5dB works well, taking up about 2% CPU usage. I initially planned to use multiple shelving stages, but just one stage performs very well.

Here's a frequency response plot of the CIC filter alone, you can see the droop towards the higher frequencies. Scaling is 2dB per div vertically and 2kHz per div horizontally (linear freq. scale!)
scope_1.png

And this is with the biquad in action:
scope_0.png
+-0.5dB from 0 to 16 kHz! This comes at the expense of reduced bit depth. The biquad has a gain of 5dB, so you need to make sure the input to the filter is less or equal -5dB, so you loose about one bit of accuracy. But I guess that's down in the noise for the PT8211 anyway, so no big deal.

-Ben
 
The monologue continues ;)

In filter_biquad.h I found a link to http://www.musicdsp.org/files/Audio-EQ-Cookbook.txt and it described a method of calculating filter parameters for shelving filters with variable slope. With this additional degree of freedom I found a shelf filter with +6dB, 12kHz and a slope parameter of 0.5 compensates the CIC filter even better:
highShelf13kHz6dB0.5Slope_2.png

As a side effect, I implemented functions to generate shelving filters for the biquad object. Description here, GitHub here. Note this fork also has the experimental PT8211 with oversampling described in this thread.


Edit: Picture didn't work, fixed.
 
Last edited:
I'm back :)

I'll look this evening - i think there a a few things that can be optimized.
For example, for the linar interpolation, there is a great assembler-dsp-command "signed_halving_add_16_and_16(a,b)" that can calc two values at once (=one cycle for two additions and two shifts)
 
Last edited:
Great, take your time, I won't have much time until Saturday. I'm not sure if the shelving filter + CIC is the best approach for oversampling, it would at least need another fir stage after the sample rate conversion, running at 176.468 Hz. So don't put too much effort into optimizing this, it might well be major parts of the filter must be changed...
 
Yes, I think filtering after interpolation (or combined with it) makes sense.
Would be great to find a solution that can be used for the other outputs too - esp. the teensy-integrated DACs.
 
Would be great to find a solution that can be used for the other outputs too - esp. the teensy-integrated DACs.
Yes, an "upsampler" object that one could simply plug in front of the output objects would be very cool, but my (limited) understanding of the audio library internals tells me that's not possible, because all connections in the signal chain must be the same sample frequency. So I guess the oversampling must be integrated directly into the individual output objects?
 
I dug into my old notes from back when I had to learn the basics of DSP (didn't like it back then, and now I have to learn it for my hobby. Crazy what proper motivation can do to learning speed :D); I also read quite some papers on CICs and CIC compensation filters. So here's the short version of what I learned in the past week, but bear in mind I'm an amateur on this topic, I might have gotten some of this wrong:

CICs are used for sample rate conversion filtering in both interpolation and decimation, they perform well when the rate change factor is large. That is because with large factors the pass band is narrow relative to the up-sampled band (zero to nyquist), and that narrow band is less affected by the pass band droop inherent to CICs.
Usually a FIR is used (on the low-sample-rate side) to compensate for that droop. I have not found an explanation to why it's always FIR, never IIR. I suppose it's because the filter has to work at low frequencies where an IIR would be unstable.

But with only 4x interpolation a IIR might well work, so I built a LibreOffice Calc Sheet to plot CIC filter response plots and to design a biquad shelving filter (See post #56) to compensate the pass band droop. Give it a try if you like: https://www.dropbox.com/s/gwzn82evyg9ce29/CICcalculator.ods?dl=0 (Although you can view the document inside the browser on dropbox.com, you need to open it with LibreOffice to see the sliders that control IIR filter parameters. OpenOffice might work too, I haven't tested.)

So the structure will be {whatever signal from the Audio lib}-->{Attenuator?}-->{Low Shelf Biquad}-->{CIC}-->{output}

The attenuator might be necessary to prevent clipping because the following shelf filter may have gain at certain frequencies. Attenuation could be achieved by scaling the biquad's filter parameters, so it wouldn't need to be a separate object in the Audio lib.

The actual CIC filter has gain, too. The thing here is that you are not allowed to truncate bits anywhere inside the filter, because that would introduce rounding errors. The CIC relies on the exact cancellation of poles and zeros in its H(z), and any rounding errors would inhibit this exact cancellation. (BTW that's why CICs only work with fixed point math)

So the good news is that the CIC gain is always a power of two in case of 4x interpolation, so it can be simply bit-shifted away. The bad news is all stages of the CIC have to run with 32bit accuracy. I guess there are Cortex DSP instructions for that too?

I think this is everything I can provide on this topic, I'm not capable of generating optimized code for all of this, let alone utilizing the Cortex DSP instructions. If anyone (Frank?) wants to implement a proper CIC interpolator I'd be happy to help with testing and providing scope readings and whatnot. A CIC with a (compile-time?)-configurable amount of stages would be a dream. Up to 8 stages should be possible without clipping in 32 bit.

Another thought, somewhat more related to the actual topic in this thread: If we find a good way to let the user control how AudioOutputPT8211::config_i2s(void) configures the I2S hardware we wouldn't need separate objects for communicating with I2S and right-justified (PT8211) hardware. That would also minimize code repetition for the CIC I guess?

- Ben
 
Yes, I do want to also use 4X over sampling and filtering on the DAC and ADC objects. But I probably will not be able to put any significant work into this until well after the K66 board is fully released.
 
Which opamp chip is that? Will it be powered from the same low-pass filtered power line as the PT8211, or directly from normal power?
 
I plan to use Microchip MCP6002 http://ww1.microchip.com/downloads/en/DeviceDoc/21733j.pdf
But the layout allows for any dual opamp that has the standard pinout. (Should be a Rail-to-Rail I/O type though)

Both the opamp and the DAC will be powered from the filtered line. But you raise a good point here, I didn't spend much time contemplating this, as I couldn't divide the analog and digital power because the DAC has a single supply pin for both. Maybe it's better to have the DAC at Teensy 3.3V and only the OP on the filtered rail...
 
I'd try running both from the same filtered power, at least as a first attempt. Opamps have decent PSRR (at least at low frequencies) and that DAC chip probably doesn't have any if it just resistor divides the power, but the opamp should have very steady current consumption.

Looks like you've got only a 0.1uF cap for filtering the power supply after that inductor. Since that DAC probably as zero PSRR, you probably want a large capacitor like their datasheet suggests, and probably also that series resistor so you get some low-pass filtering at audio band frequencies. Their example with 10 ohms and 47 uF has a roll-off at 339 Hz, which should at least attenuate the most offensive noises that might be present on the digital power supply. I'd probably go with much more than 47 uF... and remember capacitance is specified with zero volts DC and a tiny AC test signal. You get much less with most capacitors when there's a DC bias, so you need to oversize even more to get effective filtering.

It's really a shame they couldn't have used pin 7 to give the chip a filtered reference voltage for the R-2R ladder inside the chip. Guess you can't have everything for a chip that costs only pennies.
 
I actually planned on relying on the low output impedance of the LP38691 at audio freq's, since it's physically so close. That's why I used a ferrite + cap instead of the RC-filter from the datasheet, to keep impedance in the Audio band low. Don't get me wrong, you're right in everything you said, and your approach is certainly the better one from an engineering point of view! But as you said, it's just a 25ct-DAC.

If I shove all the 0603 passives really close together I could fit an SMD D-size cap in there. A conductive polymer tantalum would be a great solution: Low ESR and ESL and it won't have the voltage dependent degradation like a Class II MLCC, but it would double the total BOM cost :eek: ... and I'm not sure if I could solder the 0603s (by hand) when they are so close together...

big cap.jpg

I could solve this by making the board the full Teensy size instead of trying so hard to keep the program button accessible in case someone wants to mount the shield on top of the Teensy. Hmm....
 
I think this is everything I can provide on this topic, I'm not capable of generating optimized code for all of this, let alone utilizing the Cortex DSP instructions. If anyone (Frank?) wants to implement a proper CIC interpolator I'd be happy to help with testing and providing scope readings and whatnot. A CIC with a (compile-time?)-configurable amount of stages would be a dream. Up to 8 stages should be possible without clipping in 32 bit.

- Ben

I guess there are not too much possibilities to optimize that, the DSP-Instructions are for 16 Bit-data. Mabye the two loops can be combined (and the conditional line inside moved outside) - but i'm not sure if that helps much (the Cortex does not have enough registers to hold all values, and, maybe, the compiler inserts some unwanted stack-push/pops). Another chance could be to use 32-Bit-Writes for the audio-data..

We could just try some compiler-switches ...
 
Last edited:
Oh ok, I thought there were 32b DSP instructions available. Paul mentioned he is interested in interpolation/oversampling as well, so I won't spend more time on the code until he looks at it (I know this may take a while, with the K66 beta and the plans for a new Forum, Wiki,...Who knows what's up his sleeve next :D But I'm not in a hurry).
However I'll continue with the analog filter board and order some PCBs.
 
Fantastic, Is there a schematic for that Ben? I couldn't see anything on the oshpark page apart from the boards.
 
Back
Top