Teensy 4.0 and non DMA I2S transfer

Jean-Marc

Well-known member
I am looking for a way to transfer I2S using interrupt instead of DMA.

This library was supporting it on the Teensy 3.
https://github.com/hughpyle/teensy-i2s

It does not work on the Teensy4.
I want to use I2S1(mapped to SAI1)
I started for the Audio library DMA setup but I need to get an interrupt each time a word must be transferred.

If I do the following, it just hangs the sketch.
I don't do anything in the IRQ yet but is it normal?

I also find strange that some of the flags below I had to define myself.
#define I2S_TCSR_FRIE ((uint32_t)0x00000100) // FIFO Request Interrupt Enable
#define I2S_TCSR_FEF ((uint32_t)0x00040000) // FIFO transmit underrun
#define I2S_TCSR_FRF ((uint32_t)0x00010000) // FIFO transmit watermark reached
#define I2S_TCSR_SEF ((uint32_t)0x00020000) // FIFO transmit frame sync error



attachInterruptVector(IRQ_SAI1, i2s1_tx_isr);

NVIC_ENABLE_IRQ(IRQ_SAI1);
I2S1_TCSR |= I2S_TCSR_TE // Transmit Enable
| I2S_TCSR_BCE // Bit Clock Enable
| I2S_TCSR_FRIE // FIFO Request Interrupt Enable
| I2S_TCSR_FR // FIFO Reset
;

This just hangs on the T4.
Are non DMA I2S transfer (interrupt based) allowed on a T4.0?
 
Hi Paul,

It improves. If I start it using below code I get at least the interrupt but only once.

I2S1_TCSR = I2S_TCSR_TE | I2S_TCSR_BCE | I2S_TCSR_FRDE | I2S_TCSR_FRIE;

static int ind=0;

void i2s1_tx_isr(void)
{
if(!(I2S1_TCSR & I2S_TCSR_FRF)) return;
I2S1_TDR0 = (uint32_t)0;
if(I2S1_TCSR & I2S_TCSR_FEF) I2S1_TCSR |= I2S_TCSR_FEF; // clear if underrun
if(I2S1_TCSR & I2S_TCSR_SEF) I2S1_TCSR |= I2S_TCSR_SEF; // clear if frame sync error
ind = ind +1;
ind = ind & (AUDIO_BLOCK_SAMPLES-1);
if (ind==0) Serial.println( "I" );
}

when using the DMA in the audio lib, it writes to I2S1_TDR0+2.
I was wondering what is the frame size of the FIFO in transmit mode?
Is it always 16bits so a single sample?


But your sample seems to be the good starting point. Thanks I will have a look tomorrow. At least it generates the irq forever.
 
Last edited:
hi Paul,

Your sample did help, I could now add and I2S audio driver (interrupt based, for PCM5102) to the VGA output and now the disturbance on the video DMA transfers is gone.
I updated the VGA_t4 library code and I also add VGA (with sound) to most emulators of the MCUME project (all on the git)

There is still something I would like to improve which is reducing the amount of I2S interrupts.
Now there is 1 interrupt per 16bit sample sent one the bus.

I did copy the I2S setup from the output_i2s module of the audio lib. The DMA there is copying 16bits at a time to I2S1_TDR0 + 2.
I don't really understand this code as the I2S registers are set up for 32bits word ??? (32-1, see below code).

How should I change the registers setup to copy 32bits at a time to I2S1_TDR0. Is there a way to copy even more than 32bits in the ISR (e.g. 4 x L+R samples), it is a FIFO no?

I don't feel familiar with the SAI information of NXP documentation.

See below the current code I copied from the output_i2s:
int rsync = 0;
int tsync = 1;

I2S1_TMR = 0;
I2S1_TCR1 = I2S_TCR1_RFW(1);
I2S1_TCR2 = I2S_TCR2_SYNC(tsync) | I2S_TCR2_BCP // sync=0; tx is async;
| (I2S_TCR2_BCD | I2S_TCR2_DIV((1)) | I2S_TCR2_MSEL(1));
I2S1_TCR3 = I2S_TCR3_TCE;
I2S1_TCR4 = I2S_TCR4_FRSZ((2-1)) | I2S_TCR4_SYWD((32-1)) | I2S_TCR4_MF
| I2S_TCR4_FSD | I2S_TCR4_FSE | I2S_TCR4_FSP;
I2S1_TCR5 = I2S_TCR5_WNW((32-1)) | I2S_TCR5_W0W((32-1)) | I2S_TCR5_FBT((32-1));


I2S1_RMR = 0;
I2S1_RCR1 = I2S_RCR1_RFW(1);
I2S1_RCR2 = I2S_RCR2_SYNC(rsync) | I2S_RCR2_BCP // sync=0; rx is async;
| (I2S_RCR2_BCD | I2S_RCR2_DIV((1)) | I2S_RCR2_MSEL(1));
I2S1_RCR3 = I2S_RCR3_RCE;
I2S1_RCR4 = I2S_RCR4_FRSZ((2-1)) | I2S_RCR4_SYWD((32-1)) | I2S_RCR4_MF
| I2S_RCR4_FSE | I2S_RCR4_FSP | I2S_RCR4_FSD;
I2S1_RCR5 = I2S_RCR5_WNW((32-1)) | I2S_RCR5_W0W((32-1)) | I2S_RCR5_FBT((32-1));

//CORE_PIN23_CONFIG = 3; // MCLK
CORE_PIN21_CONFIG = 3; // RX_BCLK
CORE_PIN20_CONFIG = 3; // RX_SYNC
CORE_PIN7_CONFIG = 3; // TX_DATA0
I2S1_RCSR |= I2S_RCSR_RE | I2S_RCSR_BCE;
I2S1_TCSR = I2S_TCSR_TE | I2S_TCSR_BCE | I2S_TCSR_FRDE ;//<-- not using DMA */;
 
@Jean-Marc can you point me to a working version of the non DMA i2s code?
I looked through some of you github repos but couldn't spot it
 
//CORE_PIN23_CONFIG = 3; // MCLK
CORE_PIN21_CONFIG = 3; // RX_BCLK
CORE_PIN20_CONFIG = 3; // RX_SYNC
CORE_PIN7_CONFIG = 3; // TX_DATA0
I2S1_RCSR |= I2S_RCSR_RE | I2S_RCSR_BCE;
I2S1_TCSR = I2S_TCSR_TE | I2S_TCSR_BCE | I2S_TCSR_FRDE ;//<-- not using DMA */;

For the section of the code here, how would one go about changing the pin numbers?
 
Just use alternate pins that are SAI1 capable.
It’s all in the 1062 reference manual
 
Maybe this can help?

https://forum.pjrc.com/threads/6281...-on-Teensy-4-1?p=253563&viewfull=1#post253563

Look for the comment "not using DMA" and "start generating TX FIFO interrupts" for the specific bits you need to configure for interrupts.

You'll need to convert from that weird 20 bit format back to normal I2S, but that's mostly a matter of writing the correct vales into the 5 TCR & 5 RCR registers to configure the data format as I2S.
Hey I saw you write this piece of a code shown below a while back for controlling a Galvo using a Teensy board via the XY2-100 protocol. I was wondering if you could just tell me what commands it is sending to the Galvo and how I would go about changing the parameters involved. I am running the code on a Teensy 4.0 but have switched out output pin 32 for 23. I would be grateful if you could point me to any resources you used to write and understand the code.

void setup() {
pinMode(3, OUTPUT);
config_sai1();
attachInterruptVector(IRQ_SAI1, isr);
NVIC_ENABLE_IRQ(IRQ_SAI1);
I2S1_TCSR |= 1<<8; // start generating TX FIFO interrupts
}
volatile uint16_t nextval = 0x0FF0;
void isr() {
digitalWriteFast(3, HIGH);
delayNanoseconds(25);
digitalWriteFast(3, LOW);

uint32_t bits20 = 0x20000 | (nextval << 1) | 0x00;
I2S1_TDR0 = bits20;
I2S1_TDR1 = bits20;
}
void loop() {
static elapsedMillis msec;
if (msec > 100) {
nextval++;
msec = 0;
}
}


FLASHMEM
void config_sai1()
{
CCM_CCGR5 |= CCM_CCGR5_SAI1(CCM_CCGR_ON);
double fs = 39062.5;
int n1 = 4; //SAI prescaler 4 => (n1*n2) = multiple of 4
int n2 = 1 + (24000000 * 27) / (fs * 256 * n1);
double C = ((double)fs * 256 * n1 * n2) / 24000000;
int c0 = C;
int c2 = 10000;
int c1 = C * c2 - (c0 * c2);
audioClock(c0, c1, c2);
// clear SAI1_CLK register locations
CCM_CSCMR1 = (CCM_CSCMR1 & ~(CCM_CSCMR1_SAI1_CLK_SEL_MASK))
| CCM_CSCMR1_SAI1_CLK_SEL(2); // &0x03 // (0,1,2): PLL3PFD0, PLL5, PLL4
n1 = n1 / 2; //Double Speed for TDM
CCM_CS1CDR = (CCM_CS1CDR & ~(CCM_CS1CDR_SAI1_CLK_PRED_MASK | CCM_CS1CDR_SAI1_CLK_PODF_MASK))
| CCM_CS1CDR_SAI1_CLK_PRED(n1 - 1) // &0x07
| CCM_CS1CDR_SAI1_CLK_PODF(n2 - 1); // &0x3f
IOMUXC_GPR_GPR1 = (IOMUXC_GPR_GPR1 & ~(IOMUXC_GPR_GPR1_SAI1_MCLK1_SEL_MASK))
| (IOMUXC_GPR_GPR1_SAI1_MCLK_DIR | IOMUXC_GPR_GPR1_SAI1_MCLK1_SEL(0)); //Select MCLK

// configure transmitter
int rsync = 0;
int tsync = 1;
I2S1_TMR = 0;
I2S1_TCR1 = I2S_TCR1_RFW(4);
I2S1_TCR2 = I2S_TCR2_SYNC(tsync) | I2S_TCR2_BCP | I2S_TCR2_MSEL(1)
| I2S_TCR2_BCD | I2S_TCR2_DIV(0);
I2S1_TCR3 = I2S_TCR3_TCE_2CH;
I2S1_TCR4 = I2S_TCR4_FRSZ(0) | I2S_TCR4_SYWD(0) | I2S_TCR4_MF
/*| I2S_TCR4_FSE*/ | I2S_TCR4_FSD | I2S_TCR4_FSP;
I2S1_TCR5 = I2S_TCR5_WNW(19) | I2S_TCR5_W0W(19) | I2S_TCR5_FBT(19);
I2S1_RMR = 0;
I2S1_RCR1 = I2S_RCR1_RFW(4);
I2S1_RCR2 = I2S_RCR2_SYNC(rsync) | I2S_TCR2_BCP | I2S_RCR2_MSEL(1)
| I2S_RCR2_BCD | I2S_RCR2_DIV(0);
I2S1_RCR3 = I2S_RCR3_RCE_2CH;
I2S1_RCR4 = I2S_RCR4_FRSZ(0) | I2S_RCR4_SYWD(0) | I2S_RCR4_MF
/*| I2S_RCR4_FSE*/ | I2S_RCR4_FSD | I2S_RCR4_FSP;
I2S1_RCR5 = I2S_RCR5_WNW(19) | I2S_RCR5_W0W(19) | I2S_RCR5_FBT(19);
// CORE_PIN32_CONFIG = 3; // MCLK
CORE_PIN21_CONFIG = 3; // RX_BCLK
CORE_PIN20_CONFIG = 3; // RX_SYNC
CORE_PIN7_CONFIG = 3; // TX_DATA0
CORE_PIN23_CONFIG = 3; // TX_DATA1
I2S1_RCSR |= I2S_RCSR_RE | I2S_RCSR_BCE;
I2S1_TCSR = I2S_TCSR_TE | I2S_TCSR_BCE /* | I2S_TCSR_FRDE <-- not using DMA */;
}

/*
(c) Frank B
*/
FLASHMEM
void audioClock(int nfact, int32_t nmult, uint32_t ndiv) // sets PLL4
{
//if ((CCM_ANALOG_PLL_AUDIO & CCM_ANALOG_PLL_AUDIO_ENABLE)) return;
CCM_ANALOG_PLL_AUDIO = CCM_ANALOG_PLL_AUDIO_BYPASS | CCM_ANALOG_PLL_AUDIO_ENABLE
| CCM_ANALOG_PLL_AUDIO_POST_DIV_SELECT(2) // 2: 1/4; 1: 1/2; 0: 1/1
| CCM_ANALOG_PLL_AUDIO_DIV_SELECT(nfact);
CCM_ANALOG_PLL_AUDIO_NUM = nmult & CCM_ANALOG_PLL_AUDIO_NUM_MASK;
CCM_ANALOG_PLL_AUDIO_DENOM = ndiv & CCM_ANALOG_PLL_AUDIO_DENOM_MASK;
CCM_ANALOG_PLL_AUDIO &= ~CCM_ANALOG_PLL_AUDIO_POWERDOWN;//Switch on PLL
while (!(CCM_ANALOG_PLL_AUDIO & CCM_ANALOG_PLL_AUDIO_LOCK)) {}; //Wait for pll-lock
const int div_post_pll = 1; // other values: 2,4
CCM_ANALOG_MISC2 &= ~(CCM_ANALOG_MISC2_DIV_MSB | CCM_ANALOG_MISC2_DIV_LSB);
if (div_post_pll > 1) CCM_ANALOG_MISC2 |= CCM_ANALOG_MISC2_DIV_LSB;
if (div_post_pll > 3) CCM_ANALOG_MISC2 |= CCM_ANALOG_MISC2_DIV_MSB;
CCM_ANALOG_PLL_AUDIO &= ~CCM_ANALOG_PLL_AUDIO_BYPASS;//Disable Bypass
}
 
I was wondering if you could just tell me what commands it is sending to the Galvo and how I would go about changing the parameters involved.

Sorry, I know nothing of the Galvo's protocol aside from what was discussed (and by now I've even forgotten most of that).

But I can tell you these lines are the critical part which actually sends the data.

Code:
uint32_t bits20 = 0x20000 | (nextval << 1) | 0x00;
I2S1_TDR0 = bits20;
I2S1_TDR1 = bits20;

If you want to transmit something different, this is the place to edit the code.


I would be grateful if you could point me to any resources you used to write and understand the code.

The documentation is the reference manual, chapter 38 starting on page 1985. If you want to transmit a different format that the 20 bit word of OpenGalvo, focus on understanding the TCR1 to TCR5 registers, starting on page 2005. Those 5 registers control the data format which will be transmitted.

To get the reference manual, go to the Teensy 4.1 page and scroll down to "Technical Information" and click the first link.
 
I've got the code above working successfully to output steroo audio stream to the Audio Shield
But I get an interrupt on each half word, as I am also loading one channel of one sample at a time.
This means at 44.1Khz sample rate, I am getting 88200 interrupts per second.

Is there a way that I can load L+R channels at once, and get 44100 interrupts per second?
I took a look at the SAI chapter in the RM, but could't find anything that addresses this specifically.
 
Back
Top