Parallel IO - is it possible?

Status
Not open for further replies.

Paula

Active member
Hi all, I'm playing with the teensy 4.0 which is fab (thank you Paul).

I'm trying to write a 12 bit value to pins 0 to 11.
I've done some reading and the only way I can figure out how to do it is like this ;

Code:
  if(data & 0x0001)
    digitalWriteFast(0,HIGH);
  else
    digitalWriteFast(0,LOW); 

  if(data & 0x0002)
    digitalWriteFast(1,HIGH);
  else
    digitalWriteFast(1,LOW); 

  if(data & 0x0004)
    digitalWriteFast(2,HIGH);
  else
    digitalWriteFast(2,LOW); 

  if(data & 0x0008)
    digitalWriteFast(3,HIGH);
  else
    digitalWriteFast(3,LOW); 

  if(data & 0x0010)
    digitalWriteFast(4,HIGH);
  else
    digitalWriteFast(4,LOW); 

  if(data & 0x0020)
    digitalWriteFast(5,HIGH);
  else
    digitalWriteFast(5,LOW); 

  if(data & 0x0040)
    digitalWriteFast(6,HIGH);
  else
    digitalWriteFast(6,LOW); 

  if(data & 0x0080)
    digitalWriteFast(7,HIGH);
  else
    digitalWriteFast(7,LOW); 

  if(data & 0x0100)
    digitalWriteFast(8,HIGH);
  else
    digitalWriteFast(8,LOW); 

  if(data & 0x0200)
    digitalWriteFast(9,HIGH);
  else
    digitalWriteFast(9,LOW); 

  if(data & 0x0400)
    digitalWriteFast(10,HIGH);
  else
    digitalWriteFast(10,LOW); 

  if(data & 0x0800)
    digitalWriteFast(11,HIGH);
  else
    digitalWriteFast(11,LOW);

which is ugly as hell and also VERY slow and takes about 3.9uS to complete.

Is there anyway to write to a set of pins directly?
I've had a scour on here, and I can't find an answer :(

Many thanks in advance
Paula
 
The old T_3.x style of memory mapped port writes it seemed isn't supported. Not seen examples of replacement code … except maybe …

In this post by Paul 6 pins are quickly cycled :: https://forum.pjrc.com/threads/54711-Teensy-4-0-First-Beta-Test?p=204176&viewfull=1#post204176
Code:
 GPIO8_DR = (GPIO8_DR & ~(0x3F << 12)) | ((n & 0x3F) << 12);

Not sure if that leads to finding the ports those pins are on and doing the couple of needed pin changes on the affected ports and works?

Would take a visit to the schematic for the pin/port info and then ...\hardware\teensy\avr\cores\teensy4\imxrt.h at about line 4965 for this table - note the page ref may not be the current RefMan
Code:
// 32.4.1: page 1620
#define IMXRT_GPIO1		(*(IMXRT_REGISTER32_t *)0x401B8000)
#define GPIO1_DR			(IMXRT_GPIO1.offset000)
//...
 
Thanks defragster, but I don't understand what you've written :)
I mostly do electronics and tinker with code. Have to admit I'm amazed there isn't a library with a parallel out function :(
 
Thanks defragster, but I don't understand what you've written :)
I mostly do electronics and tinker with code. Have to admit I'm amazed there isn't a library with a parallel out function :(

I had that ref code at hand using it for another test case so posting it as the only ref I saw seemed right, can't say I looked enough to understand the gyrations. Figured it I posted it might inspire somebody to chime in with the reference at hand for an example.

Getting the pin to port mapping from schematic would be first step to trying to emulate it. And it might show that ordering the wires other than 0-11 would simplify the code.
 
ok, think I've got a handle on it, here's what I've worked out so far;

Code:
/*
 * GPIO1.IO[12..24]
 * 12 = AD_B0_12 = PIN 24
 * 13 = AD_B0_13 = PIN 25
 * 14 = AD_B0_14 = PIN 26
 * 15 = AD_B0_15 = PIN 27
 * 16 = AD_B1_00 = PIN 8
 * 17 = AD_B1_01 = PIN 7
 * 18 = AD_B1_02 = PIN 14
 * 19 = AD_B1_03 = PIN 15
 * 20 = AD_B1_04 = N/A
 * 21 = AD_B1_05 = N/A
 * 22 = AD_B1_06 = PIN 17
 * 23 = AD_B1_07 = PIN 16
 * 24 = AD_B1_08 = PIN 22
 * 25 = AD_B1_09 = PIN 23
 * 26 = AD_B1_10 = PIN 20
 * 27 = AD_B1_11 = PIN 21
 * 28 = AD_B1_12 = N/A
 * 29 = AD_B1_13 = N/A
 * 30 = AD_B1_14 = PIN 26
 * 31 = AD_B1_15 = PIN 27
 * 
 */

so there isn't 12 contiguous IO pins available.. but even if I do two writes, that would be better than the mess I have right now.

I'll have a play and see what I can get out of it
 
I'm going to try this....

Code:
void setup()
{
  GPIO1_GDIR != 0x03CFF000;     // should be 11110011111111000000000000
                                // should give us bits 25->22 and 19->12 as output
}

uint32_t n=0;

void loop()
{
  GPIO1_DR = (GPIO1_DR & ~(0x03CFF000))
  delayMicroseconds(100);
  GPIO1_DR = (GPIO1_DR & (0x03CFF000))
  delayMicroseconds(100);

  //the above *should* toggle the 12 IO pins
}
 
damn, couple of mistakes in there... Going to try this -

Code:
void setup()
{
  GPIO1_GDIR != 0x0FCF3000;     // 1111 1100 1111 0011 0000 0000 0000‬
                                // should give us bits 27->22, 19 ->16 and 13->12 as output
}

uint32_t n=0;

void loop()
{
  GPIO1_DR = (GPIO1_DR & ~(0x0FCF3000))
  delayMicroseconds(100);
  GPIO1_DR = (GPIO1_DR & 0x0FCF3000)
  delayMicroseconds(100);

  //the above *should* toggle the 12 IO pins
}

/*
 * GPIO1.IO[12..24]
 * 12 = AD_B0_12 = PIN 24
 * 13 = AD_B0_13 = PIN 25
 * 14 = AD_B0_14 = N/A
 * 15 = AD_B0_15 = N/A
 * 16 = AD_B1_00 = PIN 8
 * 17 = AD_B1_01 = PIN 7
 * 18 = AD_B1_02 = PIN 14
 * 19 = AD_B1_03 = PIN 15
 * 20 = AD_B1_04 = N/A
 * 21 = AD_B1_05 = N/A
 * 22 = AD_B1_06 = PIN 17
 * 23 = AD_B1_07 = PIN 16
 * 24 = AD_B1_08 = PIN 22
 * 25 = AD_B1_09 = PIN 23
 * 26 = AD_B1_10 = PIN 20
 * 27 = AD_B1_11 = PIN 21
 * 28 = AD_B1_12 = N/A
 * 29 = AD_B1_13 = N/A
 * 30 = AD_B1_14 = PIN 26
 * 31 = AD_B1_15 = PIN 27
 * 
 *  
 */
 
Well, sadly still nothing.. made a mistake with the ! and | in the previous code, so Ran this (based on Paul's code linked above)

Code:
void setup()
{
  GPIO1_GDIR |= (0xFCF3 << 12);
//  GPIO1_GDIR |= 0x0FCF3000;     // 1111 1100 1111 0011 0000 0000 0000‬
                                // should give us bits 27->22, 19 ->16 and 13->12 as output
}

uint32_t n=0;

void loop()
{
  n++;
  GPIO1_DR = (GPIO1_DR & ~(0xFCF3 << 12)) | ((n & 0xFCF3) << 12);
  delayMicroseconds(100);
}

/*
 * GPIO1.IO[12..24]
 * 12 = AD_B0_12 = PIN 24
 * 13 = AD_B0_13 = PIN 25
 * 14 = AD_B0_14 = N/A
 * 15 = AD_B0_15 = N/A
 * 16 = AD_B1_00 = PIN 8
 * 17 = AD_B1_01 = PIN 7
 * 18 = AD_B1_02 = PIN 14
 * 19 = AD_B1_03 = PIN 15
 * 20 = AD_B1_04 = N/A
 * 21 = AD_B1_05 = N/A
 * 22 = AD_B1_06 = PIN 17
 * 23 = AD_B1_07 = PIN 16
 * 24 = AD_B1_08 = PIN 22
 * 25 = AD_B1_09 = PIN 23
 * 26 = AD_B1_10 = PIN 20
 * 27 = AD_B1_11 = PIN 21
 * 28 = AD_B1_12 = N/A
 * 29 = AD_B1_13 = N/A
 * 30 = AD_B1_14 = PIN 26
 * 31 = AD_B1_15 = PIN 27
 *  
 */

I don't see any activity on Pins 24,25,8,7,14,15,16 or 17

Wondering what I've missed :(
 
I know this is tedious but i usually toggle pins and read the ports to see which one it’s attached to. (the bit in the port will toggle)(this also assumes that the output of the pin has the SION bit set)(is SION bit set during pinMode in core as default?)
Then I write the pins for each port so I have a base to know how many pins are available for each port.
Keep in mind the pins may not be ordered but if wired properly you should be okay
 
I've written some code that makes the T4 look like a 64K EEPROM and it can read 16 inputs, do a lookup, and write 8 outputs in something like 35ns. The 8 bit digital write looks like this...

Code:
    digitalWriteFast(ROMH, bitRead(PLA_OUT, 0));
    digitalWriteFast(ROML, bitRead(PLA_OUT, 1));
    digitalWriteFast(IO, bitRead(PLA_OUT, 2));
    digitalWriteFast(GRW, bitRead(PLA_OUT, 3));
    digitalWriteFast(CHARROM_CS, bitRead(PLA_OUT, 4));
    digitalWriteFast(KERNALROM_CS, bitRead(PLA_OUT, 5));
    digitalWriteFast(BASICROM_CS, bitRead(PLA_OUT, 6));
    digitalWriteFast(CASRAM, bitRead(PLA_OUT, 7));

The pin names are defined elsewhere and PLA_OUT is just a byte from the lookup table.

David
 
I've written some code that makes the T4 look like a 64K EEPROM and it can read 16 inputs, do a lookup, and write 8 outputs in something like 35ns. The 8 bit digital write looks like this...

...
The pin names are defined elsewhere and PLA_OUT is just a byte from the lookup table.

David

That is promising - versus OP #1 - removing the series of 'if ()' tests would make a difference … 3.9 us versus 35 ns.

Would be interesting if the 'port write' as expressed in Paul's code was generally usable though
 
Allow me to bestow upon you some code I know works for 8 bits and more can be easily added if they are on the same GPIO#. Both of these assume that the pins are already set to their correct I/O modes, although I’m not sure how fast these run I just know they were sufficient enough for my purpose.

The write command could probably be sped up if you use the DR_TOGGLE command by comparing what’s already set in the register and what needs to be turned on or off accordingly, but this seemed simpler when I was just trying to get something working.

Code:
 uint8_t busRead() {
    return    (GPIO7_DR & CORE_PIN6_BITMASK)  >> CORE_PIN6_BIT         | 
            (((GPIO7_DR & CORE_PIN7_BITMASK)  >> CORE_PIN7_BIT) << 1)  |
            (((GPIO7_DR & CORE_PIN8_BITMASK)  >> CORE_PIN8_BIT) << 2)  |
            (((GPIO7_DR & CORE_PIN9_BITMASK)  >> CORE_PIN9_BIT) << 3)  |
            (((GPIO7_DR & CORE_PIN10_BITMASK) >> CORE_PIN10_BIT) << 4) |
            (((GPIO7_DR & CORE_PIN11_BITMASK) >> CORE_PIN11_BIT) << 5) |
            (((GPIO7_DR & CORE_PIN12_BITMASK) >> CORE_PIN12_BIT) << 6) |
            (((GPIO7_DR & CORE_PIN13_BITMASK) >> CORE_PIN13_BIT) << 7);
}

void busWrite(uint8_t port) {
    GPIO7_DR_SET =  ((port >> 0) & 1) << CORE_PIN6_BIT  | 
                    ((port >> 1) & 1) << CORE_PIN7_BIT  | 
                    ((port >> 2) & 1) << CORE_PIN8_BIT  | 
                    ((port >> 3) & 1) << CORE_PIN9_BIT  | 
                    ((port >> 4) & 1) << CORE_PIN10_BIT | 
                    ((port >> 5) & 1) << CORE_PIN11_BIT | 
                    ((port >> 6) & 1) << CORE_PIN12_BIT | 
                    ((port >> 7) & 1) << CORE_PIN13_BIT;
    GPIO7_DR_CLEAR = !((port >> 0) & 1) << CORE_PIN6_BIT  | 
                     !((port >> 1) & 1) << CORE_PIN7_BIT  | 
                     !((port >> 2) & 1) << CORE_PIN8_BIT  | 
                     !((port >> 3) & 1) << CORE_PIN9_BIT  | 
                     !((port >> 4) & 1) << CORE_PIN10_BIT | 
                     !((port >> 5) & 1) << CORE_PIN11_BIT | 
                     !((port >> 6) & 1) << CORE_PIN12_BIT | 
                     !((port >> 7) & 1) << CORE_PIN13_BIT;
}
 
Also by looking through core_pins.h in the Teensy core files you can see which pins are linked to what register.
Code:
 // Fast GPIO
#define CORE_PIN0_PORTREG	GPIO6_DR
#define CORE_PIN1_PORTREG	GPIO6_DR
#define CORE_PIN2_PORTREG	GPIO9_DR
#define CORE_PIN3_PORTREG	GPIO9_DR
#define CORE_PIN4_PORTREG	GPIO9_DR
#define CORE_PIN5_PORTREG	GPIO9_DR
#define CORE_PIN6_PORTREG	GPIO7_DR
#define CORE_PIN7_PORTREG	GPIO7_DR
#define CORE_PIN8_PORTREG	GPIO7_DR
#define CORE_PIN9_PORTREG	GPIO7_DR
#define CORE_PIN10_PORTREG	GPIO7_DR
#define CORE_PIN11_PORTREG	GPIO7_DR
#define CORE_PIN12_PORTREG	GPIO7_DR
#define CORE_PIN13_PORTREG	GPIO7_DR
#define CORE_PIN14_PORTREG	GPIO6_DR
#define CORE_PIN15_PORTREG	GPIO6_DR
#define CORE_PIN16_PORTREG	GPIO6_DR
#define CORE_PIN17_PORTREG	GPIO6_DR
#define CORE_PIN18_PORTREG	GPIO6_DR
#define CORE_PIN19_PORTREG	GPIO6_DR
#define CORE_PIN20_PORTREG	GPIO6_DR
#define CORE_PIN21_PORTREG	GPIO6_DR
#define CORE_PIN22_PORTREG	GPIO6_DR
#define CORE_PIN23_PORTREG	GPIO6_DR
#define CORE_PIN24_PORTREG	GPIO6_DR
#define CORE_PIN25_PORTREG	GPIO6_DR
#define CORE_PIN26_PORTREG	GPIO6_DR
#define CORE_PIN27_PORTREG	GPIO6_DR
#define CORE_PIN28_PORTREG	GPIO8_DR
#define CORE_PIN29_PORTREG	GPIO9_DR
#define CORE_PIN30_PORTREG	GPIO8_DR
#define CORE_PIN31_PORTREG	GPIO8_DR
#define CORE_PIN32_PORTREG	GPIO7_DR
#define CORE_PIN33_PORTREG	GPIO9_DR
#define CORE_PIN34_PORTREG	GPIO8_DR
#define CORE_PIN35_PORTREG	GPIO8_DR
#define CORE_PIN36_PORTREG	GPIO8_DR
#define CORE_PIN37_PORTREG	GPIO8_DR
#define CORE_PIN38_PORTREG	GPIO8_DR
#define CORE_PIN39_PORTREG	GPIO8_DR
 
So, finally cracked it.

Couple of notes;

GPIO pins mapped like this -
Code:
/*
 * GPIO1.IO[12..24]
 * 12 = AD_B0_12 = PIN 24
 * 13 = AD_B0_13 = PIN 25
 * 14 = AD_B0_14 = N/A
 * 15 = AD_B0_15 = N/A
 * 16 = AD_B1_00 = PIN 19
 * 17 = AD_B1_01 = PIN 18
 * 18 = AD_B1_02 = PIN 14
 * 19 = AD_B1_03 = PIN 15
 * 20 = AD_B1_04 = N/A
 * 21 = AD_B1_05 = N/A
 * 22 = AD_B1_06 = PIN 17
 * 23 = AD_B1_07 = PIN 16
 * 24 = AD_B1_08 = PIN 22
 * 25 = AD_B1_09 = PIN 23
 * 26 = AD_B1_10 = PIN 20
 * 27 = AD_B1_11 = PIN 21
 * 28 = AD_B1_12 = N/A
 * 29 = AD_B1_13 = N/A
 * 30 = AD_B1_14 = PIN 26
 * 31 = AD_B1_15 = PIN 27
 *  
 */

I think used this to get the bits in the right places;

Code:
    data = value<< 4;        // make it 16 bit
    data &= 0xFC00;
  
    temp = value;
    temp &= 0x003F;
    temp <<= 2;
  
    data += temp;
  
    temp = value;
    temp &= 0x0003;
  
    data += temp;


and then wrote it like this;

Code:
    GPIO6_DR = (GPIO6_DR & ~(0xFCF3 << 12)) | ((data & 0xFCF3) << 12);


Also, NOTE the GPIO6!
For some reason the datasheet shows GPIO1, but when that didn't work I noticed Paul's code had GPIO8 for the SDIO pins,, so I kept incrementing until I got something that toggled pins..

This gives me a write time of around 30nS, which is WAY better :)

Paula
 
So, finally cracked it.

Couple of notes;

GPIO pins mapped like this -
...
I think used this to get the bits in the right places;

...
and then wrote it like this;

Code:
    GPIO6_DR = (GPIO6_DR & ~(0xFCF3 << 12)) | ((data & 0xFCF3) << 12);

This gives me a write time of around 30nS, which is WAY better :)

Paula

Great work :)
 
Also, NOTE the GPIO6!
For some reason the datasheet shows GPIO1, but when that didn't work I noticed Paul's code had GPIO8 for the SDIO pins,, so I kept incrementing until I got something that toggled pins..

This gives me a write time of around 30nS, which is WAY better :)

Paula
It was GPIO1 for the IMXRT1050 and GPIO6 for the IMXRT1060 so you must be looking at the wrong data sheet.
 
Thanks defragster for pointing at Pauls work... I'd not found that and once I got my head around it the solution came quickly (ok, well a few hours)
 
It was GPIO1 for the IMXRT1050 and GPIO6 for the IMXRT1060 so you must be looking at the wrong data sheet.

Indeed the 1052 versus 1062 was good for confusion - also the 1062 offers faster I/O when enabled {in native mode it was slower than T_3.6 IIRC} - and that shifts ports as well as was indicated in the first link to Paul's code he had to rewrite after FAST I/O was enabled the old code didn't reach the SDIO pins as it did before. Was noted the Page # RefMan indicator might be wrong - not only w/ 1052 to 1062 change - but then the RefMan itself was revised and re-ordered :(

@Paula : what 12 pins ended up coming out in order from that single 30nS operation? Any sample or summary posted here would be the first I've seen other than the cool thing Paul posted.
 
Thanks defragster for pointing at Pauls work... I'd not found that and once I got my head around it the solution came quickly (ok, well a few hours)

Not an easy task to find in that HUGE thread - and that was a rewrite as noted when 1062 was set to Turbo I/O mode. Just a trivial one off code snippet to test POGO pin connectivity for SDIO usage when the T4 Beta unit was removed and put back on the PJRC breakout.
 
Status
Not open for further replies.
Back
Top