Raft of Teensy 4 pin\port parallel read questions

Status
Not open for further replies.

kdharbert

Well-known member
[I'm rediting to get the port table below to work...I'll take any hints on how to input tables here, this is the best I could do. Spaces and tabs are your enemy.]
I have some I\O intensive work I used a Teensy 3.6 for. I do direct reads on ports that have bits 0-7 exposed. Putting the two reads together yields the desired 16 bit input. I had had massive pin conflict problems arriving at this implementation, but I got it done. I used up every single CPU cycle and was able to sample an external ADC at 100khz on 8 simultaneous channels at 16 bits. The ADC goes to 200khz so I'm looking at Teensy 4 push that.

With the reduced number of pins, the parallel input on Teensy 4.0 gets messier, but the added CPU speed can deal with it if the number of port reads can still be kept to 2 or less. Given the Teensy 4's CPU speed, it seems that I'd choose to execute the fewest port reads possible in favor of messier bit shuffling in the CPU. Please confirm.

Paul mentioned Teensy 4.0 upgrades for next year. (I can't find the thread now.) Is there any advance info on how the pin-port mapping might work out? Any chance a port with a contiguous 16 bits might be exposed? That'd be sweet.

With present Teensy4 units I'm expecting conflict mania sorting out what pins to use. I haven't tested doing port reads where unused pins in the port are configured for output, used for SPI etc. Any advice on things to look out for?

If I understand the pinout correctly, I'll still need to make two port reads to take in a minimum of 16 bits. The obvious choices are AD_B1 and EMC. I have no idea what EMC is or what it might conflict with. Where can I read?

If we assumed I was going to be using a Teensy 4 with SPI and an SD card, how could I hopscotch through the pins to obtain 16 bit bits with only two port reads and no conflicts? I mapped the Teensy 4 ports to bits below. I omitted the SD_B0 port assuming it would be off limits for SD card use.
Bonus points for fewest CPU cycles to shuffle the bits into the proper order...its a pittance at 600mhz;)


AD_B1_00...B0_00...B1_00
AD_B1_01...B0_01...B1_01
AD_B1_02...B0_02..............AD_B0_02
AD_B1_03...B0_03..............AD_B0_03
AD_B1_04........................................EMC_04
.....................................................EMC_05
AD_B1_06........................................EMC_06
AD_B1_07........................................EMC_07
AD_B1_08........................................EMC_08
AD_B1_09
AD_B1_10
AD_B1_11
................B0_12..............AD_B0_12
......................................AD_B0_13
AD_B1_14
AD_B1_15

....................................................EMC_31
....................................................EMC_32

....................................................EMC_36
....................................................EMC_37
 
Last edited:
I prefer to look at the core_pins.h file to find the mappings, it seems a lot clearer to me than looking at the schematic. If you look here, there is actually a 16 bit port available with no conflict for SPI or the SD card pins, if you have the appropriate header to solder to the bottom pads then this can easily be done in one register read instead of two, saving you a little bit of time.
Code:
[COLOR="#FF0000"]#define CORE_PIN0_PORTREG	GPIO6_DR
#define CORE_PIN1_PORTREG	GPIO6_DR[/COLOR]
#define CORE_PIN2_PORTREG	GPIO9_DR
#define CORE_PIN3_PORTREG	GPIO9_DR
#define CORE_PIN4_PORTREG	GPIO9_DR
#define CORE_PIN5_PORTREG	GPIO9_DR
#define CORE_PIN6_PORTREG	GPIO7_DR
#define CORE_PIN7_PORTREG	GPIO7_DR
#define CORE_PIN8_PORTREG	GPIO7_DR
#define CORE_PIN9_PORTREG	GPIO7_DR
#define CORE_PIN10_PORTREG	GPIO7_DR
#define CORE_PIN11_PORTREG	GPIO7_DR
#define CORE_PIN12_PORTREG	GPIO7_DR
#define CORE_PIN13_PORTREG	GPIO7_DR
[COLOR="#FF0000"]#define CORE_PIN14_PORTREG	GPIO6_DR
#define CORE_PIN15_PORTREG	GPIO6_DR
#define CORE_PIN16_PORTREG	GPIO6_DR
#define CORE_PIN17_PORTREG	GPIO6_DR
#define CORE_PIN18_PORTREG	GPIO6_DR
#define CORE_PIN19_PORTREG	GPIO6_DR
#define CORE_PIN20_PORTREG	GPIO6_DR
#define CORE_PIN21_PORTREG	GPIO6_DR
#define CORE_PIN22_PORTREG	GPIO6_DR
#define CORE_PIN23_PORTREG	GPIO6_DR
#define CORE_PIN24_PORTREG	GPIO6_DR
#define CORE_PIN25_PORTREG	GPIO6_DR
#define CORE_PIN26_PORTREG	GPIO6_DR
#define CORE_PIN27_PORTREG	GPIO6_DR[/COLOR]
#define CORE_PIN28_PORTREG	GPIO8_DR
#define CORE_PIN29_PORTREG	GPIO9_DR
#define CORE_PIN30_PORTREG	GPIO8_DR
#define CORE_PIN31_PORTREG	GPIO8_DR
#define CORE_PIN32_PORTREG	GPIO7_DR
#define CORE_PIN33_PORTREG	GPIO9_DR
#define CORE_PIN34_PORTREG	GPIO8_DR
#define CORE_PIN35_PORTREG	GPIO8_DR
#define CORE_PIN36_PORTREG	GPIO8_DR
#define CORE_PIN37_PORTREG	GPIO8_DR
#define CORE_PIN38_PORTREG	GPIO8_DR
#define CORE_PIN39_PORTREG	GPIO8_DR

Also a simple way to shuffle the bits that makes sense to me, I'm not sure of the performance of this, there's probably a much better way to do this, but this is my take on it.
Code:
uint16_t busRead() {
    return  (GPIO6_DR & CORE_PIN0_BITMASK)   >> CORE_PIN0_BIT           | 
            (((GPIO6_DR & CORE_PIN1_BITMASK)   >> CORE_PIN1_BIT) << 1)  |
            (((GPIO6_DR & CORE_PIN14_BITMASK) >> CORE_PIN14_BIT) << 2)  |
            (((GPIO6_DR & CORE_PIN15_BITMASK) >> CORE_PIN15_BIT) << 3)  |
            (((GPIO6_DR & CORE_PIN16_BITMASK) >> CORE_PIN16_BIT) << 4)  |
            (((GPIO6_DR & CORE_PIN17_BITMASK) >> CORE_PIN17_BIT) << 5)  |
            (((GPIO6_DR & CORE_PIN18_BITMASK) >> CORE_PIN18_BIT) << 6)  |
            (((GPIO6_DR & CORE_PIN19_BITMASK) >> CORE_PIN19_BIT) << 7)  | 
            (((GPIO6_DR & CORE_PIN20_BITMASK) >> CORE_PIN20_BIT) << 8)  |
            (((GPIO6_DR & CORE_PIN21_BITMASK) >> CORE_PIN21_BIT) << 9)  |
            (((GPIO6_DR & CORE_PIN22_BITMASK) >> CORE_PIN22_BIT) << 10) |
            (((GPIO6_DR & CORE_PIN23_BITMASK) >> CORE_PIN23_BIT) << 11) |
            (((GPIO6_DR & CORE_PIN24_BITMASK) >> CORE_PIN24_BIT) << 12) |
            (((GPIO6_DR & CORE_PIN25_BITMASK) >> CORE_PIN25_BIT) << 13) |
            (((GPIO6_DR & CORE_PIN26_BITMASK) >> CORE_PIN26_BIT) << 14) |
            (((GPIO6_DR & CORE_PIN27_BITMASK) >> CORE_PIN27_BIT) << 15);
}
 
Great post. Its clear I need help reading the schematic. CORE_PINX_PORTREG obviously maps to the Teensy pins. Per the schematic, this group of pins maps to both AD_B0 and AD_B1. Both of these groups can supply the 16 bits, but whats up with the terminology\labeling\naming? Its clear AD_B0 and AD_B1 aren't ports as I understand them...otherwise there wouldn't be pins they're both mapped to. Any help with my reasoning?
 
Those are the official names taken from the reference manual as far as I can tell and it looks like the very first letter is what port they are, so the 16 bit port is port A. Like I say, it's a little confusing, that's why I just prefer to look at core_pins.h it's a lot simpler for me to understand.
 
Those are the official names taken from the reference manual as far as I can tell and it looks like the very first letter is what port they are, so the 16 bit port is port A. Like I say, it's a little confusing, that's why I just prefer to look at core_pins.h it's a lot simpler for me to understand.

Can anyone explain how to actually code up writing to the ports?

On the Teensy 3.6 we easily get 10M reads/writes per second to the (scrambled) ports and with overclocking and tweaking 20 MSPS. I was so excited to get the Teensy 4.0's but no luck so far...

Example that works on Teensy 3.6:

for (int i = 0; i<100000000; i++)
{
digitalWriteFast(dirClkPin,LOW);
GPIOB_PDOR = 0xFFFF;
digitalWriteFast(dirClkPin,HIGH);
GPIOB_PDOR = 0x0000;
}
 
Can anyone explain how to actually code up writing to the ports?

On the Teensy 3.6 we easily get 10M reads/writes per second to the (scrambled) ports and with overclocking and tweaking 20 MSPS. I was so excited to get the Teensy 4.0's but no luck so far...

Example that works on Teensy 3.6:

for (int i = 0; i<100000000; i++)
{
digitalWriteFast(dirClkPin,LOW);
GPIOB_PDOR = 0xFFFF;
digitalWriteFast(dirClkPin,HIGH);
GPIOB_PDOR = 0x0000;
}

I would be also very interested to know if such a code works with Teensy 4.0. I am using Teensy 3.6 with the similar code as above to read a CMOS imaging sensor at about 10 msps. The larger RAM of Teensy 4.0 would be nice to use for my application.
 
Status
Not open for further replies.
Back
Top