Tutorial on digital I/O, ATMega PIN/PORT/DDR D/B registers vs. ARM GPIO_PDIR / _PDOR

I tested with my oscilloscope this simple program : It appears that the LEDPIN decay is not so fast as for other pins. Moreover, there are important Gibbs phenomenon (oscillations) for Low to High and High to Low state. These oscillations don't exist onn other pin. Here is a simple program for testing :

//TEST LEDPIN versus OTHER PINS
const int ledPin = 13;
const int Pin14 = 14;

void setup() {
// put your setup code here, to run once:
pinMode(ledPin, OUTPUT);
pinMode(Pin14, OUTPUT);
}

void loop()
{
// put your main code here, to run repeatedly:
digitalWrite(ledPin,HIGH);
digitalWrite(Pin14,HIGH);
delayMicroseconds(1);
digitalWrite(ledPin,LOW);
digitalWrite(Pin14,LOW);
delayMicroseconds(2);
}
 
The slower pin fall time is due to the extra load on the pin from the orange LED and 470 ohm resistor.

I ran a test here. I can see the different fall times on my scope. If I connect a LED and resistor to pin 14, it becomes the same as pin 13. Without the LED, I see about 9 ns. With the LED it looks like about 15 ns.
 
I should mention, by default pinMode() turns on the slew rate limiting feature. If you turn this off, the transitions are much faster. I measured just now. It is approx 3 ns, with or without the LED.

However, with the scope probes connected using ordinary ground clips, I also see quite a lot of ringing and cross-talk. These are common issues with very fast rise and fall times. They are the reason the pins have the slew rate limit feature, and why it's turned on by default.

If you want to run without the slew rate limit for maximum speed (and extra noise issues), this is the code:

Code:
void setup() {
  pinMode(13, OUTPUT);
  pinMode(14, OUTPUT);
  // Turn off slew rate limit, for maximum speed
  // CAUTION: noise and cross-talk problems can be
  // much worse when slew rate limiting is disabled
  CORE_PIN13_CONFIG &= ~PORT_PCR_SRE;
  CORE_PIN14_CONFIG &= ~PORT_PCR_SRE;
}

void loop() {
  digitalWrite(13, HIGH);
  digitalWrite(14, HIGH);
  delayMicroseconds(1);
  digitalWrite(13, LOW);
  digitalWrite(14, LOW);
  delayMicroseconds(2);
}
 
Thanks for your reply, Paul. Of course, this little difference is due to the 470 Ohms resistor and the LED.
(Light capacitive effect). But sufficient to induce problems for very fast real time professionnal uses.

So I need contiguous, very fast 8 bits INPUTS and OUTPUTS. Without LED on these pins.

Maybe would it be great to design a new TEENSY with : 2 x 8 bits contiguous INPUT-OUTPUT registers (registers C and D for example); and the rest (say registers A or B) for addresses.
As a classic microprocessor.
And without any LED please ! Thanks ! (Or connected to an unused pin such as PTB19 for example)
Because the C and D registers are the only ones which are available on the output pins of the MK20P64 chips.

I'm ready to design free of charge such a PCB if 2 layers only are necessary.

Best regards,

Pascal
 
... So I need contiguous, very fast 8 bits INPUTS and OUTPUTS. Without LED on these pins.

Maybe would it be great to design a new TEENSY with : 2 x 8 bits contiguous INPUT-OUTPUT registers (registers C and D for example); and the rest (say registers A or B) for addresses.
As a classic microprocessor.
And without any LED please ! Thanks ! (Or connected to an unused pin such as PTB19 for example)
Because the C and D registers are the only ones which are available on the output pins of the MK20P64 chips.

Doing that would break the Arduino ( hardware/software ) interchangeability of Teensy PJRC works to provide.

There is a design On OSH you could rework by changing the pin placement routing and redefining 'LED_BUILTIN': Teensy 3.2 DIY Reference Board

Other pointers probably on the Forum and around and this one recently linked: building-your-own-custom-teensy/

Or - if the LED were pulled off - a PCB BreakoutBoard could be designed to route the signals to a linear row of edge pins: Teensy 3.2 Breakout Board. Something like that also has solderable castellated pads on the underside to bring out those bottom side pins.

FrankB has shared designs that pull out those bottom pins - that might suggest an easier way to repurpose by grabbing the desired port pins and routing them in a similar way without the need for a whole new Teensy - just a small PCB.
 
May not be correct for Port B

I've updated my chart with the missing pins.


View attachment 387

After much frustration, I manually mapped Port B and it does not correspond to the chart exactly. In my application, with several libraries at play, pin25 simply did not respond (always read as HIGH) but (interestingly) digitalReadFast(25) *did* work, so my work-around was this:

temp1 = GPIOB_PDIR; //Bits are out of order compared to published mapping. Tested by grounding each, serial port output.
vReal[SampleIndex] = (((temp1 & 0x00030000)>>12)+((temp1 & 0x00000800)>>5)+(temp1 & 0x0000000F))+(digitalReadFast(25)<<7);

I am getting just shy of 1 MB/s transfer rate (some jitter, solid at 2 us rather than 1 us interrupt rate), but I cannot figure out how to work around the pin 25 issue. The main reason for this post is to warn that one of the pins (32) does not appear to be the internal 32-bit port digit suggested.

I could go on, but using ports is difficult due to the morass of Arduino legacy libraries and "commonly used pins." While there seem some on the forum who insist that "ports are useless" or "going fast is a meaningless metric," there are those of us in education and research who do want to do that, and the CPU can handle it - the bottleneck is getting data in fast enough.

I will try again to reach PJRC directly, but I'm curious if I/we put up the funding, would someone design a "Teensy 3.6 Pro" that had maybe two rows of pins on each edge and had at least one port "uncontaminated" and available, pins in order, for "us people."

By the way, the other thing some folks tend to say on this forum is "time to step up to professional embedded systems tools." I've used those, and while they are great, I respectfully disagree. The benefit of the Arduino/Teensy approach is massively broad access (great as an educator) and much, much faster prototyping in most cases due to the huge base of code/libraries/hardware. This is particularly true for creatives, who want to try ideas quickly rather than getting bogged down in the details.

We are trying to straddle these two points of view. One of the current projects is a touch-screen GUI, Teensy 3.6/Gameduino 3 (GPU-equipped) FFT analyzer for teaching. It's solid at 60 dB SNR and 100 ks/s using just the Teensy's ADC's and has a DDS synthesizer (tracking generator) running concurrently). The port issue came up trying to interface a lovely 8-bit flash ADC, as the code runs at tantalizingly close to 1 Ms/s. Imagine a $100, 500 kHz, all floating-point FFT analyzer with commented, educational code. I think it's going to be very cool.

Snippet of video here on Vimeo: https://vimeo.com/310882403

Every day I thank PJRC for bringing this kind of compute power to the community!!!
 
Upper bits of Port C are incorrect in the posted mapping. It is like this:

Port C Teensy Pin Notes
12..31 NONE
11 38
10 37
9 36
8 35
7 12
6 11
5 13 LED pin
4 10
3 9
2 23
1 22
0 15
 
Pins 24 and above are on different GPIO ports between Teensy 3.2 and Teensy 3.5/3.6. For 3.5/3.6, here is a table of pin numbers to port/bit number, and also a table of ports A-E to pin numbers. It is color-coded by pin location (front/back) and port number. I also attached the spreadsheet for reference.

Teensy36_by_pins.png
Teensy36_by_port.png
 

Attachments

  • Teensy36_pin_layout.xls
    13 KB · Views: 149
Have been using (on a MEGA):-

PORTA = PORTA & B11000000;
PORTA = PORTA | _muxpin; //The project is migrating to a Teensy 3.5 so

GPIOB_PDOR = GPIOB_PDOR & B11000000;
GPIOB_PDOR = GPIOB_PDOR | _muxpin;

Seems logical, however.. How do I point _muxpin at GPIOB bits 16-21 instead?
 
Hello! In my project, I transmit data in parallel code in different directions using port D. I need to quickly switch input / output. For tests, I used the only receive code:
Code:
int led = 13;
int readbit = 2;// bit 0 in portD

void setup() 
{
pinMode(led, OUTPUT);
pinMode(readbit, INPUT); //it works
//GPIOD_PDDR =0x00; // this does not work;  one team all 8 pins - input;
}

void loop() 
{
 digitalWrite(led, digitalRead(readbit));
}

What am I doing wrong?
 
Hello! In my project, I transmit data in parallel code in different directions using port D. I need to quickly switch input / output. For tests, I used the only receive code:
Code:
int led = 13;
int readbit = 2;// bit 0 in portD

void setup() 
{
pinMode(led, OUTPUT);
pinMode(readbit, INPUT); //it works
//GPIOD_PDDR =0x00; // this does not work;  one team all 8 pins - input;
}

void loop() 
{
 digitalWrite(led, digitalRead(readbit));
}

What am I doing wrong?

Which Teensy are you using?
 
Setting an Teensy pin to be a digital Input pin is more involved than just setting the direction of the port.

Most every pin on the Teensy can do more than one thing For example pin two can do PWM...

You might want to look at the sources (T3.x) in pins_teensy.c

For example it sets the pin into digital IO mode: *config = PORT_PCR_MUX(1);
where config is: portConfigRegister(pin)
 
please explain in more detail. I use port D only as parallel input / output of 8 bits of data, nothing more.
I need to configure the direction, read or send 8 bits as quickly as possible.
it seems to me that it is time consuming to configure conclusions one by one.
is there a faster way than this
Code:
byte pinTable [] = {2,14,7,8,6,20,21,5};
......
for (int i = 0; i <8; i ++)  {pinMode (pinTable [i], OUTPUT); }
 
Last edited:
To me, the question, is if you only need to do this once, does it matter?

That is you use pinMode initially to setup the 8 IO pins to be IO pins. Then you might be able to get away with updating: GPIOD_PDDR to switch all of these pins from input to output or output to input.

But of course note: that T4 is reasonably different than T3.x... i.e. the actual registers may be different.
 
I need to make communication between T3.6 and T3.2. in both directions as quickly as possible.
The data will then be sent over the UDP and simultaneously recorded on the memory card in Teensy. For UDP bought adapter USB to LAN. As soon as I receive by mail T4 I will replace the bottleneck.
Maybe I can run the SPI tests, but I did not find an example for the slave. and will SPI be faster?
 
You might want to start your own thread on this. This thread was mainly about how to use the GPIO ports on an AVR processor.

On a thread for your project you may want to explain what you are trying to do and things you have tried, including some of the stuff in the few posts here on this thread.

Faster... Is a difficult thing to sometimes quantify. That is probably the parallel input/output MIGHT be the fastest thing, but it requires both processors to be in lock step with each other to transfer the data. Plus probably some handshake or the like to know when the data is going to transfer...

Where as doing a connection using hardware Serial port, between the two processors, will likely not be as fast with raw transfer speed, but both processors can maybe work more independent and in parallel. When the one board has data it needs to send, it simply puts the data in serial output queue and continues on. Only when it needs confirmation or the like does it need to wait. But again that may not work for you. SPI again is a serial communications where one bit at a time is transferred, but can maybe go at pretty high speeds like 30mhz. But as you mentioned, there is not an official SPI Slave setup. There are examples up on forum on others that have done this, but...

So again if your questions are more about specifics of some project and not about how to use AVR ports (or maybe extended to simple port stuff on T3.x) maybe better to do this on a thread that is more specific to your project(s).
 
You might want to start your own thread on this. This thread was mainly about how to use the GPIO ports on an AVR processor.

I do not think so. I won’t argue.
I was hoping to get a recommendation.
Thanks anyway for your attention.
 
I won't argue either :D Just suggesting that if you are trying to get suggestions on different ways for two processors to communicate as fast as possible, that you might get some other suggestions if you were to post additional information about your setup and your requirements...
From your original question:
Code:
int led = 13;
int readbit = 2;// bit 0 in portD

void setup() 
{
pinMode(led, OUTPUT);
pinMode(readbit, INPUT); //it works
//GPIOD_PDDR =0x00; // this does not work;  one team all 8 pins - input;
}

void loop() 
{
 digitalWrite(led, digitalRead(readbit));
}

Earlier I was suggesting that you could either update the additional registers and the like the pinMode does or you could simply use pinMode.
That is your setup code like:
Code:
byte pinTable [] = {2,14,7,8,6,20,21,5};
......
for (int i = 0; i <8; i ++)  {pinMode (pinTable [i], OUTPUT); }
only needs to be called once during setup.

You could then through your main loop code probably get away with:
Code:
GPIOD_PDDR =0x00;
To set all of them into input mode.

And: GPIOD_PDDR = 0xff; // to set them into output mode.

And set the data register... But is this the best way for your app to work? Have no idea.

Again good luck
 
KurtE
Now I understand. It turns out I have to change additional registers once with pinMode at the beginning
and then in the main code I can change the input / output direction with GPIOD_PDDR = 0xff; or GPIOD_PDDR = 0x00;
I have already 5 hours. need to go to sleep. I'll check it tomorrow.
Thank you!
 
I will be using a Teensy 3.5 that sends small integer values to the parallel port of a receiving device. I would really prefer to send the values using GPIOx_PDOR rather than multiple calls to digitalWriteFast().

Because I'm also using the Audio Adaptor Board, there won't be any ports that aren't touched by something else on the system, but even with the audio board there are still some potentially useful contiguous blocks of bits in Ports A, B, and C:

A12 - A17
B18 - B23
C8 - C11

all seem potentially available (especially with the 3.5/3.6 Breakout board). Even though these are blocks of only 4 and 6 bits, they would be sufficient for my purposes.

What I'd like to know is if is possible to use a bit mask to slip a value into the unused bits of a port without causing any harm to whatever else is using the other bits on that port. For example, to use B18 - B23 I was hoping something like the following code fragment would work:

-----

uint32_t send_value; // 6 bit value must be 0 - 63 (0x0 - 0x3f)
uint32_t b_mask; // mask to access 6 free, contiguous bits in Port B


b_mask = 0x3f << 18; // Create mask 6 bits wide to access B18 - B23
send_value = 42 << 18; // arbitrary value to send

GPIOB_PDOR = (GPIOB_PDOR & ~b_mask) | (send_value & b_mask); // Modify only 6 bits in Port B

-----

Would something along these lines work and be safe?

Thanks.
 
I will be using a Teensy 3.5 that sends small integer values to the parallel port of a receiving device. I would really prefer to send the values using GPIOx_PDOR rather than multiple calls to digitalWriteFast().

Because I'm also using the Audio Adaptor Board, there won't be any ports that aren't touched by something else on the system, but even with the audio board there are still some potentially useful contiguous blocks of bits in Ports A, B, and C:

A12 - A17
B18 - B23
C8 - C11

all seem potentially available (especially with the 3.5/3.6 Breakout board). Even though these are blocks of only 4 and 6 bits, they would be sufficient for my purposes.

What I'd like to know is if is possible to use a bit mask to slip a value into the unused bits of a port without causing any harm to whatever else is using the other bits on that port. For example, to use B18 - B23 I was hoping something like the following code fragment would work:

-----

uint32_t send_value; // 6 bit value must be 0 - 63 (0x0 - 0x3f)
uint32_t b_mask; // mask to access 6 free, contiguous bits in Port B


b_mask = 0x3f << 18; // Create mask 6 bits wide to access B18 - B23
send_value = 42 << 18; // arbitrary value to send

GPIOB_PDOR = (GPIOB_PDOR & ~b_mask) | (send_value & b_mask); // Modify only 6 bits in Port B

-----

Would something along these lines work and be safe?

Thanks.

Of course you can use shifts and AND OR boolean operations for solving your problem. You must OR the current state of the register using GPIOB_PDIR, (And not PDOR !) in order to read and save the actual value or the register and reload it after your boolean opertations.
This example illustrates the needs for new TEENSYs with 3x8bits contiguous and in right order as many microprocessors have. As I'm not confident in high speed Serial SPI transferts, due to EMI risks. Moreover, classic parallel chips such as HCT541 and HCT574 are more secure less expensive, with 8 times more pulse width for more security on board !

By the way does anyone knows why the TEEN?SY 3.6 USB host port don't recognize USB Touchscreen inputs from standard Touchscreen 15' or 17' displays such as ASUS or LG, but is running on BUYDISPLAY screens products ?
 
Back
Top