Teensy 3.6 - How to get started with Register level programming

Status
Not open for further replies.

vimalzxc

New member
Hello everyone,

I just got a teensy3.6 for my laser scanning project which has some very stringent timing requirements. Inorder to achieve this, I feel accessing the registers directly would be faster when compared to using high level APIs in arduino. I browsed online but couldn't find any resources for bare metal programming. Can someone point me to good resources to get started?

Thanks,
Vimal
 
This comes up occasionally.

https://forum.pjrc.com/threads/31350-Maximum-GPIO-toggling-(Teensy-3-2)

https://forum.pjrc.com/threads/28942-Really-fast-SPI-maybe-try-assembly

https://forum.pjrc.com/threads/41857-Options-for-bare-metal-development

http://kevincuzner.com/2014/12/12/teensy-3-1-bare-metal-writing-a-usb-driver/

https://forum.pjrc.com/threads/25395-Teensy-Quick-Reference-Code-Examples-Tips-and-Tricks

There are many more....

Of course, all the datasheets are here:

https://www.pjrc.com/teensy/datasheets.html

To translate the Arduino pin numbers to native port names, refer to the schematic:

https://www.pjrc.com/teensy/schematic.html

If you're really going to get into low-level programming, in addition to Freescale's reference manual, you really must read Joseph Yiu's book. It has tons of essential info about the ARM processor which isn't in Freescale's documentation.

https://www.amazon.com/dp/0124080820

Of course, if you read those prior conversations you'll probably get the notion that doing everything from scratch probably isn't the best way to achieve your goals. But since you didn't give us any info at all about what you're trying to accomplish, this is the best answer I can give you.
 
laser scanning project which has some very stringent timing requirements.

Are you using some off-the-shelf device, like Lidar-Lite or XV-11, or rolling your own circuitry from scratch? If trying to do any kind of TOF [time of flight] stuff, I shouldn't imagine the Teensy will be fast enough for that, as I assume getting nsec [or less] timing resolution isn't gonna happen on a Teensy.
 
Last edited:
It is sort of interesting to see, these types of requests come up reasonably often. Hopefully when the forum finally moves and we have a wiki, a bunch of these can be summarized up a wiki.

To me, I guess it often depends on where you want to spend your time. i.e. do you wish to spend months just trying to get the basic hardware to work, or would you rather spend your time working on the particulars of your project.

Again if it were me, I would be looking to start with something that works well, like the Arduino setup. You have the option to use the Arduino IDE and/or you can setup to use make files, there are some Eclipse setups...

Then if you find some stuff that is interfering with your stuff, you always have the option to trim them out.

As for using Registers, you are totally free to use any/all of them in the Arduino code. But as for being faster? hard to say in all cases.

For example, if you do something like digitalWriteFast(3, LOW);
The code generated will be: CORE_PIN3_PORTCLEAR = CORE_PIN3_BITMASK
Which is simply one move to one register...

There also may be some other good starting points. Like recently there has been a thread or two talking about progress on making some RTOS setups work on the T3.6. Not sure if the new stuff is available yet or not.

But again this really depends on what you wish to do and where you want to spend your time.
 
The RW I/O speed is something that eventually one will hit on after he has done some work on the project and if the project is serious indeed. As you say I usually do all the hard work fast using the IDE commands and then I realize how much I have lost in speed because of the limitations. And then I get into low-level programming.. But for example yesterday I opened a topic asking the same questions and until then I didn't know about DigitalxxxxFast! Which maybe is a fraction slower than low level programming in my code but does the job very fast as well and I would not have bothered with all the PORTX and PINX etc registers in the past if I knew its existence..

So I agree that a Knowledge database or wiki as said is the best way to go and is the fastest for the end-user when seeking a solution.. Or maybe something like "Teensy vs Arduino main difference FAQ" stating basic important differences like "DigitalWrite is there for compatibility reasons but we encourage you to use DigitalWriteFast instead"..
 
I agree with all what is said, but would like to add:
- get first a functioning program with slowed down speed/ reduced performance(standard C/C++). This will cost most of your time.
- use a logic analyser and unused digital pins with digitalWriteFast to monitor time requirement of key parts of your program
- modify your program (different optimization levels) and watch how your time requirements distribution changes or not
you will realize that not all what you assumed on programming is still correct (compiler are better nowadays than said in school)
- If you figure out that there are some routines that are very time critical, get the assembled code and try to write your own version
- also your logic on data flow may not be optimal, eg. have a real-time interface running event driven on higher priority and a data processing back-end on lower real time.

you will be surprised, what can be done in an easy way by changing design, and how hard it is to improve performance by assembler and interferences with the compiler
OK, first do your calculation in time requirements as oric_dan suggested.
 
I would not have bothered with all the PORTX and PINX etc registers in the past if I knew its existence..

So I agree that a Knowledge database or wiki as said is the best way to go and is the fastest for the end-user when seeking a solution.. Or maybe something like "Teensy vs Arduino main difference FAQ" stating basic important differences like "DigitalWrite is there for compatibility reasons but we encourage you to use DigitalWriteFast instead"..

It's nice that Paul added those Fast read/write commands for the Teensy. Even on a 16-Mhz Arduino, there is a speedup difference of 5X or 10X in using the direct PORT/PIN syntax over digitalRead/Write. A lot of people have commented on the profound snailishness of the classical Arduino pin mapping, but they did it in order to create a standard usable over dozens of different processors, and so they could use a standard Header arrangement for same, and so people could design compatible shields and libraries conveniently. All in all, it was really a brilliant insight, I think. "convenience" is the crucial word.

For general compatibility, and for code that just "may" possibly run on both Arduino and Teensy 3.x, it's easier to use the Arduino syntax at least initially, and then go to the fast syntax where you really need it.

Of all the bizarre things, I've just been writing an ISR to run on a ATtiny84 for use as a coprocessor to a T3.5 [strange as it may seem], because I didn't want the T3.5 reading ADC channels continuously, and also being interrupted 100s of times per sec by another signal, and I used 2 sets of macros, as follows:
Code:
#define led1on()     (PORTA |= 0x80)        //PA7.
#define led1off()    (PORTA &= 0x7F)
//#define led1on()   { digitalWrite(3,HIGH); } //D3,PA7.
//#define led1off()  { digitalWrite(3, LOW); }
I first used the lower ones for convenience during initial development, but then coded the upper ones after things settled down, and since they are used inside the ISR. So, like Kurt indicated, a 2-level coding approach can be both convenient, and also refined for speed.
 
Hello everyone,

I just got a teensy3.6 for my laser scanning project which has some very stringent timing requirements. Inorder to achieve this, I feel accessing the registers directly would be faster when compared to using high level APIs in arduino. I browsed online but couldn't find any resources for bare metal programming. Can someone point me to good resources to get started?

Thanks,
Vimal

I started to learn the ARM Cortex M4 registers and do some Assembler programming back in 2015. I wrote a little example here which you might find useful... https://forum.pjrc.com/threads/27938-Using-Assembler-with-Arduino-and-Teensy3-1-or-Teensy2. It also has a pointer to a web ARM course on using the registers (Rasp pi quoted, but same basic registers).

Also, I recently purchased Joseph Yiu's book (following Paul's recommendation) and, though a bit expensive, I would endorse the recommendation to anyone interested in bare-metal activity. It really is the bee's knees. Especially like the chapters on floating point and DSP examples in latest edition :) !!
 
digitalWriteFast translates to a single assembler-instruction (ok, some loading is needed, too - but the compiler is pretty good at optimizing). It's hard to find a faster way :)
 
Does digitalWriteFast/digitalReadFast take advantage of bit-banding? I guess that is about the fastest way technically possible.

Bitband is actually slower, because it performs a read-modify-write bus operation, taking at least 2 cycles.

digitalWriteFast uses a single write to the GPIO hardware, which does the bit-level modify all within the GPIO within a single cycle.
 
This comes up occasionally.

https://forum.pjrc.com/threads/31350-Maximum-GPIO-toggling-(Teensy-3-2)

https://forum.pjrc.com/threads/28942-Really-fast-SPI-maybe-try-assembly

https://forum.pjrc.com/threads/41857-Options-for-bare-metal-development

http://kevincuzner.com/2014/12/12/teensy-3-1-bare-metal-writing-a-usb-driver/

https://forum.pjrc.com/threads/25395-Teensy-Quick-Reference-Code-Examples-Tips-and-Tricks

There are many more....

Of course, all the datasheets are here:

https://www.pjrc.com/teensy/datasheets.html

To translate the Arduino pin numbers to native port names, refer to the schematic:

https://www.pjrc.com/teensy/schematic.html

If you're really going to get into low-level programming, in addition to Freescale's reference manual, you really must read Joseph Yiu's book. It has tons of essential info about the ARM processor which isn't in Freescale's documentation.

https://www.amazon.com/dp/0124080820

Of course, if you read those prior conversations you'll probably get the notion that doing everything from scratch probably isn't the best way to achieve your goals. But since you didn't give us any info at all about what you're trying to accomplish, this is the best answer I can give you.


First of all thanks Pedvide for GIT repo on ADC for teensy 3.6.
I have a similer question and I will be really glad to get your response. I also have very stringent timing requirements for ADC.
And I saw in "ADC.h", it is mentioned on top, "TODO: Function to measure more that 1 pin consecutively". Can you please guide me on how do I implement this ? Do I need to access ADC registers directly, if so can you please share some tips. I would really appreciate any help. Thanks
 
Status
Not open for further replies.
Back
Top