What's the performance hit of unaligned memory reads from program memory on Teensy? I'm porting some code to Cortex-M0+ which simply crashes upon unaligned reads and I have to write this code without unaligned reads which imposes some performance overhead due to double fetch from the program memory. So I'm just wondering if I should just use this same code on Teensy as well if there's significant overhead upon unaligned reads, or keep the unaligned read implementation if it's more optimal. This is part of performance intensive part of the code thus I'm interested.

Thanks, Jarkko