Hardware-based Security for Teensy Firmware Distribution - Can it be bypassed?

alcnonl42

Member
Hello, I'm developing a device and wanted to add a security layer to it. My goal is to be able to distribute updates by publishing hex files after product distribution. However, I have a concern - users could potentially load the hex file onto an empty Teensy device, which I don't want to happen. That's why I added security to my device:
Code:
const TeensyID authorizedTeensys[] = {
  {0x4E9, 0xE51Bxxxx, "x Person"},
  {0x4E9, 0xE51Bxxxx, "y Person"},
  {0x4E9, 0xE518xxxx, "z Person"}
};

bool isAuthorizedTeeensy() {
    uint32_t current_mac1 = HW_OCOTP_MAC1;
    uint32_t current_mac0 = HW_OCOTP_MAC0;
    for (int i = 0; i < AUTHORIZED_COUNT; i++) {
        if (current_mac1 == authorizedTeensys[i].mac1 &&
            current_mac0 == authorizedTeensys[i].mac0) {
            Serial.println(authorizedTeensys[i].name);
            return true;
        }
    }
    return false;
}
I embedded this code into the firmware and check it in setup(). Are there any ways to bypass this? Is it possible to spoof the values used in this code? I assume they can't reverse the hex file back to code. I'd appreciate any clarification.
 
Hello, I'm developing a device and wanted to add a security layer to it. My goal is to be able to distribute updates by publishing hex files after product distribution. However, I have a concern - users could potentially load the hex file onto an empty Teensy device, which I don't want to happen. That's why I added security to my device:
Code:
const TeensyID authorizedTeensys[] = {
  {0x4E9, 0xE51Bxxxx, "x Person"},
  {0x4E9, 0xE51Bxxxx, "y Person"},
  {0x4E9, 0xE518xxxx, "z Person"}
};

bool isAuthorizedTeeensy() {
    uint32_t current_mac1 = HW_OCOTP_MAC1;
    uint32_t current_mac0 = HW_OCOTP_MAC0;
    for (int i = 0; i < AUTHORIZED_COUNT; i++) {
        if (current_mac1 == authorizedTeensys[i].mac1 &&
            current_mac0 == authorizedTeensys[i].mac0) {
            Serial.println(authorizedTeensys[i].name);
            return true;
        }
    }
    return false;
}
I embedded this code into the firmware and check it in setup(). Are there any ways to bypass this? Is it possible to spoof the values used in this code? I assume they can't reverse the hex file back to code. I'd appreciate any clarification.
I can't vouch for your code.

But lockable Teensy's do what you want. They can only be programmed by builds using your key. They cannot be loaded with any other code once locked, only your authorized code. Search for this on the PJRC website.
 
Thank you, but it's been mentioned that this method can cause temperature increases and delays. Since this is a timing-sensitive code that runs at full tempo, I didn't want to use it. From what I understand, if the hex file is not encrypted, it can be reverse-engineered and made readable or modifiable. If I'm not mistaken, if this is the case, then when the hex file is modified, it would be easy to disable this code. I couldn't think of an external method, so this came to mind and I implemented it. But the method you mentioned looks good, except for the additional heating and delay it causes. Could another alternative method be tried? Or I'm wondering if the hex can be cracked. Actually, what I'm curious about is whether it can be easily cracked or not. Because these files won't be in everyone's hands.
 
temperature increases and delays.
Not sure of the source of the TEMP changes - but be aware that any reported Slowdown was when the code was FORCED to run from FLASH!
>> 2+ megabytes where 4,000 functions repeatedly called into cloned functions (like recursion except unique code written by code)

That sketch was made to assure that there would be no problems executing the encrypted code so it was made to run from FLASH in a way that would both detect any read or execution errors of a general type, but also spend significant time running LARGE flash code that would exceed the 32KB CACHE support requiring it to be repeatedly decrypted for execution. It is possible the TEMP change was produced by the overhead of repeat decryption processing that gave the small slowdown at the same time. The measure of slowdown was attempted running code RAM versus FLASH on a test path to get some idea of any induced overhead - and in the test done it was measured as noted comparing NORMAL RAM execution at full speed versus decrypted code read from Flash.

HOWEVER - Normally Locked or Not a Teensy Build will run all code from RAM1 at full speed and the only decryption to RAM is done one time on startup (Startup already has delay built in so it is not likely ANY factor for that read already done). So, in that case there will not likely be ANY measurable change in execution speed (unless code exceeding the cache is kept in FLASH) and no reason for any extra MCU processing overhead that may account for any Temp elevation.

If you want to securely ship production code, use a Locked Teensy. @PaulStoffregen - can read and confirm. The slowdown was a purposeful test of Locked Flash reading for function and speed, not general execution.
 
Not sure of the source of the TEMP changes - but be aware that any reported Slowdown was when the code was FORCED to run from FLASH!
>> 2+ megabytes where 4,000 functions repeatedly called into cloned functions (like recursion except unique code written by code)

That sketch was made to assure that there would be no problems executing the encrypted code so it was made to run from FLASH in a way that would both detect any read or execution errors of a general type, but also spend significant time running LARGE flash code that would exceed the 32KB CACHE support requiring it to be repeatedly decrypted for execution. It is possible the TEMP change was produced by the overhead of repeat decryption processing that gave the small slowdown at the same time. The measure of slowdown was attempted running code RAM versus FLASH on a test path to get some idea of any induced overhead - and in the test done it was measured as noted comparing NORMAL RAM execution at full speed versus decrypted code read from Flash.

HOWEVER - Normally Locked or Not a Teensy Build will run all code from RAM1 at full speed and the only decryption to RAM is done one time on startup (Startup already has delay built in so it is not likely ANY factor for that read already done). So, in that case there will not likely be ANY measurable change in execution speed (unless code exceeding the cache is kept in FLASH) and no reason for any extra MCU processing overhead that may account for any Temp elevation.

If you want to securely ship production code, use a Locked Teensy. @PaulStoffregen - can read and confirm. The slowdown was a purposeful test of Locked Flash reading for function and speed, not general execution.
Code:
void pitCallback() {
    pit_flag = true;
}

void highPrecisionDelay(uint32_t us) {
#if HIGH_PRECISION_TIMING
    if (us < 5) {
        uint32_t start = micros();
        while (micros() - start < us) {
            asm volatile("nop");
        }
    }
    else {
        pit_flag = false;
        pitTimer.begin(pitCallback, us);

        elapsedMicros elapsed = 0;
        const uint32_t timeout = us + 100;

        while (!pit_flag) {
            if (elapsed > timeout) {
                break;
            }
            yield();
        }
        pitTimer.end();
    }
#else
    if (us > 1000) {
        delay(us / 1000);
    }
    delayMicroseconds(us % 1000);
#endif
}

It continuously performs timing-sensitive operations in this way. I'm using it at 450 MHz so the device doesn't overheat too much, to keep its lifespan long. Since it's inside a case, there's additional heating and this situation scares me. Actually, what you're saying makes sense, but in my situation, the most important factors are speed and temperature. What I read scared me so I didn't approach it, because it seemed to confirm both of my fears until you explained it
 
scared me
Wasn't meant to be scary. I read what Paul wrote about the testing I did 4 years ago:: https://github.com/Defragster/T4LockBeta/tree/main/Code4Code

Didn't occur to me it would turn folks OFF! For sure the goal was to assure proper function of code and memory access when stuck in FLASH and requiring decryption on the fly. That initial testing was on first BETA release of the Locked hardware (some of which is still used on the desk when posting). The Temp observations were secondary for that extreme case, and measured slowdown seemed less than feared as best I found to measure it.

Also note for various clockspeeds the internal voltage is increased. Running at 450 MHz it seems uses the same voltage as 528 MHz [1.15V]. Versus 1.25V used at 600 MHz. This is lower and better for longevity, so results may be similar using 528 MHz - it may increase slightly (?) - but so will performance.
 
Back
Top