JSON Deserialization slow'ish, blocks Stepper Motors

clockdiv

Member
hello,

I'm working on a project with Teensy 4.1. I want to send data quickly (with 60fps = every 16.6ms) to Teensy via Serial Port and control 4 Stepper Motors and 16 Servos with the data. The Serial Data is send from the Blender (animation program) with a small Python Add-on.

In the first successfull tests I did, I simply sent 20 csv-values via Serial. It was fast, worked pretty well, but was not flexible or nicely coded.

To be more flexible, I rewrote this part, sending a JSON-Package, receive it (non-blocking, works so far) and parse it with the ArduinoJSON 7 library. The bottleneck seems to be the deserializeJson() function, it takes ~69us and blocks out the stepper motors heavily.

That's the JSON-Data, send via Serial with 500000 Baud:
JSON:
{"Frame":567,"Mode":"MANUAL","AnimationData":[{"Type":"STEPPER","ID":0,"Position":123},{"Type":"STEPPER","ID":1,"Position":456},{"Type":"STEPPER","ID":2,"Position":789},{"Type":"SERVO","ID":0,"Position":987},{"Type":"SERVO","ID":1,"Position":654},{"Type":"SERVO","ID":2,"Position":321},{"Type":"SERVO","ID":3,"Position":987},{"Type":"SERVO","ID":4,"Position":654},{"Type":"SERVO","ID":5,"Position":321},{"Type":"SERVO","ID":6,"Position":987},{"Type":"SERVO","ID":7,"Position":654},{"Type":"SERVO","ID":8,"Position":321},{"Type":"SERVO","ID":9,"Position":987},{"Type":"SERVO","ID":10,"Position":654},{"Type":"SERVO","ID":11,"Position":321}]}

(I already shortened the JSON data to a minimal version, it reduces the speed by ~20%. But still not where I would need it.)

The code of the parser is as follows:
receivedChars holds the data you see above.

C++:
bool SerialDataHandler::parseJSONData()
{
    // Part 1 takes ~69us
    current_micros = micros(); // start stopwatch
    newData = false;
    DeserializationError error = deserializeJson(doc, receivedChars); // BOTTLENECK

    // Test if parsing succeeds
    if (error)
    {
        Serial.print(F("deserializeJson() failed: "));
        Serial.println(error.f_str());
        return false;
    }

    stopwatch_micros = micros() - current_micros; // stop stopwatch
    Serial.printf("\n\nJSON parse time part 1: %luus (blocking!!)\n", stopwatch_micros);

    // Part 2 takes ~14us, itearating the array could be rewritten to non-blocking,
    // but the bottleneck above is more urgent
    current_micros = micros(); // start stopwatch

    uint32_t frame = doc["Frame"];
    const char *mode_ = doc["Mode"];
    JsonArray frameDataJson = doc["AnimationData"].as<JsonArray>();

    uint8_t stepperIndex = 0;
    uint8_t servoIndex = 0;

    for (JsonVariant actuator : frameDataJson)
    {
        const char *type = actuator["Type"];
        uint8_t id = actuator["ID"];
        uint32_t position = actuator["Position"];

        if (strcmp(type, "STEPPER") == 0)
        {
            frameData.stepperTargets[stepperIndex].id = id;
            frameData.stepperTargets[stepperIndex].position = position;
            stepperIndex++;
        }
        else if (strcmp(type, "SERVO") == 0)
        {
            frameData.servoTargets[servoIndex].id = id;
            frameData.servoTargets[servoIndex].position = position;
            servoIndex++;
        }
    }
    frameData.stepperCount = stepperIndex;
    frameData.servoCount = servoIndex;

    stopwatch_micros = micros() - current_micros; // stop stopwatch
    Serial.printf("\n\nJSON parse time part 2: %luus (blocking)\n", stopwatch_micros);

    // finished here, only serial output from here:
    Serial.printf("Frame: %ld", frame);
    Serial.print("\nMode:");
    Serial.println(mode_);

    Serial.printf("%d Steppers:\n", frameData.stepperCount);
    for (int i = 0; i < frameData.stepperCount; i++)
    {
        Serial.print(frameData.stepperTargets[i].id);
        Serial.print(", ");
        Serial.println(frameData.stepperTargets[i].position);
    }

    Serial.printf("%d Servos:\n", frameData.servoCount);
    for (int i = 0; i < frameData.servoCount; i++)
    {
        Serial.print(frameData.servoTargets[i].id);
        Serial.print(", ");
        Serial.println(frameData.servoTargets[i].position);
    }
    return true;
}

Of course, parsing the data ~640 characters takes a bit time and 69us is in another context fast. I see different options:
1) Speed up deserialization (how?)
2) Run the stepper motors with timer interrupts (how?) - currently I'm using AccelStepper lib which relies on a smooth main-loop. So the parsing function could block, but the stepper motors continue to run.
3) Not sending all data at once, but sending tiny packages for each Stepper or Servo and putting it back together on the microcontroller.
4) any more ideas?

Thank you for your thoughts and all the best,
clockdiv
 
Did you try to simply call the accelstepper::run() functions from a timer interrupt?
 
Actully accelstepper::run() is not very expensive and the T4.1 processor is quite fast. Here a simple test which moves 4 steppers with random parameters. tick() calls the run() functions of the steppers in a timer interrupt every 100µs. During the execution it sets pin 12 HIGH for testing the execution speed with an LA/scope:

C++:
#include "AccelStepper.h"
#include "Arduino.h"

AccelStepper steppers[] = {             // array of 4 steppers
    {AccelStepper::DRIVER, 0, 1},
    {AccelStepper::DRIVER, 2, 3},
    {AccelStepper::DRIVER, 4, 5},
    {AccelStepper::DRIVER, 6, 7}};

IntervalTimer timer;

void tick()
{
    digitalWriteFast(12, HIGH);  // monitor with LA to check for execution duration
    for (AccelStepper& stepper : steppers)
    {
        if (stepper.distanceToGo() == 0)
            stepper.moveTo(-stepper.currentPosition());
        stepper.run();
    }
    digitalWriteFast(12, LOW);
}

void setup()
{
    pinMode(12, OUTPUT);
    for (AccelStepper& stepper : steppers)  // random stepper parameters
    {
        stepper.setMaxSpeed(random(1000, 2000));
        stepper.setAcceleration(random(50, 4000));
        stepper.moveTo(random(10000, 20000));
    }
    timer.begin(tick, 100);
}

void loop(){
}

The measured execution time of the tick() function is between 0.1µs and some 6µs, depending on how many steps need to be done. Duty cycle is below 1.5µs, i.e. the processor spends <1.5% in the interrupt. I would be very surprisend if this would generate any stabiltiy issues.

1707038076195.png


Generally, the processor has a nested interrupt controller where you can choose the priority of an interrupt. AFAIK the system relevant interrupts (USB, systick etc) all run with higher priority and will be able interrupt the timer interrupt with its default priority setting.

I'd just give it a try. Moving the call to the run functions from loop to a timer function for testing shouldn't be big deal
 
Last edited:
Wow, thanks luni for your response and this test, that looks really promising! Thanks!!
Is this a screenshot from your oscilloscope or did you measure it somehow software-wise?
 
Back
Top