Forum Rule: Always post complete source code & details to reproduce any issue!
Results 1 to 18 of 18

Thread: Tensorflow on Teensy

Threaded View

  1. #1

    Tensorflow on Teensy

    Hi all, I have some exciting news; Tensorflow Lite for Microcontrollers can be run fairly easily on the Teensy!

    For a bit of background, my research centers on music technology and embedded systems, but I had consciously avoided AI/ML for years because it felt too much like a buzzword and not at all aligned with my interests. My mind began to change in the first half of this year, though, with the advent of things like the Sparkfun Edge, Google Coral, and Nvidia Jetson Nano. The SF Edge piqued my interest in particular because microcontrollers are my jam, but I was disappointed by the maturity of software support (e.g. having to drop down almost to assembly to toggle an LED), so I decided to take a shot at porting TFLite to the Teensy in late May 2019.

    Long story short, I succeeded in getting the darn thing to compile, but it was a rough experience; lots of thrashing around with Makefiles, spurious clobbered newlines, and other evils. In the end my proof of concept was to hijack the TF code's main function with a blinking LED loop, showing that at least the function was getting called. Cool, but not very substantive and not well-documented so I'll leave it at that.

    Fast-forward to last week, I took this project off the back burner and decided to try again; things have gotten a lot better since then (TF is always under heavy development)! Here's what I've got so far:

    0) Intro / Caveats

    -> The hello_world example only runs on the T4.0 at the moment, I'm getting a linker error on T3.x which I could use some help from the community on, I'll detail this at the bottom of my post. The micro_speech example, however, runs on the T4.0 as well as the 3.2 and 3.5 (don't have a 3.6 with stacking headers handy at the moment but I'm confident it will work).

    -> We will be modifying code within an Arduino library generated for each TF project. It is simpler initially to download and install the nightly build .zip of each project (hello_world and micro_speech) as described in their respective READMEs, but I had some trouble with the speech example breaking after trying a fresh copy from a few days after my initial success. In the end it's best to clone the entire Tensorflow repo and generate the project .zips from there, again as described in the READMEs; that way one can do a checkout of previous commits in case things get funky. Note that when you unzip this generated library there's a top-level folder labeled 'arduino' which contains a recursive copy of the whole codebase; weird, and I usually just delete it, but it can be safely ignored, the code the library uses is in the top-level 'src' directory.

    -> The existing speech model was trained at 16KHz, so I've had to include code for the T4.0 (from Frank B by way of el_supremo) and for the T3.x to change the sample rate of the Audio Shield to 16KHz. Retraining the model for 44.1KHz is an ugly process which currently includes Docker, building all of TF from source, and *hours* of training time. Yuck. Also 44.1KHz is kinda overkill for speech recognition (but would makes interoperability with the rest of the Audio Library nicer). Would like to streamline this process in the next few weeks.

    1) hello_world (A fading LED):

    To get started, let's look at the hello_world page; a neural network is trained to reproduce the sine function and fade an LED according to it. So the great news is that this runs *out of the box* on the T4.0, but when I first ran it the LED seemed solidly illuminated. What gives? Well, in a moderately hilarious twist, it's because the T4.0 is much faster than the reference platform (the just-released Nano 33 BLE Sense, Cortex-M4F @64MHz), so terrifyingly fast that it's fading faster than the eye can see! It seems the TF authors neglected to include any throttling to limit the rate at which the neural network actually reports results.

    The solution is to simply add a delay(1) at the end of the HandleOutput function in hello_world/src/tensorflow/lite/experimental/micro/examples/hello_world/arduino/output_handler.cpp :

    Code:
    /* Copyright 2019 The TensorFlow Authors. All Rights Reserved.
    
    Licensed under the Apache License, Version 2.0 (the "License");
    you may not use this file except in compliance with the License.
    You may obtain a copy of the License at
    
        http://www.apache.org/licenses/LICENSE-2.0
    
    Unless required by applicable law or agreed to in writing, software
    distributed under the License is distributed on an "AS IS" BASIS,
    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    See the License for the specific language governing permissions and
    limitations under the License.
    ==============================================================================*/
    
    #include "tensorflow/lite/experimental/micro/examples/hello_world/output_handler.h"
    
    #include "Arduino.h"
    #include "tensorflow/lite/experimental/micro/examples/hello_world/constants.h"
    
    // The pin of the Arduino's built-in LED
    int led = LED_BUILTIN;
    
    // Track whether the function has run at least once
    bool initialized = false;
    
    // Animates a dot across the screen to represent the current x and y values
    void HandleOutput(tflite::ErrorReporter* error_reporter, float x_value,
                      float y_value) {
      // Do this only once
      if (!initialized) {
        // Set the LED pin to output
        pinMode(led, OUTPUT);
        initialized = true;
      }
    
      // Calculate the brightness of the LED such that y=-1 is fully off
      // and y=1 is fully on. The LED's brightness can range from 0-255.
      int brightness = (int)(127.5f * (y_value + 1));
    
      // Set the brightness of the LED. If the specified pin does not support PWM,
      // this will result in the LED being on when y > 127, off otherwise.
      analogWrite(led, brightness);
    
      // Log the current brightness value for display in the Arduino plotter
      error_reporter->Report("%d\n", brightness);
    
      delay(1); // Slow things down a bit
    }
    This limits the speed of the mod to yield a reasonable, visually-apparent sinusoidal fading of the LED.

    2) micro_speech (Speech recognition of "yes" and "no"):

    As described in the README, this demonstrates how to run a model which recognizes keywords from an incoming audio stream. I'm using an electret microphone soldered to a Rev B Audio Shield, with a breadboard for the necessary pin rerouting when using the T4.0.

    Fortunately the TF authors have done a good job on modularity, so on the T4.0 all the necessary changes live in one file: micro_speech/src/tensorflow/lite/experimental/micro/examples/micro_speech/arduino/audio_provider.cpp :

    Code:
    /* Copyright 2018 The TensorFlow Authors. All Rights Reserved.
    
    Licensed under the Apache License, Version 2.0 (the "License");
    you may not use this file except in compliance with the License.
    You may obtain a copy of the License at
    
        http://www.apache.org/licenses/LICENSE-2.0
    
    Unless required by applicable law or agreed to in writing, software
    distributed under the License is distributed on an "AS IS" BASIS,
    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    See the License for the specific language governing permissions and
    limitations under the License.
    ==============================================================================*/
    
    #include "tensorflow/lite/experimental/micro/examples/micro_speech/audio_provider.h"
    #include "tensorflow/lite/experimental/micro/examples/micro_speech/micro_features/micro_model_settings.h"
    
    // For Teensy audio
    #include <Audio.h>
    #include <Wire.h>
    #include <SPI.h>
    
    // Needed for changing sample rate on T4.0
    #if defined(__IMXRT1052__) || defined(__IMXRT1062__)
    #include <utility/imxrt_hw.h>
    #endif
    
    // Teensy Audio Library objects
    AudioInputI2S            i2s1;
    AudioRecordQueue         queue1;
    AudioConnection          patchCord1(i2s1, 0, queue1, 0);
    AudioControlSGTL5000     sgtl5000_1;
    
    // which input on the audio shield will be used?
    // const int myInput = AUDIO_INPUT_LINEIN;
    const int myInput = AUDIO_INPUT_MIC;
    
    // For changing audio sample rate
    void setI2SFreq(int freq) {
    #if defined(KINETISK) // Teensy 3.2/3.5/3.6
      typedef struct {
        uint8_t mult;
        uint16_t div;
      } __attribute__((__packed__)) tmclk;
      const int numfreqs = 14;
      const int samplefreqs[numfreqs] = { 8000, 11025, 16000, 22050, 32000, 44100, 44117.64706 , 48000, 88200, 44117.64706 * 2, 96000, 176400, 44117.64706 * 4, 192000};
    
    #if (F_PLL==16000000)
      const tmclk clkArr[numfreqs] = {{16, 125}, {148, 839}, {32, 125}, {145, 411}, {64, 125}, {151, 214}, {12, 17}, {96, 125}, {151, 107}, {24, 17}, {192, 125}, {127, 45}, {48, 17}, {255, 83} };
    #elif (F_PLL==72000000)
      const tmclk clkArr[numfreqs] = {{32, 1125}, {49, 1250}, {64, 1125}, {49, 625}, {128, 1125}, {98, 625}, {8, 51}, {64, 375}, {196, 625}, {16, 51}, {128, 375}, {249, 397}, {32, 51}, {185, 271} };
    #elif (F_PLL==96000000)
      const tmclk clkArr[numfreqs] = {{8, 375}, {73, 2483}, {16, 375}, {147, 2500}, {32, 375}, {147, 1250}, {2, 17}, {16, 125}, {147, 625}, {4, 17}, {32, 125}, {151, 321}, {8, 17}, {64, 125} };
    #elif (F_PLL==120000000)
      const tmclk clkArr[numfreqs] = {{32, 1875}, {89, 3784}, {64, 1875}, {147, 3125}, {128, 1875}, {205, 2179}, {8, 85}, {64, 625}, {89, 473}, {16, 85}, {128, 625}, {178, 473}, {32, 85}, {145, 354} };
    #elif (F_PLL==144000000)
      const tmclk clkArr[numfreqs] = {{16, 1125}, {49, 2500}, {32, 1125}, {49, 1250}, {64, 1125}, {49, 625}, {4, 51}, {32, 375}, {98, 625}, {8, 51}, {64, 375}, {196, 625}, {16, 51}, {128, 375} };
    #elif (F_PLL==180000000)
      const tmclk clkArr[numfreqs] = {{46, 4043}, {49, 3125}, {73, 3208}, {98, 3125}, {183, 4021}, {196, 3125}, {16, 255}, {128, 1875}, {107, 853}, {32, 255}, {219, 1604}, {214, 853}, {64, 255}, {219, 802} };
    #elif (F_PLL==192000000)
      const tmclk clkArr[numfreqs] = {{4, 375}, {37, 2517}, {8, 375}, {73, 2483}, {16, 375}, {147, 2500}, {1, 17}, {8, 125}, {147, 1250}, {2, 17}, {16, 125}, {147, 625}, {4, 17}, {32, 125} };
    #elif (F_PLL==216000000)
      const tmclk clkArr[numfreqs] = {{32, 3375}, {49, 3750}, {64, 3375}, {49, 1875}, {128, 3375}, {98, 1875}, {8, 153}, {64, 1125}, {196, 1875}, {16, 153}, {128, 1125}, {226, 1081}, {32, 153}, {147, 646} };
    #elif (F_PLL==240000000)
      const tmclk clkArr[numfreqs] = {{16, 1875}, {29, 2466}, {32, 1875}, {89, 3784}, {64, 1875}, {147, 3125}, {4, 85}, {32, 625}, {205, 2179}, {8, 85}, {64, 625}, {89, 473}, {16, 85}, {128, 625} };
    #endif
    
      for (int f = 0; f < numfreqs; f++) {
        if ( freq == samplefreqs[f] ) {
          while (I2S0_MCR & I2S_MCR_DUF) ;
          I2S0_MDR = I2S_MDR_FRACT((clkArr[f].mult - 1)) | I2S_MDR_DIVIDE((clkArr[f].div - 1));
          return;
        }
      }
    #elif defined(__IMXRT1062__) // Teensy 4.0
      // PLL between 27*24 = 648MHz und 54*24=1296MHz
      int n1 = 4; //SAI prescaler 4 => (n1*n2) = multiple of 4
      int n2 = 1 + (24000000 * 27) / (freq * 256 * n1);
      double C = ((double)freq * 256 * n1 * n2) / 24000000;
      int c0 = C;
      int c2 = 10000;
      int c1 = C * c2 - (c0 * c2);
      set_audioClock(c0, c1, c2, true);
      CCM_CS1CDR = (CCM_CS1CDR & ~(CCM_CS1CDR_SAI1_CLK_PRED_MASK | CCM_CS1CDR_SAI1_CLK_PODF_MASK))
          | CCM_CS1CDR_SAI1_CLK_PRED(n1-1) // &0x07
          | CCM_CS1CDR_SAI1_CLK_PODF(n2-1); // &0x3f
    //Serial.printf("SetI2SFreq(%d)\n",freq);
    #endif
    }
    
    namespace {
    bool g_is_audio_initialized = false;
    // An internal buffer able to fit 16x our sample size
    // AUDIO_BLOCK_SAMPLES is 128 by default in the Teensy Audio Library
    constexpr int kAudioCaptureBufferSize = AUDIO_BLOCK_SAMPLES * 16;
    int16_t g_audio_capture_buffer[kAudioCaptureBufferSize];
    // A buffer that holds our output
    int16_t g_audio_output_buffer[kMaxAudioSampleSize];
    // Mark as volatile so we can check in a while loop to see if
    // any samples have arrived yet.
    volatile int32_t g_latest_audio_timestamp = 0;
    }  // namespace
    
    void CaptureSamples() {
      // This is how many bytes of new data we have each time this is called
      const int number_of_samples = AUDIO_BLOCK_SAMPLES;
      // Calculate what timestamp the last audio sample represents
      const int32_t time_in_ms =
        g_latest_audio_timestamp +
        (number_of_samples / (kAudioSampleFrequency / 1000));
      // Determine the index, in the history of all samples, of the last sample
      const int32_t start_sample_offset =
        g_latest_audio_timestamp * (kAudioSampleFrequency / 1000);
      // Determine the index of this sample in our ring buffer
      const int capture_index = start_sample_offset % kAudioCaptureBufferSize;
      // Read the data to the correct place in our buffer
      if (queue1.available()) {
        memcpy(g_audio_capture_buffer + capture_index, queue1.readBuffer(), AUDIO_BLOCK_SAMPLES * sizeof(int16_t));
        queue1.freeBuffer();
      }
      // This is how we let the outside world know that new audio data has arrived.
      g_latest_audio_timestamp = time_in_ms;
    }
    
    TfLiteStatus InitAudioRecording(tflite::ErrorReporter* error_reporter) {
      // Teensy Audio initialization stuff
      // Audio connections require memory, and the record queue
      // uses this memory to buffer incoming audio.
      AudioMemory(60);
    
      // Enable the audio shield, select input, adjust mic gain
      sgtl5000_1.enable();
      sgtl5000_1.inputSelect(myInput);
      //sgtl5000_1.micGain(10);
    
      // This is important, the model was trained at a 16KHz sample rate
      setI2SFreq(16000);
    
      // Start up the recording queue
      queue1.begin();
    
      // Block until we have our first audio sample
      while (!g_latest_audio_timestamp) {
        LatestAudioTimestamp(); // This function polls the queue for incoming blocks
      }
    
      return kTfLiteOk;
    }
    
    TfLiteStatus GetAudioSamples(tflite::ErrorReporter* error_reporter,
                                 int start_ms, int duration_ms,
                                 int* audio_samples_size, int16_t** audio_samples) {
      // Set everything up to start receiving audio
      if (!g_is_audio_initialized) {
        TfLiteStatus init_status = InitAudioRecording(error_reporter);
        if (init_status != kTfLiteOk) {
          return init_status;
        }
        g_is_audio_initialized = true;
      }
      // This next part should only be called when the main thread notices that the
      // latest audio sample data timestamp has changed, so that there's new data
      // in the capture ring buffer. The ring buffer will eventually wrap around and
      // overwrite the data, but the assumption is that the main thread is checking
      // often enough and the buffer is large enough that this call will be made
      // before that happens.
    
      // Determine the index, in the history of all samples, of the first
      // sample we want
      const int start_offset = start_ms * (kAudioSampleFrequency / 1000);
      // Determine how many samples we want in total
      const int duration_sample_count =
          duration_ms * (kAudioSampleFrequency / 1000);
      for (int i = 0; i < duration_sample_count; ++i) {
        // For each sample, transform its index in the history of all samples into
        // its index in g_audio_capture_buffer
        const int capture_index = (start_offset + i) % kAudioCaptureBufferSize;
        // Write the sample to the output buffer
        g_audio_output_buffer[i] = g_audio_capture_buffer[capture_index];
      }
    
      // Set pointers to provide access to the audio
      *audio_samples_size = kMaxAudioSampleSize;
      *audio_samples = g_audio_output_buffer;
    
      return kTfLiteOk;
    }
    
    // main.cpp calls this, checking if the timestamp has advanced; if so it assumes
    // there are new sample blocks available, and tries to invoke the interpreter
    int32_t LatestAudioTimestamp() {
      // Are there new blocks available, and if so how many?
      int num_blocks = queue1.available();
      if (num_blocks > 0) {
        // For all new blocks, call CaptureSamples()
        for (int i = 0; i < num_blocks; i++) {
          CaptureSamples();
        }
      }
    
      // Any successful calls to CaptureSamples() in the above loop will have
      // advanced the timestamp, return it here
      return g_latest_audio_timestamp;
    }
    The code makes the builtin LED flicker as it's listening, and it lights up solid for 3 seconds if it hears the word "yes". More info is available in the Arduino Serial Monitor (i.e. recognized "no", unknowns, timestamps, etc.); I'm currently getting a lot of 'Couldn't push_back latest result, too many already!' messages, but I suspect this is largely a limitation of the current model and its handrolled queue data structure, not the fault of the Teensy (although the prodigious speed of T4.0 might be exacerbating things).

    For the T3.x, two additional changes must be made:
    a) The builtin LED on pin 13 cannot be used because it's the I2S RX pin on the Audio Shield. The LED can be changed by modifying micro_speech/src/tensorflow/lite/experimental/micro/examples/micro_speech/arduino/command_responder.cpp, I've omitted the code here for brevity since it's a rather simple change but can share if anyone wants.
    b) Template argument deduction fails (double-float mismatch) in a call to std::min() in micro_speech/src/tensorflow/lite/kernels/internal/quantization_util.cpp on line 291. The fix is to do a static_cast<double> to both args, as follows:

    Code:
    #else   // TFLITE_EMULATE_FLOAT
    // const double input_beta_real_multiplier = std::min(
    //     beta * input_scale * (1 << (31 - input_integer_bits)), (1ll << 31) - 1.0);
      const double input_beta_real_multiplier = std::min(
          static_cast<double>(beta * input_scale * (1 << (31 - input_integer_bits))),
          static_cast<double>((1ll << 31) - 1.0));
    #endif  // TFLITE_EMULATE_FLOAT
    I've included a .zip of the example with all the necessary changes if anyone would like to take it for a spin without diving headfirst into wrangling with TF itself.

    3) The T3.x Linker Error:

    As I mentioned, I'm getting an error with the hello_world example when compiling for the T3.x (MacOS Mojave 10.14.6, Arduino 1.8.9, Teensyduino 1.47). After adding the aforementioned delay(1) and std::min() casts, I get the following error:

    Code:
    /Applications/Arduino.app/Contents/Java/hardware/teensy/../tools/arm/bin/arm-none-eabi-gcc-ar rcs /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/core/core.a /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/core/yield.cpp.o
    Archiving built core (caching) in: /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_cache_745649/core/core_teensy_avr_teensy31_usb_serial,speed_96,opt_o2std,keys_en-us_cee3a1d70ca5f7f18ab44a012a95ee10.a
    Linking everything together...
    /Applications/Arduino.app/Contents/Java/hardware/teensy/../tools/arm/bin/arm-none-eabi-gcc -O2 -Wl,--gc-sections,--relax,--defsym=__rtc_localtime=1567176756 -T/Applications/Arduino.app/Contents/Java/hardware/teensy/avr/cores/teensy3/mk20dx256.ld -lstdc++ -mthumb -mcpu=cortex-m4 -fsingle-precision-constant -o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/hello_world.ino.elf /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/sketch/hello_world.ino.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/c/c_api_internal.c.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/core/api/error_reporter.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/core/api/flatbuffer_conversions.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/core/api/op_resolver.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/experimental/micro/arduino/debug_log.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/experimental/micro/debug_log_numbers.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/experimental/micro/examples/hello_world/arduino/constants.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/experimental/micro/examples/hello_world/arduino/output_handler.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/experimental/micro/examples/hello_world/main.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/experimental/micro/examples/hello_world/sine_model_data.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/experimental/micro/kernels/all_ops_resolver.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/experimental/micro/kernels/arg_min_max.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/experimental/micro/kernels/ceil.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/experimental/micro/kernels/comparisons.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/experimental/micro/kernels/conv.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/experimental/micro/kernels/depthwise_conv.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/experimental/micro/kernels/elementwise.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/experimental/micro/kernels/floor.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/experimental/micro/kernels/fully_connected.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/experimental/micro/kernels/logical.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/experimental/micro/kernels/maximum_minimum.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/experimental/micro/kernels/pack.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/experimental/micro/kernels/pooling.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/experimental/micro/kernels/prelu.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/experimental/micro/kernels/reshape.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/experimental/micro/kernels/round.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/experimental/micro/kernels/softmax.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/experimental/micro/kernels/split.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/experimental/micro/kernels/strided_slice.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/experimental/micro/kernels/unpack.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/experimental/micro/micro_allocator.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/experimental/micro/micro_error_reporter.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/experimental/micro/micro_interpreter.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/experimental/micro/micro_mutable_op_resolver.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/experimental/micro/simple_tensor_allocator.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/kernels/internal/quantization_util.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/libraries/hello_world/tensorflow/lite/kernels/kernel_util.cpp.o /var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345/core/core.a -L/var/folders/gh/k9281zxj157gwxrs7fnbz8940000gn/T/arduino_build_181345 -larm_cortexM4l_math -lm
    /Applications/Arduino.app/Contents/Java/hardware/tools/arm/bin/../lib/gcc/arm-none-eabi/5.4.1/../../../../arm-none-eabi/lib/armv7e-m/libc.a(lib_a-writer.o): In function `_write_r':
    writer.c:(.text._write_r+0x12): undefined reference to `_write'
    collect2: error: ld returned 1 exit status
    Using library hello_world at version 1.13 in folder: /Users/andrew/Documents/Arduino/libraries/hello_world 
    Error compiling for board Teensy 3.2 / 3.1.
    From the poking around I've done on the PJRC forums and elsewhere, it seems like sometimes this has to do with printf() stuff, but this doesn't make sense because the much more complex micro_speech example is printing to the serial port just fine on T3.x. So is there some subtle bug buried within the compiler flags that would cause this behavior on T3.x but not T4.0? Note I've also tried downgrading Arduino to 1.8.7, and/or Teensyduino to 1.46 which didn't give me these issues during my initial efforts back in May. I've included a .zip of the example which reproduces this issue. Any ideas?

    4) Future Directions

    This is all very exciting, but there's still work to be done. For one thing, speech recognition accuracy is not that great on the T3.x, I suspect it may take some tweaking of the model's thresholding and averaging strategies to get rid of chattering duplicate results and missed commands. For another thing, for portability the FFT routines being used do not take advantage of any of the DSP accelerations available in ARM. There's some talk in the TFLite README of how to make target optimized kernels using CMSIS-NN, but a cursory try at building and running a library generated this way did not seem to yield any improved results, I'll have to dig deeper.

    Needless to say, Tensorflow is a comprehensive general purpose framework, so it's fertile ground for many AI/ML applications outside speech recognition; the robustness and maturity of the Teensy ecosystem may prove invaluable for some of these efforts. Thank you Paul and the entire PJRC community!

    Best,
    Andrew
    Attached Files Attached Files

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •