Forum Rule: Always post complete source code & details to reproduce any issue!
Results 1 to 4 of 4

Thread: Neural Network Example - ArduinoANN

  1. #1
    Join Date
    Mar 2014
    ▄bersee, Germany

    Neural Network Example - ArduinoANN

    I have compiled a demo for an artificial neural net. See
    On Arduino 1.0.5 and teensyduino 1.18 I have found that one ANN training cicle takes
    ~26 secs on a Leonardo board and
    ~27secs on a teensy 3.0@48MHz.
    Compilation is a bit slower for the teensy and code size ~200%.

    After reinstalling Arduino1.0.5.r2 and teensyduino 1.19 I get
    ~100 secs on a Leonardo board !!!
    ~28secs on a teensy 3.0@48MHz and
    ~15secs on a teensy 3.1@72MHz.

    The first results were strange, showing a Leonardo that was faster than teensy 3.0. I believe that I did not mix up things, however I do not want to step back to teensyduino1.18 to test.
    The second result is a bit more logical in terms of speed. Still an 8-bit-ÁC with a clk that is 5 times slower should be far more than 7 times slower in float calculations than a 32-bit-ÁC. Something is wrong here.

    Does anybody have a hint why teensy 3.1 runs below expected speed?
    Last edited by eduardo; 07-03-2014 at 11:41 AM.

  2. #2
    Senior Member
    Join Date
    Jan 2013
    If you look at the code you'll find that it makes extensive use of the "float" data type. I am not aware of any Arduino or Arduino compatible board that has hardware floating point support in form of an FPU. As such these micro controllers are not the most efficient hardware to use for a neural network and the results you posted are not such a surprise. The toolchains I including the actual compiler for Leonardo (Atmel) and Teensy 3.x (Arm Cortex) also are quite different.

    If you're looking for a relatively inexpensive platform and far more performant float number crunching then perhaps look into a Raspberry PI or better the Beaglebone Black. Both of these run at or in excess of 1GHz have magnitudes more memory and a real FPU that supports that form of computing.

    What speed did you expect for the Teensy 3.1 to run at ?
    Last edited by Headroom; 07-03-2014 at 12:17 PM.

  3. #3
    Join Date
    Mar 2014
    ▄bersee, Germany
    I am fully aware of the float variables. Thanks.
    I was trying to test teensy3.1 performance on a real world example and I believe that the improvement should be higher. Float emulation contains 32-bit operations and those should be a lot slower on the AVR in relation to clk speed.
    However I may stop here since a speed improvement could be seen at my second experiment and that is ok so far.
    Neural nets can perfectly run on an Arduino by the way even if this demo is a bit slow. It is simply a demo.

  4. #4
    Senior Member+ MichaelMeissner's Avatar
    Join Date
    Nov 2012
    Ayer Massachussetts
    Well there is the Attoduino: that has the 80Mhz M4F processor with 32-bit float (Teensy 3.x has the M4 without floating point). It was a kickstarter project that was successfully funded on May 13th, but I don't think it has reached the rewards stage yet, so you would have to wait for all of the kickstarter funders to get theirs before you can buy it retail. It also is somewhat pricey at $85, which you can probably buy a Teensy, Beagle Bone Black, and bluetooth separately.

    There is the Navspark: that has a Sparc v9 processor inside, complete with 64-bit floating point. It was an Indiegogo campaign that was successfully funded, and it appears that it is selling its boards retail.

    I cant figure out if the Galileo which has an x86 inside has a floating point unit, but given all of its I/O is through i2c devices, it is on the slow side for actually doing anything you would want to use an Arduino for. In addition, it has rather high power requirements. You might as well get one of the Linux system on a chip boxes (Rasberry Pi, Beaglebone Black, pcDunio), which is cheaper than the Galileo.

    It would be nice if there was a Teensy variant with hardware floating point.

    For my day job, I am currently adding software emulation of the IEEE 128-bit floating point support to the PowerPC GCC toolchain, and I just did some measurements last night. On a Power7 machine, software emulated IEEE 128-bit is about 60-130 times slower than the native hardware 64-bit floating point. I was surprised it was as fast as it was. I was expecting it to be even slower.
    Last edited by MichaelMeissner; 07-03-2014 at 01:31 PM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts