CMSIS-NN (Neural Network) Library Now Working on Teensy 3.6


Senior Member+
Hi all.

Been curious about neural networks for a while (still don't have a clue on how they work yet but will one of theses days) and I came across the CMSIS-NN Library that is under CMSIS-5. According the hype:
CMSIS-NN library consists of two parts: NNFunctions and NNSupportFunctions. NNFunctions include the functions that implement popular neural network layer types, such as convolution, depthwise separable convolution, fully-connected (i.e. inner-product), pooling and activation. These functions are used by the application code to implement the neural network inference applications. The kernel APIs are also kept simple, so that it can be easily retargeted for any machine learning framework. NNSupportFunctions include different utility functions, such as data conversion and activation function tables, which are used in NNFunctions. These utility functions can also be used by the application code to construct more complex NN modules, e.g. Long Short-Term Memory (LSTM) or Gated Recurrent Unit (GRU).

So I got curious and decided to see if I can get it working on a Teensy 3.6. So the bottom line up front is that I did manage to get it working on the Teensy 3.6 with one of the provide examples. It does provide output but I have to figure out now if its correct output but can not find what it should be.

Anyway, the first thing you have to do is to update the DSP-math to the latest CMSIS version and I did that following the instructions on the forum. Then I created a CMSIS-NN for the Teensy which has the include files and the source files in one spot. Then I modified the examples only slightly and it worked.

If anyone is interested in the CMSIS-NN library and I can put the consolidated instructions for updating DSP to CMSIS-DSP to v5 and post the library and working example on GITHUB. Probably will do this tomorrow. I will post back when completed.


UPDATE: Here is the link to GITHUB: I update the readme shortly
Step to Update CMSIS-DSP to v5

Here is summary of the key posts and steps needed to updated CMSIS-DSP (arm-math.h) to v5:

Step 1. Taken from Teensy Convolution SDR (Software Defined Radio), Thank you @Canoe

Two important items to check for the Teensy Convolution project:
1. use the correct Si5351 library for Arduino environment from Jason Milldrum here:

2. Follow these install guidelines for the required CMSIS version 4.5.0 library functions:
A)download the CMSIS v4.5.0 library from
B) unzip and copy these three files from the unzipped sub directory /CMSIS/Include:
arm_common_tables.h ; arm_const_structs.h ; arm_math.h
C) save these files to /--/Arduino/hardware/teensy/avr/cores/teensy3 (/--/ being the root of your Arduino environment)
D) Make these modification to the arm_math.h file
#include <stdint.h>
#define __ASM __asm
#define __INLINE inline
#define __STATIC_INLINE static inline
#define __CORTEX_M 4
#define __FPU_USED 1
#define ARM_MATH_CM4
#include "core_cmInstr.h"
#include "core_cm4_simd.h"

comment out:
#if defined(ARM_MATH_CM7)
#include "core_cm7.h"
#elif defined (ARM_MATH_CM4)
#include "core_cm4.h"
#elif defined (ARM_MATH_CM3)
#include "core_cm3.h"
#elif defined (ARM_MATH_CM0)
#include "core_cm0.h"
#elif defined (ARM_MATH_CM0PLUS)
#include "core_cm0plus.h"
#error "Define according the used Cortex core ARM_MATH_CM7, ARM_MATH_CM4, ARM_MATH_CM3, ARM_MATH_CM0PLUS or ARM_MATH_CM0"
#undef __CMSIS_GENERIC enable NVIC and Systick functions */
E) Copy two more files from the unzipped CMSIS library, folder CMSIS/Lib/GCC
libarm_cortexM4l_math.a ; libarm_cortexM4lf_math.a
F) save these files to /--/Arduino/hardware/tools/arm/arm-none-eabi/lib/

Step 2. Taken from: Request: update CMSIS-DSP (arm_math.h), Thank you @willie.from.texas
I followed that procedure in installing CMSIS Version 5.3. I modified step A with:
A)download the CMSIS v5.3.0 library from Thank you @willie.from.texas

I had to add one additional step:
G) In file core_cm4_simd.h
1) Comment out:
#define __PKHBT (ARG1,ARG2,ARG3), and
#define __PKHTB (ARG1,ARG2,ARG3)

Good luck!

Step 3. CMSIS-DSP library supports, Thank you @manitou

For V5.3 I had to ifdef out all of hardware/teensy/avr/cores/teensy3/core_cm4_simd.h for compile to work

To make it easier I posted the updated files in the GITHUB repository. All you would have to do is copy and paste them "to /--/Arduino/hardware/teensy/avr/cores/teensy3 (/--/ being the root of your Arduino environment)"

EDIT: Note there is also this:
Last edited:
Back again. Did some more reading on CMSIS-NN, what it is and how to build models and am providing some links if you want to learn more;

Deploying Convolutional Neural Network on Cortex-M with CMSIS-NN - this is a good place to start.

Tutorial: Low Power Deep Learning on the OpenMV Cam - This uses CMSIS-NN on the OpenMV product - while get general for Teensy it is a good tutorial on deep learning

and a couple more:

How to run deep learning model on microcontroller with CMSIS-NN (Part 2)
How to run deep learning model on microcontroller with CMSIS-NN (Part 3)

To build training models (and this still confuses me) you need to have Python 2.7 and Caffe installed. There are some prebuilt models around that you can use but this will be in another post.
With the T_4 pending and one of PJRC's ideas being machine learning - I suppose that M7 will come with the newer appropriate CMSIS?

Wondering if this interim time would be right to integrate and update the newer CMSIS for current ARM Teensys to have some parity and 'improvements' for those elements that cross over?
Afternoon @defragster. Just my thoughts:
Wondering if this interim time would be right to integrate and update the newer CMSIS for current ARM Teensys to have some parity and 'improvements' for those elements that cross over?
Think this would be a good idea, I saw several posts on at least updating the DSP library to the latest - think its at v5.3 now. Not sure what it would take to do that though.

With the T_4 pending and one of PJRC's ideas being machine learning
Didn't know that one.
From Yet Another T4 post:
I am really digging my 3.2 I've been messing with. It's so tiny and adorable.

Teensy 4.0?! I don't know what I could possibly do with such power. I thought I would need a Teensy 3.6 to do FFT and update 2000 WS2812B LEDs without lagging behind, but the 3.2 is capable of that. I don't know what projects I can do with 3.6 that 3.2 can't do.

I don't know what I could possibly do with such power.

Machine Learning

what projects I can do with 3.6 that 3.2 can't do

Stereo Freeverb

Frequency domain based pitch shifting (not in the audio lib yet... but on my todo list)

More than 3 serial
Thanks Tim. Didn't realize that. Well - at least now I can tell you it does CMSIS-NN does work on a T3.6 - haven't tried it on a T3.5 though. :)
Good to know the T_3.6 can play along as well. Was going to add my just posted link/note when I saw your OP as there have been several other posts on updates to CMSIS but it takes some PJRC effort from what I saw with custom integer integration for Audio Lib.
Neural Network and Obstacle Avoidance for a Rover

One of the first sites I was using as a guide for my rover obstacle avoidance functions was the PiRobot implementations. One of the things on that site I wanted to implement using the Teensy was their Learning Obstacle Avoidance by Example. Just never got around to it. Guess something else for my to do list. Anyway forgot to add this one to my previous list.


Forgot another one: Neuroduino: A neural network library for Arduino
Last edited:
Folks - To avoid flooding the forum with new links on Neural networks and available software libraries I decided to create and share a MS Notebook with the stuff I have found interesting. Here is the link:!5341&id=documents?&. let me know if you have problems.

NOTE: I checked and MS Notebook is available for the Apple. I already know it is available for Android devices. That is if you want to download it. Otherwise I think it opens up in a web browser so you don't need to install anything.

Hi mjs513,

I was installing Arduino onto a new laptop and also installed the CMSIS V5.3.0 library. In trying to run a sketch that uses that library I encountered a problem that stems from the original instructions for installing CMSIS V4.5.0, and is responsible for the problems encountered with core_cm4_simd.h. I went back to my original installation on my desktop and noticed that I commented out the last line of the lines that were added by @Canoe. IN OTHER WORDS, DO NOT ADD THE LINE, "#include "core_cm4_simd.h" to arm_math.h. There is no reason to modify core_cm4_simd.h if you don't include it.

Sorry about that. It must have been late and I was tired when I discovered that problem because I honestly don't recall commenting it out. Thanks for creating a github repository for those changes. Recommend you take out that one line and remove the core_cm4_simd.h file.

Hi Willie

Never noticed that when I was looking at the files. I will make the change as soon as I get my computer back up and running.

Train a selfmade Caffe Model with non images data

Hello everybody,

you can create a neural network in Caffe and train it with numerical data or is caffe only suitable for image recognition?

I would like to create the network myself and train it with values ​​from two acceleration sensors, previously processed via FFT. After that, an estimate should be output, for example for a fill level of a container.

I hope you can give me an answer.
Thank you. Right, I should take a look.

Is there actually now a way to implement the whole thing via Tensorflow? So a model who was trained in Tensorflow and then ported it to a Teensy?