Future Teensy features & pinout

voltage range: 0-3.3v
Speed\res: Low end: 16bit 44.1khz high end: 16bit 500khz
Analog Devices, TI, Maxim, Microchip and others make multi-channel ADCs that can be easily managed over SPI or I2S. These are available in 2-8 channel versions at 1+ msps per converter. A search on Mouser, Digikey, or LCSC will show options.

Design & layout of a 3.3V 16 bit converter is not trivial. At 50uV/bit it requires knowledge of analog PCB design practices for good noise performance. If you can reduce the bit depth or increase the voltage range it will make the design process much easier.
 
One question:
It would be interesting make possible to solder as SMD?. The problem with trough hole is the it drills not only te top side but also internal ones so is more difficult to design PCBs. Also more difficult for production. The solution is easy: to expand a little to make SMD pads at the border close to pins so it would be possible solder as SMD and also pins using the same PCB
 
One question:
It would be interesting make possible to solder as SMD?. The problem with trough hole is the it drills not only te top side but also internal ones so is more difficult to design PCBs. Also more difficult for production. The solution is easy: to expand a little to make SMD pads at the border close to pins so it would be possible solder as SMD and also pins using the same PCB

Normally castellated holes are used for SMD soldering a PCB down to another one, but that only works when there are no components on the bottom side, so that unfortunately isn't an option for Teensy.

If your concern is having to put through-holes through your carrier PCB, and you can afford the height of the female headers, you can also just use 0.1" SMD female headers rather than PTH.

Otherwise, I think the upcoming MicroMod version will solve some of these issues as the M.2 connector is SMD (presuming the M.2 breaks out all the pins you need).
 
One question:
It would be interesting make possible to solder as SMD?
This topic was discussed in some depth earlier in this thread. The only practical ways do this is for the board to have no components on the bottom, or require a cutout on the baseboard. Single side components has been done by at least one manufacturer. Among other things there is an impact on noise performance due to the increased distance between the decoupling caps and the BGA pins. My opinion is that I would rather have a more robust board, as opposed to a single-sided (components) board.
 
This topic was discussed in some depth earlier in this thread. The only practical ways do this is for the board to have no components on the bottom, or require a cutout on the baseboard. Single side components has been done by at least one manufacturer. Among other things there is an impact on noise performance due to the increased distance between the decoupling caps and the BGA pins. My opinion is that I would rather have a more robust board, as opposed to a single-sided (components) board.

Agreed about the stability.
The greater distance between capacitors and pins would not likely affect performance at stock (or lower) clock speeds. However, for those of us (and I'm sure there are a large number of us) that overclock the CPU core to accelerate processing of certain time sensitive tasks this would introduce a VERY undesirable instability during these brief over-clocks. Another feature that would be harmed would be the ADC performance since the decoupling on the ADC power inputs is critical to clean conversions.
 
Otherwise, I think the upcoming MicroMod version will solve some of these issues as the M.2 connector is SMD (presuming the M.2 breaks out all the pins you need).

I got a couple of the M.2 connectors and they look to be pretty tricky to hand solder. Lots of solder wick, for sure. I'd go for reflow. Definitely not for your average hobbyist. Costs a lot more than 2 rows of 2.54mm pitch headers, too. Micromod doesn't give me more GPIOs than a T4.1 and no Ethernet so I probably won't be using it. Seems like a step back from the T4.1.
 
Agreed about the stability.
Another feature that would be harmed would be the ADC performance since the decoupling on the ADC power inputs is critical to clean conversions.

Which is why I have been using I2C ADCs - you can get a lot cleaner signal into (and out of) one of those. An ADC in a microcontroller is subject to a lot of compromises. I like it when the conversions happens far from the EMI factory. Makes my paltry analog skills look positively brilliant.
 
You are right. ADC's in MCU's are in VERY close proximity to many other peripherals which can introduce their own noise and compromise the integrity of an analog signal. Discrete ADC's are still made for that very reason.
 
Also not that the silicon process for low-power high speed CMOS is not necessarily ideal for a high performance
sigma-delta modulator which is essentially a precision analog piece of circuitry. Laser trimming is sometimes used
for the highest-performing analog chips, which requires feature sizes far larger than modern high-speed CMOS, often
taking up significant die area.

If you do have an ADC on an MCU, it will perform best if you halt the processor during acquisitions.
 
Sorry, the M2 connector will allow inserting teensy as a DDR memory in a PCB?

About using 0.1' SMD connectors, it is a good idea, but I should use male ones, so I will not have troubles with heigh, also some components could be inserted under the teensy
 
@ PaulStoffregen: Have there been any new developments regarding the RT1170 based Teensy? (Like any decisive decisions about board layout, optional memory footprints, ...)
 
MIMXRT1170-EVK i'll update this post as additional tests are performed ... 1170 benchmarks

1170.jpg

Just received my MIMXRT1170-EVK board from mouser ($208). Using NXP MCUXpresso and NXP SDK, I have run a few of the examples on both cores, cm7 (M7@1 GHz) and cm4 (M4@400MHz). Peripheral IO (includes fast GPIO, XBAR, daisy) and timers (GPT 6, PIT 2x4, quad 4x4, flex PWM 4x8) look a lot like T4. I haven't figured out multicore usage yet.
Code:
  NXP 1170 memory

    512 Mbit SDRAM memory
    512 Mbit Octal Flash
    128 Mbit QSPI Flash
    2 Gbit   Raw NAND Flash
    64 Mbit  LPSPI Flash

cm7           location    size
BOARD_FLASH   0x30000000 0x1000000
SRAM_DTC_cm7  0x20000000 0x40000
SRAM_ITC_cm7  0x0        0x40000
SRAM_OC1      0x20240000 0x80000
SRAM_OC2      0x202c0000 0x40000
NCACHE_REGION 0x20300000 0x40000
SRAM_OC_ECC1  0x20340000 0x10000
SRAM_OC_ECC2  0x20350000 0x10000
BOARD_SDRAM   0x80000000 0x4000000

cm4  
BOARD_FLASH      0x8000000  0x1000000
OCRAM_DTCM_ALIAS 0x20220000 0x20000
SRAM_ITC_cm4     0x1ffe0000 0x20000
SRAM_OC1         0x20240000 0x40000
NCACHE_REGION    0x20280000 0x40000
SRAM_OC2         0x202c0000 0x80000
SRAM_OC_ECC1     0x20340000 0x10000
SRAM_OC_ECC2     0x20350000 0x10000
BOARD_SDRAM      0x80000000 0x4000000

 cm7 stack 2003ff58  ext 200014b8  const 3000a21c  fcn 30003fe1 malloc 200015f0
 cm4 stack 2023ff50  ext 2022a748  const 20227fd8  fcn 20221d49 malloc 2022a880

Code:
[B]coremark[/B] gcc 9.3.1 -O3
cm7@996 4073 iterations/sec   275 ma
cm4@400  728 iterations/sec

  Datasheet power set points
                         MHz
        set point   cm7      cm4   DCDC_IN(ma)
          1         996      400    132.4
          0         700      240     79.3
          5         240      120     42.2
          7         200      100     19.7
          9           0      100     11.8
         11           0      200     24.2
corebar.png
Other MCU coremark results at end of perf.txt.

Old dhrystone.c v2.1
Code:
     Dhrystone 2.1 (DMIPS)
  1170@996mhz      2277   SDK -O3
  T4@600mhz        2175   gcc -O3
  M7@600mhz        2033   ARM CC -O3
  T3.6@256mhz      1120   Fastest+pure+LTO
  T3.6@180mhz       287   Faster
  T3.6@120mhz       191   Faster
  T3.5@120mhz       138   Faster
  T3.2@120mhz       106   Faster
  ESP32@240mhz      255   -O2
  adaM4F@120mhz     168   -O2  SAMD51
  STM32L4@80mhz      63
  STM32F405@168mhz  198   -O2
  F767ZI@216mhz     773   -O3
  F446RE@180mhz     351  -O3
  pico@125mhz        20
  maple@72mhz        48 
  DUE@84mhz          49
  ZERO@48mhz         24
  UNO@16mhz           6

1170 crypto acceleration (CAAM)
TRNG and crypto accel for AES, SHA, DES, and asymmetric cryptography (RSA) supported in mbedtls lib.

mbedtls SDK benchmarks -O3
Code:
cm7
mbedTLS version 2.16.6
fsys=996000000
Using following implementations:
  SHA: CAAM HW accelerated
  AES: CAAM HW accelerated
  AES GCM: CAAM HW accelerated
  DES: CAAM HW accelerated
  Asymmetric cryptography: CAAM HW accelerated

  MD5                      :  5834.39 KB/s,  145.64 cycles/byte
  SHA-1                    :  24142.75 KB/s,   29.24 cycles/byte
  SHA-256                  :  22746.25 KB/s,   27.73 cycles/byte
  SHA-512                  :  813.76 KB/s,  1188.33 cycles/byte
  3DES                     :  11885.05 KB/s,   19.72 cycles/byte
  DES                      :  43668.04 KB/s,   11.38 cycles/byte
  AES-CBC-128              :  34505.66 KB/s,   17.85 cycles/byte
  AES-CBC-192              :  32308.14 KB/s,   19.94 cycles/byte
  AES-CBC-256              :  30238.82 KB/s,   22.00 cycles/byte
  AES-GCM-128              :  32428.12 KB/s,   19.85 cycles/byte
  AES-GCM-192              :  30358.25 KB/s,   21.82 cycles/byte
  AES-GCM-256              :  27575.46 KB/s,   24.01 cycles/byte
  AES-CCM-128              :  22045.39 KB/s,   33.92 cycles/byte
  AES-CCM-192              :  20150.00 KB/s,   38.10 cycles/byte
  AES-CCM-256              :  18526.98 KB/s,   42.31 cycles/byte
  CTR_DRBG (NOPR)          :  1956.96 KB/s,  486.43 cycles/byte
  CTR_DRBG (PR)            :  1349.88 KB/s,  722.21 cycles/byte
  HMAC_DRBG SHA-1 (NOPR)   :  570.32 KB/s,  1697.48 cycles/byte
  HMAC_DRBG SHA-1 (PR)     :  527.95 KB/s,  1835.51 cycles/byte
  HMAC_DRBG SHA-256 (NOPR) :  797.10 KB/s,  1210.97 cycles/byte
  HMAC_DRBG SHA-256 (PR)   :  797.12 KB/s,  1211.01 cycles/byte
  RSA-1024                 :  4545.33  public/s
  RSA-1024                 :  240.00 private/s
  RSA-2048                 :  1794.33  public/s
  RSA-2048                 :   94.67 private/s
  DHE-2048                 :   26.00 handshake/s
  DH-2048                  :   48.00 handshake/s
  ECDSA-secp256r1          :  236.33 sign/s
  ECDSA-secp256r1          :  165.00 verify/s
  ECDHE-secp256r1          :  187.00 handshake/s
  ECDH-secp256r1           :  351.00 handshake/s

cm4
  MD5                      :  1850.37 KB/s,  205.67 cycles/byte
  SHA-1                    :  5603.39 KB/s,   66.75 cycles/byte
  SHA-256                  :  5618.67 KB/s,   66.55 cycles/byte
  SHA-512                  :  162.21 KB/s,  2377.27 cycles/byte
  3DES                     :  23137.90 KB/s,   14.78 cycles/byte
  DES                      :  31683.35 KB/s,   10.36 cycles/byte
  AES-CBC-128              :  26819.54 KB/s,   12.55 cycles/byte
  AES-CBC-192              :  25331.83 KB/s,   13.39 cycles/byte
  AES-CBC-256              :  23907.89 KB/s,   14.29 cycles/byte
  AES-GCM-128              :  22565.69 KB/s,   15.15 cycles/byte
  AES-GCM-192              :  21500.71 KB/s,   15.99 cycles/byte
  AES-GCM-256              :  20493.28 KB/s,   16.87 cycles/byte
  AES-CCM-128              :  13030.69 KB/s,   27.51 cycles/byte
  AES-CCM-192              :  12365.10 KB/s,   29.10 cycles/byte
  AES-CCM-256              :  11697.15 KB/s,   30.87 cycles/byte
  CTR_DRBG (NOPR)          :  731.82 KB/s,  522.87 cycles/byte
  CTR_DRBG (PR)            :  483.29 KB/s,  793.33 cycles/byte
  HMAC_DRBG SHA-1 (NOPR)   :  126.43 KB/s,  3055.65 cycles/byte
  HMAC_DRBG SHA-1 (PR)     :  116.22 KB/s,  3326.44 cycles/byte
  HMAC_DRBG SHA-256 (NOPR) :  171.17 KB/s,  2251.73 cycles/byte
  HMAC_DRBG SHA-256 (PR)   :  171.18 KB/s,  2251.73 cycles/byte
  RSA-1024                 :  1037.33  public/s
  RSA-1024                 :   48.33 private/s
  RSA-2048                 :  1794.33  public/s
  RSA-2048                 :   94.67 private/s
  DHE-2048                 :   23.00 handshake/s
  DH-2048                  :   37.00 handshake/s
  ECDSA-secp256r1          :   59.00 sign/s
  ECDSA-secp256r1          :   43.67 verify/s
  ECDHE-secp256r1          :   49.67 handshake/s
  ECDH-secp256r1           :  120.33 handshake/s

mbedtls without and with crypto acceleration on 1170@996MHz. acceleration disables Dcache.
Code:
                     no accel              crypto accel
        100!           324 us                 11780 us
        DH           27316 us                  2849 us
        RSA private 195755 us                 19114 us
        RSA pub       2376 us                   486 us
        RSA CRT      54820 us                  6178 us   
        SHA256         114 us  8982 KBs           8 us 128000 KBs

wolfssl performance on cm7@996mhz, -O3 SDK, no crypto accel
Code:
        100! 85 us 933262154439441526816992388562...
        N 2048 bits
        DH 19627 us
        RSA private 129262 us
        RSA pub 4563 us  comp 0
        RSA CRT 40022 us  comp 0
        MD5 111 us 9225 KBs
        SHA256 110 us 9309 KBs
        RC4 17 us 60235 KBs
        AESCBC 64 3 us 21333 KBs

mini-gmp performance cm7@996mhz, no crypto accel
Code:
          100! 57 us  20 chars 93326215443944152681
          DH 75439 us  1024 bits
          RSA priv 373023 us
          RSA pub 2932 us compare 0
          RSA CRT 104778 us compare 0
See other gmp peformance numbers

RSAsign
I measured the performance of 1170 using Paul's RSA-2048 signature benchmark RSAsign
Code:
RSAsign        seconds
T3.6@180MHz    0.474
T4@600MHz      0.085
F767ZI@216     0.332   mbed -O3   0.203 with mbedtls
1170@996MHz    0.0577    paul's tls, 32KB heap, -O3 NXP SDK
1170@996MHz    0.0069    NXP mbedtls +crypto accel   (8x)
 cm4@400MHz    0.344     mbedtls+accel  0.0134
rsasign.png
The signature failed under the NXP SDK (MCUXpresso) because the heap was only 4KB. In the IDE I increased the heap to 32KB, and signature was good. The NXP SDK has an mbedtls library that supports the crypto acceleration hardware, so I tested RSA-2048 signature with NXP lib and accelerated crypto. The crypto acceleration improved performance by a factor of 8 on the 1170. More RSAsign results
.

Random numbers (CAAM)
The 1170 crypto unit (CAAM) can generate hardware random numbers (CAAM_RNG_GetRandomData()). The NXP SDK mbedtls lib utilizes the hardware random number generator (DCache disabled). Whether you ask for one random byte or 1000, the generator takes 125 ms, slower than the Teensy 4 TRNG. There is little documentation (need NDA), so there may be speed optimizations.

You could use the 1170 hardware random number generator to get a good initial seed, then use your favorite PRNG/hashing function (MD5, SHA, RC4, Mersenne Twister, LFSR, LCG ...) to generate subsequent random bits.

Mersenne PRNG and TinyMT, 1000 32-bit random numbers
Code:
mersenne   PRNG 1000 32-bit  (microseconds)
                               TinyMT
   NXP 1170@996MHz  41 us        18  us
   T4@600MHz        67           61 
   T3.6@180MHz     462          349
   T3.5@120MHz     694          526
   T3.2@120MHz     697          527
   LC@48MHz       2341         1864
   T2++@16MHz    38680        20636
   ESP32@240MHz    349          288
   F767ZI@216MHz   210           83
   F446RE@180MHz   417          130
   32F405@168MHz   388          411
   32L476RE@80MHz  982          812    dragonfly
   pico@125MHz     797          344
   M4@120MHz       519          502    SAMD51
   artemis@96MHz   748          851
   DUE@84MHz      1519         1204    SAMD21
   maple@72MHz    1443         1114
   ZERO@48MHz     2522         2084     
   cpx@48MHz      2390         2017
mersenne.png
Here are some DSP performance results:
Code:
DSP FFT benchmark  1024  radix4 REVERSEBITS 0  (microseconds)
                q15     q31      f32       opt         arm_math.h
NXP 1170 1GHz   44.3     89.4     66.9    gcc -O3      v1.6.0 SDK
  T4@600mhz     77.4    147.0     87.0    gcc -O2      v1.5.1
  M7@600mhz     77.4    147.8     88.0    gcc -O3      v1.5.1 SDK
  M7@600mhz     74.5    126.9     95.6    ARM GCC -O3  v1.5.1  mbed
T3.6@256mhz    291.7    720.4    424.7    Faster       v1.5.3
T3.6@240mhz    311.2    768.8    453.0    Faster       v1.5.3
T3.6@180mhz    463.1   1215.2    703.7    Faster       v1.1.0
T3.6@180mhz    414.7   1010.7    598.2    Faster       v1.5.3
T3.5@120mhz    784.7   1947.9   1079.8    Faster       v1.1.0
T3.5@120mhz    658.5   1577.9    919.5    Faster       v1.5.3
K64F@120mhz    635.7   1273.8    827.2    ARM GCC -O3  v1.4.5 mbed
T3.2@120mhz    869.8   2498.5  18182.5    Faster       v1.1.0    no FPU
adaM4F@120mhz  701.3   1756.1    781.0    Faster       v1.1.0   SAMD51
STM32L4@80mhz  917.3   1953.8   1150.4    Faster       v1.4.5
STM32F405@168  466.5   1135.1    556.1    gcc -O2      v1.6.0
F767ZI@216mhz  206.9    352.7    262.7    arm gcc -O3  v1.5.1
and CMSIS-NN (neural network, CIFAR10)
Code:
1170@996mhz  13818 us  SDK -O3  arm_math.h 1.6.0
T4.0@600mhz  71102 us  Faster arm_math.h 1.5.1
T3.6@180mhz 445994 us  Faster arm_math.h 1.5.3
T3.5@120mhz 669922 us

float/double linear algebra
Code:
Linpack 100x100 mflops
               double    float
  1170@996mhz  120.3     289     NXP SDK -O3  10/19/21
   cm4@400mhz    2.7      46.2   NXP SDK -O3
  T4@600mhz     71.4     166.3   gcc -O3
  M7@600mhz     66.97    125.5   ARM CC -O3       
  T3.6@256mhz    2.85     41.1   Fastest
  T3.6@180mhz    2.13     28.4   Faster
  T3.5@120mhz    0.88     19.2   Faster
  T3.2@120mhz    0.65      1.0   Faster  no FPU
  ESP32@240mhz   2.8      44.5
  adaM4F@120mhz  1.4      20.1   SAMD51
  STM32L4@80mhz  0.88     15.4   dragonfly -O2
  STM32F405@168  1.8      28.3   -O2 adafruit
  F767ZI@216mhz 24.1      47.5   ARM CC -O3

linpackbar.png

Floating point interpolation, raytrace, and finite difference:
Code:
           float interpolate (us)  8x8 to 70x70
               bilinear  bicubic
T3.2@120mhz      18773    223109     no FPU
T3.5@120mhz       1944     26618
T3.6@180mhz       1294     16712
T4@600mhz          255      6406
1170@996mhz        158      2048   SDK -O3
adaM4F@120        1905    207326
ESP32@240mhz      1983    114813
STM32L4@80mhz     2897     37962
STM32F405@168mhz  1356    157633   -O2
F446RE@180mhz     1692    161939
F767ZI@216mhz      875    20149   -O3

 raytrace 8x8 float  -O2
                 microseconds
  1170@996mhz         28960   NXP SDK -O3
  T4@600mhz           45372
  T3.6@180mhz        186409
  T3.5@120mhz        301454
  T3.2@120mhz       6634437   no FPU
  ESP32@240mhz       204252
  adaM4F@120mhz      328686
  STM32F405@168mhz   225093
  F446RE@180mhz      213244
  STM32L4@80mhz      546230
  F767ZI@216mhz      134194   -O3

  finite difference 51x21 float -O2  fd
                 microseconds
    T4@600mhz         42305     double 84134
  T3.6@180mhz        169559          3662799
  T3.5@120mhz        257085
  T3.2@120mhz       4541625     no FPU
  1170@996mhz         24672     double 49391
  ESP32@240mhz       521806
  adaM4F@120mhz      250132
  STM32F405@168mhz   179313
  F446RE@180         227017
  STM32L4@80mhz      424934
  F767ZI@216         130689     double 25987
fd.png



See stochastic simulation performance and Cortex M7 superscalar speedup

FastCRC benchmark, table-driven
Code:
         CRC Benchmark  length: 16384 bytes
         Maxim (iButton) FastCRC: Value:0x 000000f6  89 us 1472.719101 mbs
         Maxim (iButton) builtin: Value:0x 000000f6  871 us 150.484501 mbs
         MODBUS FastCRC: Value:0x 00007029  121 us 1083.239669 mbs
         MODBUS builtin:  Value:0x 00007029  803 us 163.227895 mbs
         XMODEM FastCRC: Value:0x 000098d9  109 us 1202.495413 mbs
         XMODEM builtin:  Value:0x 000098d9  919 us 142.624592 mbs
         MCRF4XX FastCRC: Value:0x 00004a29  132 us 992.969697 mbs
         MCRF4XX builtin: Value:0x 00004a29  165 us 794.375758 mbs
         KERMIT FastCRC: Value:0x 0000b259  49 us 2674.938776 mbs
         Ethernet FastCRC: Value:0x 1271457f  444 us 295.207207 mbs



Notes
  • In the NXP SDK, GPT timer clock sources are only 24 MHz, and probably RC based -- drift of 980 ppm from GPS PPS. Tested drift with quad timer PWM and measured 34 ppm. Quad timer and PIT timer use 240 MHz bus clock. Also tested 24 MHz crystal using 64-bit PIT timer (34.67 ppm). GPT FIX: one can configure GPTx clocks with IDE's clock tool or hack clock_config.c to make GPT2 use kCLOCK_GPT2_ClockRoot_MuxOsc24MOut (or GPT_SetClockSource(GPTx,2)), then 24MHz crystal drift is 34.67 ppm. GPT 32khz clock source (GPT_SetClockSource(GPTx,4) measures -47 ppm.
  • Still not clear how memory banks are shared/protected between the cm4 and cm7 for a multicore app.
  • 12-bit DAC (1.8v, 1 ma, 4 us settle time) is available on test pad (TP18) on EVK board. 8-bit internal DAC can be routed internally to ADC or comparator/ACMP.
  • max ADC voltage is 1.8v on EVK board. 1.2us/sample with 24mhz ADC clock (12-bit resolution, average 1)
  • EVK power set points in SDK example power_mode_switch, running cm7 coremark: 275 ma (meter J38 1-2), Compare: T4 106 ma, 1060 EVK 184 ma
  • NXP SDK mbedtls benchmark example disables DCache SCB_DisableDCache(), and uses SysTick for timing. Test harness insures code being timed runs at least 3 seconds. I've also done timing with GPT micros()
  • 1170 eval board has Gig and 100T ethernet jacks. Tested 100T UDP, TCP, and ping on 1170. Uses lwIP (v2.1.1) polling and callbacks. See comparative performance table. To improve TCP receive performance, edit lwipopts.h to TCP_WND (6 * TCP_MSS).
  • tested onboard microSD, read rate 39 mbs (2048-byte read's)

References
NXP MIMXRT1170-EVK: i.MX RT1170 Evaluation Kit
1170 datasheet
1060 to 1170 migration guide
Cortex-M7 instruction cycle counts, timings, and dual-issue combinations

I'll add other results of 1170 experiments to this post ....
 
Last edited:
Sorry, the M2 connector will allow inserting teensy as a DDR memory in a PCB?
No, the Teensy is not and never will be used as DDR memory. Nor will it ever be compatible with a PC computer motherboard socket. Forum members have been discussing the possibility of co-opting the M.2 connector as a means of breaking out large numbers of IO pins. M.2 is dense, ubiquitous, cheap, and relatively easy to solder. So it makes a good choice.
 
On the subject of high-density connectors...

Have you designed any base boards for other products using the high density connectors.

I'm not normally on this forum and I actually created an account just now so don't stress too much this post :rolleyes:.

I just want to share my experience as I lately designed both the module and the motherboard using an E Key M.2 connector with a custom 48mm length. I did it mostly because it looks nice and also because the assembled module+motherboard is lower profile.

PCB wise it is for sure harder to design in my opinion as the side 0.1" header pins are nicely spread out compared to the M.2 connector which makes all traces bottle neck to one side. I haven't characterize yet my board in this regard but I'm sure I have a decent amount of cross-talk on my module as all traces need to travel the board up to the edge. Also it limits the thickness of the board to 0.8mm if this is a concern for you. I did the module on the oshpark 2 layers board and this was not comfortable but as you are on 6 layers, this should already be a lot better.

So in conclusion, I would vote in favor of the M.2 connector for the next teensy :)
 
Oh I forgot to add, I'm with you with the no debugger thing, as this is a controversial topic I will just put that blind vote against the motion out here without justification.
 
I'm not normally on this forum and I actually created an account just now so don't stress too much this post :rolleyes:.

I just want to share my experience as I lately designed both the module and the motherboard using an E Key M.2 connector with a custom 48mm length. I did it mostly because it looks nice and also because the assembled module+motherboard is lower profile.

PCB wise it is for sure harder to design in my opinion as the side 0.1" header pins are nicely spread out compared to the M.2 connector which makes all traces bottle neck to one side. I haven't characterize yet my board in this regard but I'm sure I have a decent amount of cross-talk on my module as all traces need to travel the board up to the edge. Also it limits the thickness of the board to 0.8mm if this is a concern for you. I did the module on the oshpark 2 layers board and this was not comfortable but as you are on 6 layers, this should already be a lot better.

So in conclusion, I would vote in favor of the M.2 connector for the next teensy :)
Those boards look great! I am curious, though, why did you decide to use the M.2 daughterboard arrangement rather than just soldering the module on to the host board?
 
Those boards look great! I am curious, though, why did you decide to use the M.2 daughterboard arrangement rather than just soldering the module on to the host board?

Thank you, there is actually a couple things I like with this design.

Mostly it gives me the easy swapping to ease the debug. I keep a batch of the processing module handy and sometimes during the motherboard bring-up I'm not sure if I blew up something (or I'm sure I did :)) so I can swap the module and try with a know good one quickly.

Also I can do the firmware development before the motherboard is designed and so when I receive the motherboard I can just install the same module I did the prototyping with without unsoldering/resoldering.

Lastly it makes a clear responsibility interface, the motherboard just need to provide something in the range of 6v-3v so it can be usb 5v or 3x lithium AA (5.4v full charge) and then the module is in charge of the rest.
 
Hm, you have easy swapping with any kind of connector.
M2 does not have enough pins.

PLEASE tell us you don't have this in a child's room:

Edit : [Picture removed] - it is here.

Oh my.. there is so much that is dangerous. Mains cable without strain relief, easy to pull out. A thin plastic over the screws. A housing which is an inivation for children to put things in... etc etc etc. NO!!!
 
Last edited:
Back
Top