K66 Beta Test

Status
Not open for further replies.
Update: I just created a pull request for Wire library that supports the additional IO pins. I tested on an Adafruit BNO055 Where I tried 4 different combinations for SCL, SDA
(16, 17), (19, 18), (33, 34), (7, 8)

Test program:
Code:
#include <Wire.h>
#include <Adafruit_Sensor.h>
#include <Adafruit_BNO055.h>
#include <utility/imumaths.h>
  
Adafruit_BNO055 bno = Adafruit_BNO055(55);

void setup(void) 
{
  uint32_t time_start = millis();
  while (!Serial && ((millis()-time_start) < 4000)) ;
  
  Serial.begin(9600);
  Serial.println("Orientation Sensor Test"); Serial.println("");
  
  /* Initialise the sensor */
  // On Test board 
  // SCL 19, 16, 33, 7
  // SDA 18, 17, 34, 8
  Wire.setSCL(33);
  Wire.setSDA(34);
  
  if(!bno.begin())
  {
    /* There was a problem detecting the BNO055 ... check your connections */
    Serial.print("Ooops, no BNO055 detected ... Check your wiring or I2C ADDR!");
    while(1);
  }
  
  delay(1000);
    
  bno.setExtCrystalUse(true);
  Serial.println("Setup Complete");
}

void loop(void) 
{
  /* Get a new sensor event */ 
  sensors_event_t event; 
  bno.getEvent(&event);
  
  /* Display the floating point data */
  Serial.print("X: ");
  Serial.print(event.orientation.x, 4);
  Serial.print("\tY: ");
  Serial.print(event.orientation.y, 4);
  Serial.print("\tZ: ");
  Serial.print(event.orientation.z, 4);
  Serial.println("");
  
  delay(100);
}
 
Paul:
> Need to add to Kinetis.h a define for this line 1169 - based on the FM it is bit 6 like SIM_SCGC4_I2C0:
Code:
#define SIM_SCGC4_I2C2			((uint32_t)0x00000040)		// I2C2 Clock Gate Control

> Does it make sense to make the edits above to Adafruit_SSD1306 in the Teensy library to allow easy usage of I2C_t3 on those devices?
Warning this is not correct!
I2C2 is not defined on SCGC4 register it is defined on SCGC1 register. And it defined in kinetis.h:
Code:
#define SIM_SCGC1_I2C3			((uint32_t)0x00000080)		// I2C3 Clock Gate Control
#define SIM_SCGC1_I2C2			((uint32_t)0x00000040)		// I2C2 Clock Gate Control


UPDATE: Tested hacked version of Adafruit_BNO055 code and ran it using Wire1 and Wire2. I pushed up my change to i2c_t3 library as well as a new branch for Adafruit_bno055, that can use i2ct3.h

The test program (minus setSCL/setSDA) worked for Wire1 (37/38) and Wire2 (3/4) :D
So line 217 in the i2c_t3.cpp (updated) should be:
Code:
       SIM_SCGC1 |= SIM_SCGC1_I2C2;

I am hacking on Adafruit BNO055 code to optionally use i2ct3 as so I can test Wire1 and Wire2...
 
Last edited:
Warning this is not correct!
I2C2 is not defined on SCGC4 register it is defined on SCGC1 register. And it defined in kinetis.h:
Code:
#define SIM_SCGC1_I2C3			((uint32_t)0x00000080)		// I2C3 Clock Gate Control
#define SIM_SCGC1_I2C2			((uint32_t)0x00000040)		// I2C2 Clock Gate Control
...

Thanks KurtE - closed my issue and put your correct info in place of my misread of the FM in that post.

Put that in my code and it still never returns or starts display after begin call to : "display.begin(SSD1306_SWITCHCAPVCC, 0x3C);"

So more - not right - deferring to nox771.

<edit>: nox771 is doing a revamp of the i2c_t3 to accommodate larger # of bus options and the RATE - I'll go drop my pull request.
 
Last edited:
I read through most of this thread, but diddn't see anything about the DSP instruction set. Will the K66 model allow for 24 or 32 bit DSP instructions? (I.E. 96khz / 24bit or 192khz/32bit sound capabilities? )

Re: DSP speed
Just confirming what Paul said "For DSP, it's exactly the same as K20 we have now, except the larger memory, faster clock rate,...".
So I measured speed of DSP instruction sequence in the Audio lib's filter_biquad.cpp (looped x 1000). Both teensy 3.2 and K66 @ 120 mhz each took 201us for 1000 iterations. Bumping K66 to 240mhz, and time dropped to 100us.
 
Today's random test brought to you by this post ...

Making a sketch with TimerOne or TimerThree and I get these errors on K66 compile (IDE 1.6.9 w/TD_1.29b3) - this does it too "...\teensy\avr\libraries\TimerOne\examples\Interrupt":

I:\arduino169\hardware\teensy\avr\libraries\TimerThree/TimerThree.h: In member function 'void TimerThree::setPwmDuty(char, unsigned int)':

I:\arduino169\hardware\teensy\avr\libraries\TimerThree/TimerThree.h:245:13: error: 'TIMER3_A_PIN' was not declared in this scope

if (pin == TIMER3_A_PIN) {

...
Using library TimerThree at version 1.1 in folder: I:\arduino169\hardware\teensy\avr\libraries\TimerThree

I:\arduino169\hardware\teensy\avr\libraries\TimerOne/TimerOne.h: In member function 'void TimerOne::setPwmDuty(char, unsigned int)':

I:\arduino169\hardware\teensy\avr\libraries\TimerOne/TimerOne.h:248:13: error: 'TIMER1_A_PIN' was not declared in this scope

if (pin == TIMER1_A_PIN) {

...
Using library TimerOne at version 1.1 in folder: I:\arduino169\hardware\teensy\avr\libraries\TimerOne
 
Last edited:
Today's random test brought to you by this post ...

Making a sketch with TimerOne or TimerThree and I get these errors on K66 compile (IDE 1.6.9 w/TD_1.29b3) - this does it too "...\teensy\avr\libraries\TimerOne\examples\Interrupt":

Hi, try:
https://github.com/PaulStoffregen/TimerOne/pull/16/commits/e99aab5d8d0643c17562fbca68c34c09113ef112

(TimerOne)
and :

(TimerThree)
https://github.com/PaulStoffregen/TimerThree/pull/3/commits/1ea5be04f3cbb26ab78443cc26ef7d9c25541c98
 
I2S compatibility

I try to get I2S running on K66, but have some problems
I took a program that runs as expected on T3.2 but it is not working on T3.5 (1.29b3 on 1.6.9)

Code:
// adapted from output_i2s.cpp

  #define MCLK_MULT 1
  #define MCLK_DIV  12 
  #define TCR2_DIV  4  // bitclock = MCLK/(2*TCR2_DIV)
#if F_CPU >= 20000000
  #define MCLK_SRC  3  // the PLL
#else
  #define MCLK_SRC  0  // system clock
#endif

void config_i2s(void)
{
	SIM_SCGC6 |= SIM_SCGC6_I2S;

	// if either transmitter or receiver is enabled, do nothing
	if (I2S0_TCSR & I2S_TCSR_TE) return;

	// enable MCLK output
	I2S0_MCR = I2S_MCR_MICS(MCLK_SRC) | I2S_MCR_MOE;
	I2S0_MDR = I2S_MDR_FRACT((MCLK_MULT-1)) | I2S_MDR_DIVIDE((MCLK_DIV-1));

	// configure transmitter
	I2S0_TMR = 0;
	I2S0_TCR1 = I2S_TCR1_TFW(1);  // watermark at half fifo size
	I2S0_TCR2 = I2S_TCR2_SYNC(0) | I2S_TCR2_BCP | I2S_TCR2_MSEL(1)
		| I2S_TCR2_BCD | I2S_TCR2_DIV(TCR2_DIV-1);

// transmit on single channel
	I2S0_TCR3 = I2S_TCR3_TCE; // enable single transmit channel
//
	I2S0_TCR4 = I2S_TCR4_FRSZ(1) | I2S_TCR4_SYWD(15) | I2S_TCR4_MF
		| I2S_TCR4_FSE | I2S_TCR4_FSP | I2S_TCR4_FSD;
	I2S0_TCR5 = I2S_TCR5_WNW(15) | I2S_TCR5_W0W(15) | I2S_TCR5_FBT(15);

	// configure pin mux for 3 clock signals
	CORE_PIN23_CONFIG = PORT_PCR_MUX(6); // pin 23, PTC2, I2S0_TX_FS (LRCLK)
	CORE_PIN9_CONFIG  = PORT_PCR_MUX(6); // pin  9, PTC3, I2S0_TX_BCLK
	CORE_PIN11_CONFIG = PORT_PCR_MUX(6); // pin 11, PTC6, I2S0_MCLK

  I2S0_TCSR |= I2S_TCSR_TE | I2S_TCSR_BCE; // TX clock enable, 

}


void setup()
{
//  while(!Serial);
  
  config_i2s();
}

void loop()
{ 
}

watching the MCLK, BCLK, and FS lines, I get for T3.2 what I expect
I2S_T32_screenshot.jpg
that is MCLK = 8 MHz, BCLK= 1 MHz, LR = 31.25 kHz
but for T3.5 I get
I2S_T35_screenshot.jpg
or
MCLK (pin 11) = 4.x MHz, BCLK = 12.5 MHz, LR = 374 kHz

for T3.2 dividers are MCLK = F_CPU/12, BCLK = MCLK/8, LR = BCLK/32
for T3.5 it seems only that LR = about BCLK/32, but MCLK and BCLK are not correct, in particular BCLK > MCLK (Yes, I double checked pins)

anyone already found solution?
It may be that K3.5 needs mode parameter settings, but manual is difficult to reed.
For what I can see MCLK in T3.5 is derived in the same way as in T3.1 (same sim_sopt2 settings)
 
I2S works for me, with the audioshield (with F_CPU < 200MHz).
Perhaps take a look @ the existing code (output_i2s)

That was the basis for my code, which is working in T3.2 (to be deleted: , so audioshield may only work by chance)
 
Last edited:
Good news on the ethernet front, I'm making good progress on bringing up the hardware. Packet reception is working and I can parse incoming ARP and ping requests. Tomorrow will try sending replies!

Here's the current test code, if anyone's interested to take a peek.

https://github.com/PaulStoffregen/k66_ethernet/blob/master/k66_ethernet.ino

Wow, your receiving sketch looks elegant and simpler than I had hoped. So are you planning on rolling your own TCP/IP stack, or lwIP, or ? I recall you hoped to avoid any RTOS. Besides TCP/IP/UDP, there's all the housekeeping: memory mgt, interrupt mgt, timer mgt ... the fun begins.
 
That was the basis for my code, which is working in T3.2

Correcting myself
there is a small difference between the actual version and the one I was using, that makes the difference (highlighted)
Code:
      I2S0_MCR = I2S_MCR_MICS(MCLK_SRC) | I2S_MCR_MOE;
  [B]while (I2S0_MCR & I2S_MCR_DUF) ;[/B] 
     I2S0_MDR = I2S_MDR_FRACT((MCLK_MULT-1)) | I2S_MDR_DIVIDE((MCLK_DIV-1));

Seems K66 is executing faster than K20 so a wait is required.

F_CPU 96 MHz
 
More great news, I managed to transmit an ARP reply! My Linux machine now knows the mac address and I can receive ping packets without using broadcasting or promiscuous mode. Just committed the transmit code to github.

So are you planning on rolling your own TCP/IP stack, or lwIP, or ?

At this moment, my focus is just testing the hardware. I want to be absolutely sure ethernet works, so we don't end up in a situation like Arduino Due, where the chip has a 100 Mbit/sec ethernet capability that nobody can use because the board doesn't make it available. Realistically, I'm going to get ping reply working and test MDIO communication, and build up the other 2 boards (I will personally solder these 2 since Erin doesn't work until Tuesday) and order materials for about a dozen more, and then pretty much stop working on the ethernet stuff for quite a while.

LwIP certainly does seem like the path of least resistance. I'm willing to consider others, but as I explained earlier, I don't feel right about shipping GPL source to people I believe will end up violating the GPL terms. Seems unlikely I'll write a TCP/IP stack from scratch. But hacking and tweaking LwIP (perhaps more than most people do) seems likely.

.... there's all the housekeeping: memory mgt, interrupt mgt, timer mgt ... the fun begins.

One thing I have considered is memory allocation from a pool of 512 byte blocks, to be used by SDIO, ethernet and USBHS. SDIO and USBHS (sans isync) need 512 byte buffers. The ethernet mac does have the ability to automatically segment received packets into multiple buffers, and combine multiple buffers when transmitting. A max size ethernet frame fits nicely in 3 buffers. In fact, that's how I've got the code configured now. Well, except it only deals with small packets and would need more complexity to handle the 2 and 3 buffer cases. LwIP looks like it wants single buffers...

But other than thinking ahead and tentatively planning on a fixed 512 byte buffer size for all 3 high speed peripherals, I'm really trying to resist the urge to do more on the ethernet. My purpose is to verify the hardware really works, that I didn't forget a crucial pin or there isn't some silicon bug or other unforeseen issue which must be addressed in hardware before we start manufacturing these boards. Unless a serious problem comes up, I probably won't touch ethernet again until after the boards are shipping and all the major distributors have them in stock.
 
Wow, your receiving sketch looks elegant and simpler than I had hoped. So are you planning on rolling your own TCP/IP stack, or lwIP, or ? I recall you hoped to avoid any RTOS. Besides TCP/IP/UDP, there's all the housekeeping: memory mgt, interrupt mgt, timer mgt ... the fun begins.

I need to start learning more advanced stuff :(
 
Here's a screenshot of the very first packet transmitted (well, other than the flow control ones the hardware sends automatically if you turn on flow control).

file.png

I believe the part at the beginning where TXD0 is high and TXD1 is low is probably the 7 byte ethernet preamble, which the PHYs use for clock recovery and sync. This APR reply is 42 bytes, which has 12 bytes for ethernet frame header and 28 bytes for the ARP message. You can see in the waveforms where the mac automatically adds zero padding to extend to the minimum ethernet frame size, and then add the 32 bit CRC check on the end.
 
crypto acceleration unit (CAU) sticky post

K66 and K64 have a crypto accleration unit (CAU), and NXP/Freescale provides an assembler library
http://www.freescale.com/products/a...ion-unit-cau-and-mmcau-software-library:CAUAP
or https://github.com/PaulStoffregen/CryptoAccel
The library only provides acceleration for the compute-intensive transforms for MD5, SHA, AES, and DES, so additional software would be needed to do hash blocking/padding and the various encryption modes (e.g., CBC). It was messy to build the CAU assembler code (had to paste in include's), but Freescale provides a lib_mmcau.a, so here is how that might be used with the IDE:

cp lib_mmcau.a hardware/tools/arm/arm-none-eabi/lib/libcau.a
in boards.txt change to
teensy36.build.flags.libs=-larm_cortexM4lf_math -lm -lcau
teensy35.build.flags.libs=-larm_cortexM4lf_math -lm -lcau
Here's a comparison of crypto lib to C and TLS on the K66 @120MHz and K64 @120MHz (see cryptolib.ino)
Code:
                    K66 @120mhz        |     K64 @120mhz
                 CAU        C     TLS  |   CAU    C    TLS
MD5 (KBs)      11023      5688   7366  | 10964 4654   6168
SHA256(KBs)     3081      1447   2133  |  3074 1410   1950
AES set key(us)    3        48      7  |     3   48      7   128-bit key
AES CBC (us)      18       198     68  |    16  238     83   64 bytes
The C codes are byte-oriented (UNO), whereas the TLS codes are word-oriented, based on mbed TLS. It would be nice to integrate the CAU transforms into the TLS AES/MD5/SHA. I did integrate the K66 hardware random number generator as an entropy source to the mbed TLS, see tls.ino

also see mbed K64F CAU performance

EDIT: 4/15/17, .s with .include will assemble OK with Paul's github library, see
https://forum.pjrc.com/threads/43363-Crypto-acceleration-unit

Added DES to test sketch.
Code:
  DES performance (us) for 64-byte CBC encrypt
            byte     TLS  CAU
T3.6@180     4321     58   11
T3.5@120     6679    139   13
T3.2@96      8311    159
LC@48       26548    389
DUE@84      16137    222
STM32L4@80  11203    174         dragonfly
maple@72    18481    259
1284p@16   142284   4892         avr
"byte" implementation is from https://github.com/Octoate/ArduinoDES, using PROGMEM on Arduino AVRs.
TLS is port from mbed TLS
 
Last edited:

Works for me! I replaced "I:\arduino169\hardware\teensy\avr\libraries\TimerOne\config\known_16bit_timers.h" and \TimerThree\ with that edited file and it compiles and runs.

Oddly the K66 failure to detect pulses is worse than on a T_3.1 - it is as bad or worse than a T_LC? That is at 96 and 240 MHz - like the T_3.1 it doesn't seem to fails at 24MHz given a pulse detect timeout of 2.5 the pulse to HIGH time. At 2.0 multiple it is failing 29%. I can post my sketch if desired . . . later
 
Works for me! I replaced "I:\arduino169\hardware\teensy\avr\libraries\TimerOne\config\known_16bit_timers.h" and \TimerThree\ with that edited file and it compiles and runs.

Oddly the K66 failure to detect pulses is worse than on a T_3.1 - it is as bad or worse than a T_LC? That is at 96 and 240 MHz - like the T_3.1 it doesn't seem to fails at 24MHz given a pulse detect timeout of 2.5 the pulse to HIGH time. At 2.0 multiple it is failing 29%. I can post my sketch if desired . . . later

Yes, please post the sketch, and how to use it :)
 
Here's a screenshot of the very first packet transmitted (well, other than the flow control ones the hardware sends automatically if you turn on flow control).

looks like you've got ARP and ICMP/ping reply. Won't take much to send/recv UDP, then you can do some packet-rate and bandwidth measurements (things will only get slower with a real IP stack). I have a TCP-over-UDP that I developed a few years ago with one of my students, that i'd use for experiments ...
 
Yup, since #569, I got ping and MDIO working. I'm pretty sure the ping replies have wrong checksums, but my Linux machine doesn't seem to mind. On MDIO, I only read the PHY's ID registers.

I ordered 12 more boards from OSH Park.

I made some minor cosmetic improvements on these 12, mainly just adding labels in the silkscreen. I also added solderable pads to connect MDIO & MDC. On the first 3 boards, they're always connected. On the next 12, they're disconnected by default. If you want to use the (not really necessary) PHY registers, use those to connect the signals. They consume Teensy pins 16 & 17 when connected. Of course the 8 necessary PHY signals are wired to pins 3, 4, 24-28, 39.

Later today I'll build the other 2 boards. One is set aside for you, and we can talk here about who ought to get the 3rd board (assuming they both work). Actually, I might give my first ethernet board up, since I'm really not planning to do anything more for a while. We'll have 12 more ethernet boards in a couple weeks. Hopefully 15 will be enough for everyone who really wants to play with raw ethernet before there's a working TCP/IP stack to make it actually useful.
 
Status
Not open for further replies.
Back
Top