Teensyduino 1.54 Beta #2

Status
Not open for further replies.
I should have mentioned, another property some people are expecting is identical result for "reverse" mapping.

In other words, ideally these should give the same result for all inputs:

map(x, 0, 5, 0, 1023);
map(x, 5, 0, 1023, 0);

And these 2 should give the same result as the 2 above, but "reversed"

map(x, 0, 5, 1023, 0);
map(x, 5, 0, 0, 1023);

Of course Arduino's map() doesn't do any of this, and I do not believe any of the proposed map() functions on their issue trackers do either, especially when considering extrapolating beyond the mapped range.

Thanks Paul - will check that next. Out of curiosity I was looking for a mathematical approach to map and found something on rosetta code for a map function: https://rosettacode.org/wiki/Map_range. I added this to the code just to see what it would give me:
Code:
#include <vector>
template<typename tVal>
tVal map_value(std::pair<tVal,tVal> a, std::pair<tVal, tVal> b, tVal inVal)
{
  tVal inValNorm = inVal - a.first;
  tVal aUpperNorm = a.second - a.first;
  tVal normPosition = inValNorm / aUpperNorm;
 
  tVal bUpperNorm = b.second - b.first;
  tVal bValNorm = normPosition * bUpperNorm;
  tVal outVal = b.first + bValNorm;
 
  return outVal;
}

// This is the original map() function in Arduino Core
long map1(long x, long in_min, long in_max, long out_min, long out_max)
{
  return (x - in_min) * (out_max - out_min) / (in_max - in_min) + out_min;
}

// This is the same but with the +1 "range extrension" as suggested by st42
long mapPlus1(long x, long in_min, long in_max, long out_min, long out_max)
{
  return (x - in_min) * (out_max - out_min + 1) / (in_max - in_min + 1) + out_min;
}

// This is another version of map with rounding done only with integer calculations
// as suggested by M.Kooijman
long mapRound(long x, long in_min, long in_max, long out_min, long out_max)
{
  return ((x - in_min) * (out_max - out_min) + (in_max - in_min)/2) / (in_max - in_min) + out_min;
}

void setup(void) {

        Serial.begin(115200);
        delay(2000);
        long x;
        Serial.printf("Range 0-20 to 0-4\n");
        //rosetta version
        //https://rosettacode.org/wiki/Map_range#C.2B.2B
        Serial.printf("     x     map  map1  map(+1) map(round)      Rosetta\n");
        std::pair<float,float> a(0,20), b(0,4);
        for (x=-10; x<=30; x++) {
                Serial.printf("%6ld %6ld %6ld %6ld %6ld            %f\n", x,
                        map(x, 0, 20, 0, 4),
                        map1(x, 0, 20, 0, 4),
                        mapPlus1(x, 0, 20, 0, 4),
                        mapRound(x, 0, 20, 0, 4),
                        map_value(a, b, (float)x));
        }

        Serial.printf("\n\n");
        Serial.printf("Range 0-5 to 0-1023\n");
        std::pair<float,float> c(0,5), d(0,1023);

        Serial.printf("     x     map  map1  map(+1) map(round)      Rosetta\n");
        for (x=-5; x<=10; x++) {
                Serial.printf("%6ld %6ld %6ld %6ld %6ld          %f\n", x,
                        map(x, 0, 5, 0, 1023),
                        map1(x, 0, 5, 0, 1023),
                        mapPlus1(x, 0, 5, 0, 1023),
                        mapRound(x, 0, 5, 0, 1023),
                        map_value(c, d, (float)x));
        }
}

void loop() {}

Output:
Code:
Range 0-20 to 0-4
     x     map  map1  map(+1) map(round)      Rosetta
   -10     -2     -2     -2     -1            -2.000000
    -9     -2     -1     -2     -1            -1.800000
    -8     -2     -1     -1     -1            -1.600000
    -7     -1     -1     -1      0            -1.400000
    -6     -1     -1     -1      0            -1.200000
    -5     -1     -1     -1      0            -1.000000
    -4     -1      0      0      0            -0.800000
    -3     -1      0      0      0            -0.600000
    -2      0      0      0      0            -0.400000
    -1      0      0      0      0            -0.200000
     0      0      0      0      0            0.000000
     1      0      0      0      0            0.200000
     2      0      0      0      0            0.400000
     3      1      0      0      1            0.600000
     4      1      0      0      1            0.800000
     5      1      1      1      1            1.000000
     6      1      1      1      1            1.200000
     7      1      1      1      1            1.400000
     8      2      1      1      2            1.600000
     9      2      1      2      2            1.800000
    10      2      2      2      2            2.000000
    11      2      2      2      2            2.200000
    12      2      2      2      2            2.400000
    13      3      2      3      3            2.600000
    14      3      2      3      3            2.800000
    15      3      3      3      3            3.000000
    16      3      3      3      3            3.200000
    17      3      3      4      3            3.400000
    18      4      3      4      4            3.600000
    19      4      3      4      4            3.800000
    20      4      4      4      4            4.000000
    21      4      4      5      4            4.200000
    22      4      4      5      4            4.400000
    23      5      4      5      5            4.600000
    24      5      4      5      5            4.800000
    25      5      5      5      5            5.000000
    26      5      5      6      5            5.200000
    27      5      5      6      5            5.400000
    28      6      5      6      6            5.600000
    29      6      5      6      6            5.800000
    30      6      6      7      6            6.000000


Range 0-5 to 0-1023
     x     map  map1  map(+1) map(round)      Rosetta
    -5  -1023  -1023   -853  -1022          -1023.000000
    -4   -819   -818   -682   -818          -818.400024
    -3   -614   -613   -512   -613          -613.800049
    -2   -409   -409   -341   -408          -409.200012
    -1   -205   -204   -170   -204          -204.600006
     0      0      0      0      0          0.000000
     1    205    204    170    205          204.600006
     2    409    409    341    409          409.200012
     3    614    613    512    614          613.800049
     4    818    818    682    818          818.400024
     5   1023   1023    853   1023          1023.000000
     6   1228   1227   1024   1228          1227.600098
     7   1432   1432   1194   1432          1432.199951
     8   1637   1636   1365   1637          1636.800049
     9   1841   1841   1536   1841          1841.399902
    10   2046   2046   1706   2046          2046.000000
and it looks like it is working - unless you used the same method so would get identical results :)

Next up - reverse range.
 
Ok - here is a side by side comparison. looks like its working both in forward and reversed ! ....


Code:
	Range 0-20 to 4-0						Range	0-20 to	0-4			
x	map	map1	map(+1)	map(round)   Rosetta		x  map  map1  map(+1)  map(round)  Rosetta
-10	6	6	5	6	      6			-10	-2	-2	-2	-1	-2
-9	6	5	5	6	     5.8		-9	-2	-1	-2	-1	-1.8
-8	6	5	5	6	     5.6		-8	-2	-1	-1	-1	-1.6
-7	5	5	5	5	     5.4		-7	-1	-1	-1	0	-1.4
-6	5	5	4	5	     5.2		-6	-1	-1	-1	0	-1.2
-5	5	5	4	5	      5			-5	-1	-1	-1	0	-1
-4	5	4	4	5	     4.8		-4	-1	0	0	0	-0.8
-3	5	4	4	5	     4.6		-3	-1	0	0	0	-0.6
-2	4	4	4	4	     4.4		-2	0	0	0	0	-0.4
-1	4	4	4	4	     4.2		-1	0	0	0	0	-0.2
0	4	4	4	4	      4			0	0	0	0	0	0
1	4	4	4	4	     3.8		1	0	0	0	0	0.2
2	4	4	4	4	     3.6		2	0	0	0	0	0.4
3	3	4	4	4	     3.4		3	1	0	0	1	0.6
4	3	4	4	4	     3.2		4	1	0	0	1	0.8
5	3	3	4	4	      3			5	1	1	1	1	1
6	3	3	4	4	     2.8		6	1	1	1	1	1.2
7	3	3	3	4	     2.6		7	1	1	1	1	1.4
8	2	3	3	3	     2.4		8	2	1	1	2	1.6
9	2	3	3	3	     2.2		9	2	1	2	2	1.8
10	2	2	3	3	      2			10	2	2	2	2	2
11	2	2	3	3	     1.8		11	2	2	2	2	2.2
12	2	2	3	3	     1.6		12	2	2	2	2	2.4
13	1	2	3	2	     1.4		13	3	2	3	3	2.6
14	1	2	2	2	     1.2		14	3	2	3	3	2.8
15	1	1	2	2	      1			15	3	3	3	3	3
16	1	1	2	2	     0.8		16	3	3	3	3	3.2
17	1	1	2	2	     0.6		17	3	3	4	3	3.4
18	0	1	2	1	     0.4		18	4	3	4	4	3.6
19	0	1	2	1	     0.2		19	4	3	4	4	3.8
20	0	0	2	1	      0			20	4	4	4	4	4
21	0	0	1	1	     -0.2		21	4	4	5	4	4.2
22	0	0	1	1	     -0.4		22	4	4	5	4	4.4
23	-1	0	1	0	     -0.6		23	5	4	5	5	4.6
24	-1	0	1	0	     -0.8		24	5	4	5	5	4.8
25	-1	-1	1	0	     -1			25	5	5	5	5	5
26	-1	-1	1	0	     -1.2		26	5	5	6	5	5.2
27	-1	-1	1	0	     -1.4		27	5	5	6	5	5.4
28	-2	-1	0	-1	     -1.6		28	6	5	6	6	5.6
29	-2	-1	0	-1	     -1.8		29	6	5	6	6	5.8
30	-2	-2	0	-1	     -2			30	6	6	7	6	6

Code:
	Range 0-5 to 1023-0						Range	0-5 to 0	-1023			
x	map	map1	map(+1)	map(round)	Rosetta		x	map	map1	map(+1)	map(round)	Rosetta
-5	2046	2046	1874	2046	2046		-5	-1023	-1023	-853	-1022	-1023
-4	1842	1841	1704	1841	1841.400024		-4	-819	-818	-682	-818	-818.400024
-3	1637	1636	1534	1637	1636.800049		-3	-614	-613	-512	-613	-613.800049
-2	1432	1432	1363	1432	1432.199951		-2	-409	-409	-341	-408	-409.200012
-1	1228	1227	1193	1228	1227.599976		-1	-205	-204	-170	-204	-204.600006
0	1023	1023	1023	1023	1023		0	0	0	0	0	0
1	818	819	853	819	818.400024		1	205	204	170	205	204.600006
2	614	614	683	615	613.799988		2	409	409	341	409	409.200012
3	409	410	512	410	409.199982		3	614	613	512	614	613.800049
4	205	205	342	205	204.599991		4	818	818	682	818	818.400024
5	0	0	172	1	0		5	1023	1023	853	1023	1023
6	-205	-204	1	-204	-204.600052		6	1228	1227	1024	1228	1227.600098
7	-409	-409	-169	-408	-409.199982		7	1432	1432	1194	1432	1432.199951
8	-614	-613	-339	-613	-613.800049		8	1637	1636	1365	1637	1636.800049
9	-818	-818	-510	-818	-818.399963		9	1841	1841	1536	1841	1841.399902
10	-1023	-1023	-680	-1022	-1023		10	2046	2046	1706	2046	2046
 
I'm not eager to extend the SPI library public API this way, so as written, no, probably not.

But I would be happy to merge this if it didn't change the public API. Maybe some trickery with inline methods and possibly __builtin_const_p could let these be private functions that get automatically used when the inputs are conducive?

Paul I commented in the PR, that I also wanted to add these (16 bit versions) earlier and we removed it from the PR, back then. And as I commented on this PR, if we do add the 16 bit extensions we should add them to 3.x and LC as well.
 

I believe inttypes.h were added in C99, and I think C++11. I left the C standards committee between the C90 and C99 standards, so I don't recall exactly when it got added. It relies on string concatenation that was in the original C standard (and in fact was one of the changes added by the committee that weren't in the original K&R C and later pcc, GNU compilers).

String concatenation is when the lexer sees two or more adjacent string literals together, and pastes them into one string. I tend to use it for printf formats to break long lines, but it is also useful for substitution of the formats:
Code:
    printf ("This is the first line\n"
              "This is the second line\n"
              "This is the third line\n");

IMHO, most of the code written here is tightly coupled to the used processor anyway. So, using all this complicated syntax just to get platform independent printfs for platform dependent code seems a bit, say, overengineered to me.
I appreciate having the ability in the times when the size of the item needs to be a given size (since I have used many different platforms over the years). While I haven't programmed an AVR 32u4 (such as the Leonardo) or AVR 328p (such as the Uno) in a few years, I have moved some of my code between the platforms. There int was 16-bits, instead of 32-bits And who knows, maybe the Teensy 5 systems will have different defaults.

 
MichaelMeissner said:
I believe inttypes.h were added in C99, and I think C++11. I left the C standards committee between the C90 and C99 standards, so I don't recall exactly when it got added. It relies on string concatenation that was in the original C standard (and in fact was one of the changes added by the committee that weren't in the original K&R C and later pcc, GNU compilers).
You are correct Michael. When I was looking at the c++ man pages it showed C99 and a tab for C++11.
 
Thanks for testing!

Hopefully we will finally have a map() function that meets all expectations.

Not problem glad to help. This is deja vu. Seem to remember you working on the map() function a couple of years ago as well.

If there is anything else you need checked let me know.
 
I tried making use of the SPI 16 and 32 bit block transfer functions. The Ethernet library fails to run if they're used.

Can anyone see a mistake?

https://github.com/PaulStoffregen/SPI/commit/aedb543cb374217c1ee4c731ffcb8f06f7533afa

I'm about to turn on my oscilloscope and start checking if these functions actually work...

I am guessing byte ordering.

For example earlier when I had transfer16 with buffers, I could use it do an easy faster display driver for 16 bit colors, which worked great and did not work correctly to use the transfer(.,..) buffer version as the LSB and MSB bytes were swapped....
 
Note: On T3.x code we have stuff that when you call transfer with buffers, we try packing the data into words and unpack as we push the stuff onto the FIFO and pull it off ... Don't think we ever did that on T4 code yet.
 
Looking at the SPI.h looks like you still have some 1052 defines (line 1139 or so not sure still needed unless for posterity :) );

Don't really use ethernet much but looking at the lib looks like the only place that uses SPI is in W5100.cpp especially in the read write function (line 298) (not sure I even have a wiznet board around but might). The 32 bit write looks like this (not sure what chip == 52 or 51 or 55 means though):
Code:
	} else if (chip == 52) {
		setSS();
		cmd[0] = addr >> 8;
		cmd[1] = addr & 0xFF;
		cmd[2] = ((len >> 8) & 0x7F) | 0x80;
		cmd[3] = len & 0xFF;
		SPI.transfer(cmd, 4);
#ifdef SPI_HAS_TRANSFER_BUF
		SPI.transfer(buf, NULL, len);
#else
		// TODO: copy 8 bytes at a time to cmd[] and block transfer
		for (uint16_t i=0; i < len; i++) {
			SPI.transfer(buf[i]);
		}
#endif
Don't remember what we did on the transfer16 in the display libs so will defer to Kurt
 
Turns out this optimization makes only a small improvement.

Here's a little test program which transmits 12 bytes at 33 Mbit/sec.

Code:
#include <SPI.h>

void setup() {
  SPI.begin();
  pinMode(10, OUTPUT);
  digitalWrite(10, HIGH);
}

void loop() {
  uint32_t data[] = {0x12345678, 0xDEADC0DE, 0x55AA964C};
  SPI.beginTransaction(SPISettings(33000000, MSBFIRST, SPI_MODE0));
  digitalWriteFast(10, LOW);
  SPI.transfer(data, sizeof(data));
  digitalWriteFast(10, HIGH);
  SPI.endTransaction();
  delay(10);
}

With the original byte-oriented block transfer, total time taken is 3.67 us.

file.png

Using the 32 bit word-oriented block transfer, total time taken is 3.54 us.

file2.png

You can see the bytes are in the wrong order, but otherwise it seems to be working. Not sure if this modest speedup is worth the complexity? Any thoughts?
 
I will try to take a look through some of the code for the fun of it.

Some of the more recent changes were sort of a hack to get around some interesting features of the TCR register.

That is if you do something like:
spi_registers->TCR = <Set to 16 bit mode >
Code:
   spi_registers->TDR = my_word;
    spi_registers->TCR = <Set to 8 bit mode>
....

    Serial.print(spi_registers->TCR, HEX);

The print will show you are in 16 bit mode. This was the issue we had awhile ago if someone did: transfer(x);transfer16(y); transfer16(z); transfer(a);
The a would output 16 bits... Why, because the transfer 16, Read in the current TCR, set it to 16 bit mode, output its data, set it back to the saved TCR... Well when you call the transfer16 the
second time it reads that it is in 16 bit mode and restores it to 16 bit mode....

And yes, the KinetisK code originally always reversed the bytes when it packed them to the PUSHR register, but that screwed things up when code was LSBFIRST
 
@KurtE is the expert here but I think why the speed up for 1 32-bit word transfer or 16-bit (my guess) is small if you are doing alot of transfers like for a display it may add up to a greater savings - not sure I said that correctly.
 
Turns out this optimization makes only a small improvement.

You can see the bytes are in the wrong order, but otherwise it seems to be working. Not sure if this modest speedup is worth the complexity? Any thoughts?

Yep that was sort of what I saw earlier on, which is why at the time I punted on it at the time...

But I did like the transfer16(buf, retbuf, cnt) like functions.

as one could write pretty quickly functions like we use in display libraries pretty quickly...
Like:
Code:
void ILI9341_t3::writeRect(int16_t x, int16_t y, int16_t w, int16_t h, const uint16_t *pcolors)
{
   	beginSPITransaction(_clock);
	setAddr(x, y, x+w-1, y+h-1);
	writecommand_cont(ILI9341_RAMWR);
	for(y=h; y>0; y--) {
		for(x=w; x>1; x--) {
			writedata16_cont(*pcolors++);
		}
		writedata16_last(*pcolors++);
	}
	endSPITransaction();
}
Code:
void ILI9341_t3::writeRect(int16_t x, int16_t y, int16_t w, int16_t h, const uint16_t *pcolors)
{
   	beginSPITransaction(_clock);
	setAddr(x, y, x+w-1, y+h-1);
	writecommand_cont(ILI9341_RAMWR);
        _pspi->transfer16(pcolors, nullptr, w*h);
	endSPITransaction();
}
Note: I did not do the full conversion here showing what the other functions would be without touching hardware registers. Again on T3.x it would still be reasonably slower than the current code in that with T3.x the PUSHR can handle the automatic switching of DC pin, where here the transfers have to complete and then change. But still would not be bad...

Side note: the above is why the ESP32's SPI has the methods:
Code:
    void writePixels(const void * data, uint32_t size);//ili9341 compatible
    void writePattern(uint8_t * data, uint8_t size, uint32_t repeat);
 
For my use case I was using transfer32 to write 2 pixels at a time to 6 displays which offered a significant increase in performance. Of course for the pixel data the byte order didn’t have to be changed so I could just write all the buffers as is and the displays showed up correctly. I know Paul doesn’t like making changes to public API, but it would be nice to access these functions in some type of way for the special cases that can make use of them.
 
From : File-abstraction-and-SdFat-integration

@KurtE / @mjs513:: ... @Paul
the Println() is ugly - it prints like:
Code:
size_t Print::println(void)
{
	uint8_t buf[2]={'\r', '\n'};
	return write(buf, 2);
}

I suppose the fix is to have the Windows IDE SerMon act like it does under Mac/Linux and turn "\r\n" into a single "\n" for GUI display?

Or maybe the '\r' and '\n' are not needed for Mac/Linux either and they just happen to eat the Return [ like a good line printer ] then do the New Line without showing a double line feed like on Windows.
 
the Println() is ugly

You need to be much more specific about how to reproduce this problem!

I tried just now with Arduino 1.8.13 and 1.54-beta2 on Windows 10. I believe you can see in this screenshot it looks perfectly fine.

capture.jpg

I'm sure you're doing something else that does bring out some sort of problem. But how am I supposed to guess what you're actually doing that's giving ugly printing?
 
You need to be much more specific about how to reproduce this problem!

I tried just now with Arduino 1.8.13 and 1.54-beta2 on Windows 10. I believe you can see in this screenshot it looks perfectly fine.


I'm sure you're doing something else that does bring out some sort of problem. But how am I supposed to guess what you're actually doing that's giving ugly printing?
Oops - guess I forgot the other piece of the puzzle.

When you dump to the SerMon everything looks fine, as you have shown, but now try and copy the contents of the window and paste it into your editor of choice including the forum. You will see extra blank lines. Using your example:
Code:
Hello World 1

Hello World 2

Hello World 3

Hello World 4

Hello World 5

Hello World 6
 
@Paul and all - yep that is the issue I mentioned earlier.

As for stuff for the next beta - Wonder about the Hardware Serial Half duplex support I did earlier in the Pull Request: https://github.com/PaulStoffregen/cores/pull/489
It was sort of fun to do an T3.x as no real additional overhead to the Serial Writes and the ISR as in this case simply changed the bitband register value we were saving to instead of setting or clearing the IO pins it instead changed the bit in the IO register that set the TX pin to transmit or receive.
 
Ah, yes, I see it now. Double space with copy & paste. This problem is on the Arduino side. To see this, select the non-Teensy COM port from the Tools > Ports menu. That will use Arduino's normal (slow) serial monitor. If you copy & paste from that one, you'l get single space lines.

I've put this on my bug list. But I'm going to wait until at least beta4. I want to get beta3 wrapped up for testing just as soon as the new File and SD / SdFat stuff is stable enough for widespread testing.
 
Installed on macOS Catalina 10.15.7

No new issues so far. Haven't looked at any of the new features but all old stuff seems to work.
 
Status
Not open for further replies.
Back
Top