Forum Rule: Always post complete source code & details to reproduce any issue!
Page 3 of 5 FirstFirst 1 2 3 4 5 LastLast
Results 51 to 75 of 114

Thread: CircuitPython on Teensy 4!

  1. #51
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    23,087
    I may consider a way for future bootloaders to preserve the filesystem portion of Circuit Python. But hacking the bootloader is a very serious matter, as there is a very real risk of bricked boards if things go wrong. That's why much of the design goal of the bootloader is to keep things simple & low-risk. Whether I put engineering time into this and other ways to support Circuit Python will really depend upon whether PJRC sees significant Teensy sales for people using Circuit Python.

  2. #52
    Junior Member
    Join Date
    Nov 2019
    Posts
    14
    The latest (2nd) firmware from @tannewt is a lot faster than the 1st one :
    - performanceTest with firmware CircuitPython 5.0.0-beta.3-6-g926375d99-dirty on 2020-01-10 @ Teensy 4.0 : 300871
    - performanceTest with firmware CircuitPython 5.0.0-beta.3-69-g1c3960634 on 2020-01-18 @ Teensy 4.0 : 4005428 # 13.3x, greater is better
    - hsquare with firmware CircuitPython 5.0.0-beta.3-6-g926375d99-dirty on 2020-01-10 @ Teensy 4.0 : 155.3999519348145 us
    - hsquare with firmware CircuitPython 5.0.0-beta.3-69-g1c3960634 on 2020-01-18 @ Teensy 4.0 : 10.7399976253510 us # 6.91%, lower is better
    Using benchmark scripts adapted from "Benchmark comparison of MicroPython boards" topic in MicroPython forum.

    But this 2nd firmware has bugs in REPL and when reading files (sometimes), like the pystone benchmark.

  3. #53
    Quote Originally Posted by PaulStoffregen View Post
    I may consider a way for future bootloaders to preserve the filesystem portion of Circuit Python. But hacking the bootloader is a very serious matter, as there is a very real risk of bricked boards if things go wrong. That's why much of the design goal of the bootloader is to keep things simple & low-risk. Whether I put engineering time into this and other ways to support Circuit Python will really depend upon whether PJRC sees significant Teensy sales for people using Circuit Python.
    It's not my intent to hack the Teensy Bootloader at all. My intention would be to provide another bootloader alongside CircuitPython. How to enter the bootloader may be an issue though. Does the button on Teensy reset the main chip every press? We usually recognize a double reset as meant to enter the bootloader.

    I don't expect you to put time into it until you feel it is worthwhile.

    Quote Originally Posted by rcolistete View Post
    The latest (2nd) firmware from @tannewt is a lot faster than the 1st one :
    - performanceTest with firmware CircuitPython 5.0.0-beta.3-6-g926375d99-dirty on 2020-01-10 @ Teensy 4.0 : 300871
    - performanceTest with firmware CircuitPython 5.0.0-beta.3-69-g1c3960634 on 2020-01-18 @ Teensy 4.0 : 4005428 # 13.3x, greater is better
    - hsquare with firmware CircuitPython 5.0.0-beta.3-6-g926375d99-dirty on 2020-01-10 @ Teensy 4.0 : 155.3999519348145 us
    - hsquare with firmware CircuitPython 5.0.0-beta.3-69-g1c3960634 on 2020-01-18 @ Teensy 4.0 : 10.7399976253510 us # 6.91%, lower is better
    Using benchmark scripts adapted from "Benchmark comparison of MicroPython boards" topic in MicroPython forum.

    But this 2nd firmware has bugs in REPL and when reading files (sometimes), like the pystone benchmark.
    Thanks for testing this! I believe the bugs are due to the DCache interacting with TinyUSB. I've turned it off in my PR and attached the build from the GitHub Actions CI. (All boards are built for pending PRs and the files can be downloaded from GitHub. To do so, click the red x or green checkmark next to a commit, get details on a build and then click artifacts. The dropdown will have every board and it'll have a zip of all languages for that board.) I'd expect it to be similar speed and less buggy.

  4. #54
    Junior Member
    Join Date
    Nov 2019
    Posts
    14
    Thanks for the CircuitPython for Teensy 4.0 v2020-01-19. The benchmarks show that it's as fast as v2020-01-18.

    The REPL is better (no bugs until now).

    But the the same pystone benchmark (see attached file) works ok on firmware v2020-01-10 :
    Code:
    Adafruit CircuitPython 5.0.0-beta.3-6-g926375d99-dirty on 2020-01-10; Teensy 4.0 with IMXRT1062DVJ6A
    
    >>> import pystone_lowmem_monotonic
    >>> pystone_lowmem_monotonic.main()
    Pystone(1.2) time for 500 passes = 761000000 ms
    This machine benchmarks at 657 pystones/second
    But not on v2020-01-18 and v2020-01-19 :
    Code:
    Adafruit CircuitPython 5.0.0-beta.3-70-g7f960151b on 2020-01-19; Teensy 4.0 with IMXRT1062DVJ6A
    
    >>> import pystone_lowmem_monotonic
    >>> pystone_lowmem_monotonic.main()
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    AttributeError: 'module' object has no attribute 'main'
    Attached Files Attached Files

  5. #55
    Senior Member+ manitou's Avatar
    Join Date
    Jan 2013
    Posts
    2,594
    @tannewt your latest hex file in post #53 only shows a build date of 20200119, i was hoping for something newer. That .hex file fails on T4 as before (post #45) when porting my longer .py scripts (llutm.py raytrace.py pystone.py wator.py) T4 does run hsquare.py faster (11.2 us) and 10 s counting script reaches 3247698. I still have the most success with your first .hex (2020-01-10). I've added some more performance numbers for that .hex in post #14

  6. #56
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    12,760
    Quote Originally Posted by manitou View Post
    @tannewt your latest hex file in post #53 only shows a build date of 20200119, i was hoping for something newer. That .hex file fails on T4 as before (post #45) when porting my longer .py scripts (llutm.py raytrace.py pystone.py wator.py) T4 does run hsquare.py faster (11.2 us) and 10 s counting script reaches 3247698. I still have the most success with your first .hex (2020-01-10). I've added some more performance numbers for that .hex in post #14
    Will be cool to see the numbers with improved build in the table.
    @manitou - can you note if High/Low is better for each subset.

    Would be interesting to see something of the same benchmark using a cpp build to gauge overhead of interpreter.

    As far as adding bootloader reserved space in FLASH - would be nice for Arduino sketch to be able to have reserved space for rarely updated 'config' info that might be too big for EEPROM - one user wanted 4K IIRC. I suppose that may fall out on future Teensy 4.1 with larger FLASH where space for filesystem may be reserved if the T_4.0's 2MB doesn't justify that. OR altered EEPROM logic and numbers could free up reserved space there compromising the overall size or the backing rewrite ratio. But EEPROM edit doesn't help this thread for store of python code.

  7. #57
    Senior Member+ manitou's Avatar
    Join Date
    Jan 2013
    Posts
    2,594
    Quote Originally Posted by defragster View Post
    Will be cool to see the numbers with improved build in the table.
    @manitou - can you note if High/Low is better for each subset.

    Would be interesting to see something of the same benchmark using a cpp build to gauge overhead of interpreter.
    yeah, the high/low is left to the reader. most of early columns are time (us or s) but there are pystones and counter values where bigger is better.

    to post #14 i've added a "T4 C" line for a rough comparison of interpreted python vs compiled C. The sketches can take advantage of hardware float, i think the python core uses double everywhere. The interpreter takes a lot of space too. On the M0+ circuit playground express with circuitpython, there is little room left for user scripts.

    https://forum.micropython.org/viewtopic.php?t=2659

  8. #58
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    12,760
    Quote Originally Posted by manitou View Post
    yeah, the high/low is left to the reader. most of early columns are time (us or s) but there are pystones and counter values where bigger is better.

    to post #14 i've added a "T4 C" line for a rough comparison of interpreted python vs compiled C. The sketches can take advantage of hardware float, i think the python core uses double everywhere. The interpreter takes a lot of space too. On the M0+ circuit playground express with circuitpython, there is little room left for user scripts.

    https://forum.micropython.org/viewtopic.php?t=2659
    C looks like an unfair comparison - I didn't expect it to be that extreme. Will be interesting to see how T4 comes along. I saw linked table that includes T_3.x's - odd they don't stand out there.

    'left to the reader' … okay forum isn't always good for tables anyhow ...

  9. #59
    Quote Originally Posted by rcolistete View Post
    Thanks for the CircuitPython for Teensy 4.0 v2020-01-19. The benchmarks show that it's as fast as v2020-01-18.

    The REPL is better (no bugs until now).

    But the the same pystone benchmark (see attached file) works ok on firmware v2020-01-10 :
    Code:
    Adafruit CircuitPython 5.0.0-beta.3-6-g926375d99-dirty on 2020-01-10; Teensy 4.0 with IMXRT1062DVJ6A
    
    >>> import pystone_lowmem_monotonic
    >>> pystone_lowmem_monotonic.main()
    Pystone(1.2) time for 500 passes = 761000000 ms
    This machine benchmarks at 657 pystones/second
    But not on v2020-01-18 and v2020-01-19 :
    Code:
    Adafruit CircuitPython 5.0.0-beta.3-70-g7f960151b on 2020-01-19; Teensy 4.0 with IMXRT1062DVJ6A
    
    >>> import pystone_lowmem_monotonic
    >>> pystone_lowmem_monotonic.main()
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    AttributeError: 'module' object has no attribute 'main'
    Interesting! What does `dir(pystone_lowmem_monotonic)` show? That will list all names within the module.

    Quote Originally Posted by manitou View Post
    @tannewt your latest hex file in post #53 only shows a build date of 20200119, i was hoping for something newer. That .hex file fails on T4 as before (post #45) when porting my longer .py scripts (llutm.py raytrace.py pystone.py wator.py) T4 does run hsquare.py faster (11.2 us) and 10 s counting script reaches 3247698. I still have the most success with your first .hex (2020-01-10). I've added some more performance numbers for that .hex in post #14
    There are no newer changes. This port isn't my top priority currently. (My top priority is fixing bugs for 5.0.0 stable and making sure Bluetooth on the nRF52840 is solid.) The syntax error problem sounds like the file didn't transfer successfully. I'm surprised you are still hitting it with the DCache turned off.

    Quote Originally Posted by defragster View Post
    C looks like an unfair comparison - I didn't expect it to be that extreme. Will be interesting to see how T4 comes along. I saw linked table that includes T_3.x's - odd they don't stand out there.

    'left to the reader' okay forum isn't always good for tables anyhow ...
    C will always be much faster to run than Python. However, Python is much faster to write and iterate on. CircuitPython/MicroPython is the best of both worlds, use Python to connect together C bits that are fast.

  10. #60
    Senior Member+ manitou's Avatar
    Join Date
    Jan 2013
    Posts
    2,594
    @tannewt I cloned your latest teensy4-dev branch and was able to run all of my benchmark circuitypython scripts. I updated results in post #14. As I understand it, DCache is disabled? So things could get faster with additional memory/cache tuning.

    ref https://github.com/adafruit/circuitpython/pull/2532

  11. #61
    Quote Originally Posted by manitou View Post
    @tannewt I cloned your latest teensy4-dev branch and was able to run all of my benchmark circuitypython scripts. I updated results in post #14. As I understand it, DCache is disabled? So things could get faster with additional memory/cache tuning.

    ref https://github.com/adafruit/circuitpython/pull/2532
    Thanks for testing it out! I've merged that PR in so any changes can go on the master adafruit branch now. It does have the DCache disabled so we could get a bit of a speed boost from turning it back on. The build isn't using link time optimization yet either which should be able to speed things up as well.

    Will let you know when I circle back around to it. Thanks!

  12. #62
    Senior Member+ manitou's Avatar
    Join Date
    Jan 2013
    Posts
    2,594
    Hack alert:

    OK, I've updated post #14 again. I added SCB_EnableDCache() to cpu.voltage, and then invoke cpu.voltage at the start of each of my .py tests, so I should have DCache enabled, and performance (post #14) did improve. To get fact.py times i used GPT micros (also hacked into cpu.voltage), other tests are using time.monotonic()

    Are you using ITCM/DTCM? Teensy 4 core, copies all of FLASH to RAM, and runs instructions out of ITCM (0-delay) and most of the data is in DTCM.
    ref: https://www.pjrc.com/store/teensy40.html

  13. #63
    Quote Originally Posted by manitou View Post
    Hack alert:

    OK, I've updated post #14 again. I added SCB_EnableDCache() to cpu.voltage, and then invoke cpu.voltage at the start of each of my .py tests, so I should have DCache enabled, and performance (post #14) did improve. To get fact.py times i used GPT micros (also hacked into cpu.voltage), other tests are using time.monotonic()
    Whoa! Those numbers do look better! It'd be great to hunt down the TinyUSB bug so we can leave it on.

    Quote Originally Posted by manitou View Post
    Are you using ITCM/DTCM? Teensy 4 core, copies all of FLASH to RAM, and runs instructions out of ITCM (0-delay) and most of the data is in DTCM.
    ref: https://www.pjrc.com/store/teensy40.html
    My PR adds basic support for the ITCM and the DTCM. I allocate 32k to each. Our first board will be the 1010 so I wanted to focus on that limited amount first. I tried to move all of the core VM stuff there to speed things up. The stack also lives in the DTCM now. You can play with adding things to the TCMs using the macros here: https://github.com/adafruit/circuitp...r/linker.h#L32 It'd be worth validating that we're using the hardware float support too. I haven't looked into it yet.

  14. #64
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    23,087
    Quote Originally Posted by tannewt View Post
    I allocate 32k to each. Our first board will be the 1010 so I wanted to focus on that limited amount first.
    "Our" means Adafruit, right?

  15. #65
    Quote Originally Posted by PaulStoffregen View Post
    "Our" means Adafruit, right?
    Correct. Here is a preview: https://twitter.com/adafruit/status/1221182041267560449

  16. #66
    Senior Member+ defragster's Avatar
    Join Date
    Feb 2015
    Posts
    12,760
    Quote Originally Posted by tannewt View Post
    Good pics - thanks for the preview link - I wondered what the 1010 referred to.

  17. #67
    Senior Member+ manitou's Avatar
    Join Date
    Jan 2013
    Posts
    2,594
    Quote Originally Posted by tannewt View Post
    It'd be worth validating that we're using the hardware float support too. I haven't looked into it yet.
    i'm pretty sure FPU is configured (by SDK SystemInit()) and compiler switches are correct, and that double and float are working for your Teensy 4 branch. Many of my .py tests were float intensive (raytrace hsquare llutm)

    i also embedded a C floating point test in cpu.temperature and speeds suggest FPU is working in circuitpython core
    Code:
            ddot 2.12534e+10 93 ms   8.60 Mflops   double
            daxpy 2.12534e+10 61 ms  13.11 Mflops
             
            sdot 2.12578e+10 31 ms  25.81 Mflops   float
            saxpy 2.12578e+10 34 ms  23.53 Mflops
    
            cache enabled   SCB_EnableDCache();
            ddot 2.12534e+10 67 ms  11.94 Mflops
            daxpy 2.12534e+10 33 ms  24.24 Mflops
    
            sdot 2.12578e+10 4 ms 200.00 Mflops
            saxpy 2.12578e+10 7 ms 114.29 Mflops
    Last edited by manitou; 01-30-2020 at 01:24 PM.

  18. #68
    Quote Originally Posted by manitou View Post
    i'm pretty sure FPU is configured (by SDK SystemInit()) and compiler switches are correct, and that double and float are working for your Teensy 4 branch. Many of my .py tests were float intensive (raytrace hsquare llutm)

    i also embedded a C floating point test in cpu.temperature and speeds suggest FPU is working in circuitpython core
    Code:
            ddot 2.12534e+10 93 ms   8.60 Mflops   double
            daxpy 2.12534e+10 61 ms  13.11 Mflops
             
            sdot 2.12578e+10 31 ms  25.81 Mflops   float
            saxpy 2.12578e+10 34 ms  23.53 Mflops
    
            cache enabled   SCB_EnableDCache();
            ddot 2.12534e+10 67 ms  11.94 Mflops
            daxpy 2.12534e+10 33 ms  24.24 Mflops
    
            sdot 2.12578e+10 4 ms 200.00 Mflops
            saxpy 2.12578e+10 7 ms 114.29 Mflops
    Very cool! It looks like we always use float internally as well: https://github.com/adafruit/circuitp...mpconfig.h#L69

  19. #69
    Senior Member+ mjs513's Avatar
    Join Date
    Jul 2014
    Location
    New York
    Posts
    5,912

    Micropython and Teensy 4.0

    Know this thread I circuitpython specific but just wanted to share something I just came across on the micropython forum regarding Teensy 4.0, https://forum.micropython.org/viewto...00c80&start=10.

    Some one seems to have started a port of micropython over to the Teensy 4.0 (https://forum.micropython.org/viewto...tart=10#p43549). The micropython PR for the T4.0 is at https://github.com/micropython/micropython/pull/5558.

    Looks kind of interesting in case anyone is interested.
    Last edited by mjs513; 02-03-2020 at 11:11 PM. Reason: Spelling !!!!!!

  20. #70
    Senior Member
    Join Date
    Jan 2015
    Location
    UK
    Posts
    146
    Which one is best to use on Teensy 4.0 ? Circuitpython or Micropython ?

  21. #71
    Senior Member+ mjs513's Avatar
    Join Date
    Jul 2014
    Location
    New York
    Posts
    5,912
    Not sure what you mean by best. Right now both are in there beginnings stages of development. From what I am reading maybe CircuitPython is a bit farther ahead of the curve? see https://github.com/adafruit/circuitpython/pull/2532. haven't dug into it. Right now have a few other projects in the fire.

  22. #72
    Senior Member PaulStoffregen's Avatar
    Join Date
    Nov 2012
    Posts
    23,087
    I also haven't had time to do much with this. Focusing on hardware stuff right now. Hoping to get some serious time to work with Python in by mid-March...

  23. #73
    Senior Member+ manitou's Avatar
    Join Date
    Jan 2013
    Posts
    2,594
    Quote Originally Posted by mjs513 View Post
    Micropythong and Teensy 4.0
    Please don't post any photos of your micropythong!

    My fingers also like to add the "g"

  24. #74
    Senior Member+ mjs513's Avatar
    Join Date
    Jul 2014
    Location
    New York
    Posts
    5,912
    Quote Originally Posted by manitou View Post
    Please don't post any photos of your micropythong!

    My fingers also like to add the "g"
    Promise no photos of micropythong's! (fixed it anyway).

    What happens when you have fat fingers and eyes half closed.

  25. #75
    Senior Member
    Join Date
    May 2015
    Posts
    402
    All the drivers you get with circuitpython sure make it seem easier for a noobie. I haven't started my embedded python journey yet but will very soon. The 4 is the first microcontroller I fell is plenty fast enough to run almost any script fast enough for almost any project.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •