Forum Rule: Always post complete source code & details to reproduce any issue!
Results 1 to 18 of 18

Thread: Profiling using gprof on Teensy 4

  1. #1
    Senior Member
    Join Date
    Sep 2013
    Location
    Boston, MA
    Posts
    130

    Profiling using gprof on Teensy 4

    I created a project to partially support gprof-style profiling of applications on Teensy 4. This shows how much time is spent in each function. In theory, this could also work on Teensy 3, but it doesn't have enough memory to do it well so I didn't implement that yet. This is the repository: https://github.com/ftrias/TeensyProf

    It will only support the histogram feature of gprof. The Teensyduino arm cross-compiler does not support more advanced features (but it could! ...if Paul includes gprof executable and libc_p.a). You will need to make a few changes to the Arduino setup described in the README.md file.

    Overview
    -------------

    The profiler samples the current instruction every 1 millisecond. That is at each millisecond it looks to see what function is currently running and keeps a counter of how many times this happens. Over time, this gives an approximate value of how much time each function consumes.

    On desktops, gprof will also keep track of which function calls which, but that is not supported on Teensy 4.

    After a while, a file with the function counters is sent out to the serial port. A python program is listening on the serial port for this and writes out "gmon.out". This file is then cross-referenced by gprof with the original executable to generate a table of execution times.

    For an interesting overview of gprof: http://wwwcdf.pd.infn.it/localdoc/gprof.pdf

    Output of Tests Example
    ------------------

    Code:
    Flat profile:
    
    Each sample counts as 0.001 seconds.
      %   cumulative   self              self     total           
     time   seconds   seconds    calls  Ts/call  Ts/call  name    
     50.00      5.08     5.08                             stall_3()
     33.33      8.46     3.38                             stall_2()
     16.66     10.15     1.69                             stall_1()
      0.01     10.15     0.00                             endpoint0_receive
    Edit: Explained serial port.
    Last edited by ftrias; 08-23-2019 at 03:02 PM.

  2. #2
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    546
    Sounds interesting, I'll give it a try.

    The output you show is generated by the Linux/Mac/Win executable which is listening on a serial port right?

  3. #3
    Senior Member
    Join Date
    Sep 2013
    Location
    Boston, MA
    Posts
    130
    Quote Originally Posted by luni View Post
    Sounds interesting, I'll give it a try.

    The output you show is generated by the Linux/Mac/Win executable which is listening on a serial port right?
    I should have explained more. When compiling ends, it copies the elf file to /tmp/build.elf. Then the program uploads and runs as normal. When the profiling ends, the Teensy (using the library) sends a specially formatted data to the serial port. There is a python program listening to the serial port on the desktop. It detects the special codes and writes out the gmon.out file. Then it executes gprof on it cross-referencing the /tmp/build.elf binary.

  4. #4
    Senior Member
    Join Date
    Sep 2013
    Location
    Boston, MA
    Posts
    130
    Quote Originally Posted by ftrias View Post
    I created a project to partially support gprof-style profiling of applications on Teensy 4. This shows how much time is spent in each function. In theory, this could also work on Teensy 3, but it doesn't have enough memory to do it well so I didn't implement that yet. This is the repository: https://github.com/ftrias/TeensyProf

    It will only support the histogram feature of gprof. The Teensyduino arm cross-compiler does not support more advanced features (but it could! ...if Paul includes gprof executable and libc_p.a). You will need to make a few changes to the Arduino setup described in the README.md file.

    Overview
    -------------

    The profiler samples the current instruction every 1 millisecond. That is at each millisecond it looks to see what function is currently running and keeps a counter of how many times this happens. Over time, this gives an approximate value of how much time each function consumes.

    On desktops, gprof will also keep track of which function calls which, but that is not supported on Teensy 4.

    After a while, a file with the function counters is sent out to the serial port. A python program is listening on the serial port for this and writes out "gmon.out". This file is then cross-referenced by gprof with the original executable to generate a table of execution times.

    For an interesting overview of gprof: http://wwwcdf.pd.infn.it/localdoc/gprof.pdf

    Output of Tests Example
    ------------------

    Code:
    Flat profile:
    
    Each sample counts as 0.001 seconds.
      %   cumulative   self              self     total           
     time   seconds   seconds    calls  Ts/call  Ts/call  name    
     50.00      5.08     5.08                             stall_3()
     33.33      8.46     3.38                             stall_2()
     16.66     10.15     1.69                             stall_1()
      0.01     10.15     0.00                             endpoint0_receive
    Edit: Explained serial port.
    I forgot to mention this other modification:

    Modify `imxrt1062.ld` in the directory Arduino.../Contents/Java/hardware/teensy/avr/cores/teensy4. All references to `.text.itcm` must be changed to `.text`. Gprof expects the code to be in a segment named `.text`.

  5. #5
    Senior Member
    Join Date
    Sep 2013
    Location
    Boston, MA
    Posts
    130
    Unsatisfied with the limits of this library, I decided to do a more complete implementation of gprof. I used this previous work by Erich Styger as a guide: https://mcuoneclipse.com/2015/08/23/...-arm-cortex-m/. Clearly, I have too much time on my hands.

    The result works for Teensy 3 & 4 and is in: https://github.com/ftrias/TeensyGProf

    It works similarly to my last attempt. The main difference is that it also implements the call hierarchy and thus you can see which function calls which and the total cumulative time used by functions. This is very helpful.

    Setup is a little more involved and requires modifying Teensyduino files `boards.txt`, `platform.txt` and `imxrt1062.ld`. See README.md for details.

    `imxrt1062.ld` is modified to put all the code in a segment called `.text`. gprof is hard-coded to expect this because it's an almost universal convention. Teensy 3 does this. But for some reason, Teensy 4 does not.

    Please post here if you try. I'm curious to know how it works for people.

    Here is a partial output of the Tests example.

    Code:
    #include "TeensyGProf.h"
    
    int stall_3() {
      volatile int x;
      for(int i=0; i<300; i++) {
        for(int j=0; j<10000; j++) {
          x = 0;
        }
      }
      return x;
    }
    int stall_2() {
      volatile int x;
      for(int i=0; i<200; i++) {
        for(int j=0; j<10000; j++) {
          x = 0;
        }
      }
      return x;
    }
    int stall_1() {
      volatile int x;
      for(int i=0; i<100; i++) {
        for(int j=0; j<10000; j++) {
          x = 0;
        }
      }
      return x;
    }
    
    void runtests() {
      stall_1();
      stall_2();
      stall_3();
    }
    
    void setup() {
      Serial.begin(115200);
    }
    
    long start = millis();
    
    void loop() {
      if (start) Serial.println("loop");
      runtests();
      if (start && millis() - start > 10000) {
        gprof_end();
        start = 0;
      }
    }
    OUTPUT:
    Code:
    Flat profile:
    
    Each sample counts as 0.001 seconds.
      %   cumulative   self              self     total           
     time   seconds   seconds    calls   s/call   s/call  name    
     49.28      4.96     4.96      110     0.05     0.05  stall_3()
     32.83      8.26     3.30        1     3.30     3.30  stall_2()
     16.44      9.91     1.65        1     1.65     1.65  stall_1()
      0.61      9.97     0.06                             _mcount_internal
      0.57     10.03     0.06                             usb_transfer_status
      0.11     10.04     0.01   210058     0.00     0.00  yield
      0.05     10.04     0.01   210167     0.00     0.00  EventResponder::runFromYield()
      0.04     10.05     0.00                             __gnu_mcount_nc
      0.03     10.05     0.00                             micros
      0.02     10.05     0.00                             usb_serial_available
      0.01     10.05     0.00                             usb_serial_class::available()
      0.01     10.06     0.00                             delay
      0.01     10.06     0.00                             gprof_end
      0.00     10.06     0.00    10289     0.00     0.00  MillisTimer::runFromTimer()
      0.00     10.06     0.00    10289     0.00     0.00  MillisTimer::enableTimerInterrupt(bool)
      0.00     10.06     0.00    10289     0.00     0.00  MillisTimer::disableTimerInterrupt()
      0.00     10.06     0.00    10288     0.00     0.00  systick_isr
      0.00     10.06     0.00      110     0.00     0.09  runtests()
      0.00     10.06     0.00      110     0.00     0.00  Print::println()
      0.00     10.06     0.00        1     0.00     0.00  _GLOBAL__sub_I__Z7stall_3v
      0.00     10.06     0.00        1     0.00     0.00  __static_initialization_and_destruction_0(int, int)
    
     %         the percentage of the total running time of the
    time       program used by this function.
    
    cumulative a running sum of the number of seconds accounted
     seconds   for by this function and those listed above it.
    
     self      the number of seconds accounted for by this
    seconds    function alone.  This is the major sort for this
               listing.
    
    calls      the number of times this function was invoked, if
               this function is profiled, else blank.
    
     self      the average number of milliseconds spent in this
    ms/call    function per call, if this function is profiled,
    	   else blank.
    
     total     the average number of milliseconds spent in this
    ms/call    function and its descendents per call, if this
    	   function is profiled, else blank.
    
    name       the name of the function.  This is the minor sort
               for this listing. The index shows the location of
    	   the function in the gprof listing. If the index is
    	   in parenthesis it shows where it would appear in
    	   the gprof listing if it were to be printed.
    
    
    Copyright (C) 2012-2015 Free Software Foundation, Inc.
    
    Copying and distribution of this file, with or without modification,
    are permitted in any medium without royalty provided the copyright
    notice and this notice are preserved.
    
    
    		     Call graph (explanation follows)
    
    
    granularity: each sample hit covers 8 byte(s) for 0.01% of 10.06 seconds
    
    index % time    self  children    called     name
                                                     <spontaneous>
    [1]     98.5    0.00    9.91                 loop [1]
                    0.00    9.91     110/110         runtests() [2]
    -----------------------------------------------
                    0.00    9.91     110/110         loop [1]
    [2]     98.5    0.00    9.91     110         runtests() [2]
                    4.96    0.00     110/110         stall_3() [3]
                    3.30    0.00       1/1           stall_2() [4]
                    1.65    0.00       1/1           stall_1() [5]
    -----------------------------------------------
                    4.96    0.00     110/110         runtests() [2]
    [3]     49.3    4.96    0.00     110         stall_3() [3]
    -----------------------------------------------
                    3.30    0.00       1/1           runtests() [2]
    [4]     32.8    3.30    0.00       1         stall_2() [4]
    -----------------------------------------------
                    1.65    0.00       1/1           runtests() [2]
    [5]     16.4    1.65    0.00       1         stall_1() [5]
    -----------------------------------------------
                                                     <spontaneous>
    [6]      0.6    0.06    0.00                 _mcount_internal [6]
    -----------------------------------------------
                                                     <spontaneous>
    [7]      0.6    0.06    0.00                 usb_transfer_status [7]
    -----------------------------------------------
                    0.00    0.00   43836/210058      delay [11]
                    0.01    0.00  166222/210058      usb_serial_write [9]
    [8]      0.2    0.01    0.01  210058         yield [8]
                    0.01    0.00  210167/210167      EventResponder::runFromYield() [10]
    -----------------------------------------------
                                                     <spontaneous>
    [9]      0.1    0.00    0.01                 usb_serial_write [9]
                    0.01    0.00  166222/210058      yield [8]
    -----------------------------------------------
                    0.01    0.00  210167/210167      yield [8]
    [10]     0.0    0.01    0.00  210167         EventResponder::runFromYield() [10]
    -----------------------------------------------
                                                     <spontaneous>
    [11]     0.0    0.00    0.00                 delay [11]
                    0.00    0.00   43836/210058      yield [8]
    -----------------------------------------------
                                                     <spontaneous>
    [12]     0.0    0.00    0.00                 __gnu_mcount_nc [12]
    -----------------------------------------------
                                                     <spontaneous>
    [13]     0.0    0.00    0.00                 micros [13]
    -----------------------------------------------
                                                     <spontaneous>
    [14]     0.0    0.00    0.00                 usb_serial_available [14]
    -----------------------------------------------
                                                     <spontaneous>
    [15]     0.0    0.00    0.00                 gprof_end [15]
    -----------------------------------------------
                                                     <spontaneous>
    [16]     0.0    0.00    0.00                 usb_serial_class::available() [16]
    -----------------------------------------------
                    0.00    0.00   10288/10288       gprof_systick_isr [37]
    [17]     0.0    0.00    0.00   10288         systick_isr [17]
                    0.00    0.00   10289/10289       MillisTimer::runFromTimer() [87]
    -----------------------------------------------
                    0.00    0.00   10289/10289       systick_isr [17]
    [87]     0.0    0.00    0.00   10289         MillisTimer::runFromTimer() [87]
                    0.00    0.00   10289/10289       MillisTimer::disableTimerInterrupt() [89]
                    0.00    0.00   10289/10289       MillisTimer::enableTimerInterrupt(bool) [88]
    -----------------------------------------------
                    0.00    0.00   10289/10289       MillisTimer::runFromTimer() [87]
    [88]     0.0    0.00    0.00   10289         MillisTimer::enableTimerInterrupt(bool) [88]
    -----------------------------------------------
                    0.00    0.00   10289/10289       MillisTimer::runFromTimer() [87]
    [89]     0.0    0.00    0.00   10289         MillisTimer::disableTimerInterrupt() [89]
    -----------------------------------------------
                    0.00    0.00     110/110         Print::println(char const*) [113]
    [90]     0.0    0.00    0.00     110         Print::println() [90]
    -----------------------------------------------
                    0.00    0.00       1/1           __libc_init_array [123]
    [91]     0.0    0.00    0.00       1         _GLOBAL__sub_I__Z7stall_3v [91]
                    0.00    0.00       1/1           __static_initialization_and_destruction_0(int, int) [92]
    -----------------------------------------------
                    0.00    0.00       1/1           _GLOBAL__sub_I__Z7stall_3v [91]
    [92]     0.0    0.00    0.00       1         __static_initialization_and_destruction_0(int, int) [92]
    -----------------------------------------------
    Last edited by ftrias; 08-24-2019 at 03:38 PM. Reason: ask for feedback; add code

  6. #6
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    546
    Wanted to try in steps and started with the following (I'm using a makefile & vsCode not the Arduino IDE)

    Board: Teensy 3.2
    Changed/added compile flags: -O0 -gp

    Result:
    Compiling a test sketch, your library and linking does work without problems.
    But: The program seems to badly mess up something. The Teensy vanishes from the USB bus and a simple blink does not work anymore. Need to press the program button to revive the Board.

    If I link with -gp it complains about the missing libc_p (which was expected I think)

    Any ideas?

  7. #7
    Senior Member
    Join Date
    Sep 2013
    Location
    Boston, MA
    Posts
    130
    Quote Originally Posted by luni View Post
    Wanted to try in steps and started with the following (I'm using a makefile & vsCode not the Arduino IDE)

    Board: Teensy 3.2
    Changed/added compile flags: -O0 -gp

    Result:
    Compiling a test sketch, your library and linking does work without problems.
    But: The program seems to badly mess up something. The Teensy vanishes from the USB bus and a simple blink does not work anymore. Need to press the program button to revive the Board.

    If I link with -gp it complains about the missing libc_p (which was expected I think)

    Any ideas?
    One theory: I believe this happens when you use the "-pg" option in the link stage. This is why the "boards.txt" file adds a new parameter "profile" with "-O0 -pg" and then only adds it in "platform.txt" in the compile stage of C++ files. If you just add it to the "optimize" parameter in "boards.txt" it won't work.

  8. #8
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    546
    Unfortunately not. I only set the -pg for the compiler not the linker

    Here the makefile. See the section Flags and Defines.
    Code:
    #******************************************************************************
    # Generated by VisualTeensy (https://github.com/luni64/VisualTeensy)
    #
    # Board              Teensy 3.2 / 3.1
    # USB Type           Serial
    # CPU Speed          96 MHz (overclock)
    # Optimize           Faster
    # Keyboard Layout    US English
    #
    # 24.08.2019 19:29
    #******************************************************************************
    SHELL            := cmd.exe
    export SHELL
    
    TARGET_NAME      := profTest
    BOARD_ID         := TEENSY31
    MCU              := mk20dx256
    
    LIBS_SHARED_BASE := C:\Users\lutz\Documents\Arduino\libraries
    LIBS_SHARED      := TeensyGProf
    
    LIBS_LOCAL_BASE  := lib
    LIBS_LOCAL       :=
    
    CORE_BASE        := C:\PROGRA~1\Arduino\ARDUIN~1.8\hardware\teensy\avr\cores\teensy3
    GCC_BASE         := C:\PROGRA~1\Arduino\ARDUIN~1.8\hardware\tools\arm\bin
    UPL_PJRC_B       := C:\PROGRA~1\Arduino\ARDUIN~1.8\hardware\tools
    UPL_TYCMD_B      := C:\toolchain\TyTools
    UPL_JLINK_B      := C:\PROGRA~2\SEGGER\JLINK_~1
    
    #******************************************************************************
    # Flags and Defines
    #******************************************************************************
    FLAGS_CPU   := -mthumb -mcpu=cortex-m4 -fsingle-precision-constant
    FLAGS_OPT   := -O0
    FLAGS_COM   := -g -pg -Wall -ffunction-sections -fdata-sections -nostdlib -MMD
    FLAGS_LSP   :=
    
    FLAGS_CPP   := -fno-exceptions -felide-constructors -std=gnu++14 -Wno-error=narrowing -fno-rtti
    FLAGS_C     :=
    FLAGS_S     := -x assembler-with-cpp
    FLAGS_LD    := -Wl,--gc-sections,--relax,--defsym=__rtc_localtime=$(shell powershell [int](Get-Date -UFormat +%s)[0]) -T$(CORE_BASE)/mk20dx256.ld -lstdc++
    
    LIBS        := -larm_cortexM4l_math -lm
    
    DEFINES     := -D__MK20DX256__ -DTEENSYDUINO=145 -DARDUINO=10807
    DEFINES     += -DF_CPU=96000000 -DUSB_SERIAL -DLAYOUT_US_ENGLISH
    
    CPP_FLAGS   := $(FLAGS_CPU) $(FLAGS_OPT) $(FLAGS_COM) $(DEFINES) $(FLAGS_CPP)
    C_FLAGS     := $(FLAGS_CPU) $(FLAGS_OPT) $(FLAGS_COM) $(DEFINES) $(FLAGS_C)
    S_FLAGS     := $(FLAGS_CPU) $(FLAGS_OPT) $(FLAGS_COM) $(DEFINES) $(FLAGS_S)
    LD_FLAGS    := $(FLAGS_CPU) $(FLAGS_OPT) $(FLAGS_LSP) $(FLAGS_LD)
    AR_FLAGS    := rcs
    
    #******************************************************************************
    # Colors
    #******************************************************************************
    COL_CORE    := 
    COL_LIB     := 
    COL_SRC     := 
    COL_LINK    := 
    COL_ERR     := 
    COL_OK      := 
    COL_RESET   := 
    
    #******************************************************************************
    # Folders and Files
    #******************************************************************************
    USR_SRC     := src
    LIB_SRC     := lib
    CORE_SRC    := $(CORE_BASE)
    
    BIN         := .vsteensy/build
    USR_BIN     := $(BIN)/src
    CORE_BIN    := $(BIN)/core
    LIB_BIN     := $(BIN)/lib
    CORE_LIB    := $(BIN)/core.a
    TARGET_HEX  := $(BIN)/$(TARGET_NAME).hex
    TARGET_ELF  := $(BIN)/$(TARGET_NAME).elf
    TARGET_LST  := $(BIN)/$(TARGET_NAME).lst
    
    #******************************************************************************
    # BINARIES
    #******************************************************************************
    CC          := $(GCC_BASE)/arm-none-eabi-gcc
    CXX         := $(GCC_BASE)/arm-none-eabi-g++
    AR          := $(GCC_BASE)/arm-none-eabi-gcc-ar
    OBJCOPY     := $(GCC_BASE)/arm-none-eabi-objcopy
    SIZE        := $(GCC_BASE)/arm-none-eabi-size
    OBJDUMP     := $(GCC_BASE)/arm-none-eabi-objdump
    UPL_PJRC    := "$(UPL_PJRC_B)/teensy_post_compile" -test -file=$(TARGET_NAME) -path=$(BIN) -tools="$(UPL_PJRC_B)" -board=$(BOARD_ID) -reboot
    UPL_TYCMD   := $(UPL_TYCMD_B)/tyCommanderC upload $(TARGET_HEX) --autostart --wait --multi
    UPL_CLICMD  := $(UPL_CLICMD_B)/teensy_loader_cli -mmcu=$(MCU) -v $(TARGET_HEX)
    UPL_JLINK   := $(UPL_JLINK_B)/jlink -commanderscript .vsteensy/flash.jlink
    
    #******************************************************************************
    # Source and Include Files
    #******************************************************************************
    # Recursively create list of source and object files in USR_SRC and CORE_SRC
    # and corresponding subdirectories.
    # The function rwildcard is taken from http://stackoverflow.com/a/12959694)
    
    rwildcard =$(wildcard $1$2) $(foreach d,$(wildcard $1*),$(call rwildcard,$d/,$2))
    
    #User Sources -----------------------------------------------------------------
    USR_C_FILES      := $(call rwildcard,$(USR_SRC)/,*.c)
    USR_CPP_FILES    := $(call rwildcard,$(USR_SRC)/,*.cpp)
    USR_S_FILES      := $(call rwildcard,$(USR_SRC)/,*.S)
    USR_OBJ          := $(USR_S_FILES:$(USR_SRC)/%.S=$(USR_BIN)/%.o) $(USR_C_FILES:$(USR_SRC)/%.c=$(USR_BIN)/%.o) $(USR_CPP_FILES:$(USR_SRC)/%.cpp=$(USR_BIN)/%.o)
    
    # Core library sources --------------------------------------------------------
    CORE_CPP_FILES   := $(call rwildcard,$(CORE_SRC)/,*.cpp)
    CORE_C_FILES     := $(call rwildcard,$(CORE_SRC)/,*.c)
    CORE_S_FILES     := $(call rwildcard,$(CORE_SRC)/,*.S)
    CORE_OBJ         := $(CORE_S_FILES:$(CORE_SRC)/%.S=$(CORE_BIN)/%.o) $(CORE_C_FILES:$(CORE_SRC)/%.c=$(CORE_BIN)/%.o) $(CORE_CPP_FILES:$(CORE_SRC)/%.cpp=$(CORE_BIN)/%.o)
    
    
    # User library sources  (see) https://github.com/arduino/arduino/wiki/arduino-ide-1.5:-library-specification
    LIB_DIRS_SHARED  := $(foreach d, $(LIBS_SHARED), $(LIBS_SHARED_BASE)/$d/ $(LIBS_SHARED_BASE)/$d/utility/)  # base and /utility
    LIB_DIRS_SHARED  += $(foreach d, $(LIBS_SHARED), $(LIBS_SHARED_BASE)/$d/src/ $(dir $(call rwildcard,$(LIBS_SHARED_BASE)/$d/src/,*/.)))                           # src and all subdirs of base
    
    LIB_DIRS_LOCAL   := $(foreach d, $(LIBS_LOCAL), $(LIBS_LOCAL_BASE)/$d/ $(LIBS_LOCAL_BASE)/$d/utility/ )    # base and /utility
    LIB_DIRS_LOCAL   += $(foreach d, $(LIBS_LOCAL), $(LIBS_LOCAL_BASE)/$d/src/ $(dir $(call rwildcard,$(LIBS_LOCAL_BASE)/$d/src/,*/.)))                             # src and all subdirs of base
    
    LIB_CPP_SHARED   := $(foreach d, $(LIB_DIRS_SHARED),$(call wildcard,$d*.cpp))
    LIB_C_SHARED     := $(foreach d, $(LIB_DIRS_SHARED),$(call wildcard,$d*.c))
    LIB_S_SHARED     := $(foreach d, $(LIB_DIRS_SHARED),$(call wildcard,$d*.S))
    
    LIB_CPP_LOCAL    := $(foreach d, $(LIB_DIRS_LOCAL),$(call wildcard,$d/*.cpp))
    LIB_C_LOCAL      := $(foreach d, $(LIB_DIRS_LOCAL),$(call wildcard,$d/*.c))
    LIB_S_LOCAL      := $(foreach d, $(LIB_DIRS_LOCAL),$(call wildcard,$d/*.S))
    
    LIB_OBJ          := $(LIB_CPP_SHARED:$(LIBS_SHARED_BASE)/%.cpp=$(LIB_BIN)/%.o)  $(LIB_CPP_LOCAL:$(LIBS_LOCAL_BASE)/%.cpp=$(LIB_BIN)/%.o)
    LIB_OBJ          += $(LIB_C_SHARED:$(LIBS_SHARED_BASE)/%.c=$(LIB_BIN)/%.o)  $(LIB_C_LOCAL:$(LIBS_LOCAL_BASE)/%.c=$(LIB_BIN)/%.o)
    LIB_OBJ          += $(LIB_S_SHARED:$(LIBS_SHARED_BASE)/%.S=$(LIB_BIN)/%.o)  $(LIB_S_LOCAL:$(LIBS_LOCAL_BASE)/%.S=$(LIB_BIN)/%.o)
    
    # Includes -------------------------------------------------------------
    INCLUDE          := -I./$(USR_SRC) -I$(CORE_SRC)
    INCLUDE          += $(foreach d, $(LIB_DIRS_SHARED), -I$d)
    INCLUDE          += $(foreach d, $(LIB_DIRS_LOCAL), -I$d)
    
    # Generate directories --------------------------------------------------------
    DIRECTORIES :=  $(sort $(dir $(CORE_OBJ) $(USR_OBJ) $(LIB_OBJ)))
    generateDirs := $(foreach d, $(DIRECTORIES), $(shell if not exist "$d" mkdir "$d"))
    
    
    #******************************************************************************
    # Rules:
    #******************************************************************************
    
    .PHONY: directories all rebuild upload uploadTy uploadCLI clean cleanUser cleanCore
    
    
    all:  $(TARGET_LST) $(TARGET_HEX)
    
    rebuild: cleanUser all
    
    clean:   cleanUser cleanCore cleanLib
    	@echo $(COL_OK)cleaning done$(COL_RESET)
    
    upload: all
    	@$(UPL_PJRC)
    
    uploadTy: all
    	@$(UPL_TYCMD)
    
    uploadCLI: all
    	@$(UPL_CLICMD)
    
    uploadJLink: all
    	@$(UPL_JLINK)
    
    
    # Core library ----------------------------------------------------------------
    $(CORE_BIN)/%.o: $(CORE_SRC)/%.S
    	@echo $(COL_CORE)CORE [ASM] $(notdir $<) $(COL_ERR)
    	@"$(CC)" $(S_FLAGS) $(INCLUDE) -o $@ -c $<
    
    $(CORE_BIN)/%.o: $(CORE_SRC)/%.c
    	@echo $(COL_CORE)CORE [CC]  $(notdir $<) $(COL_ERR)
    	@"$(CC)" $(C_FLAGS) $(INCLUDE) -o $@ -c $<
    
    $(CORE_BIN)/%.o: $(CORE_SRC)/%.cpp
    	@echo $(COL_CORE)CORE [CPP] $(notdir $<) $(COL_ERR)
    	@"$(CXX)" $(CPP_FLAGS) $(INCLUDE) -o $@ -c $<
    
    $(CORE_LIB) : $(CORE_OBJ)
    	@echo $(COL_LINK)CORE [AR]  $@ $(COL_ERR)
    	@$(AR) $(AR_FLAGS) $@ $^
    	@echo $(COL_OK)Teensy core built successfully &&echo.
    
    # Shared Libraries ------------------------------------------------------------
    $(LIB_BIN)/%.o: $(LIBS_SHARED_BASE)/%.S
    	@echo $(COL_LIB)LIB  [ASM] $(notdir $<) $(COL_ERR)
    	@"$(CC)" $(S_FLAGS) $(INCLUDE) -o $@ -c $<
    
    $(LIB_BIN)/%.o: $(LIBS_SHARED_BASE)/%.cpp
    	@echo $(COL_LIB)LIB  [CPP] $(notdir $<) $(COL_ERR)
    	@"$(CXX)" $(CPP_FLAGS) $(INCLUDE) -o $@ -c $<
    
    $(LIB_BIN)/%.o: $(LIBS_SHARED_BASE)/%.c
    	@echo $(COL_LIB)LIB  [CC]  $(notdir $<) $(COL_ERR)
    	@"$(CC)" $(C_FLAGS) $(INCLUDE) -o $@ -c $<
    
    # Local Libraries -------------------------------------------------------------
    $(LIB_BIN)/%.o: $(LIBS_LOCAL_BASE)/%.S
    	@echo $(COL_LIB)LIB  [ASM] $(notdir $<) $(COL_ERR)
    	@"$(CC)" $(S_FLAGS) $(INCLUDE) -o $@ -c $<
    
    $(LIB_BIN)/%.o: $(LIBS_LOCAL_BASE)/%.cpp
    	@echo $(COL_LIB)LIB  [CPP] $(notdir $<) $(COL_ERR)
    	@"$(CXX)" $(CPP_FLAGS) $(INCLUDE) -o $@ -c $<
    
    $(LIB_BIN)/%.o: $(LIBS_LOCAL_BASE)/%.c
    	@echo $(COL_LIB)LIB  [CC]  $(notdir $<) $(COL_ERR)
    	@"$(CC)" $(C_FLAGS) $(INCLUDE) -o $@ -c $<
    
    # Handle user sources ---------------------------------------------------------
    $(USR_BIN)/%.o: $(USR_SRC)/%.S
    	@echo $(COL_SRC)USER [ASM] $< $(COL_ERR)
    	@"$(CC)" $(S_FLAGS) $(INCLUDE) -o $@ -c $<
    
    $(USR_BIN)/%.o: $(USR_SRC)/%.c
    	@echo $(COL_SRC)USER [CC]  $(notdir $<) $(COL_ERR)
    	@"$(CC)" $(C_FLAGS) $(INCLUDE) -o "$@" -c $<
    
    $(USR_BIN)/%.o: $(USR_SRC)/%.cpp
    	@echo $(COL_SRC)USER [CPP] $(notdir $<) $(COL_ERR)
    	@"$(CXX)" $(CPP_FLAGS) $(INCLUDE) -o "$@" -c $<
    
    # Linking ---------------------------------------------------------------------
    $(TARGET_ELF): $(CORE_LIB) $(LIB_OBJ) $(USR_OBJ)
    	@echo $(COL_LINK)
    	@echo [LD]  $@ $(COL_ERR)
    	@$(CC) $(LD_FLAGS) -o "$@" $(USR_OBJ) $(LIB_OBJ) $(CORE_LIB) $(LIBS)
    	@echo $(COL_OK)User code built and linked to libraries &&echo.
    
    %.lst: %.elf
    	@echo $(COL_LINK)
    	@echo [LST] $@ $(COL_ERR)
    	@$(OBJDUMP) -d -S --demangle --no-show-raw-insn --syms "$<"  > "$@"
    	@echo $(COL_OK)Sucessfully built project$(COL_RESET) &&echo.
    
    %.hex: %.elf
    	@echo $(COL_LINK)[HEX] $@
    	@$(SIZE) "$<"
    	@$(OBJCOPY) -O ihex -R.eeprom "$<" "$@"
    
    
    # Cleaning --------------------------------------------------------------------
    cleanUser:
    	@echo $(COL_LINK)Cleaning user binaries...$(COL_RESET)
    	@if exist $(USR_BIN) rd /s/q "$(USR_BIN)"
    	@if exist "$(TARGET_LST)" del $(subst /,\,$(TARGET_LST))
    
    cleanCore:
    	@echo $(COL_LINK)Cleaning core binaries...$(COL_RESET)
    	@if exist $(CORE_BIN) rd /s/q "$(CORE_BIN)"
    	@if exist $(CORE_LIB) del  $(subst /,\,$(CORE_LIB))
    
    cleanLib:
    	@echo $(COL_LINK)Cleaning user library binaries...$(COL_RESET)
    	@if exist $(LIB_BIN) rd /s/q "$(LIB_BIN)"
    
    # compiler generated dependency info ------------------------------------------
    -include $(CORE_OBJ:.o=.d)
    -include $(USR_OBJ:.o=.d)
    -include $(LIB_OBJ:.o=.d)

  9. #9
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    546
    Tried it with a T3.6, same result, Teensy crashes and vanishes from the bus.

  10. #10
    Senior Member
    Join Date
    Sep 2013
    Location
    Boston, MA
    Posts
    130
    Quote Originally Posted by luni View Post
    Tried it with a T3.6, same result, Teensy crashes and vanishes from the bus.
    That's a sophisticated Makefile.

    As a first pass, remove FLAGS_COM from C_FLAGS and S_FLAGS. You probably don't want to put instrumentation on the core teensy files.

    Can you remove the "@" silencing prefixes and post the output of a clean build so we can see exactly all the commands and command line options that are run?

    I didn't try it on T3.4/6.

  11. #11
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    546
    That's a sophisticated Makefile.
    Yes, grew somehow over the years. Anyway, the flags are automatically extracted from boards.txt so they should be OK.

    Doesn't link without the Flags_COM on the c and assembler files. Do you want me to just remove the -pg on those files?

    Code:
    CORE [CPP] HardwareSerial4.cpp 
    CORE [CPP] AudioStream.cpp 
    CORE [CPP] new.cpp 
    CORE [CPP] WString.cpp 
    CORE [CPP] IPAddress.cpp 
    LIB  [CC]  gmon.c 
    CORE [CPP] WMath.cpp 
    LIB  [CPP] TeensyFile.cpp 
    LIB  [CC]  profil.c 
    CORE [CPP] HardwareSerial6.cpp 
    CORE [CPP] main.cpp 
    LIB  [ASM] profiler.S 
    CORE [CPP] Stream.cpp 
    USER [CPP] main.cpp 
    CORE [AR]  .vsteensy/build/core.a 
    Teensy core built successfully 
    
    
    [LD]  .vsteensy/build/profTest.elf
    .vsteensy/build/core.a(pins_teensy.o): In function `attachInterrupt':
    pins_teensy.c:(.text+0x1b2): undefined reference to `getPinIndex'
    .vsteensy/build/core.a(pins_teensy.o): In function `detachInterrupt':
    pins_teensy.c:(.text+0x23a): undefined reference to `getPinIndex'
    collect2.exe: error: ld returned 1 exit status
    make.exe: *** [makefile:235: .vsteensy/build/profTest.elf] Error 1
    The terminal process terminated with exit code: 2

  12. #12
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    546
    Ah ha, doesn't crash if I apply the -pg only tho the cpp and .S files. Lets see if it outputs something...

  13. #13
    Senior Member
    Join Date
    Sep 2013
    Location
    Boston, MA
    Posts
    130
    Quote Originally Posted by luni View Post
    Yes, grew somehow over the years. Anyway, the flags are automatically extracted from boards.txt so they should be OK.

    Doesn't link without the Flags_COM on the c and assembler files. Do you want me to just remove the -pg on those files?

    Code:
    CORE [CPP] HardwareSerial4.cpp 
    CORE [CPP] AudioStream.cpp 
    CORE [CPP] new.cpp 
    CORE [CPP] WString.cpp 
    CORE [CPP] IPAddress.cpp 
    LIB  [CC]  gmon.c 
    CORE [CPP] WMath.cpp 
    LIB  [CPP] TeensyFile.cpp 
    LIB  [CC]  profil.c 
    CORE [CPP] HardwareSerial6.cpp 
    CORE [CPP] main.cpp 
    LIB  [ASM] profiler.S 
    CORE [CPP] Stream.cpp 
    USER [CPP] main.cpp 
    CORE [AR]  .vsteensy/build/core.a 
    Teensy core built successfully 
    
    
    [LD]  .vsteensy/build/profTest.elf
    .vsteensy/build/core.a(pins_teensy.o): In function `attachInterrupt':
    pins_teensy.c:(.text+0x1b2): undefined reference to `getPinIndex'
    .vsteensy/build/core.a(pins_teensy.o): In function `detachInterrupt':
    pins_teensy.c:(.text+0x23a): undefined reference to `getPinIndex'
    collect2.exe: error: ld returned 1 exit status
    make.exe: *** [makefile:235: .vsteensy/build/profTest.elf] Error 1
    The terminal process terminated with exit code: 2
    Yes. Don't use -pg on c or S files.

    Have you tried it with the Arduino app, as described in README.md?

    Also, I'd like to see the complete command line options used for each compile to see if it's different than how my system compiles it.

    It doesn't seem like the errors shown have anything to do with TeensyGProf. Are you using different "-O" options for different compilations?

  14. #14
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    546
    Ok, writes weird stuff on Serial after the waiting period.
    Next step will be to generate a file out of this. Don't want to install python for that. Is it just binary data which needs to be stored to a file?

  15. #15
    Senior Member
    Join Date
    Sep 2013
    Location
    Boston, MA
    Posts
    130
    Quote Originally Posted by luni View Post
    Ok, writes weird stuff on Serial after the waiting period.
    Next step will be to generate a file out of this. Don't want to install python for that. Is it just binary data which needs to be stored to a file?
    You can look at TeensyFile.cpp for the very simple format used.

    You can also configure it to write out the file using hexadecimal and then use something like hex2bin to convert it to a file.

  16. #16
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    546
    Perfect, I'll have a look into that. Thanks for your support so far.

  17. #17
    Senior Member
    Join Date
    Apr 2014
    Location
    Germany
    Posts
    546
    One remark:
    Would be great if one could pass a Stream into the file generation instead of the hardcoded Serial as output. I more or less always use Serial for PC/Teensy command communication so, having the file spit out on USB Serial is rather troublesome.

  18. #18
    Senior Member
    Join Date
    Sep 2013
    Location
    Boston, MA
    Posts
    130
    Quote Originally Posted by luni View Post
    One remark:
    Would be great if one could pass a Stream into the file generation instead of the hardcoded Serial as output. I more or less always use Serial for PC/Teensy command communication so, having the file spit out on USB Serial is rather troublesome.
    Very interesting. You can, of course, add your own implementation of the TeensyProf_open(), TeensyProf_close() and TeensyProf_write() functions to send the output to where you want. They're very simple and emulate the standard library open(), close() and write(). I have included 4 implementations: SD card, Serial using a binary format, Serial using hex, and Midi using Sysex messages.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •