MITO8M-AN-001: Advanced multicore debugging, tracing, and energy profiling with Lauterbach TRACE32

From DAVE Developer's Wiki
Jump to: navigation, search
Info Box
DMI-Mito-top.png Applies to MITO 8M
Warning-icon.png This Application Note was validated against specific versions of hardware and software. What is described here may not work with other versions. Warning-icon.png

History[edit | edit source]

Version Date Notes
1.0.0 June 2021 First public release
1.0.1 June 2021 Added webinar recording

Introduction[edit | edit source]

This Application Note (AN) is associated with this webinar organized by Lauterbach Italy, DAVE Embedded Systems, and NXP in May 2021.

The webinar showed advanced techniques used to debug, trace, and energy profile the code executed by the NXP i.MX8M SoC powering the Mito8M system-on-module (SoM) with the aid of Lauterbach tools.

Specifically, this article deals with JTAG debugging, which is a stop-mode technique. Unlike run/mode debuggers such as GDB, this means that the debugger stops the processor, for example, to execute the code step-by-step or when it hits a breakpoint. Another fundamental characteristic of this mode is that the processor cooperates with the debugger for controlling the execution of the code and for accessing internal resources (CPU registers, RAM/flash memories, on-chip peripheral registers, etc.). For more details on JTAG debugging, please refer to this document.

Tracing is an extremely powerful technique that comes to help not only to "enhance" the debugging. It is also an outstanding troubleshooting weapon and allows advanced measurements, profiling, and specific software-related testing like code coverage. You can think about it as a sort of "movie" that is shot in real-time and in a non-intrusive fashion while the processor is running. Later on, it allows engineers to scrutinize all the processor's activity around specific events that occurred in the past. For instance, this is astonishingly useful for analyzing errors, bugs, or other situations that do not allow to stop the processor. With respect to tracing, three different strategies are here illustrated:

  • Onchip trace
  • Offchip trace via Trace Port Interface Unit (TPIU)
  • Offchip trace via PCIe.

In this AN, tracing will be also used to show how to carry out energy profiling in order to correlate the power consumption with the execution of the code.

The hardware platforms used for this demonstration basically consists of the MITO 8M Evaluation Kit. This kit comprises three different boards:

It is also worth remembering that Mito8M SoM is electrically and mechanically compatible with i.MX6-based AxelLite SoM. Therefore, Mito8M can be mated with the SBCX carrier board directly. Nevertheless, to fully leverage all the functionalities it provides, an adapter board is required. Namely, the adapter board allows to access physical interfaces that are used for the purposes here discussed.

Regarding the software configuration, a typical Linux SMP setup was used. Specifically, the kernel 4.14.98 was utilized in tandem with a root file system built with Yocto Sumo.

The debugging/tracing/energy profiling tools[edit | edit source]

This chapter details the Lauterbach tools used for this demonstration.

Prerequisites[edit | edit source]

Debug and Onchip trace[edit | edit source]

  • LA-3500 Power Debug USB3 or LA-3505 PowerDebug PRO Ethernet
  • LA-3743 Debugger for Cortex-A/R (Armv8 and Armv9)
  • TRACE32 PowerView for ARM (Release ≥ Sep 2020, Software Version: R.2020.09.000128638)
  • Optional: LA-7844X Debug Cortex-M (ARMv6/7/8 32-bit) Ext.
  • Optional: LA-7970X Trace License for ARM (Debug Cable)

Debug and offchip trace via TPIU (parallel Trace Port Interface Unit)[edit | edit source]

  • LA-3505 PowerDebug PRO Ethernet + LA-7692 PowerTrace II 1 GigaByte
  • LA-3743 Debugger for Cortex-A/R (Armv8 and Armv9)
  • LA-7992 Preproc. for ARM-ETM/AUTOFOCUS II 600 Flex or LA-7993 Preproc. for ARM-ETM/AUTOFOCUS 600 MIPI
  • TRACE32 PowerView for ARM (Release ≥ Sep 2020, Software Version: R.2020.09.000128638)
  • Optional: LA-7844X Debug Cortex-M (ARMv6/7/8 32-bit) Ext.
  • Optional: LA-7949 Analog Probe for PI/PT-II/CP/MicroTrace

TRACE32 PowerDebug PRO, PowerTrace II, LA-7992 preprocessor for ARM/ETM, MITO 8M board

Debug and offchip trace via PCIe (PCI express)[edit | edit source]

  • LA-3505 PowerDebug PRO Ethernet + LA-3520 PowerTrace Serial 4 GigaByte for ARM-ETM
  • LA-3743 Debugger for Cortex-A/R (Armv8 and Armv9)
  • LA-3550X License for PCI Express
  • LA-3522 Accessories for PTSerial for ARM-ETM 7-8Lanes
  • LA-3527 PTSERIAL-PCIe x1 Slot-Card-Converter
  • TRACE32 PowerView for ARM (Release ≥ Sep 2020, Software Version: R.2020.09.000128638)
  • Optional: LA-7844X Debug Cortex-M (ARMv6/7/8 32-bit) Ext.
  • Optional: LA-7949 Analog Probe for PI/PT-II/CP/MicroTrace

TRACE32 PowerDebug PRO, PowerTrace Serial, PTSERIAL-PCIe x1 Slot-Card-Converter, MITO 8M board

For a general introduction to debug features provided by TRACE32 tools, please refer to

  • Debugger Basics – Training manual (training_debugger.pdf)
  • Training HLL Debugging manual (training_hll.pdf)

Debugging and tracing[edit | edit source]

The following sections describe how to configure the Lauterbach TRACE32® debugger to support debug and trace of Linux running on the quad-core i.MX 8M Application Processor by NXP Semiconductors.

TRACE32 configuration[edit | edit source]

In an SMP system the tasks are dynamically assigned by an SMP operating system to the cores. For debugging SMP systems, only one TRACE32 instance is opened and all cores are controlled from this instance.

TRACE32 PowerView distinguishes two types of information:

  • Core-specific information which is displayed on a colored background. Typical core-specific information are: register contents, source listing of the code currently executed by the core, the stack frame. TRACE32 PowerView uses predefined color settings for the cores.
  • Information common for all cores, which is displayed on a white background. Typical common information are: memory contents, values of variables, breakpoint setting.

A selector is available to switch the core-specific information to a selected core.

TRACE32 selector.png

On an SMP system the program execution on all cores is started with Go and stopped with Break. The same onchip breakpoints are programmed by the debugger at the same time to all cores. If a breakpoint is hit, TRACE32 selects the core on which the breakpoint occurred.

Startup scripts[edit | edit source]

Reference scripts for startup and for configuring the Analog Probe are available for download: please refer to Resources section below. For more details about the PRACTICE batch language, please refer to

  • Training PRACTICE manual (training_practice.pdf)
  • PRACTICE Script Language User´s Guide (practice_user.pdf)
  • PRACTICE Script Language Reference Guide (practice_ref.pdf)

Setting up the Linux debug configuration[edit | edit source]

The symbolic information is useful for HLL debugging, or setting breakpoints, stepping through the code, viewing variables, and many other aspects of debugging.

The compiler must be configured in order to generate debug symbols. The vmlinux file for the running kernel must be available, in order to load the kernel debug symbols. No instrumentation is needed in the kernel source code for debugging with Lauterbach debuggers, but it’s important that the vmlinux file is generated from the same kernel build as the zImage or uImage running on the system.

Specific (TrOnchip) options must be configured to avoid automatic Break of TRACE32 debugger, in case PageAbort or DataAbort events happen due to normal Linux operations.

In the following paragraphs, the basic TRACE32 Linux configuration will be introduced. For more details, please refer to

  • Training Linux Debugging manual (training_rtos_linux.pdf)
  • RTOS Debugger for Linux - Stop Mode manual (rtos_linux_stop.pdf)

Kernel awareness[edit | edit source]

The TRACE32 RTOS kernel awareness technology makes the debugger aware of the OS running in the target system. Debug is significantly simplified, as the user can immediately access all the components of the OS and the applications. The Executable and Linkable Format (ELF) binary image, created at kernel build time, is also used by the TRACE32 kernel awareness for Linux.

While configuring the TRACE32 kernel awareness for Linux, a specific menu file for Linux can be loaded which includes many useful menu items developed for the TRACE32 GUI to ease Linux debugging.

TRACE32 Linux menu.png

MMU support[edit | edit source]

In Linux embedded, the Lauterbach debuggers provide a very tight integration with the RTOS. The kernel awareness supports the Linux MMU format and is able to handle virtual memory addressing.

To provide full debugging possibilities, the debugger has to know how virtual addresses are translated to physical addresses. If an OS that runs several processes at the same logical addresses (e.g. Linux) is used, the hardware MMU in the CPU only holds translation tables that allow the debugger memory accesses to the code/data of the kernel and the currently running process. The OS itself maintains the translation tables for all processes, because the OS is responsible for the reprogramming of the hardware MMU on a process switch. The debugger can access code/data from a not currently running process using the information from the OS MMU tables. An automatic table-walk method is available in TRACE32, walking through the OS MMU tables to find a valid logical-to-physical translation, in case it’s not already cached in TRACE32.

TRACE32 MMU page tables reduced.png

Debug of kernel modules[edit | edit source]

The Linux kernel can be compiled to allow linking of additional modules at runtime (kernel objects). The Lauterbach debuggers also support kernel modules debugging, starting from the initialization function.

TRACE32 debug of kernel modules.png

Debug of user processes, threads, shared objects[edit | edit source]

User process debugging is also available, starting from the very beginning of the process. If the process loads shared objects, they are loaded in the process address space when the related instructions are executed for the first time (demand paging).

The Lauterbach debuggers also support debug of threads for multithreaded processes. In this case, the same address space is shared between different threads and the symbolic information can be loaded only once per process.

In general, the same techniques used for debugging kernel code, such as setting breakpoints, stepping through code, watching variables, and viewing memory contents, can be performed in the same way for processes and tasks.

Virtual address spaces are distinguished in TRACE32 using the concept of spaceID. The memory addressing is extended using the lower 16 bit of the process PID, allowing in this way to distinguish between equal virtual addresses for different processes.

TRACE32 debug of user process.png

Program trace[edit | edit source]

The ETM (Embedded Trace Macrocell) is an ARM CoreSight component which outputs program (and optionally data) trace information about the cores’ activity. The ETM trace information can be stored internally (onchip trace) or externally (offchip trace) using an external recording device.

In TRACE32 many trace views are available: each trace window will follow the time / record synchronization of other trace windows, if the option /Track is used. Many statistical views are also available.

For more details, please refer to

  • ARM-ETM Training manual (training_arm_etm.pdf)
Onchip trace[edit | edit source]

The NXP chip i.MX 8M implements the ETF (Embedded Trace FIFO) and the ETR (Embedded Trace Router) which are CoreSight hardware component providing onchip trace functionality.

  • The ETF is a FIFO for trace data on the chip to moderate the peak bandwidth the trace sinks need to handle. The ETF can alternatively be used as ETB (Embedded Trace Buffer) to store trace data on chip.
  • The ETR can send the trace data stream to a memory location on the AXI bus: in this way you can use the DRAM as a big onchip trace memory.

The trace data can be read out via JTAG, when the trace recording has ended.

TRACE32 onchip trace.png
Offchip trace: TPIU[edit | edit source]

The ETM trace information can be combined with other trace sources or passed directly offchip to the Trace Port Interface Unit (TPIU). There it will be captured by a trace port analyzer (ETM Preprocessor).

The Lauterbach ETM preprocessors implement an AutoFocus technology which calibrates the trace-data sampling points in order to compensate for the effects of wave reflections, component tolerances, different trace lengths, limited pad driver capabilities, signal coupling, affecting the sampling of high-speed parallel buses.

TRACE32 offchip trace.png

TRACE32 offchip trace part 2.png
Offchip trace: PCIe[edit | edit source]

PowerTrace Serial is an extension to the debug module TRACE32 PowerDebug PRO and offers a trace memory of up to 4 GByte, which can record trace data conveyed offchip via a serial trace port.

The i.MX 8M Application Processor implements a PCIe Gen 2 interface which allows to reach a speed of 5GT/s. The PowerTrace Serial can be configured as PCIe end-point and its PCIe memory can be mapped in the ETR. In this way the program flow trace exported via ETR can be stored offchip in the PowerTrace Serial memory.

TRACE32 PCIe.png

Energy profiling[edit | edit source]

Using a Lauterbach Analog Probe, it’s possible to measure up to 4 voltage channels (1 at a time, shared with current channels) and 3 current channels (1 at a time, shared with voltage channels; shunt resistor required).

The MITO 8M carrier board is equipped with a connector compatible with the Analog Probe layout. This allows to profile the energy consumed by the running application, performing an analysis of the instantaneous power consumption, which can be correlated with the program flow trace. The purpose is:

  • to detect unexpected consumption peaks
  • to check the power saving modes
  • to optimize the code to reduce energy consumption

Specific views are available in TRACE32 to analyze the instantaneous power and energy consumed for a given function. Additional settings are available for triggering: e.g. voltage or current are greater / smaller than upper / lower limits.

TRACE32 energy profiling.png

Summary view[edit | edit source]

In the pictures below, all the concepts previously discussed are shown as a summary global view of TRACE32 debugger.

TRACE32 summary view.png

Resources[edit | edit source]

The TRACE32 scripts used for this demonstration can be downloaded here.

Webinar recording is also avalaible. It is in Italian, but it is possible to enable English subtitles as well.

Webinar recording

References[edit | edit source]

Logo Lauterbach.png

This article has been mainly written by the Lauterbach Italian branch office.

Contact information:

Lauterbach SRL

Via Caldera 21

20153 Milan (Italy)

Tel. +39 02 45490282