BELK-AN-001: Asymmetric Multiprocessing (AMP) on Bora – Linux FreeRTOS

From DAVE Developer's Wiki
Revision as of 13:18, 15 September 2015 by U0001 (talk | contribs) (Advanced debugging techniques for AMP Linux+FreeRTOS configuration)

Jump to: navigation, search
Info Box
Bora5-small.jpg Applies to Bora
BORA Xpress.png Applies to BORA Xpress

History[edit | edit source]

Version Date BELK version Notes
1.0.0 November 2013 1.1.0 First release
1.0.1 November 2013 1.1.0 Added UART0 pinout information

Minor fixes

1.1.0 November 2013 1.1.0 Added support for RPMsg example
1.5.0 December 2013 1.1.0 Added chapter related to Lauterbach debugger
1.5.1 January 2014 1.1.0 Minor fixes
1.6.0 April 2014 2.0.0 Minor fixes

Updated for BELK 2.0.0 release

Introduction[edit | edit source]

This application note describes how to build the software components required to set up asymmetric multi-processing (AMP for short) configuration required to run Linux OS on first Cortex-A9 core and FreeRTOS on second Cortex-A9 core of the Zynq SOC.

Asymmetric Multiprocessing (AMP) allows a multiprocessor/multicore system to run multiple Operating Systems (OS) that are independent of each other. In other words, each CPU has its own private memory space, which contains the OS and the applications that are to run on that CPU. In addition, there can be some shared memory space that is used for multiprocessor communication. This is contrasted with Symmetric Multiprocessing (SMP), in which one OS runs on multiple CPUs using a public shared memory space. Thanks to AMP, developers can use open-source Linux and FreeRTOS operating systems and the RPMsg Inter Processor Communication (IPC) framework between the Zynq's two high-performance ARM® Cortex™-A9 processors to quickly implement applications that need to deliver deterministic, real-time responsiveness for markets such as automotive, industrial and others with similar requirements. For further information, please refer to this link.

Two different examples are here provided. The first one – HelloWorld – shows basic functionalities while the second – RPMsg-based application – exploits more sophisticated techniques to handle inter-processors communication and synchronization. This latter configuration is based on RPMsg mechanism as described in Xilinx document UG978 (v2013.04, April 22, 2013).

PDF version of this Application Note can be downloaded here.

AMP on Bora[edit | edit source]

The following sections detail how to build the software components required to set up asymmetric multi-processing (AMP for short) configuration required to run Linux OS on first Cortex-A9 core and FreeRTOS on second Cortex-A9 core. The prerequisites are:

Building the software components[edit | edit source]

Vivado project[edit | edit source]

  • log into the development host
  • Assuming that a local repository has not been created, clone the remote Bora git repository (the -b option is used to automatically checkout the current branch):
git clone git@git.dave.eu:dave/bora/bora.git -b bora
  • Enter the git directory
  • Switch to bora branch (not required if this is already the current branch):
git checkout bora

Set project directory variable:

export PROJ_DIR=$(pwd)/../bora-build-YYYYMMDD-nobk

Configure Vivado settings (1):

. /opt/Xilinx/Vivado/2013.3/settings64.sh

Launch Vivado with build_project script (2):

vivado -mode tcl -source build_project.tcl -notrace -tclargs "-bitstream"

(1) In a 32 bit system, Vivado settings are configured with the following command /opt/Xilinx/Vivado/2013.3/settings32.sh

(2) Passing the -tclargs "-bitstream" parameters allows for automatic building of the FPGA bitstream.

FSBL[edit | edit source]

Once the Vivado project build is completed, the hardware configuration can be exported starting the SDK to build the FSBL. From the SDK GUI:

  • Create a new application project, as shown in the picture below:
AN-BELK-001 01.jpg
  • Configure the application settings as shown in the pictures below:
AN-BELK-001 02.jpg
AN-BELK-001 03.jpg
  • Click finish to launch FSBL build process
  • Create the binary from the FSBL ELF chosing one of the following options:
    • manually launch the command: arm-xilinx-eabi-objcopy -v -O binary $PROJ_DIR/bora.sdk/SDK/SDK_Export/bora_FSBL/Debug/bora_FSBL.elf $PROJ_DIR/bora.sdk/SDK/SDK_Export/bora_FSBL/Debug/bora_FSBL.bin
    • configure the automatic binary generation on project build. In Project Explorer, right-click on bora_FSBL project and select C/C++ Build Settings and add the command arm-xilinx-eabi-objcopy -v -O binary ${ProjName}.elf ${ProjName}.bin on Post-build steps

N.B. When the Vivado project is modified, the binary must be re-generated with the following command:

python fpga-bit-to-bin.py --flip $PROJ_DIR/bora.runs/bora_run_impl/bora_design_wrapper.bit $PROJ_DIR/bora.runs/bora_run_impl/bora_design_wrapper.bin

FreeRTOS applications[edit | edit source]

The following sections describe the steps required to configure and build both the Helloworld and the RPMsg-based examples.

Importing the FreeRTOS repository into the SDK[edit | edit source]
  • Assuming that a local repository has not been created, clone the remote freeRTOS git repository:
git clone git@git.dave.eu:dave/bora/freertos.git
  • Enter the git directory
  • Switch to freertos-AMP branch:
git checkout freertos-AMP
  • In SDK gui import new repository: Xilinx Tools->Repositories
AN-BELK-001 04.jpg
  • Click New... to add a new repository under Local or Global Repositories, and select the freeRTOS repository directory:
AN-BELK-001 05.jpg
  • Click Rescan Repositories , Apply and OK
  • At the end of the procedure, applications based on freeRTOS operating system can be built

Building Example #1: HelloWorld application[edit | edit source]

The first example shows basic AMP functionalities. On FreeRTOS side, UART0 is used to implement a simple console. This port is routed via EMIO signals to pin-strip connector of BoraEVB. Since these signals are driven by FPGA Bank #34, these pins are 3.3V. Thus a RS232 transceiver or an USB/UART bridge should be used in order to connect the console on a PC. The signals are routed to the JP17 connector of the BoraEVB as reported below:

  • JP17.4 – UART0_TX
  • JP17.6 – UART0_RX

Please follow the steps listed below to build a HelloWorld application that prints a message on UART0 (via EMIO) on FreeRTOS running on Bora core #2.

  • From the SDK GUI, create e new application project:
AN-BELK-001 01.jpg
  • Configure the application settings as shown in the pictures and table below:
AN-BELK-001 07.jpg
AN-BELK-001 08.jpg
    • Project name: helloworld_freeRTOS
    • Hardware Platform: hw_platform_0
    • Processor: ps7_cortexa9_1
    • OS Plaftorm: freertos_zynq
    • Language: C
    • Board Support Package: Create New
    • Type: FreeRTOS Hello World AMP template
  • Click finish to launch the application build process
  • Create the binary from the application ELF chosing one of the following options:
    • manually launch the command: arm-xilinx-eabi-objcopy -v -O binary $PROJ_DIR/bora.sdk/SDK/SDK_Export/hellowordl_freeRTOS/Debug/hellowordl_freeRTOS.elf $PROJ_DIR/bora.sdk/SDK/SDK_Export/hellowordl_freeRTOS/Debug/hellowordl_freeRTOS.bin
    • configure the automatic binary generation on project build. In Project Explorer, right-click on helloworld_freeRTOS project and select C/C++ Build Settings and add the command arm-xilinx-eabi-objcopy -v -O binary ${ProjName}.elf ${ProjName}.bin on Post-build steps.

Building Example #2: RPMsg-based application[edit | edit source]

The procedure needed to build this application is similar to the one used to build HelloWorld application. The only difference is that the FreeRTOS Latency AMP template must be selected. In this case please note that:

  • the standard Linux infrastructure will be used to load the firmware for the second core
  • Linux will start in SMP mode, running on both cores; then CPU1 will be shutdown and FreeRTOS firmware will be loaded and run.

This example application exploits TTC1 timer to measure IRQ latencies as described in Xilinx UG978. In addition to that, GPIO0 (pin JP21.16 on BoraEVB) will be toggled every time ISR is invoked.

Once the build process is completed, the executable file in .elf format will be generated (we suggest to name it freertos). Creating the .bin file is not required.

  • Project name: RPMsg_freeRTOS
  • Hardware Platform: hw_platform_0
  • Processor: ps7_cortexa9_1
  • OS Plaftorm: freertos_zynq
  • Language: C
  • Board Support Package: Create New
  • Type: FreeRTOS Latency AMP

To run this example, Linux kernel (1) must be rebuilt too (2). First of all copy the freertos executable file in .elf format (freertos) into the directory firmware of Linux kernel tree (3). Then configure the kernel using bora_amp_defconfig as configuration file and enter the following command line, that changes the default load address of kernel and launches the building of both the kernel image and the modules:

bash# make UIMAGE_LOADADDR=0x10008000 uImage modules
[...]
  OBJCOPY arch/arm/boot/zImage
  Kernel: arch/arm/boot/zImage is ready
  UIMAGE  arch/arm/boot/uImage
Image Name:   Linux-3.9.0-bora-1.1.0-xilinx-00
Created:      Thu Nov 21 15:55:07 2013
Image Type:   ARM Linux Kernel Image (uncompressed)
Data Size:    3217192 Bytes = 3141.79 kB = 3.07 MB
Load Address: 10008000
Entry Point:  10008000
  Image arch/arm/boot/uImage is ready

The file arch/arm/boot/uImage is the binary image of the kernel that must be used to boot the system. The following kernel modules, resulting from the kernel build procedure, must be copied from the building directory to the root file system (usually into /lib/modules/<kernel version>/kernel, but any other directory can be used):

  LD [M]  drivers/remoteproc/remoteproc.ko
  LD [M]  drivers/remoteproc/zynq_remoteproc.ko
  LD [M]  drivers/rpmsg/rpmsg_freertos_statistic.ko
  LD [M]  drivers/rpmsg/virtio_rpmsg_bus.ko
  LD [M]  drivers/virtio/virtio.ko
  LD [M]  drivers/virtio/virtio_ring.ko
  LD [M]  net/rpmsg/rpmsg_proto.ko

For further details on kernel modules, please refer to this link.

(1) The kernel branch must be bora.

(2) It is assumed that the development environment is already set up as described in BELK Quick Start Guide.

(3) The name of the binary file copied into the firmware directory must be freertos.

Linux Device Tree[edit | edit source]

The Flattened Device Tree (FDT) is a data structure for describing the hardware in a system (for further information, please refer to http://elinux.org/Device_Tree). Both Example #1 and Example #2 requires some modifications to the standard Bora device tree (to initialiaze UART0 port and to properly initialize the RPMsg infrastructure, respectively). Please use the kernel branch bora, that already includes the aforementioned patches (for further details, please refer to the arch/arm/boot/dts/bora.dts file and commit descriptions on the Linux git repository). For detailed instructions on how to build the Linux kernel and the Device Tree, please refer to the BELK Quick Start Guide TBD.

Running the demo applications[edit | edit source]

Example #1: HelloWorld application[edit | edit source]

This section describes how to run freeRTOS HelloWorld example application on BORA using AMP (Linux + FreeRTOS). Plese follow the steps listed below:

  • Place all the binary files into the host tftp directory:
    • Kernel (1): uImage
    • Device Tree: bora.dtb
    • First stage bootloader: bora_FSBL.bin
    • FPGA bitstream: bora_design_wrapper.bin
    • FreeRTOS application: helloworld_freeRTOS.bin
  • Start the Bora system
  • From the U-Boot shell, update the FSBL with the following commands:
run load_fsbl
run update_fsbl
  • Reset the board to reboot with the new FSBL
  • Add the following U-Boot environment variables (2):
setenv addcons 'setenv bootargs ${bootargs} console=${console},115200n8 cma=16M debug maxcpus=${nr_cpus}'
setenv addmem 'setenv bootargs ${bootargs} mem=$(kernel_mem)'
setenv kernel_mem 1008M
setenv nr_cpus 1
setenv net_nfs 'run program_fpga; run load_freertos; run loadk nfsargs addip addcons addmem; bootm ${loadaddr_kern} - ${loadaddr_ftd}'
setenv load_freertos 'tftp ${freertos_addr} ${freertos_file};mw.l 0xFFFFFFF0 ${freertos_addr}'
setenv freertos_addr 0x3F000000
setenv freertos_file bora/BELK/helloworld_freeRTOS.bin
setenv fpga_file BELK/bora_design_wrapper.bin

Boot the system running the following command:

run net_nfs

(1) The kernel must be built with the UIMAGE_LOADADDR 0x8000 option. Please refer to section 3.4.3 of the Belk Quick Start Guide.

(2)

program_fpga

: Loads FPGA binary from TFTP and programs the bitstream

load_freertos

: Loads freertos application binary from TFTP and writes application start address for core #2

mem=${kernel_memory}

: sets maximum kernel memory (1008M = 1024M - 16M)

maxcpus=${nr_cpus}

: sets maximum Linux cores to 1

Example #2: RPMsg-based application[edit | edit source]

As stated before, this example shows a more sophisticated approach that allows for:

  • using a standardized communication channel between the two cores
  • exploiting a standardized mechanism to load the firmware of second core.

The example performs IRQ latency measurements on FreeRTOS side by using a hardware timer. These measures are collected by the counterpart application running on Linux side and shown on console. Plese follow the steps listed below:

  • Place all the binary files into the host tftp directory:
    • Kernel: uImage
    • Device Tree: bora.dtb
    • First stage bootloader: bora_FSBL.bin
    • FPGA bitstream: bora_design_wrapper.bin
    • FreeRTOS application: freertos
  • Start the Bora system
  • From the U-Boot shell, update the FSBL with the following commands:
run load_fsbl
run update_fsbl
  • Reset the board to reboot with the new FSBL
  • Add the following U-Boot environment variables (1) (2):
setenv addcons 'setenv bootargs ${bootargs} console=${console},115200n8 cma=16M debug'
setenv addmem 'setenv bootargs ${bootargs} mem=$(kernel_mem)'
setenv kernel_mem 496M
setenv net_nfs 'run program_fpga; run loadk nfsargs addip addcons addmem; bootm ${loadaddr_kern} - ${loadaddr_ftd}'
setenv freertos_addr 0x3F000000
setenv fpga_file BELK/bora_design_wrapper.bin
  • Boot the system running the following command:

run net_nfs

When booting, the Linux kernel will print out the following message to indicate it has been relocated to address 0x10000000:

[    0.000000] Machine: Xilinx Zynq Platform, model: Bora
[    0.000000] Change memory bank to 10000000-2fffffff
[    0.000000] cma: CMA: reserved 16 MiB at 2f000000

To start the example, please enter the following commands on Linux side to load the required modules:

insmod  drivers/virtio/virtio.ko
insmod  drivers/virtio/virtio_ring.ko
insmod  drivers/rpmsg/virtio_rpmsg_bus.ko
insmod  net/rpmsg/rpmsg_proto.ko
insmod  drivers/remoteproc/remoteproc.ko
insmod  drivers/remoteproc/zynq_remoteproc.ko
insmod  drivers/rpmsg/rpmsg_freertos_statistic.ko

Linux kernel will print these messages, informing that the communication between the two cores has been established: [ 17.966158] NET: Registered protocol family 41 [ 18.036698] CPU1: shutdown [ 18.045287] remoteproc0: 0.remoteproc-test is available [ 18.050522] remoteproc0: Note: remoteproc is still under development and considered experimental. [ 18.059554] remoteproc0: THE BINARY FORMAT IS NOT YET FINALIZED, and backward compatibility isn't yet guaranteed. [ 18.077341] remoteproc0: powering up 0.remoteproc-test [ 18.082668] remoteproc0: Booting fw image freertos, size 2357682 [ 18.103607] remoteproc0: remote processor 0.remoteproc-test is now up [ 18.113339] virtio_rpmsg_bus virtio0: rpmsg host is online [ 18.118795] remoteproc0: registered virtio0 (type 7) [ 18.124417] virtio_rpmsg_bus virtio0: creating channel rpmsg-timer-statistic addr 0x50 [ 18.151586] rpmsg_freertos_statistic rpmsg0: new channel: 0x400 -> 0x50! Then run the latencystat application as shown below. The typical output will look like this:

root@bora:~# ./latencystat -b
Linux FreeRTOS AMP Demo.
   0: Command 0 ACKed
   1: Command 1 ACKed
Waiting for samples...
   2: Command 2 ACKed
   3: Command 3 ACKed
   4: Command 4 ACKed
-----------------------------------------------------------
Histogram Bucket Values:
        Bucket 323 ns (36 ticks) had 38 frequency
        Bucket 341 ns (38 ticks) had 299 frequency
        Bucket 512 ns (57 ticks) had 1 frequency
        Bucket 746 ns (83 ticks) had 1 frequency
-----------------------------------------------------------
Histogram Data:
        min: 323 ns (36 ticks)
        avg: 332 ns (37 ticks)
        max: 746 ns (83 ticks)
        out of range: 0
        total samples: 339
-----------------------------------------------------------

This application is extremely useful for evaluating how CPU load on first core affects IRQ latency. In case latency does not satisfy real-time requirements, it may be necessary to adjust arbitration priorities of processor's interconnect subsystem. For further details, please refer to chapter Interconnect of Zynq Technical Reference Manual.

N.B. prior to launching the latencystat application, make sure that the governor is set to performance with the following command:

echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

(1) program_fpga: Loads FPGA binary from TFTP and programs the bitstream

load_freertos: Loads freertos application binary from TFTP and writes application start address for core #2

(2) Please note that, using the RPMsg mechanism, it's not required to set the maxcpus=${nr_cpus} variable.

Advanced debugging techniques for AMP Linux+FreeRTOS configuration[edit | edit source]

Introduction[edit | edit source]

When working with complex real-time configurations such as AMP Linux+FreeRTOS, debugging requirements increase dramatically. This chapter – written in collaboration with Lauterbach SRL – shows how these issues can be tackled with Lauterbach TRACE32 ® debugger (1). The following picture shows the BoraEVB connected to Lauterbach PowerDebug Interface/USB3 via J18 connector. By default, the board is configured to chain Xilinx PL TAP and ARM DAP (please refer to chapter “JTAG and DAP Subsystem” of Zynq Technical Reference Manual for more details).


BoraEVB connected to Lauterbach PowerDebug Interface/USB3

(1) The techniques described in this chapter apply to the Example #1: HelloWorld FreeRTOS application (please refer to section TBD).

Additional resources[edit | edit source]