Difference between revisions of "ML-TN-006 — Keyword Spotting and Asymmetric Multiprocessing on Orca SBC"

From DAVE Developer's Wiki
Jump to: navigation, search
(Testbed)
Line 25: Line 25:
 
* Executing a computationally expensive inference algorithm on the collected data.
 
* Executing a computationally expensive inference algorithm on the collected data.
  
This scenario is quite common in the realm of AI at the edge and generally it can not be addressed with a microcontroller-based solution. On the other hand, a classic embedded processor running a complex operating system such as Linux might not be suited either because unable to handle tight real-time constrained tasks.  
+
This scenario is quite common in the realm of AI at the edge but, generally, can not be addressed with a microcontroller-based solution because it would take too long to run the inference algorithm. On the other hand, a classic embedded processor running a complex operating system such as Linux might not be suited either because unable to handle tight real-time constrained tasks properly.
  
 
In such cases, the power and the flexibility of the NXP i.MX8M Plus can be of much help, as this SoC features a heterogeneous architecture — an ARM Cortex-A53 complex and an ARM Cortex-M7 core — and a Neural Processing Unit (NPU).   
 
In such cases, the power and the flexibility of the NXP i.MX8M Plus can be of much help, as this SoC features a heterogeneous architecture — an ARM Cortex-A53 complex and an ARM Cortex-M7 core — and a Neural Processing Unit (NPU).   
  
The idea is to exploit this heterogeneous architecture to implement an AMP configuration where  
+
The idea is to exploit i.MX8M Plus' heterogeneous architecture to implement an AMP configuration where  
* The Cortex-A53 complex — running Yocto Linux — is devoted to execute the inference algorithm with the hardware acceleration provided by the NPU
+
* The Cortex-A53 complex — running Yocto Linux — is devoted to the inference algorithm by leveraging the NPU hardware acceleration
* The Cortex-M7 core takes care of data acquisition.  
+
* The Cortex-M7 core takes care of data acquisition.
  
 
=Testbed=
 
=Testbed=
 
The testbed is illustrated in the following picture. Basically, it consists of an [[ORCA_SBC|Orca Single Board Computer]]
 
The testbed is illustrated in the following picture. Basically, it consists of an [[ORCA_SBC|Orca Single Board Computer]]
  
 +
 +
As stated previously, the inference algorithm is keyword spotting. The data being processed are thus audio samples retrieved by the Cortex M7 and sent to the Cortex A53 complex.
 +
 +
From a software perspective, we identify two different domains:
 +
* D1, which refers to the Yocto Linux world running on the Cortex A53 complex
 +
* D2, which refers to the firmware running on the Cortex M7 core.
 +
 +
For the sake of simplicity, the audio samples are not captured by a real microphone. They are retrieved by prefilled memory buffers that can not be accessed by Cortex A53 cores. For the purposes of discussion, this simplification is neglectable as the communication mechanisms between the domains are not affected at all.
 
=Implementation=
 
=Implementation=
 +
From
  
 
=Testing=
 
=Testing=
 +
==Boot sequence==
 +
This example was arranged in order to execute the following boot sequence:

Revision as of 18:06, 6 December 2021

Info Box
NeuralNetwork.png Applies to Machine Learning


History[edit | edit source]

Version Date Notes
1.0.0 December 2021 First public release

Introduction[edit | edit source]

This Technical Note (TN) describes a demo application used to show the combination of an inference algorithm, namely keyword spotting, and an asymmetric multiprocessing scheme (AMP). This use case can serve as the basis for more complex applications that have to carry out the following tasks:

  • Acquiring data from sensors in real-time
  • Executing a computationally expensive inference algorithm on the collected data.

This scenario is quite common in the realm of AI at the edge but, generally, can not be addressed with a microcontroller-based solution because it would take too long to run the inference algorithm. On the other hand, a classic embedded processor running a complex operating system such as Linux might not be suited either because unable to handle tight real-time constrained tasks properly.

In such cases, the power and the flexibility of the NXP i.MX8M Plus can be of much help, as this SoC features a heterogeneous architecture — an ARM Cortex-A53 complex and an ARM Cortex-M7 core — and a Neural Processing Unit (NPU).

The idea is to exploit i.MX8M Plus' heterogeneous architecture to implement an AMP configuration where

  • The Cortex-A53 complex — running Yocto Linux — is devoted to the inference algorithm by leveraging the NPU hardware acceleration
  • The Cortex-M7 core takes care of data acquisition.

Testbed[edit | edit source]

The testbed is illustrated in the following picture. Basically, it consists of an Orca Single Board Computer


As stated previously, the inference algorithm is keyword spotting. The data being processed are thus audio samples retrieved by the Cortex M7 and sent to the Cortex A53 complex.

From a software perspective, we identify two different domains:

  • D1, which refers to the Yocto Linux world running on the Cortex A53 complex
  • D2, which refers to the firmware running on the Cortex M7 core.

For the sake of simplicity, the audio samples are not captured by a real microphone. They are retrieved by prefilled memory buffers that can not be accessed by Cortex A53 cores. For the purposes of discussion, this simplification is neglectable as the communication mechanisms between the domains are not affected at all.

Implementation[edit | edit source]

From

Testing[edit | edit source]

Boot sequence[edit | edit source]

This example was arranged in order to execute the following boot sequence: