Linux and interrupt latency (Axel)

From DAVE Developer's Wiki
Jump to: navigation, search
Info Box
Axel-04.png Applies to Axel Ultra
Axel-lite 02.png Applies to Axel Lite
Axel-02.png Applies to AXEL ESATTA

Introduction

As known, Linux is not a real-time operating system and, as such, it can't guarantee that interrupt latency can be upper bounded by a determined value.

When developing applications based on embedded platforms, this can cause unacceptable behaviors. For this reason system integrators have to be aware of this issue and, if necessary, have to implement specific strategies to prevent or limit it.

Numerous solutions exist (see for example Application_Notes_(Axel)#AN-XELK-001:_Asymmetric_Multiprocessing_.28AMP.29_on_Axel_.E2.80.93_Linux_.2B_FreeRTOS, BRX-WP001:_Real-timeness,_system_integrity_and_TrustZone®_technology_on_AMP_configuration, https://rt.wiki.kernel.org/index.php/Main_Page, http://elinux.org/CPU_Shielding_capability), however an exhaustive discussion of these is beyond the scope of this document. Some practical considerations are illustrated instead, based on real-world cases involving Axel platforms and XELK.

Disabling interrupts at kernel level

A lot of device drivers exist in the kernel space. Many of them need to temporarily disable interrupts to implement specific hardware-related operations. During the interrupt-disabled time windows, any peripheral could issue an interrupt request anyway. If this happens, associated interrupt service routine execution is delayed until (at least) interrupts are enabled again. As a consequence, when this particular condition happens, interrupt latency is greater than the average one. It is virtually impossible to estimate the upper bound because exploring all the possible execution paths at kernel level is not feasible. Generally speaking, maximum interrupt latency can be even orders of magnitude greater that the average one.

A real-word case: UART RX FIFO overrun

This case is a typical example that is pretty hard to analyze because a lot of variables are involved and therefore it is extremely difficult to isolate cause-effect relationships. In other words, with the help of software-based tools only, it is not easy to find out which are the drivers that cause the UART RX interrupt latency to explode. Advanced debugging tools providing tracing capabilities such as Lauterbach TRACE32/PowerDebug are extremely useful to perform such investigations. In this specific case, this tool allowed to find out that the SD controller driver affects interrupt latencies dramatically. This is due to 1ms delay that is implemented in the driver and that is issued after disabling the interrupts. In case the physical SD card interface is not provided with the card detect signal, this condition occurs on a regular basis because the polling mechanism used to detect the presence of the card. Thus it is relatively likely that RX FIFO interrupt is triggered in the middle of such delay, causing the overrun issue.