Changes

← Older edit

MISC-TN-019: Post-portem analysis of embedded Linux systems — Part 1

10 bytes added, 11:19, 8 February 2022

→‎Introduction

~~{{WarningMessage|text=This Technical Note was validated against specific versions of hardware and software. What is described here may not work with other versions.}}~~

[[Category:MISC-AN-TN]]

[[Category:MISC-TN]]

==Introduction==

One of the most challenging problems related to embedded Linux systems is the so called post-mortem analysis. Post-mortem is a Latin expression that means "after death.". In this context, death is meant as an event after which the system becomes unstable or even gets stuck. Therefore, post-mortem analysis refers to the tasks carried out after the occurrence of such an event to figure out its root cause.

Even worse, post-mortem analyses are yet harder when these events occur randomly and it is apparently impossible to trigger them in a controlled fashion. ~~Sometimes~~In spite of thorough testing at qualification stage, unfortunately, these situations may even occur when the system has already been deployed on the field and is used by end customers making the analysis amazingly troublesome.

Several techniques are available for post-mortem analysis. Software tools, hardware tools, or a combination of both can be leveraged. This article is the first of a series of Technical Notes (TN) describing in more details some of these techniques. ~~Some TN's~~ Interestingly, some of such articles refer to real-world cases in for which DAVE Embedded Systems ~~put in field~~ deployed its expertise to support ~~several~~ customers reporting on-field failures they were ~~able~~ unable to analyze with traditional debugging toolsand approaches. In these cases, information reported by customers are necessarily so limited and fragmented that often it is even impossible to determine a priori if the root cause is software or hardware related. Thus, no assumption about the root cause domain can be made and engineers need to be very open-minded to consider every possible cause.

~~It is worth remembering that the analysis described~~ == Articles in this series generally made no assumption about the root cause domain. In other words, information reported by customers were so limited and fragmented that was impossible to determine a priori if the root cause was software or hardware.==TBD

U0001

Bureaucrats, dave_user, Administrators

4,650

edits

DAVE Developer's Wiki β

Changes

MISC-TN-019: Post-portem analysis of embedded Linux systems — Part 1

DAVE Developer's Wiki ^β