Open main menu

DAVE Developer's Wiki β

Changes

no edit summary
{{Applies To Bora}}
{{Applies To BoraX}}
{{AppliesToBORA_TN}}
{{AppliesToBORA_Xpress_TN}}
{{InfoBoxBottom}}
|[[Bora_Embedded_Linux_Kit_(BELK)#BELK_software_components|3.0.0]]
|Internal draft
|-
|1.0.0
|November 2015
|[[Bora_Embedded_Linux_Kit_(BELK)#BELK_software_components|2.2.0, 3.0.0]]
|First public release
|-
|}
<ref name="XAPP1078"> John McDougall, ''XAPP1078 (v1.0) Simple AMP Running Linux and Bare-Metal System on Both Zynq SoC Processors'', 14th February 2013</ref> and
<ref name="XAPP1079">John McDougall, ''XAPP1079 (v1.0.1) Simple AMP: Bare-Metal System Running on Both Cortex-A9 Processors'', 24th January 2014</ref>,
L2 cache - in contrast to L1 - is a shared resource in Zynq implementation. As such, in case both cores have to use it, specific techniques have to be implemented to handle it properly in dual-OS AMP configuration. In principle the approach that has been adopted allows to implement different strategies related to L2 cache management. The actual configuration that has been used to conduct the tests described in [[#Characterization and performance tests|this section]] assigns the whole L2 cache to the W2 world, as depicted in the following picture.
<span id="L2 cache usage"></span>
==Characterization and performance tests==
Some basics tests have been conducted to characterize the system configured as described above. The figure the tests focus on is the interrupt latency on W1 realm. This value has been measured under different system load conditions to verify if and how the non real-time world may influence the real-time world.
With the system configured as described aboveAbout Linux side, we run some basic test two load conditions have been considered:* idle* Google stressapptest (SAT for short) <ref name="SAT">https://code.google.com/p/stressapptest/"</ref> running to verify if stress SDRAM memory and how SD I/O.About RTOS side:* idle* memory intensive task; two subcases, in turn, have been considered to evaluate the non real-time world may influences impact of the real-time worldL2 cache unavailability.
We did the following tests:* Linux side:** idle** Google stressapptest <ref name="SAT">https://code.google.com/p/stressapptest/"</ref>, stressing memory and SD I/O* RTOS side:** idle** memory intensive task The RTOS memory task access an array in main memory of two different sizesizes:
* the smaller is half of L1 size (16KiB)
* the larger is 4 times the L1 size(128KiB)
The main task on the RTOS side is the ''latencystat'' demo provided with [[AN-BELK-001:_Asymmetric_Multiprocessing_(AMP)_on_Bora_–_Linux_FreeRTOS|AN-BELK-001]]<ref name="AN-BELK-001"></ref>, which:
* program programs PS TTC timer as freerun, triggering an interrupt on overflow* inside the overflow ISR it read the TTC counteris read: this counter reports the number of ticks passed elapsed between the event (overflow) and the handler itself, in other words our the interrupt latency* after a while the TTC is reprogrammed and interrupt is enabled again, to trigger another event* those ''latency countercounters'' are collected into an array* the Linux-side application, by default after 10 seconds, stop stops the RTOS task which sends the array data over RPMSRPMsg* the Linux-side application collect collects the data and display the mininum, maximum and average latency measured
The following table summarize the test results(all timing are given in ''ns'')
{| class="wikitable"
|-
! rowspan="2"| Lantecy !! Linux idle<br/>!! colspan="3" | Linux SAT|-! RTOS idle !! Linux SAT<br/>RTOS idle !! Linux SAT<br/>RTOS 16k !! Linux SAT<br/>RTOS 128k|-| min || align="right"| 287 || align="right"| 287 || align="right"| 287 || align="right"| 1268
|-
| avg || align="right"| 287 || align="right"| 296 || align="right"| 305 || align="right"| 2024
|-
| max || align="right"| 548 || align="right"| 539 || align="right"| 575 || align="right"| 3050
|}
==Conclusions and future work==
[[FileThe following conclusions can be drawn from the test results:TBD.png|thumb|center|200px|caption]]===Isolation vs performances===This work confirmed the need to find a trade* Real-off between two requirements that often push in opposite directions: isolation and performances. On one hand isolation should be pushed to the maximum possible extent to preserve the integrity timeness of W1 worldrealm is preserved in any condition, since Linux activity on CPU/memory/SD virtually has no influence on RTOS latency. On the other hand, overall systems performances have not to be affected so much that the product gets unusable* Moderate RTOS activity has no impact on latency. Generally speaking* As expected, strong isolation negatively impacts performancesin case intensive memory activity is performed on RTOS side, so finding the optimal balancing is not trivial. A "one size fits all" solution does not exist and system designer is responsible to choose which direction this knob has to be moved. This analysis naturally has to take into account application-specific requirementsdata/instruction cache misses increase significantly resulting in higher latency.
===Future work===
[[FileFuture work will first focus on an additional feature that has not been included in the requirement list but that is undoubtedly useful in several applications. We are referring to the possibility of performing a complete reboot of the GPOS under the control of the RTOS, while this keeps operating normally. For instance this can be exploited when the RTOS needs to work as software watchdog for W2 activity:TBDin case no activity is detected for a certain period of time, GPOS can be shutdown and rebooted.png|thumb|center|200px|caption]]
Another aspect that should be investigated in more depth refers to the effects of the communication between W1 and W2 on the IRQ latency and the integrity of the real-time world. This matter is strictly related to the degree of isolation between the two worlds. In this work a strong-isolation approach has been adopted, meaning that
*no data is exchanged during the execution of the IRQ latency measurement
*it has been implicitly assumed that data sent from W2 to W1 can not compromise the integrity of the trust domain.
These assumption may be not verified in real applications, however specific techniques can be implemented to manage these situations (see for example <ref name="Sangorrin's thesis"></ref> and <ref name="PreventingInterruptOverload">J. Regehr, U. Duongsaa, ''Preventing Interrupt Overload'', 2nd May 2005, http://www.cs.utah.edu/~regehr/papers/lctes05/regehr-lctes05.pdf</ref>).
-----
{{notelist}}
 
==References==
{{reflist}}
8,204
edits