Changes

Jump to: navigation, search
no edit summary
{{InfoBoxTop}}
{{AppliesToSBCX}}
{{AppliesToAxel}}
{{AppliesToAxelEsatta}}
{{AppliesToAxelLite}}
{{AppliesToAxelEsattaAppliesToAXEL Lite TN}}{{AppliesToSBCX}}
{{InfoBoxBottom}}
{{WarningMessage|text=This technical note was validated against specific versions of hardware and software. What is described here may not work with other versions.}}
|Frequency
[MHz]
|533(*)
|-
|Bus witdth
|2048
|}
 
 
(*) It is worth remembering that i.MX6DualLite/Solo could achieve better results in terms of memory bandwidth, even though their SDRAM bus frequency is lower (400 MHz). This is due to an errata of the ARM PL310 L2 cache controller. This bug is not present in the i.MX6DualLite/Solo SoC's, which integrate a newer version of the controller.
===Software configuration===
|14.9
|}
 
As expected, the efficiency is relatively low. Generally, 32-bit ARM architectures are known to have mediocre performances when it comes to memory bandwidth.
 
Please see [https://www.cs.virginia.edu/stream/ this page] for more details about STREAM benchmark.
|633
|}
 
{| class="wikitable"
|+Memory write bandwidth
!Buffer size
!Bandwitdth
[MB/s]
|-
|512B
|3724
|-
|1kB
|3848
|-
|2kB
|3902
|-
|4kB
|3940
|-
|8kB
|3958
|-
|16kB
|3957
|-
|32kB
|3964
|-
|64kB
|3967
|-
|128kB
|3967
|-
|256kB
|3956
|-
|512kB
|3947
|-
|1MB
|2097
|-
|2MB
|2154
|-
|4MB
|2114
|-
|8MB
|2082
|-
|16MB
|2084
|-
|32MB
|2085
|-
|64MB
|2093
|-
|128MB
|2086
|-
|256MB
|2089
|-
|512MB
|2087
|-
|1GB
|2088
|}
 
The most interesting results to consider are those that refer to buffer sizes exceeding 1MB, which is the size of the L2 cache. Approximately, read bandwidth is 630MB/s (7.8% efficiency), while write bandwidth is 2080 MB/s (25.7% efficiency). These numbers are significantly different that the ones provided by STREAM. This confirms once again that such results are strongly dependent on the implementation of the test used to determine the bandwidth.
For more information regarding LMbench, please see [http://lmbench.sourceforge.net/ this page].
===pmbw===
As defined by the author, <code>pmbw</code> is "a set of assembler routines to measure the parallel memory (cache and RAM) bandwidth of modern multi-core machines."It performs a myriad of tests. Luckily, it comes with a handful tool that plots the results—which are stored in a text file—in a series of charts.
TBDThe complete results and the charts are available at the following links:*http://mirror.dave.eu/axel/SBCX-TN-006/pmbw-stats-AxelLite-i.MX6Q-996MHz.txt*http://mirror.dave.eu/axel/SBCX-TN-006/pmbw-plots-AxelLite-i.MX6Q-996MHz.pdf Generally speaking, the charts exhibit significant declines in the performances when the array size is around the L1 and the L2 cache size.
For more details about <code>pmbw</code>, please refer to [https://panthema.net/2013/pmbw/ this page].
The complete results are available at the following links:
*http://mirror.dave.eu/axel/SBCX-TN-006/pmbw-stats-AxelLite-i.MX6Q-996MHz.txt
*http://mirror.dave.eu/axel/SBCX-TN-006/pmbw-plots-AxelLite-i.MX6Q-996MHz.pdf
==Useful links==
*Joshua Wyatt Smith and Andrew Hamilton, [http://inspirehep.net/record/1424637/files/1719033_626-630.pdf Parallel benchmarks for ARM processors in the highenergy context]
====Building====
To build STREAM:
* clone its git repository
*modify the <code>Makefile</code> as shown below
*issue the <code>make</code> command.
 
<pre class="board-terminal">
git clone https://github.com/jeffhammond/STREAM.git
===pmbw===
====Building====
Building pmbw is straightforward. Please click on ''Expand'' to show the box that illustrates the procedure.
<pre class="board-terminal mw-collapsible mw-collapsed">
armbian@sbcx:~/devel/pmbw$ git clone https://github.com/bingmann/pmbw.git
Running nthreads=1 factor=3791455289 areasize=1024 thrsize=1024 testsize=1024 repeats=3702594 testvol=3791456256 testaccess=947864064
...
</pre>
 
 
To generate the charts plotting the results, the following command was issued:
<pre class="board-terminal">
./stats2gnuplot stats.txt | gnuplot
</pre>
8,154
edits

Navigation menu