Changes

ML-TN-001 - AI at the edge: comparison of different embedded platforms - Part 4

1,246 bytes added, 16:28, 16 October 2020

m

→‎Results

== Results ==

TBD

The following table lists the prediction times for a single image depending on the model and the thread parameter.

{| class="wikitable" style="margin: auto;"

|+

Inference times

!Model

!Threads parameter

!Inference time

[ms]

!Notes

|-

| rowspan="3" |'''Floating-point'''

|unspecified

|220

|

|-

|1

|220

|

|-

|2

|390

|

|-

|'''Half-quantized'''

|unspecified

|330

|

|-

| rowspan="2" |'''Fully-quantized'''

|unspecified

|200

|Four threads are created beside the main process (supposedly, this quantity is set accordingly to the number of physical cores available). Nevertheless, they seem to be constantly in sleep state.

|-

|4

|80

|Interestingly, 7 actual processes are created beside the main one. Four of them, however, seem to be constantly in sleep state.

|}

The prediction time '''takes into account the time needed to fill the input tensor with the image'''. Furthermore, it is averaged over several predictions.

The same tests were repeated using a network file system (NFS) over an Ethernet connection, too. No significant variations in the prediction times were observed.

In conclusion, to maximize the performance in terms of execution time, the model has to be fully-quantized and the number of threads has to be specified explicitly.

("In addition, this document compares the results achieved to the ones produced by the platforms that were considered in the [[ML-TN-001 - AI at the edge: comparison of different embedded platforms - Part 1#Articles in this series|previous articles of this series]]")

U0018

89

edits

Changes

ML-TN-001 - AI at the edge: comparison of different embedded platforms - Part 4

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Quick Links

Contact us

How to use wiki

Advanced Search

Tools