Changes

Jump to: navigation, search
m
Results
== Results ==
TBD
The following table lists the prediction times for a single image depending on the model and the thread parameter.
 
{| class="wikitable" style="margin: auto;"
|+
Inference times
!Model
!Threads parameter
!Inference time
[ms]
!Notes
|-
| rowspan="3" |'''Floating-point'''
|unspecified
|220
|
|-
|1
|220
|
|-
|2
|390
|
|-
|'''Half-quantized'''
|unspecified
|330
|
|-
| rowspan="2" |'''Fully-quantized'''
|unspecified
|200
|Four threads are created beside the main process (supposedly, this quantity is set accordingly to the number of physical cores available). Nevertheless, they seem to be constantly in sleep state.
|-
|4
|80
|Interestingly, 7 actual processes are created beside the main one. Four of them, however, seem to be constantly in sleep state.
|}
 
The prediction time '''takes into account the time needed to fill the input tensor with the image'''. Furthermore, it is averaged over several predictions.
 
The same tests were repeated using a network file system (NFS) over an Ethernet connection, too. No significant variations in the prediction times were observed.
 
In conclusion, to maximize the performance in terms of execution time, the model has to be fully-quantized and the number of threads has to be specified explicitly.
 
("In addition, this document compares the results achieved to the ones produced by the platforms that were considered in the [[ML-TN-001 - AI at the edge: comparison of different embedded platforms - Part 1#Articles in this series|previous articles of this series]]")
89
edits

Navigation menu