Open main menu

DAVE Developer's Wiki β

Changes

no edit summary
{{InfoBoxTop}}
{{AppliesToMachineLearning}}
{{AppliesTo Machine Learning TN}}
{{InfoBoxBottom}}
==Building the application==
The starting point for the application is the model—in the form of a Keras .h5 file—described model described [[ML-TN-001_-_AI_at_the_edge:_comparison_of_different_embedded_platforms_-_Part_1#Reference_application_.231:_fruit_classifier|here]]. Incidentally, this is the '''same''' file model structure was used as starting point for [[ML-TN-001_-_AI_at_the_edge:_comparison_of_different_embedded_platforms_-_Part_2|this other test]] as well(*). This makes the comparison of the two tests straightforward, even though they were run on SoC's that differ significantly from the architectural standpoint.
 
 
(*) The two models share the same structure but, as they are trained independently, their weights differ.
===Training the model===
Model training is performed with the help of the Docker container provided by Vitis AI.
|}
 Interestingly, having four threads—i.e. the same number of CPU cores—allows to furtherly further increment the throughput by a factor of almost 2 , while keeping the DPU cores occupation low. It should not be forgotten, in fact, that part of the algorithm does make use of the CPU computational power as well.
=====Six threads=====
It is worth mentioning that, when *When the number of threads is greater than 1, the latency of the DPU_0 is higher than the latency of the DPU_1, although they are equivalent in terms of hardware configuration. To date, this fact is still unexplained.*Increasing the number of threads of the VART-based application beyond 6 does not further increase the achieved throughput.
dave_user, Administrators
5,178
edits