Changes

Jump to: navigation, search

NELK Power Management

734 bytes added, 11:36, 14 January 2014
m
no edit summary
{{InfoBoxTop}}
{{AppliesToNaonFamily}}
{{InfoBoxBottom}}
 
== Introduction ==
In this article we'll see how Power Management is implemented in [[Naon Embedded Linux Kit (NELK)|NELK]], which option is available to the user and which result, regarding related to power consumption, can be reached.
Power Management (PM) can be divided in two major sections:
# PM in suspend mode (also called '''suspend to RAM'''): in this mode the CPU is halted, most of it's internal clock are gated, DDR memories are put in self-refresh state and [[:Category:Naon|Naon]] can wake up only when a peripheral from a user selected set of peripheral (usually internal timer, GPIO or UART)is activated.# PM in running mode: in this mode most of the SOM is working, only a few clocks are gated and the other others are slow down slowed to reduce power consumption. There are different level levels of clock/power supply settings that the user can choose accordingly select to what comply with the application need to dorequirements.
== Runtime Power Management Support ==
Runtime power management support allow allows the userspace application to choose the more appropriate level of performance/power consumption accordingly according to what the application itself need needs to do.
This PM can be divided in two sections:
# DSP, HDVICP2 and CORE (which put together HDVPSS, dual Cortex M3 and L3 bus) PM
The main difference between the two is that the Linux kernel itself knows pretty well how much the A8 is loaded at a given point in time (due to the fact that it's the kernel the one that schedule the kernel and userspace schedules its processes). Having a configurable CPU governor is a standard feature of Linux kernel, that can be found on PC/laptop tooPCs and laptops. Accordingly to user settings, the The Kernel changes A8 frequency/voltage on it's ownin accordance with user settings.
All the other stuff subsystems (DSP, HDVICP2 and CORE) are not managed directly from by the kernel (e.g. dual M3 runs run their own independent RTOS), for this reason it cannot choose which is the optimal working set from a PM point of view. It's userspace application responsibility to choose Choosing the correct OPP (Operating Performance Points) is userspace application's responsibility to reach a given obtain the desired result (e.geg. Full-HD H264 encoding vs 720p H264 decoding).
This is even more true when considering video management applications: in this application these applications runtime frequency scaling is critical because computational load is dependent on data stream and can change very quickly. What is usually done in this situation is to use a sub-optimal configuration that can handle the ''worst'' input stream input without loosing frames and without wasting too much power. For this reason A8 too can also be configured statically with OPP and thus disabling standard Linux governor support.
=== CPU Governor Usage ===
CPU governor is a standard feature of recent Linux Kernels. The kernel itself choose selects the best working point depending on CPU load. User can choose select between various predefined governors or choose the working frequency on it's ownhimself. In the latter configuration the kernel will only choose select the best allowed OPP from the frequency that chosen by the user chosen. Please note that this will also allow to thread allows the Cortex-A8 to be treated like the rest of the PM subsystem (DSP, CORE, HDVICP2) and thus allowing directly direct OPP setup (see next section).
Pre-configured governor governors are the one ones listed into in the following table:
{| class="wikitable"
=== OPP configuration ===
As stated before, automatic configuration by measuring system load, cannot be done performed for some elements or is not useful in many situations (e.geg. realtime video stream processing).
What is useful A suggested approach is to choose a set-point for each subsystem that has enough processing power to do the required work (e.geg. without loosing video frames) but while minimizing the power consumption.
The situation is even harder to manage to because of internal system constraint: changing the frequency for a subsystem requires changing its power supply (as a rule of thumb, higher power supply is required to have a stable higher frequency). There are some other restriction restrictions that does do not allow the user to select arbitrary settings on different subsystems (which will would lead to system instability).
For it's its [[:Category:Naon|Naon]] platform, Dave '''DAVE Embeddded Systems''' provides a ''custom'' PM driver that allow to chooseallows for selection, at runtime, of an OPP for each subsystem (Cortex A8, CORE, DSP, HDVICP) that already configure providing a default stable voltage/frequency setting. Some OPPs are directly derived from the one ones officially supported by TI (see DM814x TRM at DVFS section, for example) plus and some other custom OPP OPPs are provided by Dave'''DAVE Embeddded Systems'''.
{| class="wikitable"
|}
{{ImportantMessage|text=As described on the table above, not all OPPs can be used with video output enabled. In fact OPP0 has does not have enough processing power on HDVPSS to display even static images on HDMI (VOUT1) @1080p60. For streaming video processing (e.geg. H264 decoding) at least an OPP100 or higher is required.}}
{{Board Specific Information|text=Please note that A8 settings can be overloaded by Linux Kernel governor (see previous section). Choose ''userspace'' governor to configure A8 performance with static OPP.}}
OPP settings can be changed globally (in other words: the same OPP for each subsystem) or locally individually (a different OPP for each subsystem).
OPP setup can be done performed via sysfs interface. For example to globally configure ''OPP50'' globally , enter the following command at [[:Category:Naon|Naon]] console
<pre class="board-terminal">
</pre>
To choose The user can select a different OPP for each subsystem, the user can use using the same sysfs entry, but by specifying the OPPs separated by commas. E.gEg. for an application that does not require DSP and hardware encoder/decoder, the user can choose:
* OPP120 for ARM Cortex-A8
* OPP120 for CORE (HDVPSS and M3)
=== Manually choosing working frequencies ===
To have a more in-depth control over power management and performance, [[Category:Naon|Naon]] PM driver allow allows the user to specify the frequency for each configurable clock known to by the Linux Kernel.
{{WarningMessage|text=Please note that changing frequency will also require a power supply change, to have a stable system. However changing Changing power supply is a critical issue for the module and may damage it permanently. For this reason [[Category:Naon|Naon]] PM driver allow allows the user to customize only the frequency and not power supply, which is automatically adjusted by choosing the right appropriate OPP}}
Naon PM driver can manage a list of clocks and configure, for each of them:
We will look only at the former, while the latter will be detailed in the ''Standby'' section of this article.
First of all, the user should find the correct clock name, e.geg. by looking inside ''/sys/kernel/debug/clock'' (DEBUGFS must be enabled and mounted). We will suppose that the user wants to configure HDVPSS (and thus M3) frequency, which is derived from ''iss_dpll_ck'' PLL (only PLL are configurable, of course).
Now the user has to ''add'' the chosen clock to the list managed by Naon PM driver:
Please note the usage of '''-n''' (to skip the new line usage of echo command) and the '''+''' added to the clock name.
Clock removal from list can be done by changing ''+'' with ''-'', e.geg.
<pre>
</pre>
Before use using the clock, this it must be selected between all the one that are inside from the list
<pre class="board-terminal">
</pre>
User can now query read the current frequency value:
<pre class="board-terminal">
</pre>
Setting the frequency is just a matter of writing its value, in MHzHz, into the same sysfs entry:
<pre class="board-terminal">
== Standby Support ==
Entering standby means putting the [[Category:Naon|Naon]] module in into a ''sleep'' state, where the power consumption is minimal and also no processing is allowed. All processes are halted, ARM Cortex A8 goes into a specific mode called WFI (''Wait For Interrupt'') and DDR2 RAM goes in into self-refresh mode (the lowest power consumption mode without wasting memory contents).
An introduction to Standby support on DM814x SOC can be found on [http://processors.wiki.ti.com/index.php/DM814x_AM387x_PM_Suspend_resume_overview TI wiki] too.
Entering standby Standby mode can be done activated by any userspace application, while wakeup is allowed triggered only from:
* internal timer (which, of course, should be enabled and configured before entering standby)
* an interrupt sources source (e.geg. GPIO or UART). Please note that having an interrupt source enabled means that its clocks cannot be disabled when entering standby mode.
By default timer and UART (Linux serial console) wakeup are enabled.
{{Board Specific Information|text=To allow UART wakeup from Linux console, the user should add '''no_console_suspend''' parameter to kernel command line. See [[Change Linux Command Line Parameter from U-boot]] for more information on how to do this.}}
Due to system complexity and user application dependency, standby mode requires a bit of configuration for optimal performance. The user can (and should) set the standby configuration for various clockclocks, depending on it's its application. However Dave '''DAVE Embeddded Systems''' provides a default clock configuration and system setup that usually is both functional and best performing for common [[Category:Naon|Naon]] based platform. Clock Clocks are managed by the same Naon PM driver described on the previous sections.
From the '''user point of view''', entering standby is a matter of:
# configuring suspend/standby clock clocks (by slowing down their frequencies and/or gate gating them completely)
# configure wakeup sources
# enter standby
The first step is the most complex one, but is required to have obtain the best performance regarding performances in relation to power consumption. Dave '''DAVE Embeddded Systems''' provides a script that correctly configure configures most of the unused clock clocks and give gives the best power result, without too much impact on suspend/wakeup performance (regarding in terms of suspend/wakeup latency).
Here is the bash script provided with NELK:
<pre>
</pre>
See also the following table to understand what is done on each subsystem:
{| class="wikitable"
|}
Configuring wakeup source sources is a matter of choosing which IRQ are left unmasked when entering standby and isenabling/how enable configuring wakeup timer. We will take a look only to consider the latter, because the former is highly driver dependent.
Wakeup timer is described in details into in the [http://processors.wiki.ti.com/index.php/DM814x_AM387x_PM_Suspend_resume_overview TI wiki]. In brief, the user should just enter the timeout (in seconds and/or milliseconds) via sysfs. E.gEg. to wakeup after 2.5 secs seconds standby mode is entered, the following command can be used:
<pre class="board-terminal">
</pre>
{{Board Specific Information|text=Please note that the above commands requires require debugfs enabled and mounted (e.geg. ''mount -t debugfs debugfs /sys/kernel/debug'')}}
After all the above stuff has been setup, user can go into standby by running the following command
</pre>
If wakeup timer has been configured, among amongst the other messages , the user will see something like the following:
<pre class="board-terminal">
From the '''kernel point of view''' entering stand-by means:
* syncing filesystems and freezing processes
* call suspend() function of each registered device driver (e.geg. suspend() function of USB driver will put usb phy in suspend too)
* saving current OPP and entering OPP0 (see above)
* reconfigure and/or gate the user defined clocks
* (if needed) configure wakeup timer
* (if needed) mask all interrupts apart from except for the peripheral peripherals used for wakeup
* put DDR in self refresh
* put Cortex A8 in WFI
When an interrupt is issue issued or when wakeup timer elapses:
* Cortex A8 goes out of WFI
* DDR RAM are put back in working mode
* interrupt mask masks are restored
* the previously saved OPP is restored
* clock clocks are un-gated and/or restored* all processes are unfreezedunfrozen
=== Evaluate Wakeup Latency ===
* the wakeup latency is the running time of ''echo mem'' command minus the wakeup timer configuration.
This is not very precise but give gives an order of magnitude measure of the latency. Please note that the time for entering standby is also taken in into account.
Here is a sample result:
== PM performance and power consumption summary ==
In the following table we summarize the power consumption of the whole [[Category:Naon|Naon]] module (3.3V power supply) in the different situations.
{| class="wikitable"
|-
! OPP !! Overall Status !! MPU (A8) Processing !! Video Processing !! Shut Shunt Voltage [mV] !! Current [mA] !! Power Consumption [mW]
|-
| StandBy || suspend to RAM || WFI || Off || 2.5 || 250 || 825
* in standby mode on module 10/100Mbit Ethernet Phy is kept in reset
* 0% on both ''Processing'' columns means that the subsystem is turned on but without input
* shut shunt voltage has been measured on R422 (10mOhm), in [[:Category:NaonEVB-Mid|NaonEVB-Mid]], which feeds [[:Category:Naon|Naon]] module 3.3V power supply == Tools == === Measurements script === Measurements are performed accessing the INA226 device connected to the I2C bus. The following commands can be saved as a shell script (e.g. read_power_values) and run to collect the measurements data: <pre>#!/bin/shcd /sys/devices/platform/omap/omap_i2c.3/i2c-3/3-0041echo "input voltage $(cat in1_input)V"echo "shut voltage $(cat in0_input)mV"echo "current sink $(cat curr1_input)mA"echo "power consumption $(expr $(cat power1_input) / 1000) mW" </pre> <pre class="board-terminal">root@dm814x-evm:~# ./read_power_valuesinput voltage 3289Vshut voltage 10mVcurrent sink 978mApower consumption 3200 mW</pre>

Navigation menu