Open main menu

DAVE Developer's Wiki β

Changes

NELK Power Management

21,892 bytes added, 15:25, 22 November 2012
Created page with "== Introduction == In this article we'll see how Power Management is implemented in NELK, which option is available to the user and which re..."
== Introduction ==
In this article we'll see how Power Management is implemented in [[Naon Embedded Linux Kit (NELK)|NELK]], which option is available to the user and which result, regarding power consumption, can be reached.

Power Management (PM) can be divided in two major sections:
# PM in suspend mode (also called suspend to RAM): in this mode the CPU is halted, most of it's internal clock are gated, DDR memories are put in self-refresh state and [[Category:Naon|Naon]] can wake up only from a user selected set of peripheral (usually internal timer, GPIO or UART)
# PM in running mode: in this mode most of the SOM is working, only a few clocks are gated and the other are slow down to reduce power consumption. There are different level of clock/power supply settings that the user can choose accordingly to what the application need to do.

== Runtime Power Management Support ==

Runtime power management support allow the userspace application to choose the more appropriate level of performance/power consumption accordingly to what the application itself need to do.

This PM can be divided in two sections:
# ARM Cortex-A8 PM
# DSP, HDVICP2 and CORE (which put together HDVPSS, dual Cortex M3 and L3 bus) PM

The main difference between the two is that the Linux kernel itself knows pretty well how much the A8 is loaded at a given point in time (due the fact that it's the kernel the one that schedule the kernel and userspace processes). Having a configurable CPU governor is a standard feature of Linux kernel, that can be found on PC/laptop too. Accordingly to user settings, the Kernel changes A8 frequency/voltage on it's own.

All the other stuff (DSP, HDVICP2 and CORE) are not managed directly from the kernel (e.g. dual M3 runs their own independent RTOS), for this reason it cannot choose which is the optimal working set from PM point of view. It's userspace application responsibility to choose the correct OPP (Operating Performance Points) to reach a given result (e.g. Full-HD H264 encoding vs 720p H264 decoding).

This is even more true when considering video management applications: in this application runtime frequency scaling is critical because computational load is dependent on data stream and can change very quickly. What is usually done in this situation is to use a sub-optimal configuration that can handle the ''worst'' stream input without loosing frames and without wasting too much power. For this reason A8 too can be configured statically with OPP and thus disabling standard Linux governor support.

=== CPU Governor Usage ===

CPU governor is a standard feature of recent Linux Kernels. The kernel itself choose the best working point depending on CPU load. User can choose between various predefined governors or choose the working frequency on it's own. In the latter configuration the kernel will only choose the best allowed OPP from the frequency that the user chosen. Please note that this will also allow to thread the Cortex-A8 like the rest of the PM subsystem (DSP, CORE, HDVICP2) and thus allowing directly OPP setup (see next section).

Pre-configured governor are the one listed into the following table:

{| class="wikitable"
|-
! Governor !! Brief description
|-
| powersave || configure the lowest CPU frequency
|-
| performance || configure the highest CPU frequency
|-
| ondemand || set the CPU frequency depending on current usage, fast change between slowest and highest frequencies
|-
| conservative || much like ''ondemand'' but change CPU frequency more gracefully
|-
| userspace || allow root processes to configure CPU frequency. No changes are done automatically
|}

User can change the current governor with sysfs, e.g.:

<pre class="board-terminal">
root@naon:~# echo powersave > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
[ 402.660000] cpufreq-omap: frequency transition: 600000 --> 200000
[ 402.660000] cpufreq-omap: voltage transition: 1200000 --> 1000000
</pre>

<pre class="board-terminal">
root@naon:~# echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
[ 678.540000] cpufreq-omap: frequency transition: 200000 --> 1000000
[ 678.540000] cpufreq-omap: voltage transition: 1000000 --> 1350000
</pre>

Default governor behavior can be changed by configuring the other sysfs entries inside ''/sys/devices/system/cpu/cpu0/cpufreq''. For more information regarding standard Linux governor see the file ''Documentation/cpu-freq/governors.txt'' inside Linux kernel source tree.

=== OPP configuration ===

As stated before, automatic configuration by measuring system load, cannot be done for some elements or is not useful in many situations (e.g. realtime video stream processing).

What is useful is to choose a set-point for each subsystem that has enough processing power to do the required work (e.g. without loosing video frames) but while minimizing the power consumption.

The situation is even harder to manage to internal system constraint: changing the frequency for a subsystem requires changing its power supply (as a rule of thumb, higher power supply is required to have a stable higher frequency). There are some other restriction that does not allow the user to select arbitrary settings on different subsystems (which will lead to system instability).

For it's [[Category:Naon|Naon]] platform, Dave provides a ''custom'' PM driver that allow to choose, at runtime, an OPP for each subsystem (Cortex A8, CORE, DSP, HDVICP) that already configure a stable voltage/frequency setting. Some OPPs are directly derived from the one officially supported by TI (see DM814x TRM at DVFS section, for example) plus some other custom OPP provided by Dave.

{| class="wikitable"
|-
! OPP !! Brief Description !! Known Restrictions
|-
| OPP0 || minimum power consumption without standby || Video subsystem is disabled (both VOUTx and VINx)
|-
| OPP50 || mimimum power consumption with static video output (HDMI 1080P) || only static (FB-based) output is allowed, VIN disabled
|-
| OPP100 || base performance with video working, A8@600MHz || video processing subsystem performance are limited
|-
| OPP120 || medium performance, A8@720MHz || not allowed on base Naon module
|-
| OPP166 || high performance, A8@1GHz, DSP@700MHz || top CPU/video performance, requires Naon module DAxxxxx (TDB)
|-
| OPP166x || top performance, A8@1GHz, DSP@700MHz || maximum video processing performance, requires Naon module DAxxxxx (TDB)
|}

{{ImportantMessage|text=As described on the table above, not all OPPs can be used with video output enabled. In fact OPP0 has not enough processing power on HDVPSS to display even static images on HDMI (VOUT1) @1080p60. For streaming video processing (e.g. H264 decoding) at least OPP100 is required.}}


{{Board Specific Information|text=Please note that A8 settings can be overloaded by Linux Kernel governor (see previous section). Choose ''userspace'' governor to configure A8 performance with static OPP.}}

OPP settings can be changed globally (in other words: the same OPP for each subsystem) or locally (a different OPP for each subsystem).

OPP setup can be done via sysfs interface. For example to configure ''OPP50'' globally enter the following command at [[Category:Naon|Naon]] console

<pre class="board-terminal">
root@naon:~# echo -n OPP50 > /sys/devices/platform/naon_power/opp
[ 1640.460000] naon_power naon_power: entering OPP50 for domain ARM
[ 1640.460000] naon_power naon_power: [OPP50] slowing down clock arm_dpll_ck from 600000000 to 200000000
[ 1640.470000] naon_power naon_power: [OPP50] setting power for ARM to 1050 [mV]
[ 1640.480000] naon_power naon_power: entering OPP50 for domain CORE
[ 1640.490000] naon_power naon_power: [OPP50] slowing down clock iss_dpll_ck from 400000000 to 200000000
[ 1640.500000] naon_power naon_power: [OPP50] slowing down clock l3_dpll_ck from 200000000 to 50000000
[ 1640.510000] naon_power naon_power: [OPP50] setting power for CORE to 1050 [mV]
[ 1640.530000] naon_power naon_power: [OPP50] speed-up clock hdvpss_dpll_ck from 20000000 to 150000000
[ 1640.540000] naon_power naon_power: entering OPP50 for domain DSP
[ 1640.540000] naon_power naon_power: [OPP50] slowing down clock dsp_dpll_ck from 500000000 to 100000000
[ 1640.560000] naon_power naon_power: [OPP50] setting power for DSP to 1050 [mV]
[ 1640.590000] naon_power naon_power: entering OPP50 for domain HDVICP
[ 1640.590000] naon_power naon_power: [OPP50] slowing down clock hdvicp_dpll_ck from 266000000 to 50000000
[ 1640.610000] naon_power naon_power: [OPP50] setting power for HDVICP to 1050 [mV]
</pre>

To choose a different OPP for each subsystem, the user can use the same sysfs entry, but by specifying the OPPs separated by commas. E.g. for an application that does not require DSP and hardware encoder/decoder, the user can choose:
* OPP120 for ARM Cortex-A8
* OPP120 for CORE (HDVPSS and M3)
* OPP0 for DSP
* OPP0 for HDVICP2

By entering:
<pre class="board-terminal">
root@naon:~# echo -n OPP120,OPP120,OPP0,OPP0 > /sys/devices/platform/naon_power/opp
[ 2183.050000] naon_power naon_power: entering OPP120 for domain ARM
[ 2183.060000] naon_power naon_power: [OPP120] setting power for ARM to 1200 [mV]
[ 2183.070000] naon_power naon_power: [OPP120] speed-up clock arm_dpll_ck from 200000000 to 720000000
[ 2183.080000] naon_power naon_power: entering OPP120 for domain CORE
[ 2183.090000] naon_power naon_power: [OPP120] setting power for CORE to 1200 [mV]
[ 2183.100000] naon_power naon_power: [OPP120] speed-up clock iss_dpll_ck from 200000000 to 400000000
[ 2183.110000] naon_power naon_power: [OPP120] speed-up clock hdvpss_dpll_ck from 150000000 to 200000000
[ 2183.120000] naon_power naon_power: [OPP120] speed-up clock l3_dpll_ck from 50000000 to 220000000
[ 2183.130000] naon_power naon_power: entering OPP0 for domain DSP
[ 2183.140000] naon_power naon_power: [OPP0] slowing down clock dsp_dpll_ck from 100000000 to 10000000
[ 2183.150000] naon_power naon_power: [OPP0] setting power for DSP to 1050 [mV]
[ 2183.150000] naon_power naon_power: entering OPP0 for domain HDVICP
[ 2183.160000] naon_power naon_power: [OPP0] slowing down clock hdvicp_dpll_ck from 50000000 to 10000000
[ 2183.170000] naon_power naon_power: [OPP0] setting power for HDVICP to 1050 [mV]
</pre>

The user can also see the currently configured OPPs by looking inside the same sysfs entry:

<pre class="board-terminal">
root@naon:~# cat /sys/devices/platform/naon_power/opp
ARM: OPP120 CORE: OPP120 DSP: OPP0 HDVICP: OPP0
</pre>

{{ImportantMessage|text=User can change OPP setting at runtime without putting the machine or it's application in any ''special'' mode. However, change CORE frequency, which involves HDVPSS and thus video output (VOUTx) encoders, may lead to display flickering. }}

=== Manually choosing working frequencies ===

To have a more in-depth control over power management and performance, [[Category:Naon|Naon]] PM driver allow the user to specify the frequency for each configurable clock known to the Linux Kernel.

{{WarningMessage|text=Please note that changing frequency will also require a power supply change, to have a stable system. However changing power supply is a critical issue for the module and may damage it permanently. For this reason [[Category:Naon|Naon]] PM driver allow the user to customize only the frequency and not power supply, which is automatically adjusted by choosing the right OPP}}

Naon PM driver can manage a list of clocks and configure, for each of them:
* the running frequency
* the standby frequency

We will look only at the former, while the latter will be detailed in the ''Standby'' section of this article.

First of all, the user should find the correct clock name, e.g. by looking inside ''/sys/kernel/debug/clock'' (DEBUGFS must be enabled and mounted). We will suppose that the user wants to configure HDVPSS (and thus M3) frequency, which is derived from ''iss_dpll_ck'' PLL (only PLL are configurable, of course).

Now the user has to ''add'' the chosen clock to the list managed by Naon PM driver:

<pre class="board-terminal">
echo -n +iss_dpll_ck > /sys/devices/platform/naon_power/clk_list
</pre>

Please note the usage of '''-n''' (to skip the new line usage of echo command) and the '''+''' added to the clock name.

Clock removal from list can be done by changing ''+'' with ''-'', e.g.

<pre>
root@naon:~/pm# echo -n -iss_dpll_ck > /sys/devices/platform/naon_power/clk_list
[ 1295.590000] naon_power naon_power: iss_dpll_ck removed
</pre>

Before use the clock, this must be selected between all the one that are inside the list

<pre class="board-terminal">
root@naon:~/pm# echo -n iss_dpll_ck > /sys/devices/platform/naon_power/selected_clk
[ 1060.570000] naon_power naon_power: iss_dpll_ck finded
</pre>

User can now query the current frequency value:

<pre class="board-terminal">
root@naon:~/pm# cat /sys/devices/platform/naon_power/selected_clk
iss_dpll_ck
root@naon:~/pm# cat /sys/devices/platform/naon_power/rate
200000000
</pre>

Setting the frequency is just a matter of writing its value, in MHz, into the same sysfs entry:

<pre class="board-terminal">
root@naon:~/pm# echo -n 300000000 > /sys/devices/platform/naon_power/rate
[ 1389.570000] naon_power naon_power: setting rate to 300000000
</pre>

== Standby Support ==

Entering standby means putting the [[Category:Naon|Naon]] module in a ''sleep'' state, where the power consumption is minimal and also no processing is allowed. All processes are halted, ARM Cortex A8 goes into a specific mode called WFI (''Wait For Interrupt'') and DDR2 RAM goes in self-refresh mode (the lowest power consumption mode without wasting memory contents).

An introduction to Standby support on DM814x SOC can be found on [http://processors.wiki.ti.com/index.php/DM814x_AM387x_PM_Suspend_resume_overview TI wiki] too.

Entering standby mode can be done by any userspace application, while wakeup is allowed only from:
* internal timer (which, of course, should be enabled and configured before entering standby)
* an interrupt sources (e.g. GPIO or UART). Please note that having an interrupt source enabled means that its clocks cannot be disabled when entering standby mode.

By default timer and UART (Linux serial console) wakeup are enabled.

{{Board Specific Information|text=To allow UART wakeup from Linux console, the user should add '''no_console_suspend''' parameter to kernel command line. See [[Change Linux Command Line Parameter from U-boot]] for more information on how to do this.}}

Due system complexity and user application dependency, standby mode requires a bit of configuration for optimal performance. The user can (and should) set the standby configuration for various clock, depending on it's application. However Dave provides a default clock configuration and system setup that usually is both functional and best performing for common [[Category:Naon|Naon]] based platform. Clock are managed by the same Naon PM driver described on the previous sections.

From the '''user point of view''', entering standby is a matter of:
# configuring suspend/standby clock (by slowing down their frequencies and/or gate them completely)
# configure wakeup sources
# enter standby

The first step is the most complex one, but is required to have the best performance regarding power consumption. Dave provides a script that correctly configure most of the unused clock and give the best power result, without too much impact on suspend/wakeup performance (regarding suspend/wakeup latency).

Here is the bash script provided with NELK

<pre>
#!/bin/sh
setup_pll()
{
echo configuring $1 to $2 MHz
echo -n +$1 > /sys/devices/platform/naon_power/clk_list
echo -n $1 > /sys/devices/platform/naon_power/selected_clk
echo -n $(expr $2 \* 1000 \* 1000) > /sys/devices/platform/naon_power/rate
}

setup_suspend_clock()
{
echo configuring $1 to $2 MHz
echo -n +$1 > /sys/devices/platform/naon_power/clk_list
echo -n $1 > /sys/devices/platform/naon_power/selected_clk
echo -n $(expr $2 \* 1000 \* 1000) > /sys/devices/platform/naon_power/suspend_rate
}

gate_clock_on_suspend()
{
echo gating $1 on suspend
echo -n +$1 > /sys/devices/platform/naon_power/clk_list
echo -n $1 > /sys/devices/platform/naon_power/selected_clk
echo -n 0 > /sys/devices/platform/naon_power/suspend_rate
}

gate_clock_on_suspend iss_dpll_ck
gate_clock_on_suspend hdvicp_dpll_ck
gate_clock_on_suspend dsp_dpll_ck

setup_pll hdvpss_dpll_ck 20
gate_clock_on_suspend hdvpss_dpll_ck
gate_clock_on_suspend sgx_dpll_ck
gate_clock_on_suspend audio_dpll_ck
gate_clock_on_suspend uart6_fck
gate_clock_on_suspend uart5_fck
gate_clock_on_suspend uart4_fck
gate_clock_on_suspend uart3_fck
gate_clock_on_suspend uart2_fck
#gate_clock_on_suspend uart1_fck
</pre>

See also the following table to understand what is done on each subsystem

{| class="wikitable"
|-
! Clock !! Affected subsystem !! Standby State
|-
| iss_dpll_ck || M3 || gated
|-
| hdvicp_dpll_ck || HDVICP2|| gated
|-
| dsp_dpll_ck || DSP || gated
|-
| hdvpss_dpll_ck || HDVPSS|| gated
|-
| sgx_dpll_ck|| SGX || gated
|-
| audio_dpll_ck || audio (McASP, McBSP..) || gated
|-
| uart[2..6]_fck || UART [2..6] || gated
|}

Configuring wakeup source is a matter of choosing which IRQ are left unmasked when entering standby and is/how enable wakeup timer. We will take a look only to the latter, because the former is highly driver dependent.

Wakeup timer is described in details into the [http://processors.wiki.ti.com/index.php/DM814x_AM387x_PM_Suspend_resume_overview TI wiki]. In brief, the user should just enter the timeout (in seconds and/or milliseconds) via sysfs. E.g. to wakeup after 2.5 secs standby mode is entered, the following command can be used:

<pre class="board-terminal">
root@naon:~# echo 2 > /sys/kernel/debug/pm_debug/wakeup_timer_seconds
root@naon:~# echo 500 > /sys/kernel/debug/pm_debug/wakeup_timer_milliseconds
</pre>

{{Board Specific Information|text=Please note that the above commands requires debugfs enabled and mounted (e.g. ''mount -t debugfs debugfs /sys/kernel/debug'')}}

After all the above stuff has been setup, user can go into standby by running the following command

<pre class="board-terminal">
root@naon:~# echo mem > /sys/power/state
</pre>

If wakeup timer has been configured, among the other messages the user will see something like the following:

<pre class="board-terminal">
root@naon:~# echo 2 > /sys/kernel/debug/pm_debug/wakeup_timer_seconds
root@naon:~# echo 500 > /sys/kernel/debug/pm_debug/wakeup_timer_milliseconds
root@naon:~# echo mem > /sys/power/state
[snip]
[ 322.670000] PM: Resume timer in 2.500 secs (50000000 ticks at 20000000 ticks/sec.)
[snip]
</pre>

From the '''kernel point of view''' entering stand-by means:
* syncing filesystems and freezing processes
* call suspend() function of each registered device driver (e.g. suspend() function of USB driver will put usb phy in suspend too)
* saving current OPP and entering OPP0 (see above)
* reconfigure and/or gate the user defined clocks
* (if needed) configure wakeup timer
* (if needed) mask all interrupts apart from the peripheral used for wakeup
* put DDR in self refresh
* put Cortex A8 in WFI

When an interrupt is issue or when wakeup timer elapses:
* Cortex A8 goes out of WFI
* DDR RAM are put back in working mode
* interrupt mask are restored
* the previously saved OPP is restored
* clock are un-gated and/or restored
* all processes are unfreezed

=== Evaluate Wakeup Latency ===

Wakeup latency, in other words the time required by a suspended system to be back in fully functional state, can be evaluated by:
* configuring the wakeup timer to a given value
* use ''time'' function to measure the running time of the ''echo mem > /sys/power/state'' command
* the wakeup latency is the running time of ''echo mem'' command minus the wakeup timer configuration.

This is not very precise but give an order of magnitude measure of the latency. Please note that the time for entering standby is also taken in account.

Here is a sample result:

<pre class="board-terminal">
root@naon:~# echo -n OPP100 > /sys/devices/platform/naon_power/opp
root@naon:~# echo 5 > /sys/kernel/debug/pm_debug/wakeup_timer_seconds
root@naon:~# echo 0 > /sys/kernel/debug/pm_debug/wakeup_timer_milliseconds
root@naon:~# time echo mem > /sys/power/state
real 0m 5.86s
user 0m 0.00s
sys 0m 0.03s
</pre>

In the example the total running time of ''echo mem'' command was 5.86s, while wakeup timer was configured with a timeout of 5s. '''The total latency is 860 ms'''.

== PM performance and power consumption summary ==

In the following table we summarize the power consumption of the whole [[Category:Naon|Naon]] module (3.3V power supply) in the different situations.

{| class="wikitable"
|-
! OPP !! Overall Status !! MPU (A8) Processing !! Video Processing !! Shut Voltage [mV] !! Current [mA] !! Power Consumption [mW]
|-
| StandBy || suspend to RAM || WFI || Off || 2.5 || 250 || 825
|-
| OPP0 || CPU working (slowly) || 0%-100% || Off || 4.9 || 490 || 1617
|-
| OPP50 || CPU/Static video working || 0% || HDMI 1080p60, static || 6.1 || 610 || 2013
|-
| OPP100 || all subsystem working, A8@600MHz || 0% || 0% || 7.6 || 760 || 2508
|-
| OPP120 || all subsystem working, A8@720MHz || 0% || 0% || 8.5 || 850 || 2805
|-
| OPP166 || all subsystem working, A8@1GHz || 0% || 0% || 10.4 || 1040 || 3432
|-
| OPP166x || all subsystem working, A8@1GHz || 0% || 0% || 10.7 || 1070 || 3531
|}

NOTE:
* in standby mode on module 10/100Mbit Ethernet Phy is kept in reset
* 0% on both ''Processing'' columns means that the subsystem is turned on but without input
* shut voltage has been measured on R422 (10mOhm), in [[:Category:NaonEVB-Mid|NaonEVB-Mid]], which feeds [[:Category:Naon|Naon]] module 3.3V power supply