sections/*: updates
parent
9f31bf0180
commit
40e25cc7ff
|
@ -41,7 +41,7 @@ Our basic design is to set the CPU speed to 70\% for general case, unless overri
|
|||
Hints from userspace generally involve an upcoming performance-prioritized period where the CPUs should be set to 100\%.
|
||||
We also support a userspace hint for pending memory-bound workload, where the CPU should be set to a lower speed of 40\%.
|
||||
Userspace can also indicate that performance-priority or memory-bound optimized periods have ended, and that default behavior should resume.
|
||||
For stability and security, we implement a time-out of \fixme{what} -- we have observed these periods are less than this in practice, and userspace can always re-supply the hint.
|
||||
For stability and security, we implement a time-out of 10s -- we have observed these periods are typically much less than this in practice, and userspace can always re-supply the hint.
|
||||
When \systemname receives conflicting hints from different apps, it prioritizes both performance-optimized and memory-bound hints over default behavior (energy-centric) hints.
|
||||
We reserve resolution between performance-prioritized (speed 100) and memory-bound (speed 40) hints to future work.
|
||||
We do not address different simultaneous CPU speeds: On our device, the speeds of the 4 big and 4 little core clusters must be set as a block.
|
||||
|
@ -54,7 +54,7 @@ The Android AOSP 10 platform forms the base of our system.
|
|||
We modded the Linux 4.4.210 kernel with the new \systemname governor to implement the design logic.
|
||||
The kernel can then use this information at its discretion to set an appropriate CPU speed.
|
||||
%(modulo very low battery etc.)
|
||||
An additional syscall API, with native calldown support from the Android platform, allows userspace to communicate hints about pending system needs from userspace.
|
||||
A syscall API, with native calldown support from the Android platform, allows userspace to communicate hints about pending system needs from userspace and suggested CPU speed settings.
|
||||
|
||||
We modded the AOSP framework to expose the new syscall interface to userlevel Java code.
|
||||
\fixme{todo}
|
||||
|
|
|
@ -22,6 +22,8 @@ The current default CPU policy, \schedutil, bases speed from the proportion of r
|
|||
|
||||
\subsection{Idle overrides any speed}
|
||||
|
||||
\fixme{ADD: discussion re: p-states and c-states -- we rely on the later to shut off cores but do not delve into tweaking it}
|
||||
|
||||
%An obvious question with running an app at a fixed speed is what happens when work finishes.
|
||||
Running a CPU with no work wastes energy, and slowing the CPU saves energy.
|
||||
%This complex speed-selection system is not the only way, however.
|
||||
|
@ -81,7 +83,7 @@ With history-driven dynamic policies such as \schedutil, this triggers constantl
|
|||
Figure \ref{fig:speed_time} shows CPU speed, over time, for 1s of scrolling through the Facebook app feed.
|
||||
The top plot shows nominal CPU speed, ignoring idling -- this is the setting requested by the default policy.
|
||||
|
||||
Previous studies have noted that interactive workloads significantly harm smartphone performance.\cite{nuessle2019benchmarking}
|
||||
Previous studies have noted that intermittent workloads significantly harm smartphone performance.\cite{nuessle2019benchmarking}
|
||||
Figure \ref{fig:speed_time_delay} shows how the combination of intermittent loads with lagging ramp-up speeds picked by the \schedutil policy increases runtime significantly.
|
||||
We ran the same fixed workload with 2 different delay settings.
|
||||
In both cases, the 2 righthand graphs are time zooms of the 2 left graphs to show detail.
|
||||
|
@ -168,14 +170,16 @@ Some system of changing CPU speed is necessary to achieve the base goals of furn
|
|||
Instead of a system that blindly uses the history of past CPU usage to make frequent fine adjustments to CPU speed, is there an alternative?
|
||||
|
||||
Our experiments suggest a less complex approach for phone CPU management.
|
||||
When performance needs warrant, immediately run the CPU at maximum speed.
|
||||
Aside from performance-critical periods -- such as when the user is waiting on a compute-bound task -- CPUs should run at or near a midspeed setting that conserves energy.
|
||||
\fixme{phraseology correct?}
|
||||
In both cases, let the idle subsystem turn off any unneeded CPUs.
|
||||
This avoids the twin pitfalls of picking a speed that is too low -- wasting either performance, energy or both -- or too high, wasting energy.
|
||||
Previous studies have acknowledged the policy goal for the smartphone platform should be to minimize energy usage, subject to meeting performance targets.\cite{rao2017application}
|
||||
|
||||
Performance targets fall into 2 broad categories.
|
||||
When the user is interacting with the phone but the device is computation bound -- that is, when the user is waiting on the phone -- immediately run the CPU at maximum speed.
|
||||
Otherwise, run the CPU at a speed that conserves energy while preserving user experience.
|
||||
For both cases, let the idle subsystem turn off any unneeded CPUs.
|
||||
|
||||
We have found that, for common apps, a simple midspeed setting fulfils the later general-case goal: It saves energy relative to the system default policy, while maintaining acceptable user experience as measured in frame drops.
|
||||
It does so by avoiding the twin pitfalls of picking a speed that is too low -- sacrificing performance, energy or both -- or too high, wasting energy.
|
||||
In section \ref{sec:design}, we present the \systemname system that implements this.
|
||||
This design, intuitive for for non-interactive periods, also proves suitable for running typical interactive apps.
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
|
@ -51,6 +51,11 @@ Our results were obtained using Google Pixel 2 devices running Android AOSP 10 w
|
|||
One of the phones was modified to obtain energy measurements using the Monsoon HVPM power meter.\cite{monsoon}
|
||||
Our evaluation system consists of a series of shell scripts running on both the phone and offline.
|
||||
|
||||
The main offline script sleeps for 10s, then starts another script that controls the Monsoon meter as well as the on-phone script.
|
||||
The later, after sleeping another 20s, sets the desired governor policy and starts the experiment, sleeps another 10s, and notifies the offline script of completion.
|
||||
The sleeps ensure that the evaluated device has reached a state of quiescence, and that system state artifacts from previous runs do not bleed over to the next evaluation.
|
||||
They also exclude the workload affects of post-experiment data collection and transfer from the measured data.
|
||||
|
||||
We ran 2 types of workloads: First, standalone native microbenchmarks in C performed fine-control experiments with differing CPU policies and fore- and background workloads.
|
||||
Second, we used the Android UI Automator testing framework to perform scripted simulated interactions with real-world apps.\cite{uiautomator}
|
||||
A UI Automator testing app mimiced typical user interactions, such as scrolling through the Facebook friends lists and feed.
|
||||
|
|
|
@ -1,4 +1,10 @@
|
|||
% -*- root: ../main.tex -*-
|
||||
|
||||
Lorem Ipsum
|
||||
Rao et al acknowledge the need for going beyond a blind general-purpose governor, and tuning performance to particular apps.\cite{rao2017application}
|
||||
They do not...
|
||||
|
||||
The Polaris system tunes also CPU speed to pending workloads, using userspace information.\cite{korkmaz2018workload}
|
||||
This system requires knowledge of the pending amount of work and deadline target, information that is tied to and derivable from a specific type of workload, viz. databases.
|
||||
|
||||
|
||||
|
||||
|
|
|
@ -1,32 +1,48 @@
|
|||
% -*- root: ../main.tex -*-
|
||||
|
||||
Different types of workloads:
|
||||
\begin{itemize}
|
||||
\item[]{IO bound}
|
||||
\item[]{CPU bound}
|
||||
\item[]{Memory bound}
|
||||
\item[]{UI bound}
|
||||
\end{itemize}
|
||||
Our system design partly reflects the nature of our use platform: mobile phones.
|
||||
Phones are embedded devices with limited resources and specific use cases.
|
||||
A general purpose system design, including general-case Linux CPU policies, is inappropriate, as it often misses behavior patterns that are common and predictable to the phone platform.
|
||||
Conversely, uses that may be frequent on other systems -- such as running at full system saturation for indefinite periods -- do not occur.
|
||||
A CPU policy that supports ramping up to saturation indefinitely is thus not appropriate.
|
||||
|
||||
\tinysection{UI and IO bound loads}
|
||||
While they must run many different types of tasks -- such as social media, videogames, web browsers, and financial apps -- they typically only do one at a time.
|
||||
That is, only one app is visible and potentially interactive.
|
||||
This greatly greatly reduces the need to resolve priority and resource conflicts amongst apps.
|
||||
Our system thus focuses on the needs and demands of this foreground app, and avoids distractions that other information may induce.
|
||||
|
||||
\begin{itemize}
|
||||
\item{These benefit from an energy-favorable midspeed. Most of the time, the phone is blocking (either interactively or screenoff). No need to run the CPU faster or lower.}
|
||||
\item{This is the general case.}
|
||||
\end{itemize}
|
||||
Phone apps, being interactive in nature, spend the bulk of their time blocking on user input, waiting for button presses or screen swipes.
|
||||
Most of the time, the user is \textit{not} waiting for the phone to become ready for input.
|
||||
Secondly, the main product of phones is screendraws, whether scrolling a list or animating a graphic.
|
||||
Our system takes advantage of this, by using CPU settings that save energy while maintaining user experience..
|
||||
|
||||
\fixme{add: download experiment (IO)}
|
||||
\tinysection{UI bound loads}
|
||||
|
||||
This is the common case by far.
|
||||
Most of the time, the main app thread is blocking on user input, whether interactively with the screen on or while dozing with the screen off.
|
||||
While there are background threads running, they are precomputing work for some future use, particularly pending screendraws.
|
||||
The CPUs spend the bulk of their time in idle -- that is, there is plenty of potential compute resources available.
|
||||
Thus, there is typically no need to run CPUs anywhere close to full speed.
|
||||
Rather, we have found an identifiable midspeed setting to be quite sufficient.
|
||||
|
||||
\fixme{add: download / audiostream}
|
||||
|
||||
\tinysection{CPU bound loads}
|
||||
|
||||
\begin{itemize}
|
||||
\item[]{These obviously benefit, from a performance vantage, from setting the CPU to 100\%.}
|
||||
\item[]{These situations are infrequent in readily identifiable.}
|
||||
\end{itemize}
|
||||
Even phones do have periods when the user is waiting.
|
||||
App installs, app coldstarts -- after an installed app gets killed due to memory pressure -- and new browser tabs fit this case.
|
||||
In such cases, the system should prioritize performance immediately, and there is no reason to run the CPU at any less than 100\% (as the default policy often does).
|
||||
While these situations are infrequent, they are not rare, so they must not be ignored.
|
||||
Happily, the bulk of them are also readily identifiable: The system knows when it needs to do a lot of work before it can present a foreground app ready to receive input.
|
||||
We design our system to use this information and show that it offers better performance than the default case.
|
||||
|
||||
\tinysection{Memory bound loads}
|
||||
|
||||
\begin{itemize}
|
||||
\item{These benefit from a slower speed.}
|
||||
\item{Ini practice, we have not identified any use cases.}
|
||||
\end{itemize}
|
||||
A workload that is memory bound -- say, pointer chasing or sorting over a sparse array -- presents a corner case.
|
||||
The CPU is necessarily running but stalling on loads and stores.
|
||||
An even lower speed than the common case, dictated by the new bottleneck of memory access rather than UI screendraws, can offer additional energy savings.
|
||||
However, we do not identify any real world workloads in this case and thus exclude it from our analysis.
|
||||
|
||||
\fixme{mention I/O bound loads? e.g. DNA processing}
|
||||
|
||||
|
||||
|
|
Loading…
Reference in New Issue