sections/*: updates

master
carlnues@buffalo.edu 2023-04-11 13:49:58 -04:00
parent 9f31bf0180
commit 40e25cc7ff
5 changed files with 63 additions and 32 deletions

View File

@ -41,7 +41,7 @@ Our basic design is to set the CPU speed to 70\% for general case, unless overri
Hints from userspace generally involve an upcoming performance-prioritized period where the CPUs should be set to 100\%.
We also support a userspace hint for pending memory-bound workload, where the CPU should be set to a lower speed of 40\%.
Userspace can also indicate that performance-priority or memory-bound optimized periods have ended, and that default behavior should resume.
For stability and security, we implement a time-out of \fixme{what} -- we have observed these periods are less than this in practice, and userspace can always re-supply the hint.
For stability and security, we implement a time-out of 10s -- we have observed these periods are typically much less than this in practice, and userspace can always re-supply the hint.
When \systemname receives conflicting hints from different apps, it prioritizes both performance-optimized and memory-bound hints over default behavior (energy-centric) hints.
We reserve resolution between performance-prioritized (speed 100) and memory-bound (speed 40) hints to future work.
We do not address different simultaneous CPU speeds: On our device, the speeds of the 4 big and 4 little core clusters must be set as a block.
@ -54,7 +54,7 @@ The Android AOSP 10 platform forms the base of our system.
We modded the Linux 4.4.210 kernel with the new \systemname governor to implement the design logic.
The kernel can then use this information at its discretion to set an appropriate CPU speed.
%(modulo very low battery etc.)
An additional syscall API, with native calldown support from the Android platform, allows userspace to communicate hints about pending system needs from userspace.
A syscall API, with native calldown support from the Android platform, allows userspace to communicate hints about pending system needs from userspace and suggested CPU speed settings.
We modded the AOSP framework to expose the new syscall interface to userlevel Java code.
\fixme{todo}

View File

@ -22,6 +22,8 @@ The current default CPU policy, \schedutil, bases speed from the proportion of r
\subsection{Idle overrides any speed}
\fixme{ADD: discussion re: p-states and c-states -- we rely on the later to shut off cores but do not delve into tweaking it}
%An obvious question with running an app at a fixed speed is what happens when work finishes.
Running a CPU with no work wastes energy, and slowing the CPU saves energy.
%This complex speed-selection system is not the only way, however.
@ -81,7 +83,7 @@ With history-driven dynamic policies such as \schedutil, this triggers constantl
Figure \ref{fig:speed_time} shows CPU speed, over time, for 1s of scrolling through the Facebook app feed.
The top plot shows nominal CPU speed, ignoring idling -- this is the setting requested by the default policy.
Previous studies have noted that interactive workloads significantly harm smartphone performance.\cite{nuessle2019benchmarking}
Previous studies have noted that intermittent workloads significantly harm smartphone performance.\cite{nuessle2019benchmarking}
Figure \ref{fig:speed_time_delay} shows how the combination of intermittent loads with lagging ramp-up speeds picked by the \schedutil policy increases runtime significantly.
We ran the same fixed workload with 2 different delay settings.
In both cases, the 2 righthand graphs are time zooms of the 2 left graphs to show detail.
@ -168,14 +170,16 @@ Some system of changing CPU speed is necessary to achieve the base goals of furn
Instead of a system that blindly uses the history of past CPU usage to make frequent fine adjustments to CPU speed, is there an alternative?
Our experiments suggest a less complex approach for phone CPU management.
When performance needs warrant, immediately run the CPU at maximum speed.
Aside from performance-critical periods -- such as when the user is waiting on a compute-bound task -- CPUs should run at or near a midspeed setting that conserves energy.
\fixme{phraseology correct?}
In both cases, let the idle subsystem turn off any unneeded CPUs.
This avoids the twin pitfalls of picking a speed that is too low -- wasting either performance, energy or both -- or too high, wasting energy.
Previous studies have acknowledged the policy goal for the smartphone platform should be to minimize energy usage, subject to meeting performance targets.\cite{rao2017application}
Performance targets fall into 2 broad categories.
When the user is interacting with the phone but the device is computation bound -- that is, when the user is waiting on the phone -- immediately run the CPU at maximum speed.
Otherwise, run the CPU at a speed that conserves energy while preserving user experience.
For both cases, let the idle subsystem turn off any unneeded CPUs.
We have found that, for common apps, a simple midspeed setting fulfils the later general-case goal: It saves energy relative to the system default policy, while maintaining acceptable user experience as measured in frame drops.
It does so by avoiding the twin pitfalls of picking a speed that is too low -- sacrificing performance, energy or both -- or too high, wasting energy.
In section \ref{sec:design}, we present the \systemname system that implements this.
This design, intuitive for for non-interactive periods, also proves suitable for running typical interactive apps.

View File

@ -51,6 +51,11 @@ Our results were obtained using Google Pixel 2 devices running Android AOSP 10 w
One of the phones was modified to obtain energy measurements using the Monsoon HVPM power meter.\cite{monsoon}
Our evaluation system consists of a series of shell scripts running on both the phone and offline.
The main offline script sleeps for 10s, then starts another script that controls the Monsoon meter as well as the on-phone script.
The later, after sleeping another 20s, sets the desired governor policy and starts the experiment, sleeps another 10s, and notifies the offline script of completion.
The sleeps ensure that the evaluated device has reached a state of quiescence, and that system state artifacts from previous runs do not bleed over to the next evaluation.
They also exclude the workload affects of post-experiment data collection and transfer from the measured data.
We ran 2 types of workloads: First, standalone native microbenchmarks in C performed fine-control experiments with differing CPU policies and fore- and background workloads.
Second, we used the Android UI Automator testing framework to perform scripted simulated interactions with real-world apps.\cite{uiautomator}
A UI Automator testing app mimiced typical user interactions, such as scrolling through the Facebook friends lists and feed.

View File

@ -1,4 +1,10 @@
% -*- root: ../main.tex -*-
Lorem Ipsum
Rao et al acknowledge the need for going beyond a blind general-purpose governor, and tuning performance to particular apps.\cite{rao2017application}
They do not...
The Polaris system tunes also CPU speed to pending workloads, using userspace information.\cite{korkmaz2018workload}
This system requires knowledge of the pending amount of work and deadline target, information that is tied to and derivable from a specific type of workload, viz. databases.

View File

@ -1,32 +1,48 @@
% -*- root: ../main.tex -*-
Different types of workloads:
\begin{itemize}
\item[]{IO bound}
\item[]{CPU bound}
\item[]{Memory bound}
\item[]{UI bound}
\end{itemize}
Our system design partly reflects the nature of our use platform: mobile phones.
Phones are embedded devices with limited resources and specific use cases.
A general purpose system design, including general-case Linux CPU policies, is inappropriate, as it often misses behavior patterns that are common and predictable to the phone platform.
Conversely, uses that may be frequent on other systems -- such as running at full system saturation for indefinite periods -- do not occur.
A CPU policy that supports ramping up to saturation indefinitely is thus not appropriate.
\tinysection{UI and IO bound loads}
While they must run many different types of tasks -- such as social media, videogames, web browsers, and financial apps -- they typically only do one at a time.
That is, only one app is visible and potentially interactive.
This greatly greatly reduces the need to resolve priority and resource conflicts amongst apps.
Our system thus focuses on the needs and demands of this foreground app, and avoids distractions that other information may induce.
\begin{itemize}
\item{These benefit from an energy-favorable midspeed. Most of the time, the phone is blocking (either interactively or screenoff). No need to run the CPU faster or lower.}
\item{This is the general case.}
\end{itemize}
Phone apps, being interactive in nature, spend the bulk of their time blocking on user input, waiting for button presses or screen swipes.
Most of the time, the user is \textit{not} waiting for the phone to become ready for input.
Secondly, the main product of phones is screendraws, whether scrolling a list or animating a graphic.
Our system takes advantage of this, by using CPU settings that save energy while maintaining user experience..
\fixme{add: download experiment (IO)}
\tinysection{UI bound loads}
This is the common case by far.
Most of the time, the main app thread is blocking on user input, whether interactively with the screen on or while dozing with the screen off.
While there are background threads running, they are precomputing work for some future use, particularly pending screendraws.
The CPUs spend the bulk of their time in idle -- that is, there is plenty of potential compute resources available.
Thus, there is typically no need to run CPUs anywhere close to full speed.
Rather, we have found an identifiable midspeed setting to be quite sufficient.
\fixme{add: download / audiostream}
\tinysection{CPU bound loads}
\begin{itemize}
\item[]{These obviously benefit, from a performance vantage, from setting the CPU to 100\%.}
\item[]{These situations are infrequent in readily identifiable.}
\end{itemize}
Even phones do have periods when the user is waiting.
App installs, app coldstarts -- after an installed app gets killed due to memory pressure -- and new browser tabs fit this case.
In such cases, the system should prioritize performance immediately, and there is no reason to run the CPU at any less than 100\% (as the default policy often does).
While these situations are infrequent, they are not rare, so they must not be ignored.
Happily, the bulk of them are also readily identifiable: The system knows when it needs to do a lot of work before it can present a foreground app ready to receive input.
We design our system to use this information and show that it offers better performance than the default case.
\tinysection{Memory bound loads}
\begin{itemize}
\item{These benefit from a slower speed.}
\item{Ini practice, we have not identified any use cases.}
\end{itemize}
A workload that is memory bound -- say, pointer chasing or sorting over a sparse array -- presents a corner case.
The CPU is necessarily running but stalling on loads and stores.
An even lower speed than the common case, dictated by the new bottleneck of memory access rather than UI screendraws, can offer additional energy savings.
However, we do not identify any real world workloads in this case and thus exclude it from our analysis.
\fixme{mention I/O bound loads? e.g. DNA processing}