edits (primarily eval)
parent
62207ec3ed
commit
7133a48c6f
|
@ -53,13 +53,16 @@
|
|||
|
||||
We now evaluate the \systemname and truncated \schedutil governors, by comparing their performance on a range of representative workloads the default Android \schedutil governor.
|
||||
Concretely, we evaluate the claims that on normal workloads:
|
||||
(i) truncated \schedutil achieves significantly better performance than regular \schedutil without significantly increasing energy consumption, and
|
||||
(ii) \systemname achieves significantly better energy consumption than \schedutil, without significantly increasing screen jank.
|
||||
(i) truncated \schedutil achieves better performance than regular \schedutil without significantly increasing energy consumption, and
|
||||
(ii) \systemname achieves better energy consumption than \schedutil, without significantly increasing screen jank.
|
||||
|
||||
We further conduct several experiments to confirm our observations from \Cref{sec:wasted}, namely that:
|
||||
(iii) the adaptive app pattern is not unique to facebook,
|
||||
(iv) apps spend significant time below $\fenergy$.
|
||||
|
||||
\todo{FIXME}
|
||||
\fixme{Change clustergraph labels: Kiss 70 => Kiss}
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\paragraph{Evaluation platform}
|
||||
|
||||
|
@ -78,31 +81,48 @@ We also used \texttt{ftrace} to log testing parameter and state.
|
|||
Information on screen performance including framedrops came from the Android \texttt{dumpsys gfxinfo} service.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\paragraph{CPU Policies}
|
||||
We evaluate six different CPU policies under different workloads:
|
||||
(i) the system default, \schedutil,
|
||||
(ii) a truncated \schedutil implemented by lower-bounding the CPU using the existing API discused in subsection \ref{subsec:signal_perf_needs},
|
||||
(iii) a fixed 70\% speed using the existing \texttt{userspace} governor,
|
||||
(iv) a truncated \schedutil implemented with \systemname,
|
||||
(v) unmodified \systemname, and
|
||||
(vi) the \texttt{performance} governor.
|
||||
We include (ii) and (iii) to compare the general performance of the truncated \schedutil and a general-case $\sim$70\% speed policies when implemented under the existing API with that when implemented using \systemname.
|
||||
Under default Linux, a specific CPU speed requested gets implemented as the next-highest speed in a preset series of supported speeds in texttt{scaling\_available\_frequencies} in texttt{sysfs}.
|
||||
We follow this behavior with our system.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\paragraph{Workloads}
|
||||
We consider three separate workloads: (i) Facebook, (ii) YouTube, and (iii) Spotify.
|
||||
We consider four separate workloads, the first 3 involving indiviual apps: (i) Facebook, (ii) YouTube, and (iii) Spotify.
|
||||
The fourth (iv) workload combines the Facebook and Spotify loads.
|
||||
These were designed to mimic common user phone interactions.
|
||||
The \textbf{Facebook} workload was described in \Cref{sec:low-speed-in-practice}.
|
||||
The \textbf{YouTube} workload starts the app, and searches a popular video by its name.
|
||||
The app selects the first hit, starts the video, and waits for 30 seconds.
|
||||
The specific video was selected to get a predictable high rate of being served random motion video ads at the start.
|
||||
The \textbf{Spotify} workload...
|
||||
\todo{Fill in details}.
|
||||
The \textbf{Spotify} workload starts the app searches for a common musical selection.
|
||||
It starts the first suggestion and waits for 30 seconds while the audio plays with the app in the foreground.
|
||||
Lastly, the \textbf{Combined} workload examines the system under additional stress.
|
||||
It runs the original Facebook workload in the foreground while the Spotify app streams audio continuously in the background.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\subsection{Screen Jank}
|
||||
|
||||
%\begin{figure}
|
||||
%\centering
|
||||
%\includegraphics[width=.95\linewidth]{figures/graph_jank_perspeed_yt.pdf}
|
||||
%\bfcaption{Display framedrop proportion for a :30 Youtube interaction under different CPU policies (10 runs, 90\% confidence)}
|
||||
%\label{fig:screendrops_per_freq_yt}
|
||||
%\end{figure}
|
||||
\Cref{fig:jank_allapps} show frame drop rates for the four workloads.
|
||||
These graphs address the performance aspect of claims (i) and (ii).
|
||||
On all workloads, \systemname and truncated \schedutil offer nearly identical or notably better performance than regular \schedutil.
|
||||
The Facebook load under \systemname costs an additional .3\%, or $\sim$.2 frames per second at 60fps.
|
||||
We argue this does not noticably affect user experience and is more than acceptable given the greater than 10\% energy savings.
|
||||
The results of the truncated \schedutil policies and of fixedspeed 70\% similar offer significant energy savings at small to zero cost.
|
||||
|
||||
\Cref{fig:jank_allapps} show frame drop rates for the three workloads.
|
||||
\todo{discuss}
|
||||
|
||||
These graphs confirm the performance aspect of claims (i) and (ii).
|
||||
On all workloads, \systemname and truncated \schedutil both outperform regular \schedutil.
|
||||
\systemname has a 10-25\% lower frame drop rate, varying by workload.
|
||||
Youtube shows a clear performance win for \systemname compared to the default.
|
||||
The truncated \schedutil policies and fixed speed 70\% policy also offer improved performance to the default.
|
||||
Performance under \systemname for both Spotify and the Combined workloads, like that for Facebook, costs .3\% fps compared to the default -- a cost we again argue is very minimal and acceptable.
|
||||
The other non-default policies for both Spotify and Combined also offer either essentially the same or even somewhat better performance than the default.
|
||||
Particularly, the increased background load of Combined does not change screendrop rate appreciably.
|
||||
In summary: \systemname, with a considerably simpler policy mechanism, offers essentially the same performance, measured in user experience screendrops, to that of \systemname, in common app workloads.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\subsection{Ramp-Up Times}
|
||||
|
@ -134,6 +154,10 @@ The leftmost part of the graph, with the smallest circles (representing a normal
|
|||
Up to a sustained load of about 70\% across \emph{all} CPU cores, the system is able to keep up with screen redraw events, with a significant effect on jank only at the lowest 2 CPU frequencies.
|
||||
In actual usage, a user would likely never encounter this level of background usage; it takes significant, and unrealistic, additional workload to degrade the user experience.
|
||||
|
||||
A more representative evaluation case of high loads is that offered by our fourth Combined workload: Browsing through Facebook while listening to Spotify music in the background.
|
||||
As we discuss above, \Cref{fig:jank_allapps} shows the cost of the additional background load is quite small in terms of frame drops; 2 of the non-default policies offer improvements.
|
||||
In common settings, background load does not pose a threat to the performance of \systemname.
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\subsection{Energy Usage}
|
||||
|
|
|
@ -22,7 +22,6 @@ Launch screen on; idle & 130 \\
|
|||
\end{figure}
|
||||
|
||||
\todo{FIXME}
|
||||
\fixme{Ensure paper builds *without* draft mode for hyperlinks}
|
||||
|
||||
CPUs consume considerable energy on mobile phones.
|
||||
As Table \ref{fig:item_energy_cost} shows, a single (big) CPU core on a Pixel 2, running at full speed with the screen off, consumes almost three times the energy of the display, and a second core running at full speed almost doubles that.
|
||||
|
|
|
@ -129,6 +129,7 @@ However, even at the CPU's maximum frequency, more work is created than the CPU
|
|||
\end{figure}
|
||||
|
||||
\Cref{fig:u_micro_fb} shows power consumption for the Facebook workload, padded with idle time to a fixed 40s period.
|
||||
\todo{FIXME}
|
||||
Operating the CPU at maximum frequency imposes an energy overhead of approximately $1$mAh compared to operating at $\fenergy \approx 70\%$ of its maximum.
|
||||
This represents about $\frac{1}{2700}$ of the typical Pixel 2's maximum battery capacity.
|
||||
|
||||
|
@ -139,12 +140,13 @@ If added performance is desirable in this use case and others like it, then the
|
|||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
|
||||
\subsection{Signaling Performance Needs}
|
||||
\label{subsec:signal_perf_needs}
|
||||
|
||||
The more interesting systems design question is how to select CPU speeds in the presence of adaptive applications, when the additional energy investment does not provide value.
|
||||
Specifically, adaptive apps (while in-use, e.g., scrolling through a list) create a functionally infinite source of work.
|
||||
The CPU usage profiles presented by an adaptive app and a user legitimately waiting on a CPU-bound task (e.g., cold-start) are identical, rendering them indistinguishable to \schedutil.
|
||||
|
||||
Fortunately, the Linux maintainers have already recognized the need for better user-space signalling of performance needs.
|
||||
Fortunately, the Linux maintainers have already recognized the need for better user-space signaling of performance needs.
|
||||
In 2015, the Linux kernel added a virtual filesystem, mounted at \texttt{/dev/stune/} that provides virtual file hooks:
|
||||
\begin{itemize}
|
||||
\item[]{\texttt{schedtune.boost}}
|
||||
|
|
Loading…
Reference in New Issue