156 lines
8.7 KiB
TeX
156 lines
8.7 KiB
TeX
% -*- root: ../main.tex -*-
|
|
%!TEX root=../main.tex
|
|
|
|
\begin{figure*}
|
|
\centering
|
|
\includegraphics[width=.90\linewidth]{figures/graph_jank_allapps.pdf}
|
|
\bfcaption{Display framedrop for apps under different CPU policies (10 runs, 90\% confidence)}
|
|
\label{fig:jank_allapps}
|
|
\end{figure*}
|
|
|
|
\begin{figure*}
|
|
\centering
|
|
\includegraphics[width=.90\linewidth]{figures/graph_energy_allapps.pdf}
|
|
\bfcaption{Energy usage for apps under different CPU policies (10 runs, 90\% confidence)}
|
|
\label{fig:energy_allapps}
|
|
\end{figure*}
|
|
|
|
\begin{figure*}
|
|
\centering
|
|
\includegraphics[width=.90\linewidth]{figures/graph_time_per_freq_yt.pdf}
|
|
\bfcaption{Average time spent per CPU under the default policy for Youtube (Average of 10 runs, 90\% confidence)}
|
|
\label{fig:time_per_freq_yt}
|
|
\end{figure*}
|
|
|
|
\begin{figure*}
|
|
\centering
|
|
\includegraphics[width=.90\linewidth]{figures/graph_time_per_freq_spot.pdf}
|
|
\bfcaption{Average time spent per CPU under the default policy for Spotify (Average of 10 runs, 90\% confidence)}
|
|
\label{fig:time_per_freq_spot}
|
|
\end{figure*}
|
|
|
|
\begin{figure*}
|
|
\centering
|
|
\includegraphics[width=.90\linewidth]{figures/graph_nonidletime_yt.pdf}
|
|
\bfcaption{CPU non-idle time for Youtube under different CPU policies (10 runs, 90\% confidence)}
|
|
\label{fig:nonidle_yt}
|
|
\end{figure*}
|
|
|
|
\begin{figure*}
|
|
\centering
|
|
\includegraphics[width=.90\linewidth]{figures/graph_nonidletime_spot.pdf}
|
|
\bfcaption{CPU non-idle time for Spotify under different CPU policies (10 runs, 90\% confidence)}
|
|
\label{fig:nonidle_spot}
|
|
\end{figure*}
|
|
|
|
\begin{figure*}
|
|
\centering
|
|
\includegraphics[width=.90\linewidth]{figures/graph_idlejank_heavyload.pdf}
|
|
\bfcaption{The effect of additional background loads on user experience for given CPU policies}
|
|
\label{fig:idlejank}
|
|
\end{figure*}
|
|
|
|
|
|
We now evaluate the \systemname and truncated \schedutil governors, by comparing their performance on a range of representative workloads the default Android \schedutil governor.
|
|
Concretely, we evaluate the claims that on normal workloads:
|
|
(i) truncated \schedutil achieves significantly better performance than regular \schedutil without significantly increasing energy consumption, and
|
|
(ii) \systemname achieves significantly better energy consumption than \schedutil, without significantly increasing screen jank.
|
|
|
|
We further conduct several experiments to confirm our observations from \Cref{sec:wasted}, namely that:
|
|
(iii) the adaptive app pattern is not unique to facebook,
|
|
(iv) apps spend significant time below $\fenergy$.
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\paragraph{Evaluation platform}
|
|
|
|
Our results were obtained using Google Pixel 2 devices running Android AOSP 10 with 4 GB RAM and 128 GB SSD storage and the Snapdragon 835 chipset~\cite{snapdragon-835}.
|
|
Standalone microbenchmarks were implemented in C, while end-to-end macrobenchmarks were performed using the Android UI Automator testing framework to perform scripted simulated interactions with real-world apps~\cite{uiautomator}.
|
|
One of the phones was modified to obtain energy measurements using the Monsoon HVPM power meter~\cite{monsoon}.
|
|
Our evaluation system consists of a pair of shell scripts running on the phone and an external monitor, respectively.
|
|
|
|
The external script sleeps for 10s to ensure quiescence and prevent inter-trial artifacts, and initializes both the Monsoon meter and the on-phone script.
|
|
The on-phone script sleeps for 20s to ensure that the Monsoon meter is capturing data, sets the desired governor policy, and starts the experiment.
|
|
When the experiment concludes, the on-phone script sleeps for a further 10s to ensure that the Monsoon meter captures the full trace, and notifies the external script that the experiment has concluded.
|
|
The external script concludes by retrieving relevant artifacts from the phone, excluding data transfer from any energy or performance measurements.
|
|
|
|
We collected information on CPU speed and idlestate from both the Linux \texttt{ftrace} framework and from \texttt{sysfs}, and on CPU cycles from the \texttt{perf\_event\_open} syscall~\cite{perf-event}.
|
|
We also used \texttt{ftrace} to log testing parameter and state.
|
|
Information on screen performance including framedrops came from the Android \texttt{dumpsys gfxinfo} service.
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\paragraph{Workloads}
|
|
We consider three separate workloads: (i) Facebook, (ii) YouTube, and (iii) Spotify.
|
|
The \textbf{Facebook} workload was described in \Cref{sec:low-speed-in-practice}.
|
|
The \textbf{YouTube} workload starts the app, and searches a popular video by its name.
|
|
The app selects the first hit, starts the video, and waits for 30 seconds.
|
|
The specific video was selected to get a predictable high rate of being served random motion video ads at the start.
|
|
The \textbf{Spotify} workload...
|
|
\todo{Fill in details}.
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\subsection{Screen Jank}
|
|
|
|
%\begin{figure}
|
|
%\centering
|
|
%\includegraphics[width=.95\linewidth]{figures/graph_jank_perspeed_yt.pdf}
|
|
%\bfcaption{Display framedrop proportion for a :30 Youtube interaction under different CPU policies (10 runs, 90\% confidence)}
|
|
%\label{fig:screendrops_per_freq_yt}
|
|
%\end{figure}
|
|
|
|
\Cref{fig:jank_allapps} show frame drop rates for the three workloads.
|
|
\todo{discuss}
|
|
|
|
These graphs confirm the performance aspect of claims (i) and (ii).
|
|
On all workloads, \systemname and truncated \schedutil both outperform regular \schedutil.
|
|
\systemname has a 10-25\% lower frame drop rate, varying by workload.
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\subsection{Ramp-Up Times}
|
|
|
|
To attribute the improvement in performance, we measure the CPU frequencies selected by \schedutil and \systemname, respectively.
|
|
\Cref{fig:time_per_freq_fb,fig:time_per_freq_yt,fig:time_per_freq_spot} plot a CDF of the difference between these two selections.
|
|
We note that for a significant fraction of the workload (5\% for Facebook, 15\% for Youtube), the frequency selected by \schedutil is significantly (up to 50\%) lower.
|
|
This is \schedutil's ramp-up period, where it selects frequencies lower than $\fenergy$.
|
|
We attribute the improved performance for both governors to eliminating the ramp-up period where \systemname selects speeds below $\fenergy$.
|
|
Although each workload spends part of its time at a higher frequency in \schedutil compared to \systemname, it spends more time ramping up to $\fenergy$ than at a higher speed.
|
|
In summary, the improved performance of both truncated \schedutil and \systemname can be attributed to \schedutil's ramp-up period.
|
|
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\subsection{Jank Under High-Load}
|
|
|
|
|
|
We next explore the level of additional load required to degrade the user experience.
|
|
For this experiment, we run the facebook workload in the presence of background tasks.
|
|
These background tasks generate additional background load by performing simple arithmetic with periodically injected sleeps at varying intervals.
|
|
%We collect non-idle time through sysfs and framedrop rate through Android GFX as before.
|
|
%We pin one load-producing task to each of the 8 CPU cores.
|
|
\Cref{fig:idlejank} illustrates the effect of the added CPU load on the measured jank.
|
|
The x-axis shows the average load across all 8 CPU cores (based on the injected sleeps), and the frame-drop rate is shown on the y-axis.
|
|
Note that a smaller sleep interval equates to a higher load.
|
|
|
|
|
|
The leftmost part of the graph, with the smallest circles (representing a normal interaction, with no additional background load) shows that a fixed speed of 70\% or greater produces a measured screen drop rate that is essentially idential with that of the system default.
|
|
Up to a sustained load of about 70\% across \emph{all} CPU cores, the system is able to keep up with screen redraw events, with a significant effect on jank only at the lowest 2 CPU frequencies.
|
|
In actual usage, a user would likely never encounter this level of background usage; it takes significant, and unrealistic, additional workload to degrade the user experience.
|
|
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\subsection{Energy Usage}
|
|
|
|
\Cref{fig:energy_allapps} show frame drop rates for the three workloads.
|
|
\todo{discuss}
|
|
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\subsection{Idle Time}
|
|
|
|
We next review our findings from \Cref{sec:adaptiveApps}, that typical apps increase their offered load as CPU capacity increases.
|
|
\Cref{fig:nonidle_fb,fig:nonidle_yt,fig:nonidle_spot} illustrate the fraction of the of time the CPU spends doing work in each workload as CPU frequency increases.
|
|
Recall that, assuming the amount of work stays constant in a fixed-duration workload, the time spent non-idle would show an inverse-linear relationship with the CPU frequency.
|
|
As with Facebook, the Youtube workload shows a much flatter relationship, particularly on the big cores.
|
|
\todo{Discuss Spotify}
|
|
|
|
|
|
|