added eval section
This commit is contained in:
parent
120b4c619d
commit
755e1e7c01
115
sections/evaluation.tex
Normal file
115
sections/evaluation.tex
Normal file
|
@ -0,0 +1,115 @@
|
|||
% -*- root: ../main.tex -*-
|
||||
|
||||
\begin{figure}
|
||||
\centering
|
||||
\includegraphics[width=.45\textwidth]{figures/graph_u.png} %test123.pdf}
|
||||
\bfcaption{Energy consumed for a given compute at different speeds}
|
||||
\label{fig:u_micro}
|
||||
\end{figure}
|
||||
|
||||
\begin{figure}
|
||||
\centering
|
||||
\includegraphics[width=.45\textwidth]{figures/graph_u_fb.png}
|
||||
\bfcaption{Energy consumed for a fixed set of interations, given compute at different speeds}
|
||||
\label{fig:u_micro_fb}
|
||||
\end{figure}
|
||||
|
||||
\begin{figure}
|
||||
\centering
|
||||
\includegraphics[width=.45\textwidth]{figures/graph_drops.png}
|
||||
\bfcaption{Screen drops for a given interactive workload, run with different CPU policies}
|
||||
\label{fig:drops}
|
||||
\end{figure}
|
||||
|
||||
\begin{figure}
|
||||
\centering
|
||||
\includegraphics[width=.45\textwidth]{figures/graph_idlejank.png}
|
||||
\bfcaption{The relation between available CPU resources and user experience, for given CPU policies}
|
||||
\label{fig:idlejank}
|
||||
\end{figure}
|
||||
|
||||
We evaluate \systemname by comparing performance of illustrative and representative workloads on our system.
|
||||
We comapare against similar results obtained using default system settings, as well as with other CPU speed settings.
|
||||
|
||||
\tinysection{Evaluation platform}
|
||||
|
||||
Our results were obtained using the Google Pixel 2 device running Android OS WHAT with \fixme{specify} CPU and RAM.
|
||||
One of the phones was modified to obtain energy measurements using the Monsoon HVPM power meter.\cite{monsoon}
|
||||
We modded the Linux \fixme{VERSION} kernel with \fixme{governor implementation synopsis} to the kernel to act on a hint about pending system usage passed as a syscall from userspace.
|
||||
The kernel can then use this information at its discretion to set an appropriate CPU speed.
|
||||
|
||||
Our evaluation system consists of parts to \fixme{Discuss evaluation setup -- scripts, UIAutomator, ftrace etc.}
|
||||
|
||||
\tinysection{There is an energy-optimal speed}
|
||||
|
||||
Previous works \cite{nuessle2019benchmarking, what} have suggested that, for a given workload, there is an energy optimal speed.
|
||||
This speed falls at some point between the CPU's minimum and maximum settings.
|
||||
Figure \ref{fig:u_micro} shows the results of a fixed amount of compute (\todo{discuss setup}) under different CPU policies.
|
||||
Particularly, the system default policy consumes notably more energy than a mid-speed setting.
|
||||
Etc. etc.
|
||||
|
||||
Next, we studied real-world apps under different CPU policies.
|
||||
The big question is whether our previous observation -- that there is an energy-optimal speed -- still holds.
|
||||
We run scripts to simulate typical user interactions on the Facebook app under different CPU policies: the system default, various fixed speeds, and under
|
||||
\systemname
|
||||
|
||||
Figure \ref{fig:u_micro_fb} shows results obtained.
|
||||
As before, a mid-speed CPU policy proves better than the system default policy.
|
||||
Additionally, \systemname also offers better energy performance.\todo{SHOW THIS}
|
||||
|
||||
\tinysection{The cost of optimal speeds in interactive apps}
|
||||
|
||||
While a simpler fixed speed policy yields optimal energy, this potentially comes at a cost.
|
||||
The output of phone apps is largely a visual display.
|
||||
Previous studies that have constrained system resources available for interactive apps have evaluated their cost on the basis of screen display metrics, in particular what Android terms screen jank -- the proportion of dropped frames. \cite{THIS, THAT}
|
||||
|
||||
As apps are closed source, we are unable to control the exact amount of compute.
|
||||
However, apps spend the vast bulk of their time waiting for user input. \cite{ANY??}
|
||||
While there are background tasks going on, they typically come nowhere to saturating CPU resources.\todo{SHOW: per-core CPU idle\% graph}
|
||||
|
||||
Hence, adjusting CPU speed within reason does not appreciably affect user experience.
|
||||
Figure \ref{fig:drops} shows the cost, energy and in screen drops, for a given CPU policy.
|
||||
Policies \fixme{WHAT} show \fixme{WHAT} savings in energy over the system default.
|
||||
For these, even fixed speeds above 40\% produce drop rates (jank) of within 50\% of the system default.
|
||||
In practice, this averages to approximately 1 extra dropped frame per second, a figure we argue is acceptable.\todo{CITE OTHER STUDIES}
|
||||
The proportion of drops is typically better for a fixed speed over 70\% than for the system default policy.
|
||||
\systemname likewise shows equivalent user experience -- while saving \fixme{WHAT} on energy.\todo{SHOW: with 1 our governor and 2 energy metrics}
|
||||
|
||||
The key observation is that, at the CPU usage level imposed by apps, there are still plenty of unused resources to ensure quality user experience with \systemname.
|
||||
The question then arises -- at what level of resource constraint does experience begin to suffer?
|
||||
To answer this, Figure \ref{fig:idlejank} shows the \facebook experiment with additional background tasks that consume CPU cycles.
|
||||
Results show the CPU idle time and screen jank rate for each of the CPU policies and for each of several levels of background work.
|
||||
The background tasks are do-nothing loads, run on each of the 8 CPU cores throughout the duration of the interactive experiment.
|
||||
|
||||
We can achieve acceptable cost -- a jank rate below the blue bar on the graph -- with CPU speeds above 40\% \fixme{verify} and background load rates of 20ms sleep intervals and above.
|
||||
\systemname falls in this category.
|
||||
\todo{SHOW: with download app in background}
|
||||
|
||||
To characterize the amount of background work this represents, table \fixme{SHOW: DO THIS} shows the proportion of CPU usage these loads consume when run by themselves.\todo{Good idea? I think this helps}
|
||||
In actual usage, a user would likely never encounter this level of background usage.
|
||||
The CPU usage imposed by downloading a large file consumes approximately 50\% of a single core -- far below the microloads we imposed.\todo{SHOW: with download background task}
|
||||
|
||||
\tinysection{When energy-optimal is not optimal}
|
||||
|
||||
An energy-optimal policy is not always best.
|
||||
In particular, when the user is waiting and computation is the major bottleneck, in most cases the system should prioritize latency.
|
||||
Figure \fixme{WHAT} shows the latencies of 2 common such situations, app installation and coldstart, for different CPU policies.
|
||||
\systemname identifies and outperforms the default in both latency and energy.\todo{SHOW this}
|
||||
|
||||
|
||||
\fixme{GRAPHS TODO:}
|
||||
\begin{itemize}
|
||||
\item[1]{Figure \ref{fig:u_micro_fb} with \systemname}
|
||||
\item[2]{NEW: Per-core stacked idle\% graph}
|
||||
\item[3]{Figure \ref{fig:drops} with \systemname policy and with energy numbers for all policies}
|
||||
\item[4]{Figure \ref{fig:idlejank} with \systemname policy and with download background load}
|
||||
\item[5]{NEW: latency-energy graph for CPU-bound tasks (coldstart, install)}
|
||||
\end{itemize}
|
||||
|
||||
%CYCLE COUNT: Show that the work done is, approximately, the same
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
Loading…
Reference in a new issue