diff --git a/sections/evaluation.tex b/sections/evaluation.tex new file mode 100644 index 0000000..2d70713 --- /dev/null +++ b/sections/evaluation.tex @@ -0,0 +1,115 @@ +% -*- root: ../main.tex -*- + +\begin{figure} +\centering +\includegraphics[width=.45\textwidth]{figures/graph_u.png} %test123.pdf} +\bfcaption{Energy consumed for a given compute at different speeds} +\label{fig:u_micro} +\end{figure} + +\begin{figure} +\centering +\includegraphics[width=.45\textwidth]{figures/graph_u_fb.png} +\bfcaption{Energy consumed for a fixed set of interations, given compute at different speeds} +\label{fig:u_micro_fb} +\end{figure} + +\begin{figure} +\centering +\includegraphics[width=.45\textwidth]{figures/graph_drops.png} +\bfcaption{Screen drops for a given interactive workload, run with different CPU policies} +\label{fig:drops} +\end{figure} + +\begin{figure} +\centering +\includegraphics[width=.45\textwidth]{figures/graph_idlejank.png} +\bfcaption{The relation between available CPU resources and user experience, for given CPU policies} +\label{fig:idlejank} +\end{figure} + +We evaluate \systemname by comparing performance of illustrative and representative workloads on our system. +We comapare against similar results obtained using default system settings, as well as with other CPU speed settings. + +\tinysection{Evaluation platform} + +Our results were obtained using the Google Pixel 2 device running Android OS WHAT with \fixme{specify} CPU and RAM. +One of the phones was modified to obtain energy measurements using the Monsoon HVPM power meter.\cite{monsoon} +We modded the Linux \fixme{VERSION} kernel with \fixme{governor implementation synopsis} to the kernel to act on a hint about pending system usage passed as a syscall from userspace. +The kernel can then use this information at its discretion to set an appropriate CPU speed. + +Our evaluation system consists of parts to \fixme{Discuss evaluation setup -- scripts, UIAutomator, ftrace etc.} + +\tinysection{There is an energy-optimal speed} + +Previous works \cite{nuessle2019benchmarking, what} have suggested that, for a given workload, there is an energy optimal speed. +This speed falls at some point between the CPU's minimum and maximum settings. +Figure \ref{fig:u_micro} shows the results of a fixed amount of compute (\todo{discuss setup}) under different CPU policies. +Particularly, the system default policy consumes notably more energy than a mid-speed setting. +Etc. etc. + +Next, we studied real-world apps under different CPU policies. +The big question is whether our previous observation -- that there is an energy-optimal speed -- still holds. +We run scripts to simulate typical user interactions on the Facebook app under different CPU policies: the system default, various fixed speeds, and under +\systemname + +Figure \ref{fig:u_micro_fb} shows results obtained. +As before, a mid-speed CPU policy proves better than the system default policy. +Additionally, \systemname also offers better energy performance.\todo{SHOW THIS} + +\tinysection{The cost of optimal speeds in interactive apps} + +While a simpler fixed speed policy yields optimal energy, this potentially comes at a cost. +The output of phone apps is largely a visual display. +Previous studies that have constrained system resources available for interactive apps have evaluated their cost on the basis of screen display metrics, in particular what Android terms screen jank -- the proportion of dropped frames. \cite{THIS, THAT} + +As apps are closed source, we are unable to control the exact amount of compute. +However, apps spend the vast bulk of their time waiting for user input. \cite{ANY??} +While there are background tasks going on, they typically come nowhere to saturating CPU resources.\todo{SHOW: per-core CPU idle\% graph} + +Hence, adjusting CPU speed within reason does not appreciably affect user experience. +Figure \ref{fig:drops} shows the cost, energy and in screen drops, for a given CPU policy. +Policies \fixme{WHAT} show \fixme{WHAT} savings in energy over the system default. +For these, even fixed speeds above 40\% produce drop rates (jank) of within 50\% of the system default. +In practice, this averages to approximately 1 extra dropped frame per second, a figure we argue is acceptable.\todo{CITE OTHER STUDIES} +The proportion of drops is typically better for a fixed speed over 70\% than for the system default policy. +\systemname likewise shows equivalent user experience -- while saving \fixme{WHAT} on energy.\todo{SHOW: with 1 our governor and 2 energy metrics} + +The key observation is that, at the CPU usage level imposed by apps, there are still plenty of unused resources to ensure quality user experience with \systemname. +The question then arises -- at what level of resource constraint does experience begin to suffer? +To answer this, Figure \ref{fig:idlejank} shows the \facebook experiment with additional background tasks that consume CPU cycles. +Results show the CPU idle time and screen jank rate for each of the CPU policies and for each of several levels of background work. +The background tasks are do-nothing loads, run on each of the 8 CPU cores throughout the duration of the interactive experiment. + +We can achieve acceptable cost -- a jank rate below the blue bar on the graph -- with CPU speeds above 40\% \fixme{verify} and background load rates of 20ms sleep intervals and above. +\systemname falls in this category. +\todo{SHOW: with download app in background} + +To characterize the amount of background work this represents, table \fixme{SHOW: DO THIS} shows the proportion of CPU usage these loads consume when run by themselves.\todo{Good idea? I think this helps} +In actual usage, a user would likely never encounter this level of background usage. +The CPU usage imposed by downloading a large file consumes approximately 50\% of a single core -- far below the microloads we imposed.\todo{SHOW: with download background task} + +\tinysection{When energy-optimal is not optimal} + +An energy-optimal policy is not always best. +In particular, when the user is waiting and computation is the major bottleneck, in most cases the system should prioritize latency. +Figure \fixme{WHAT} shows the latencies of 2 common such situations, app installation and coldstart, for different CPU policies. +\systemname identifies and outperforms the default in both latency and energy.\todo{SHOW this} + + +\fixme{GRAPHS TODO:} +\begin{itemize} +\item[1]{Figure \ref{fig:u_micro_fb} with \systemname} +\item[2]{NEW: Per-core stacked idle\% graph} +\item[3]{Figure \ref{fig:drops} with \systemname policy and with energy numbers for all policies} +\item[4]{Figure \ref{fig:idlejank} with \systemname policy and with download background load} +\item[5]{NEW: latency-energy graph for CPU-bound tasks (coldstart, install)} +\end{itemize} + +%CYCLE COUNT: Show that the work done is, approximately, the same + + + + + +