41 lines
2.4 KiB
TeX
41 lines
2.4 KiB
TeX
% -*- root: ../main.tex -*-
|
|
|
|
|
|
\begin{figure}
|
|
\centering
|
|
\includegraphics[width=.90\linewidth]{figures/graph_freqtime_micro.pdf}
|
|
\bfcaption{Intermittent workloads hurt runtime 2 ways: directly, by sleeping and indirectly, by inducing slower CPU speeds}
|
|
\label{fig:speed_time_delay}
|
|
\end{figure}
|
|
|
|
%\subsection{The cost and problems of complex speed micromanagement}
|
|
%\label{complexity_cost}
|
|
|
|
Controlling CPU speed on phones stems from a set of fairly intricate subsystems -- the scheduler, idle policy, drivers, as well as the governor itself.
|
|
Under common circumstances, they adjust the speed constantly.
|
|
Despite the complexity, the system often makes bad choices and picks speed that hurt both energy and performance.
|
|
This is because past CPU utilization -- the bedrock metric of all dynamic governors -- has little to do with the ideal present CPU speed.
|
|
\fixme{because: idle thread, slow rx ...}
|
|
|
|
\tinysection{Dynamic governors can hurt responsiveness}
|
|
|
|
% N.b. scheduling classes: stop-dl-rt-cfs-idle. DVFS only applies to cfs tasks (dl / rt tasks run at 100)
|
|
% N.b. schedutil, unlike others, estimates load per-task rather than per-core. So handles task migration better.
|
|
% N.b. But schedutil still calculates based on "_recent_ load"
|
|
|
|
% schedutil and greedy: zhou, p.3
|
|
|
|
The default governor policy, \schedutil, hurts responsiveness.
|
|
The \schedutil policy sets the CPU speed based on a rolling window of recent runqueue utilization.
|
|
On a phone, workloads typically do not saturate the CPUs but vary constantly in demand.
|
|
With history-driven dynamic policies such as \schedutil, this triggers constantly changing speeds\cite{nuessle2019benchmarking}.
|
|
Figure \ref{fig:missed_opportunities} shows how ramp-up already time hurts performance.
|
|
However, previous studies have additionally noted that intermittent workloads makes this problem significantly worse.\cite{nuessle2019benchmarking}
|
|
Figure \ref{fig:speed_time_delay} illustrates this: We ran the same fixed workload with and without intermittent 5ms sleeps.
|
|
With no sleep intervals, the top graph shows the workload takes $\sim$7.1s to complete.
|
|
Adding 1000 5ms sleeps (the bottom graph) induces the governor to keep the speed much lower, hovering around 40\% of maximum throughout the run.
|
|
Of the additional 18.2s runtime, 5s stems from total sleeping, and $\sim$13.1s from running at a slower CPU speed.
|
|
We will show that real-world apps, when running the default policy, similarly spend significant time at unnecessarily low speeds.
|
|
|
|
|