temporary push

main
Oliver Kennedy 2019-02-20 14:42:47 -05:00
parent cc9c8c0305
commit c87fe946d7
7 changed files with 169 additions and 75 deletions

View File

@ -82,6 +82,9 @@
\section{Infrastructure Description}
\input{sections/infrastructure_description.tex}
\subsection{Smartphone Deployment Testbed}
\input{sections/description_testbed.tex}
\subsection{Reproducible Evaluation of Mobile Data Management Systems}
\label{sec:eval}
\input{sections/description_runner.tex}
@ -93,8 +96,6 @@
\subsection{Benchmarking Suite}
\input{sections/description_benchmark.tex}
\subsection{Smartphone Deployment Testbed}
\input{sections/description_testbed.tex}
%\subsection{\PocketData Workshop and Outreach Efforts}
%\input{sections/workshop.tex}

View File

@ -1,14 +1,29 @@
%!TEX root = ../proposal.tex
\begin{wrapfigure}{r}{.5\textwidth}
\centering
\vspace{-0.3cm}
\includegraphics[width=.49\textwidth]{graphics/new_elbow_c.pdf}
\vspace{-0.3cm}
\bfcaption{\small YCSB Benchmark A on a Nexus 6.}
\label{fig:elbow}
\trimfigurespacing
\end{wrapfigure}
\begin{figure*}
\centering
\begin{subfigure}{.5\textwidth}
\centering
\vspace{-0.3cm}
\includegraphics[width=.8\textwidth]{graphics/new_elbow_c.pdf}
\vspace{-0.3cm}
\bfcaption{\small YCSB Benchmark C on a Nexus 6.}
\label{fig:elbow}
\trimfigurespacing
\end{subfigure}
\begin{subfigure}{.49\textwidth}
\centering
\vspace*{-5mm}
\includegraphics[width=.8\textwidth,trim={5mm 0 10mm 15mm},clip]{graphics/example_workload_trace.pdf}
\vspace{-0.3cm}
\bfcaption{\small 400 second trace of query load for one user.}
\label{fig:loadtrace}
\trimfigurespacing'
\vspace*{-4mm}
\end{subfigure}
\bfcaption{Measuring solely throughput on mobile systems produces misleading results.}
\trimfigurespacing
\end{figure*}
Numerous benchmarks presently exist for databases~\cite{Curino:2012:BOD:2390021.2390025,Ahmed:2010:MPA:1878537.1878641,Malkowski:2010:EAD:1774088.1774449,Erling:2015:LSN:2723372.2742786,Frank:2012:EUD:2188286.2188315}, distributed databases~\cite{Kuhlenkamp:2014:BSE:2732977.2732995,Baumgartel:2013:BMS:2939301.2939314}, and key value stores~\cite{ycsb,Atikoglu:2012:WAL:2318857.2254766, Tomas:2017:FRB:3064889.3064897}.
However, such benchmarks invariably target server-class database systems.
@ -22,6 +37,7 @@ This makes sense for the typically multi-client server-class databases, where th
However, typical mobile data management happens at much lower rates~\cite{DBLP:conf/tpctc/KennedyACZ15}, and latency and power consumption are far more important.
Hence, using existing benchmarks \emph{directly} on mobile databases can produce misleading results and obscure relevant details.
\tinysection{Mobile Phones (and Android) Are Different} \label{android_different}
Mobile phones are very different systems, by design, from server-class database hardware. They are embedded and resource constrained on several fronts: processing, memory, and energy. They are also interactive multi-tenant devices.
@ -33,41 +49,6 @@ This governor was also the Android default through the Nexus 5.
Recent Android phones since the Nexus 6 moved to the Interactive governor, which as its name suggests, provides better response to user threads by ramping CPU speed quicker when needed.
Both of these defaults, roughly, base their CPU frequency choice on how busy the CPU has been recently.
\begin{figure*}
\centering
\begin{subfigure}{0.30\textwidth}
\centering
\includegraphics[width=\textwidth,trim={30mm 10mm 0 0}]{graphics/Stacked_C.pdf}
\bfcaption{Saturated v. unsaturated}
\label{fig:clean_dirty_C}
\end{subfigure}
\begin{subfigure}{0.34\textwidth}
\centering
\includegraphics[width=\textwidth]{graphics/YCSB_WorkloadC_TimingA-freq_timeline.pdf}
\bfcaption{CPU frequency over time.}
\label{fig:C_frequency_shifts}
\end{subfigure}
\begin{subfigure}{0.32\textwidth}
\centering
\includegraphics[width=\textwidth,trim={5mm 10mm 0 0}]{graphics/C_latencies.pdf}
\bfcaption{Saturated v. unsaturated latency.}
\label{fig:C_governor_latencies}
\end{subfigure}
\vspace*{-3mm}
\bfcaption{YCSB Workload C}
\trimfigurespacing
\end{figure*}
\begin{wrapfigure}{r}{.49\textwidth}
\centering
\vspace*{-5mm}
\includegraphics[width=.46\textwidth,trim={5mm 0 10mm 15mm},clip]{graphics/example_workload_trace.pdf}
\vspace{-0.3cm}
\bfcaption{\small 400 second trace of query load for one user.}
\label{fig:loadtrace}
\trimfigurespacing'
\vspace*{-4mm}
\end{wrapfigure}
Android systems, however, elicit markedly different responses from these governors than do traditional DB systems.
Server-class database workloads try to extract as much work as possible from the hardware.
@ -104,6 +85,31 @@ Repeated idling, such as from lower loads or IO-blocked operations are interpret
The frequency scaling operation is expensive: No activity can be scheduled for several milliseconds while the core is scaled up or down.
Hence, when the CPU is running at a low frequency, a database with a burst of work takes a double performance hit: first from having an initially slower CPU and second from waiting while the core scales up\footnote{Ironically, this means that a database running on a non-saturated CPU could significantly improve latencies by simply busy-waiting to keep the CPU pinned at a high frequency.}.
\begin{figure*}
\centering
\begin{subfigure}{0.30\textwidth}
\centering
\includegraphics[width=\textwidth,trim={30mm 10mm 0 0}]{graphics/Stacked_C.pdf}
\bfcaption{Saturated v. unsaturated}
\label{fig:clean_dirty_C}
\end{subfigure}
\begin{subfigure}{0.34\textwidth}
\centering
\includegraphics[width=\textwidth]{graphics/YCSB_WorkloadC_TimingA-freq_timeline.pdf}
\bfcaption{CPU frequency over time.}
\label{fig:C_frequency_shifts}
\end{subfigure}
\begin{subfigure}{0.32\textwidth}
\centering
\includegraphics[width=\textwidth,trim={5mm 10mm 0 0}]{graphics/C_latencies.pdf}
\bfcaption{Saturated v. unsaturated latency.}
\label{fig:C_governor_latencies}
\end{subfigure}
\vspace*{-3mm}
\bfcaption{YCSB Workload C}
\trimfigurespacing
\end{figure*}
\tinysection{No Frequency Scaling at Saturation} \label{no_saturation_scaling}
However, the effects of frequency scaling only manifest themselves when CPUs are unsaturated -- something that would be missed by traditional database benchmarks.
Figure~\ref{fig:clean_dirty_C} illustrates the effect of frequency scaling on database performance by injecting artificial delays in between queries.

View File

@ -35,33 +35,14 @@ Unsurprisingly, the way embedded databases are used on smartphones is quite diff
\item Data accesses on a smartphone may be triggered in response to a variety of touch or camera gestures, specific sensor inputs, recognized activities, network connectivity, or any of a range of other events, making it difficult to synthesize realistic workloads.
\item Small differences in hardware or operating system can lead to significant changes in system performance, but it is not reasonable to expect researchers to individually obtain the dozens of smartphones necessary to test on an appropriate range of possible configurations.
\end{compactitem}
This proposal aims to create infrastructure that makes it easier to measure and interpret the performance of data management systems on mobile platforms.
\tinysection{Infrastructure Overview}
This proposal aims to
This proposal aims to create infrastructure that makes it easier to measure and interpret the performance of data management systems on mobile platforms. If funded, this proposal will:
(1) lower barriers to entry for research on \PocketData, and
(2) make it easier for researchers to obtain reproducible performance measurements of \PocketData systems.
If funded, we will build, document, and release the following 4 infrastructure components.
Specifically, we will develop one core infrastructure component and three additional software components as follows.
\textit{Component 1:} \textit{Tooling required to reproducibly evaluate the performance of mobile data management systems.}\\[0.5mm]
\PocketData systems rarely operate at saturation, so classical performance measures like throughput are less useful than latency or power usage~\cite{DBLP:conf/tpctc/KennedyACZ15,nuessle2019notyourfather}.
Worse still, at low throughputs, noise from background activities, frequency scaling, and environmental factors can disrupt measurements.
We will develop a modular toolkit for obtaining reproducible measurements of \PocketData systems and techniques.
%measurement under multiple settings.
% A clean-room environment with most self-management features disabled will produce consistent baseline measures to supplement a variety of environments that more closely replicate real-world conditions.
% Through the toolkit and documentation on best-practices for performance evaluation on mobile phones, Goal 1 will enable consistent,
\tinysection{Fundamental Infrastructure}
\textit{Component 2:} \textit{Establish realistic benchmark workloads for pocket-scale data management systems.} \\[0.5mm]
In the wild, data access is triggered by a variety of stimuli: varying network conditions, numerous onboard sensors, power status, and a range of gestures~\cite{DBLP:journals/pvldb/NandiJM13}, making it difficult to create reliably representative synthetic workloads.
As part of the NSF-funded \PhoneLab{} testbed~\cite{DBLP:conf/sensys/NandugudiMKBDKQ13}, during the planning phase of this proposal, we collected and analyzed query logs from phones in the wild (an extension of \cite{DBLP:conf/tpctc/KennedyACZ15}).
These logs will allow us to create synthetic workloads that are verifiably representative of real-world behavior.
\textit{Component 3:} \textit{Package components 1 and 2 into a plug-and-play benchmarking tool.} \\[0.5mm]
Simply having the technology to benchmark data management innovations is insufficient. These tools must be simple, easy to deploy, and ideally should include resources to aid in interpreting the results.
\textit{Component 4:} \textit{Deploy a diverse set of smartphones into a publicly accessible smartphone testbed.} \\[0.5mm]
\textit{Component 1:} \textit{Deploy a diverse set of smartphones into a publicly accessible smartphone testbed.} \\[0.5mm]
As of September 2017, the Google play store registered 13 different versions of Android~\cite{google:playstorestats}
running on tens of thousands of Android-compatible devices~\cite{opensignal:androidfragmentation}.
Small hardware differences
@ -71,6 +52,47 @@ We will develop such an array as a resource for the community.
%Goal 3 will enable consistent, reproducible, low-cost performance evaluations for \PocketData research.
\tinysection{Tools, Resources, and Data Sets}
\textit{Component 2:} \textit{Tooling required to reproducibly evaluate the performance of mobile data management systems.}\\[0.5mm]
\PocketData systems rarely operate at saturation, so classical performance measures like throughput are less useful than latency or power usage~\cite{DBLP:conf/tpctc/KennedyACZ15,nuessle2019notyourfather}.
Worse still, at low throughputs, noise from background activities, frequency scaling, and environmental factors can disrupt measurements.
We will develop a modular toolkit for obtaining reproducible measurements of \PocketData systems and techniques.
%measurement under multiple settings.
% A clean-room environment with most self-management features disabled will produce consistent baseline measures to supplement a variety of environments that more closely replicate real-world conditions.
% Through the toolkit and documentation on best-practices for performance evaluation on mobile phones, Goal 1 will enable consistent,
\textit{Component 3:} \textit{Establish realistic benchmark workloads for pocket-scale data management systems.} \\[0.5mm]
In the wild, data access is triggered by a variety of stimuli: varying network conditions, numerous onboard sensors, power status, and a range of gestures~\cite{DBLP:journals/pvldb/NandiJM13}, making it difficult to create reliably representative synthetic workloads.
As part of the NSF-funded \PhoneLab{} testbed~\cite{DBLP:conf/sensys/NandugudiMKBDKQ13}, during the planning phase of this proposal, we collected and analyzed query logs from phones in the wild (an extension of \cite{DBLP:conf/tpctc/KennedyACZ15}).
These logs will allow us to create synthetic workloads that are verifiably representative of real-world behavior.
\textit{Component 4:} \textit{Package components 2 and 3 into a plug-and-play benchmarking tool.} \\[0.5mm]
Simply having the technology to benchmark data management innovations is insufficient. These tools must be simple, easy to deploy, and ideally should include resources to aid in interpreting the results.
\tinysection{User Services}
We will provide an automated test submission system for system users.
A single command-and-control server will be connected to an array of mobile phones.
To launch a test, the command-and-control server will flash the target phone with the desired operating system version, install any instrumentation and user-provided software and libraries, and trigger the test.
We plan to initially target two basic use-cases: (1) New database systems, (2) New benchmark workloads.
Users will be able to upload new database systems by either providing source code that we will cross-compile and upload to the target phones, or by providing
for participants
We will deploy software components, including source code, documentation, and binaries through both the website developed during the pre-proposal phase~\cite{pocketdata:website}, as well as public open-data repositories.
User Services
\tinysection{Target Community}
Research on mobile devices, as well as the more general space of the internet of things (IoT) cuts across communities that work on data management systems, real-time and embedded devices, programming languages, software engineering, and operating and mobile systems.
Research and development in this space is actively being pursued by both academia and industry.

Binary file not shown.

View File

@ -9,7 +9,7 @@
\begin{center}
{\LARGE
\textsc{Community Outreach}
\textsc{Community Outreach Documentation}
}
\end{center}
\hrule
@ -34,8 +34,8 @@ Finally, we have budgeted a small amount of travel funding for the PIs to visit
\begin{center}
\begin{tabular}{>{\bf}r|%
>{\small}c|%
>{\small}c|%
>{\small}c%
>{\small}c%
>{\small}c%
}
\textbf{\large Activity}
@ -46,11 +46,11 @@ Finally, we have budgeted a small amount of travel funding for the PIs to visit
Tutorial
& 1-2 Conference Participants
& 2-3 Conference Participants
& -
& n/a
\\[1.2em]
Workshop
& -
& -
& n/a
& n/a
& 3-4 Conference Participants
\\[1.2em]
Website/Hosting

Binary file not shown.

View File

@ -0,0 +1,65 @@
\input{header}
\fancyfoot[C]{I-\thepage}
\usepackage{paralist}
\usepackage{array}
\usepackage{multirow}
\newcommand{\PocketData}{\textsc{PocketData}}
\begin{document}
\begin{center}
{\LARGE
\textsc{Project Roles and Responsibilities}
}
\end{center}
\hrule
\newcommand{\allyears}[1]{
\multicolumn{3}{c}{ \hspace{5mm}
\dotfill
~~{\small #1}~~
\dotfill
\hspace*{5mm}
}
}
\begin{center}
\begin{tabular}{>{\bf\begin{flushright}}m{0.2\textwidth}<{\end{flushright}}|%
>{\small\centering\arraybackslash}p{0.23\textwidth}%
>{\small\centering\arraybackslash}p{0.23\textwidth}%
>{\small\centering\arraybackslash}p{0.23\textwidth}%
}
\textbf{\large Team Member}
& \textbf{\large Year 1}
& \textbf{\large Year 2}
& \textbf{\large Year 3}\\[1em] \hline
\multirow{3}{*}{Kennedy}
& \allyears{Coordinate UB particpants}\\
& Organize Tutorial
&
& Co-Chair Workshop
\\\hline
Ziarek
&
& Organize Tutorial
& Co-Chair Workshop
\\\hline
\multirow{3}{*}{Kul}
& \allyears{Coordinate DESU particpants}\\
&
&
& Co-Chair Workshop
\\\hline
UB Student
& \allyears{Develop \& Package Benchmarking Tool}
\\\hline
UB Postdoc
& \allyears{Build, Manage \& Maintain Testbed}
\\\hline
DESU Student
& \allyears{Develop \& implement synthetic workload generator}\\
\end{tabular}
\end{center}
\end{document}