Incorporating some of Geoff's suggestions

master
Oliver Kennedy 2015-06-22 13:37:46 -04:00
parent c1947ba304
commit b12976fea6
5 changed files with 54 additions and 39 deletions

View File

@ -40,10 +40,6 @@ Website: \texttt{http://odin.cse.buffalo.edu/research/}
\section{Introduction}
\input{sections/1-introduction}
\section{Overview: Why TPC-MOBILE?}
\label{sec:overview}
\input{sections/2-overview}
\section{Experimental Setup}
\label{sec:experimental}
\input{sections/3-experimental}
@ -56,10 +52,14 @@ Website: \texttt{http://odin.cse.buffalo.edu/research/}
\label{sec:dba}
\input{sections/5-dba}
\section{Pocket Data, TPC-MOBILE, and Related Work}
\section{Pocket Data and Related Work}
\label{sec:pocketdata}
\input{sections/6-pocketdata}
\section{Why TPC-MOBILE?}
\label{sec:overview}
\input{sections/2-overview}
\section{Conclusions}
\label{sec:conc}
\input{sections/7-conclusions}

View File

@ -1,7 +1,7 @@
% Mobile systems are important.
The world's 2~billion smartphones represent the most powerful and pervasive
distributed system ever built. And open application marketplaces, such as the
distributed system ever built. Open application marketplaces, such as the
Google Play Store, have resulted in a vibrant software ecosystem comprising
millions of smartphone and tablet apps in hundreds of different categories
that both meet existing user needs and provide exciting novel capabilities.
@ -14,9 +14,9 @@ private data, a task that is frequently performed using an \textit{embedded
database} such as SQLite~\cite{sqlite}. Android, the open-source and
widely-used smartphone platform, provides interfaces that simplify the
process of accessing private SQLite databases, and many apps make use of
SQLite for this purpose. In addition, the Android platform services that
provide the app interface make heavy use of SQLite, as do built-in (Mail,
Contacts) and popular apps (Gmail, Maps) and libraries (Google Play Services)
SQLite for this purpose. In addition, Android platform services themselves
make heavy use of SQLite, as do built-in apps (Mail,
Contacts), popular apps (Gmail, Maps), and libraries (Google Play Services)
distributed by Google. As a result, the large and growing number of mobile
apps using embedded databases represent a new and important class of database
clients.
@ -25,10 +25,11 @@ Unsurprisingly, mobile app usage of embedded databases is quite different
from the workloads experienced by database servers supporting websites or big
data applications. For example, while database servers are frequently tested
and tuned for continuous high-throughput query processing, embedded databases
experience lower-volume but bursty workloads produced by interactive use. And
while enterprise database servers are frequently provisioned to have
exclusive access to an entire machine, apps using embedded databases compete
for shared system resources with other apps and may be effected by
experience lower-volume but bursty workloads produced by interactive use.
As another example,
enterprise database servers are frequently provisioned to have
exclusive access to an entire machine, while apps using embedded databases compete
for shared system resources with other apps and may be affected by
system-wide policies that attempt to conserve limited energy on
battery-constrained mobile devices. So while the fundamental challenges
experienced by mobile apps using embedded databases---minimizing energy
@ -42,23 +43,32 @@ smartphone platform. Our analysis shows that the workloads experienced by
SQLite on these phones differ substantially from the database workloads
expressed by popular database benchmarking suites. We argue that a new
benchmark for mobile embedded databases is required to effectively measure
their performance could spur innovation in this area, and outline the
workload characteristics of such a benchmark. The main contributions of the
paper are the following:
%
\begin{itemize}
%
\item The synthesis of an open source data set containing all database
queries and query performance statistics generated by the personal
smartphones of 11~\PhoneLab{} participants over one month.
%
\item A detailed examination of real-world Android SQLite~\cite{sqlite}
usage including a comparison to traditional TPC benchmark workloads.
%
\item An outline of workload characteristics for a
proposed TPC-MOBILE benchmark.
%
\end{itemize}
their performance, and that such a benchmark could spur innovation in this
area.
% \begin{itemize}
% %
% \item The synthesis of an open source data set containing all database
% queries and query performance statistics generated by the personal
% smartphones of 11~\PhoneLab{} participants over one month.
% %
% \item A detailed examination of real-world Android SQLite~\cite{sqlite}
% usage including a comparison to traditional TPC benchmark workloads.
% %
% \item An outline of workload characteristics for a
% proposed TPC-MOBILE benchmark.
% %
% \end{itemize}
Our specific contributions are as follows: (a)~A month-long trace of
SQLite usage under real world conditions (details in
Section~\ref{sec:experimental}), (b)~An in-depth analysis of the complexity
(Section~\ref{sec:queryc}) and runtime (Section~\ref{sec:dba})
characteristics of SQL statements evaluated by SQLite during this
trace, (c)~A comparison of these characteristics to existing benchmarking
strategies (Section~\ref{sec:pocketdata}), and (d)~An overview of the
requirements for a new ``pocket data'' benchmark: TPC-MOBILE
(Section~\ref{sec:overview}).
%% LocalWords: smartphone Android Android's SQLite io smartphones
%% LocalWords: testbed PhoneLab TPC

View File

@ -1,12 +1,15 @@
Our primary observation is that an embedded database workload in a modern mobile device includes a mix of both OLTP and OLAP characteristics. The majority of operations performed by SQLite are simple key-value manipulations and look-ups. However, a substantial fraction of the (comparatively read-heavy) workload consists of far more complex OLAP-style operations involving wide, multi-table joins, nested sub-queries, complex selection predicates, and aggregation.
Our primary observation was that a pocket data workload includes a mix of both OLTP and OLAP characteristics. The majority of operations performed by SQLite were simple key-value manipulations and look-ups. However, a substantial fraction of the (comparatively read-heavy) workload consisted of far more complex OLAP-style operations involving wide, multi-table joins, nested sub-queries, complex selection predicates, and aggregation.
Many of these workload characteristics are motivated by factors unique to embedded databases. For example, SQLite uses single-file databases that have a standard, platform-independent format. As a consequence, it is common to see entire databases, indexes and all, transported in their entirety through web downloads or as attachments to other files~\cite{Dit2015CIDR}. A common pattern we observed was for a cloud service to package a fragment of its state into a SQLite database, which could then be cached locally on the device for lower-latency and offline access.
Many of these workload characteristics appeared to be motivated by factors unique to embedded databases. For example, SQLite uses single-file databases that have a standard, platform-independent format. As a consequence, we saw indications of entire databases, indexes and all, being transported in their entirety through web downloads or as attachments to other files~\cite{Dit2015CIDR}. A common pattern we observed was for a cloud service to package a fragment of its state into a SQLite database, which could then be cached locally on the device for lower-latency and offline access.
Query optimization goals also differ substantially. For example, latency is a primary concern, but at vastly different scales. Over our one-month trial, the average SQL statement took 2 ms to evaluate, and even complex \texttt{SELECT} queries with 4-level deep nesting only took an average of 120 ms.
Query optimization goals also differ substantially for pocket data workloads. For example, latency is a primary concern, but at vastly different scales. Over our one-month trial, the average SQL statement took 2 ms to evaluate, and even complex \texttt{SELECT} queries with 4-level deep nesting only took an average of 120 ms.
Finally, unlike typical server-class benchmark workloads where throughput is a key factor, embedded databases have fixed, ``small data"~\cite{Dit2015CIDR} workloads and need to share computing resources fairly with other processes on the same device. This means that in stark contrast to server-class workloads, the database is idle more frequently. Periods of low-utilization are opportunities for background optimization, but must be managed against the needs of other applications running on the device, as well as the device's limited power budget. We use the term ``pocket data" to refer to data management settings that exhibit such characteristics.
Finally, unlike typical server-class benchmark workloads where throughput is a key factor, embedded databases have smaller workloads --- on the order of hundreds of rows at most. Moreover, embedded databases
need to share computing resources fairly with other processes on the same device. This means that in stark contrast to server-class workloads, an embedded database is idle more frequently. Periods of low-utilization are opportunities for background optimization, but must be managed against the needs of other applications running on the device, as well as the device's limited power budget.
Pocket data workloads represent a growing, and extremely important class of database consumers. Unfortunately, research and development on embedded databases (\textit{e.g.},~\cite{jeong2013iostack,kang2013xftl}) is presently obligated to rely on micro-benchmarks or anecdotal observations about the needs and requirements of embedded database engines. In this paper, we lay out the characteristics of a one month trace of SQLite operations performed on eleven Android smartphones participating in the \PhoneLab{} experimental platform~\cite{phonelab}. We believe that a new TPC-MOBILE benchmark that captures these characteristics can provide a principled, standardized way to evaluate advances in mobile database technology, which will in turn, help to drive the development of such advances.
Pocket data workloads represent a growing, and extremely important class of database consumers. Unfortunately, research and development on embedded databases (\textit{e.g.},~\cite{jeong2013iostack,kang2013xftl}) is presently obligated to rely on micro-benchmarks or anecdotal observations about the needs and requirements of embedded database engines.
%In this paper, we laid out the characteristics of a one month trace of SQLite operations performed on eleven Android smartphones participating in the \PhoneLab{} experimental platform~\cite{phonelab}.
We believe that a new TPC-MOBILE benchmark that captures the characteristics observed in this paper can provide a principled, standardized way to evaluate advances in mobile database technology, which will in turn, help to drive the development of such advances.
%% LocalWords: OLTP OLAP SQLite ms Android smartphones PhoneLab TPC

View File

@ -6,7 +6,7 @@ receive discounted service in return for providing data to smartphone
experiments. \PhoneLab{} participants are balanced between genders and
distributed across ages, and thus representative of the broader smartphone
user population. \PhoneLab{} smartphones run a modified version of the
Android Open Source Platform (AOSP) 4.4.4 "KitKat" including instrumentation
Android Open Source Platform (AOSP) 4.4.4 ``KitKat" including instrumentation
and logging developed in collaboration with the mobile systems community.
Participating smartphones log experimental results which are uploaded to a
centralized server when the device is charging.
@ -30,10 +30,10 @@ Our trace data-set is drawn from publicly-available data provided by
released\footnote{\url{https://phone-lab.org/static/experiment/sample_dataset.tgz}}
complete trace data for their phones for March 2015. Of the eleven
participants, seven had phones that were participating in the SQLite
experiment every day for the full month, with the remaining phones were
experiment every day for the full month, with the remaining phones
active for 1, 3, 14, and 19 days. A total of 254 phone/days of data were
collected including 45,399,550 SQL statements. Of these, we were unable to
interpret 308,752 statements (~0.5\%) due to a combination of data corruption
interpret 308,752 statements ($\sim$0.5\%) due to a combination of data corruption
and the use of unusual SQL syntax. Results presented in this paper that
include SQL interpretation are based on the 45,090,798 queries that were
successfully parsed.

View File

@ -1,4 +1,6 @@
In spite of the prevalence of mobile devices, relatively little attention has been paid to pocket-scale data management. We believe that this is, in large part, due to the lack of a common, overarching mechanism to evaluate potential solutions to known challenges in the space. In this section, we first explore some existing research on mobile databases, and in particular focus on how the authors evaluate their solutions. Then, we turn to existing benchmarking suites and identify specific disconnects that prevent them from being applied directly to model pocket data. In the process, we explore aspects of these benchmarks that could be drawn into a potential pocket-data benchmark.
In spite of the prevalence of SQL on mobile devices, and a increasing interest in so-called ``small data"~\cite{Dit2015CIDR}, relatively little attention has been paid to the rapidly growing \textit{pocket data} space.
%We believe that this is, in large part, due to the lack of a common, overarching mechanism to evaluate potential solutions to known challenges in the space.
In this section, we first explore some existing research on mobile databases, with a focus on how the authors evaluate their solutions. Then, we turn to existing benchmarking suites and identify specific disconnects that prevent them from being applied directly to model pocket data. In the process, we explore aspects of these benchmarks that could be drawn into a potential pocket data benchmark.
\subsection{Pocket Data Management}
\label{sec:pocketdata:related}