grant-NSF-CRI-2015/shortproposal.tex

281 lines
10 KiB
TeX
Raw Permalink Blame History

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

\documentclass[10pt,article,hidelinks]{nsfcnsproposal}
\usepackage{hyperref}
\usepackage{enumitem}
\setlist[itemize]{leftmargin=*,partopsep=5pt}
\setlist[enumerate]{leftmargin=*,partopsep=5pt}
\input{.xxxnote}
\pagestyle{plain}
\tightlists
\firmlists
\newcommand{\PhoneLab}{\textsc{PhoneLab}}
\hyphenation{Phone-Lab}
\newcommand{\PocketData}{\textsc{PocketData}}
\hyphenation{Pocket-Data}
\def\thetitle{CRI Pre-proposal: CI-New: Supporting Pocket-Scale Data Management Research}
\def\shorttitle{Pocket-Scale Data Management Research}
\def\theauthors{PI: Oliver Kennedy, Co-PIs: Geoffrey Challen and
Lukasz Ziarek (Univ. of Buffalo, Dept. of Comp. Sci. and Eng.)}
\def\shortauthors{Kennedy, Challen, Ziarek}
\def\submissiondate{10 Nov 2015}
\let\OLDthebibliography\thebibliography
\renewcommand\thebibliography[1]{
\OLDthebibliography{#1}
\setlength{\parskip}{0pt}
\setlength{\itemsep}{0pt plus 0.3ex}
}
\begin{document}
\chapterstyle{summary}
\chapter{\thetitle}
\chapterstyle{proposal}
\section{Infrastructure Description}
% A concise description of the infrastructure to be developed, enhanced, or
% sustained. This includes a description of major equipment needs for the
% project as well as other significant costs.
The world's 2~billion smartphones and 4~million apps have become a large part
of most people's computing experiences.
%
A common requirement of apps is persisting structured data, a task frequently
performed using an \textit{embedded database} such as SQLite.
%
These are heavily used, with Android smartphones generating an average of
more than two SQLite queries per second~\cite{pocketdata}.
%
We propose to enable research on smartphone embedded databases through two
infrastructure efforts: instrumentation and benchmarking.
\textbf{Goal 1:} \textit{Develop the infrastructure required to understand
how apps use embedded databases.} \\[0.5mm]
%
As part of a preliminary investigation, we used the \PhoneLab{} smartphone
testbed~\cite{phonelab} to deploy a version of SQLite instrumented to log
queries and statistics, such as per-query runtime and number of rows
returned.
%
Even with this minimal level of instrumentation, our preliminary
analysis~\cite{pocketdata} identified numerous opportunities for improvement:
both in SQLite itself, and in libraries and apps that invoke SQLite.
We propose widen our analytics
efforts to more smartphone platforms and embedded databases
and to expand our initial instrumentation.
%
For example, monitoring event handlers or performing program analysis will
allow us to link individual queries to higher-level app semantics.
%
% Similarly, filesystem instrumentation will allow us to monitor indirect
% database updates such as an app simply downloading an entire embedded
% database.
%
The result %output of this part of the project
will be a %n instrumentation and analysis
library that can be used
by app and platform developers
to collect and analyze data about embedded database performance.
\textbf{Goal 2:} \textit{Establish a benchmark for pocket-scale data
management systems.} \\[0.5mm]
%
Understanding the challenges faced by mobile apps in their use of
embedded databases will allow us to address our second goal.
%
We will establish a set of standards, benchmarks, and metrics that can be
used to evaluate data management systems for mobile devices.
%
% We propose to establish a \PocketData{} benchmark suite for evaluating
% data management solutions that persist and query structured app state.
%
This \PocketData{} benchmark suite will evaluate
embedded databases, as well as higher-level primitives such
as object-relational mappers and lower-level primitives such as key-value
stores.
Our approach differs from current benchmarks in three important ways:
%
% \begin{enumerate}
%
% \item
(1)
We will use our traces to build as many specialized benchmarks as
needed to reflect the many ways apps store and access
structured data;
%
% \item
(2)
We will evaluate systems based on steady-state load, rather than
pushing them to saturation;
%
% \item
and (3)
We allow for non-traditional database interfaces by utilizing
semantic rather than syntactic workloads.
%
% \end{enumerate}
%
In addition, we will also work to enable transparent per-app
selection and optimization of different embedded database engines.
%
This will both help developers improve their apps by determining the most
appropriate way to store structured data, while also allowing real-world
evaluation of approaches to pocket-scale data management.
%
We also anticipate that this feature will help drive data collection and
benchmark adoption by harnessing developers eager to improve their app's
performance.
We have budgeted for two graduate students, each responsible for one project
goal.
%
Our budget also supports effort for all three investigators, who will jointly
advise the graduate students.
%
The budget includes support for travel and publicity allowing us to continue
to build external support from app developers and the database community.
%
(See community involvement below).
%
We anticipate needing minimal hardware: Two personal computers for the
graduate students, and mobile phones to drive development efforts and
testing.
%
Software and benchmarking results produced by this project will be
disseminated through a combination of public software repositories such as
GitHub and public websites.
\section{Research Focus}
% The CISE Research Focus. This section describes the research focus that is
% enabled by the infrastructure, the importance of the research problems to
% advancing CISE research frontiers, and the expertise of the research team
% relative to the focused research thrust. The description should identify
% the project team and detail each members contributions to the project as
% well as specific expertise relative to the proposed focused research
% agenda.
Unsurprisingly, app usage of embedded databases is quite different from the
workloads created by database servers supporting websites, data analytics, or
cloud applications.
%
For example, while database servers are tested and tuned for high-throughput
continuous query processing, embedded databases experience lower-throughput
bursts of queries due to interactive use.
%
While the fundamental challenges faced by embedded databases---such as
minimizing energy consumption, query latency, and disk utilization---are
familiar ground for database researchers, the specific tradeoffs produced by
each app's unique workload characteristics are not well understood.
%
A natural first step in smartphone embedded database research is collecting
the data required to identify efficiency bottlenecks, common usage patterns,
and other opportunities for improvement.
%
We have already released a preliminary dataset collected on \PhoneLab{} and
an accompanying analysis~\cite{pocketdata}.
%
NSF support will allow us to continue these efforts by expanding our data
collection efforts to produce more and higher-quality data.
%
We will make this data freely available to the database research community so
that it can be used to understand and improve embedded databases.
As opportunities for improvement are identified, it will become necessary to
evaluate proposed solutions.
%
Our second goal is to develop and maintain a benchmark generator and suite of
benchmarks for pocket-scale data management.
%
We will also play an active role in publishing benchmark results for novel
approaches to smartphone data management.
%
In this way, we hope to spur further research on embedded databases and on
related topics.
%
Performance results will also help encourage app developers to improve their
apps, and embedded database developers to improve their libraries.
The project team unites experts with complementary expertise on databases,
mobile systems, and program analysis.
%
PI Kennedy will contribute expertise in databases, and in particular the
synthesis of databases and compilers to produce highly specialized data
management solutions.
%
PI Challen will contribute expertise in mobile systems and
research infrastructure.
%
PI Ziarek will contribute experience in compilers, program analysis, and embedded
devices.
%
Team members have collaborated successfully on multiple projects in the past,
including a language construct for runtime
adaptation~\cite{Challen:2015:MWE:2699343.2699361} and adaptive index
structures~\cite{kennedy2015just}.
\section{Sample Research Project}
In our preliminary analysis~\cite{pocketdata}, we noted that known
limitations of object-relational mappers were causing a significant increase
in the number of queries processed by SQLite.
%
Compiler-based database integrations like
LMS~\cite{Rompf:2015:FPS:2784731.2784760} or StatusQuo~\cite{StatusQuo}
begin to address these limitations.
%
Evaluating these ideas requires an instrumentation and benchmarking suite,
such as the one that we propose to provide.
\section{Community Involvement}
Our preliminary analysis~\cite{pocketdata} was presented to the Transactional
Processing Performance Council (TPC) during their annual workshop.
%
The TPC is an organization that oversees many of the popular database
benchmarks, and can provide feedback and publicity for our efforts.
%
We believe the TPC is excited about this project. We have been working with
TPC leadership, including
%Meikel Poess of Oracle and
Raghunath Nambiar of Cisco, to obtain buy-in from smartphone vendors.
%
We have also been discussing our project with potential users and
contributors.
%
For example, Arnab Nandi from Ohio State has stated an interest in
contributing traces from his efforts on interactive smartphone data
management~\cite{Jiang:2015:SPI:2809974.2809986}, and Sharad Agarwal at
Microsoft Research has expressed interest in utilizing the benchmark.
%
We have also received expressions of interest from other embedded database
and mobile systems researchers, both in academia and in industry, and the
requested funding will allow us to continue our outreach efforts.
\section{Relevance to CISE}
All three investigators are active in CISE research communities, including
databases (Kennedy), mobile systems (Challen), and programming languages
(Ziarek).
%
We anticipate that the proposed project will be of primary benefit to the
database and mobile systems communities.
%
Finally, the project has not received any previous support from the NSF or
any other funding source.
{\scriptsize
\bibliographystyle{nsf}
\bibliography{preliminary}
}
\end{document}