Revising intro

master
Oliver Kennedy 2018-04-15 15:13:40 -04:00
parent e823f33a3f
commit 6eb35b31cb
25 changed files with 66 additions and 30 deletions

0
.gitignore vendored Executable file → Normal file
View File

0
ACM-Reference-Format.bst Executable file → Normal file
View File

Binary file not shown.

0
acmart.cls Executable file → Normal file
View File

0
acmart.dtx Executable file → Normal file
View File

0
acmart.ins Executable file → Normal file
View File

View File

@ -13,4 +13,39 @@
booktitle={USENIX Annual Technical Conference},
pages={309--320},
year={2013}
}
}
@inproceedings{DBLP:conf/vldb/ChaudhuriN07,
author = {Surajit Chaudhuri and Vivek R. Narasayya},
booktitle = {{VLDB}},
pages = {3--14},
publisher = {{ACM}},
title = {Self-Tuning Database Systems: {A} Decade of Progress},
year = 2007
}
@inproceedings{DBLP:conf/vldb/AgrawalCN00,
author = {Sanjay Agrawal and Surajit Chaudhuri and Vivek R. Narasayya},
booktitle = {{VLDB}},
pages = {496--505},
publisher = {Morgan Kaufmann},
title = {Automated Selection of Materialized Views and Indexes in {SQL} Databases},
year = 2000
}
@inproceedings{DBLP:conf/sigmod/AkenPGZ17,
author = {Van Aken, Dana and Pavlo, Andrew and J. Gordon, Geoffrey and Zhang, Bohan},
booktitle = {{SIGMOD} Conference},
pages = {1009--1024},
publisher = {{ACM}},
title = {Automatic Database Management System Tuning Through Large-scale MachineLearning},
year = 2017
}
@inproceedings{DBLP:conf/edbt/IdreosMG12,
author = {Stratos Idreos and Stefan Manegold and Goetz Graefe},
booktitle = {{EDBT}},
pages = {566--569},
publisher = {{ACM}},
title = {Adaptive indexing in modern database kernels},
year = 2012
}

View File

@ -1,12 +1,15 @@
\documentclass[sigconf, anonymous]{acmart}
% \documentclass[sigconf]{acmart}
%\documentclass[sigconf]{acmart}
%\documentclass{vldb}
\documentclass{vldb}
\input{preamble}
\usepackage{balance} % for \balance command ON LAST PAGE (only there!)
% \toappear{}
\newtheorem{example}{Example}
\usepackage[dvipsnames]{xcolor}
%\numberofauthors{1}
\begin{document}
@ -14,29 +17,27 @@
\title{Summarizing Small Data Workloads}
% \author{
% \alignauthor
% Gokhan Kul, Gourab Mitra, Oliver Kennedy, Lukasz Ziarek\\
% \affaddr{University at Buffalo, SUNY}\\
% \email{\{gokhanku, gourabmi, okennedy, lziarek\}@buffalo.edu}
% }
\author{
Anonymous authors
\alignauthor
Gokhan Kul, Gourab Mitra, Oliver Kennedy, Lukasz Ziarek\\
\affaddr{University at Buffalo, SUNY}\\
\email{\{gokhanku, gourabmi, okennedy, lziarek\}@buffalo.edu}
}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\maketitle
\begin{abstract}
\input{sections/0-abstract.tex}
\end{abstract}
%>>>> Include a list of keywords after the abstract
\keywords{Benchmark, Database, Workload, Mobile Systems}
% \keywords{Benchmark, Database, Workload, Mobile Systems}
\maketitle
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Introduction}

0
sections/0-abstract.tex Executable file → Normal file
View File

33
sections/1-introduction.tex Executable file → Normal file
View File

@ -12,35 +12,34 @@
%Although there is a limited number of choices for database management systems available for smartphones, we anticipate the release of alternative systems soon.
%Mobile phones have become ubiquitous in the last few decades.
Many modern smartphone apps, operating systems, and services need to persist structured data.
Many modern smartphone applications (apps), operating systems, and services need to persist structured data.
For this task, developers typically turn to an embedded database like SQLite, which is a part of most modern smartphone operating systems.
Embedded databases play a significant role in the performance of smartphone apps and can easily become sources of user-perceived latency~\cite{yang-icse15}.
Crafting apps with good user experiences thus often requires tuning indexes, schemas, or other configutation options to the needs of the app.
Unfortunately, these needs can be hard to characterize and optimize for.
The server-class workloads that the database community is familiar with are typically high-volume streams of homogeneous queries from a mix of simultaneous users.
In contrast, each smartphone app has a dedicated database, and is typically used by only one user for a variety of tasks that are usually performed one at a time.
As a consequence, the database workload created by a typical app (e.g., Figure~\ref{fig:sampleFacebook}) is bursty, variable, noisy, and as a result can be hard to summarize.
Hence, it is more common for researchers and app developers to synthesize workloads to experimentally evaluate tuning options~\cite{kim2012androbench}.
Unfortunately, these synthetic workloads are typically created in controlled settings, often without any guarantees that they are representative of real world usage.
Providing database support for good user experiences presently requires tuning indexes, schemas, or other configutation options to the needs of the app.
While the process of (automated) database tuning has received significant attention~\cite{DBLP:conf/vldb/ChaudhuriN07,DBLP:conf/vldb/AgrawalCN00,DBLP:conf/sigmod/AkenPGZ17}, each solution relies on a representative model of the database workload.
\begin{figure}[h!]
In the server-class systems that the database community is familiar with, workloads are typically high-volume streams of homogeneous queries from a mix of simultaneous users.
Hence, while there may be shifts in workload frequency, the workload itself can be modeled by a representative sample of queries.
Conversely, each smartphone app has a dedicated database, and is typically used by only one user for a variety of tasks that are usually performed one at a time.
Simple workload samples are not representative of the bursty, variable, and noisy database access patterns of a typical app (e.g., Figure~\ref{fig:sampleFacebook}).
\begin{figure}
\centering
\includegraphics[width=0.45\textwidth]{graphics/ChangeOverTimeData}
\caption{Sample Facebook Workload}
\label{fig:sampleFacebook}
\end{figure}
The problems of workload synthesis and summarization are linked; A good summary identifies features that need to be reproduced in a synthetic workload.
In this paper we tackle both problems, creating a process for extracting representative summaries from smartphone database workload traces, which in turn can be used to synthesize representative workloads.
Nominally, this requires us first to understand how users interact with the app, and second how these interactions translate into database activity.
Naively, we might do this by instrumenting the app: monitoring user interactions and the app's database activity.
However, this is not always feasible.
In this paper, we develop a process for modeling smartphone database workload activity.
Nominally, this requires us to (1) understand how users interact with the app, and (2) how these interactions translate into database activity.
The most direct way to do this would be to instrument the app to monitor user interactions, as well as the resulting database activity.
Assuming that it is possible to modify the app --- which is not always the case --- such instrumentation is not always productive.
For example, latency sensitive operations like list scrolling can trigger rapid sequences of single-row or range queries~\cite{DBLP:conf/sigmod/EbensteinKN16}, but can be hard to instrument without affecting user experience.
Similarly, queries triggered by a list scroll may be offloaded to a worker thread, making it difficult to associate them with the scrolling action.
In short, the direct approach of app instrumentation is hard and needs to be repeated for each app nearly from scratch.
Such queries are frequently offloaded to background worker threads, making it hard to attribute these queries to any specific user action.
In short, directly instrumenting the app is not always feasible.
We propose a more straightforward summarization technique that only requires a log of the app's queries.
This in turn can be obtained by simply linking the app against an appropriately instrumented embedded database library~\cite{kennedy2015pocket}.
Such a log can be obtained by simply linking the app against an appropriately instrumented embedded database library~\cite{kennedy2015pocket}.
Overtly, our approach is similar to the naive one: We first summarize user interactions with the app and then the effect of these interactions on the database.
To summarize user interactions, we treat the query log as a collection of \emph{sessions}, or bursts of database activity typically triggered by self-contained user activities, such as checking a Facebook feed or composing an email.
After partitioning a log into sessions, we attempt to recover the specific class of interaction associated with each sequence, mapping each session to one of a set of \emph{session categories}.

0
sections/2-background.tex Executable file → Normal file
View File

0
sections/3-systemoverview.tex Executable file → Normal file
View File

0
sections/3a-clustering.tex Executable file → Normal file
View File

0
sections/3b-patternmatching.tex Executable file → Normal file
View File

0
sections/3d-resourceutilization.tex Executable file → Normal file
View File

3
sections/4-experiments.tex Executable file → Normal file
View File

@ -1,3 +1,4 @@
%!TEX root=../paper.tex
In this section, we describe the datasets, the environment we performed our experiments in, and the experiment designs along with their results.
All of our experiments were run on a machine with 3.6 GHz Intel i7 6th. Generation processor with 16GB RAM. We leveraged the Java 1.8 SE Runtime Environment and R v3.3.2 on Ubuntu 16.04 operating system.
@ -274,7 +275,7 @@ The aim of this experiment is to investigate if detected session clusters corres
\begin{figure}[h!]
\centering
\includegraphics[width=0.45\textwidth]{graphics/activityRecognition}
\includegraphics[width=0.45\textwidth]{graphics/ActivityRecognition}
\vspace{-0.5cm}
\caption{Activity recognition performance for different profiler methods}
\label{fig:activityRecognition}

0
sections/4-experiments.tex.bak.tex Executable file → Normal file
View File

0
sections/4-sessionclustering.tex Executable file → Normal file
View File

0
sections/4a-sessionidentification.tex Executable file → Normal file
View File

0
sections/4b-profiler.tex Executable file → Normal file
View File

0
sections/4c-analyzer.tex Executable file → Normal file
View File

0
sections/5-conclusion.tex Executable file → Normal file
View File

0
sections/6-discussion.tex Executable file → Normal file
View File

0
sections/6-futurework.tex Executable file → Normal file
View File

0
vldb.cls Executable file → Normal file
View File