Sec 3 - Oliver's review

master
gourabmi 2017-11-02 14:51:23 -04:00
parent 161b27c59d
commit 9eea5d0047
3 changed files with 11 additions and 2 deletions

View File

@ -162,4 +162,13 @@
booktitle={Proceedings of the BCS-IRSG 22nd annual colloquium on information retrieval research},
pages={57--66},
year={2000}
}
@article{rabl2009generating,
title={Generating Shifting Workloads to Benchmark Adaptability in Relational Database Systems.},
author={Rabl, Tilmann and Lang, Andreas and Hackl, Thomas and Sick, Bernhard and Kosch, Harald},
journal={TPCTC},
volume={9},
pages={116--131},
year={2009},
publisher={Springer}
}

View File

@ -136,7 +136,7 @@ NoSQL database benchmarks~\cite{cooper2010YCSB, council2017tpcxiot}, appear to b
However, the queries are still pre-set in TPCx-IoT~\cite{council2017tpcxiot}. YCSB~\cite{cooper2010YCSB} only measures performance of key-value stores, and requires to be extended in order to process more complex queries.
Additionally, the workload created is still homogenous, and sequential.
Most benchmarks like the TPC-C focus on emulating homogenous query workloads of an OLTP system. Their goal is to analyze throughput for these homogenous workloads. But it is not correct to truly emulate smartphone query workloads without emulating the intermittent bursts of query activity. These bursts can only be detected by looking at the chronological attributes like query timestamp and query interarrival time.
Most benchmarks like the TPC-C focus testing peak performance of an OLTP system on homogenous query workloads~\cite{rabl2009generating}. Their goal is to analyze throughput for these homogenous workloads. But it is not correct to truly emulate smartphone query workloads without emulating the intermittent bursts of query activity. These bursts can only be detected by looking at the chronological attributes like query timestamp and query interarrival time.
Another level of abstraction is needed to extract meaningful patterns from the query log.
There are also some mobile system database micro benchmarks such as AndroBench~\cite{liu2014application}, which was designed to evaluate the storage performance of the device, and not the database management system itself.

View File

@ -12,7 +12,7 @@ interval of time. Such bursts of intermittent activity are captured in database
\label{fig:approach}
\end{figure*}
A logical task, called an activity, performed by a user on a smartphone, such as checking for new email, might produce multiple queries to the database. Since smartphone applications keep switching between foreground and background, these queries could be arbitrarily spaced out in time. Hence, one database user session might contain one or more logical user activities. Similarly, a logical user activity might be spread across multiple database user sessions. These sessions are useful in capture subset of logical activities which are repetitive. Since there is no discrete indicator of the start and end of a database user session for smartphones --- most users keeps apps open continuously, we use a heuristic to help define one.
We define activity as a logical task performed by a user on a smartphone, such as checking for new email, might produce multiple queries to the database. Since smartphone applications keep switching between foreground and background, these queries could be arbitrarily spaced out in time. In our approach, a database session is a logical unit of user interaction. It spans over a period of time and is comprised of sequential queries. Hence, one database user session might contain one or more logical user activities. Similarly, a logical user activity might be spread across multiple database user sessions. These sessions are useful in capturing subset of logical activities which are repetitive. Since there is no discrete indicator of the start and end of a database user session for smartphones --- most users keeps apps open continuously, we use a heuristic to help define one.
If two queries in a log have timestamps whose difference in time exceeds a specified threshold, we consider them to be a part of different sessions.
After partitioning the query log into sessions, we create a clustering of the sessions that consist of \emph{similar} activities by providing frequencies of each query pattern detected.