\begin{figure*}[t] \begin{subfigure}[t]{0.33\textwidth} \includegraphics[width=\textwidth]{./graphs/activity/All_Devices__All_Apps__All_Queries__AllPreviousQueryCDFGraph.pdf} \caption{} \label{fig-overview-interarrival} \end{subfigure}% \begin{subfigure}[t]{0.33\textwidth} \includegraphics[width=\textwidth]{./graphs/activity/All_Devices__All_Apps__All_Queries__AllRuntimeCDFGraph.pdf} \caption{} \label{fig-overview-runtime} \end{subfigure}% \begin{subfigure}[t]{0.33\textwidth} \includegraphics[width=\textwidth]{./graphs/activity/All_Devices__All_Apps__All_Queries__AllRowcountCDFGraph.pdf} \caption{} \label{fig-overview-rowcount} \end{subfigure}% \caption{\textbf{Summary Statistics for Android SQLite Queries. Distributions of (a) inter-query arrival times, (b) query runtimes, and (c) rows returned per query.}} \label{fig-overview} \end{figure*} Next, we look at overall runtime characteristics of the query workload observed during our study. We examine how often queries arrive, how long they run, and how many rows they return---all important inputs into designing the TPC-Mobile embedded database benchmark. \subsubsection{General Characteristics} Figure~\ref{fig-overview} shows query interarrival times, runtimes, and returned row counts (for \texttt{SELECT} statements) for all users, applications, and non-informational query types (\texttt{SELECT}, \texttt{UPDATE}, \texttt{INSERT}, \texttt{DELETE}) included in our dataset. Given that each mobile application is really generating an isolated workload to its own embedded database, we measure query interarrival time only between queries issued by the same application. Examining the interarrival times shown in Figure~\ref{fig-overview-interarrival}, it is interesting to observe that many queries seem to arrive much more quickly than the minimum query runtime shown in Figure~\ref{fig-overview-runtime}. Part of this may be due to apps that use multiple separate databases, which is not yet captured by our analysis. However, our logging is also done above any locking performed by SQLite, and so this may demonstrate that there are many cases where multiple application threads are issuing overlapping queries in parallel, even if the queries are eventually serialized before results are returned. Figure~\ref{fig-overview-interarrival} also shows that, in addition to a standard long-tailed distribution of query inter-arrival times, about 20\% of the workload is very periodic, arriving at a rate of 0.01~Hz. The runtime CDF shown in Figure~\ref{fig-overview-runtime} shows while overall query runtimes show variation over several orders of magnitude, a large fraction of queries are executed in between 100~and~1000~$\mu$s. Further investigation into the small fraction of extremely slow queries may discover areas for database or application improvement. Finally, the row count CDF shown in Figure~\ref{fig-overview-rowcount} shows that 80\% of queries return only one row, further supporting our observation that many applications seem to be using the SQLite database almost as a key-value store. \begin{figure*}[t] \begin{subfigure}[t]{0.33\textwidth} \includegraphics[width=\textwidth]{./graphs/activity/All_Devices__All_Apps__All_Queries__ByTypePreviousQueryCDFGraph.pdf} \caption{} \label{fig-type-interarrival-prev} \end{subfigure}% \begin{subfigure}[t]{0.33\textwidth} \includegraphics[width=\textwidth]{./graphs/activity/All_Devices__All_Apps__All_Queries__ByTypeNextQueryCDFGraph.pdf} \caption{} \label{fig-type-interarrival-next} \end{subfigure}% \begin{subfigure}[t]{0.33\textwidth} \includegraphics[width=\textwidth]{./graphs/activity/All_Devices__All_Apps__All_Queries__ByTypeRuntimeCDFGraph.pdf} \caption{} \label{fig-type-runtime} \end{subfigure}% \caption{\textbf{By-Query-Type Statistics for Android SQLite Queries. Distribution of times since the query (a) immediately preceding, and (b) immediately following the query in question. (c) Distribution of runtimes for each query.}} \label{fig-type} \end{figure*} \subsubsection{Runtime Characteristics by Query Type} Figure~\ref{fig-type} shows runtime characteristics for each of the four types of SQL statement. Figure~\ref{fig-type-interarrival-prev} and \ref{fig-type-interarrival-next} in particular show the time since the last query to be issued and the time until the next query is issued (respectively), while Figure~\ref{fig-type-runtime} shows the distribution of runtimes for each type of query. Examining the differences between Figures~\ref{fig-type-interarrival-prev} and \ref{fig-type-interarrival-next}, we observe that \texttt{INSERT} queries are far more likely to arrive shortly before another query than shortly after. Almost 80\% of \texttt{INSERT}s are followed by another query within 100$\mu$s. A similar, but far more subdued pattern can be seen for \texttt{UPDATE} statements. Conversely, both \texttt{SELECT} and \texttt{DELETE} statements are slightly more likely to arrive shortly before, rather than shortly after another query. Figure~\ref{fig-type-runtime} shows significant deviations from the global average runtime for \texttt{DELETE} and \texttt{UPDATE} statements. \texttt{UPDATE} statements in particular have a bimodal distribution of runtimes, spiking at 100$\mu$s and 10ms. We suspect that this performance distribution is related to SQLite's use of filesystem primitives for locking and write-ahead logging~ \cite{jeong2013iostack,kang2013xftl}. This could also help to explain the 0.01Hz query periodicity we observed above. \begin{figure*}[t] \centering \includegraphics[width=0.6\textwidth]{./graphs/activity/All_Devices__Top_10__Key} \begin{subfigure}[t]{0.33\textwidth} \includegraphics[width=\textwidth]{./graphs/activity/All_Devices__Top_10__All_Queries__ByAppPreviousQueryCDFGraph_noKey.pdf} \caption{} \label{fig-app-interarrival} \end{subfigure}% \begin{subfigure}[t]{0.33\textwidth} \includegraphics[width=\textwidth]{./graphs/activity/All_Devices__Top_10__All_Queries__ByAppRuntimeCDFGraph_noKey.pdf} \caption{} \label{fig-app-runtime} \end{subfigure}% \begin{subfigure}[t]{0.33\textwidth} \includegraphics[width=\textwidth]{./graphs/activity/All_Devices__Top_10__All_Queries__ByAppRowcountCDFGraph_noKey.pdf} \caption{} \label{fig-app-rowcount} \end{subfigure}% \caption{\textbf{Per-App Summary Statistics for Android SQLite Queries. Distributions of (a) inter-query arrival times, (b) query runtimes, and (c) rows returned per query.}} \label{fig-app} \end{figure*} \subsubsection{Runtime Characteristics by Application} Figure~\ref{fig-app} shows query interarrival times, runtimes, and returned row counts for ten of the most active SQLite clients. As seen in Figure~\ref{fig-app-interarrival}, the 0.01Hz periodicity is not unique to any one application, further suggesting filesystem locking as a culprit. Two of the most prolific SQLite clients, \textit{Google Play services} and \textit{Media Storage} appear to be very bursty: 70\% of all statements for these applications are issued within 0.1ms of the previous statement. Also interesting is the curve for queries issued by the \textit{Android System} itself. The interarrival time CDF appears to be almost precisely logarithmic for rates above 10$\mu$s, but has a notable lack of interarrival times in the 1ms to 10ms range. This could suggest caching effects, with the cache expiring after 1ms. As seen in Figure~\ref{fig-app-runtime}, most apps hold to the average runtime of 100$\mu$s, with several notable exceptions. Over 50\% of the \textit{Android System}'s statements take on the order of 1ms. Just under 20\% of \textit{Hangouts} statements take 10ms, suggesting an update-heavy workload. Also, \textit{Contacts Storage} has a heavier-duty workload, with 30\% of statements taking between 100$\mu$s and 1ms. Figure~\ref{fig-app-rowcount} shows that the \textit{Android System} and \textit{Media Storage} issue almost exclusively single-row lookup queries. The remaining apps issue a large number of single-row queries --- Even \textit{Contacts Storage} has a workload consisting of 45\% single-row reads --- the number of rows returned in general varies much more widely. Many of these apps' user interfaces have both a list and a search view that show multiple records at a time, suggesting that these views are backed directly by SQLite. Although all apps have long tails, two apps in particular: \textit{Gmail} and \textit{Google+} are notable for regularly issuing queries that return on the order of 100 rows. %% LocalWords: Android SQLite runtimes TPC interarrival apps CDF ms %% LocalWords: SQLite's App bursty Gmail