Minor edits

master
Oliver Kennedy 2018-04-29 00:23:25 -04:00
parent 23819edfb7
commit c282dfb2d9
1 changed files with 16 additions and 17 deletions

View File

@ -115,11 +115,11 @@ We apply our session identification methodology, and perform session clustering
We apply our session identification methodology, and perform session clustering by creating the distance matrix by comparing feature appearance frequencies in each session with JS-Divergence.
% \end{itemize}
\begin{figure}[h!]
\begin{figure}
\centering
\includegraphics[width=0.45\textwidth]{graphics/WorkloadPredictibility}
\vspace{-0.5cm}
\caption{Prediction Accuracy Comparison}
\caption{Session Prediction Accuracy}
\label{fig:averagesimilarity}
\trimfigurespacing
\end{figure}
@ -140,11 +140,11 @@ The experiment results confirm our argument by consistently showing comparable c
%We believe that a random session selection among the 5184 sessions can provide us with a representative query set of the workloads for all users since 90\% average similarity means the query set represents 90\% of all the sessions in the dataset. The average, minimum and maximum session lengths are given in Table~\ref{tab:sessionlength}.
\begin{figure}[h!]
\begin{figure}
\centering
\includegraphics[width=0.45\textwidth]{graphics/SessionIdentificationComp}
\vspace{-0.5cm}
\caption{Session Identification Impact Comparison}
\caption{Session Identification Impact}
\label{fig:sessionIdentification}
\end{figure}
@ -161,11 +161,8 @@ For our experiments, we selected Facebook to be our example app. For visual purp
For this specific user's case, there are 8856 rows of parsable queries in the log. However, there are 431 unique queries among them.
We prepare ground truth cluster labels by manually inspecting all the unique queries within a user's query log for Facebook app. The accuracy of the clustering result is measured by comparing the query placements to the clusters to the ground truth.
\begin{table}[h!]
\begin{figure}
\centering
\caption{Clustering accuracy for a random user}
\label{tab:clusteringAccuracy}
\vspace{-0.2cm}
\begin{tabular}{ccc}
~ & \textbf{Facebook} \\ \hline
\# of queries & 8856 \\
@ -175,23 +172,22 @@ We prepare ground truth cluster labels by manually inspecting all the unique que
\# of inaccurately placed queries & 511 \\
Accuracy & 94.2\% \\ \hline
\end{tabular}
\end{table}
\caption{Clustering accuracy for a random user}
\label{tab:clusteringAccuracy}
\trimfigurespacing
\end{figure}
\begin{figure}[h!]
\vspace{-1cm}
\begin{figure}
\centering
\includegraphics[width=0.5\textwidth]{graphics/dendrogram}
\vspace{-1.5cm}
\caption{Query Clustering Dendrogram of Facebook usage for a user}
\label{fig:dendrogram}
\vspace{-0.5cm}
\trimfigurespacing
\end{figure}
\begin{table}[h!]
\begin{figure}
\centering
\caption{Clusters extracted from a user's Facebook workload}
\label{tab:clusteringresult}
\vspace{-0.2cm}
\begin{tabular}{cc}
\hline
\textbf{Cluster} & \textbf{Explanation} \\ \hline
@ -205,7 +201,10 @@ We prepare ground truth cluster labels by manually inspecting all the unique que
7 & Consistency check \\
8 & Housekeeping \\ \hline
\end{tabular}
\end{table}
\caption{Clusters extracted from a user's Facebook workload}
\label{tab:clusteringresult}
\trimfigurespacing
\end{figure}
%Keep in mind that PocketData dataset is an anonymized dataset where most of the constant values are replaced with ``?'', which reduces the number of distinct queries greatly.