26 lines
1.7 KiB
TeX
26 lines
1.7 KiB
TeX
%!TEX root=../main.tex
|
|
\section{Experiments}
|
|
\label{sec:experiments}
|
|
|
|
\begin{figure}
|
|
\includegraphics[width=\columnwidth]{figures/tpch.pdf}
|
|
\trimmedcaption{Compile times on TPC-H using a set of 7 rules.}
|
|
\label{fig:results}
|
|
\end{figure}
|
|
|
|
Evaluation was performed on a 3.5 GHz AMD Ryzen 9 5950X 16-Core CPU, with Linux 6.2.6, OpenJDK 11.0.19, Scala 2.12.15.
|
|
Results shown are averaged over 10 runs, with 4 discarded burn-in trials to trigger JIT.
|
|
We implemented the Astral compiler, as well as its work sharing optimization as a rewriter for Spark 3.4.1.
|
|
We evaluated Astral's work sharing optimization by manually translating 7 rewrite rules (selected to be relevant to the 22 queries of the TPC-H workload) into Astral:
|
|
\textbf{PushProjectionThroughUnion},
|
|
\textbf{PushProjectionThroughLimit},
|
|
\textbf{ReorderJoin},
|
|
\textbf{EliminateOuterJoin},
|
|
\textbf{PushDownPredicates},
|
|
\textbf{PushDownLeftSemiAntiJoin}, and
|
|
\textbf{CollapseProject}.
|
|
\Cref{fig:results} compares the runtime of a naive rule-at-a-time optimizer (\textbf{Astral-Standalone}) to an optimizer leveraging the work-sharing optimization (\textbf{Astral-Shared}).
|
|
Results are normalized to the rule-at-a-time runtime, which varied from $35\mu s$ to $4ms$; these runtimes were comparable to Spark's optimizer using the same rules, and produced comparable query plans.
|
|
In general, work-sharing can reduce runtimes significantly, up 4 times faster.
|
|
The main limiting factor of the work sharing optimization is that the merged rewrite queries deviate from Spark's carefully selected rule evaluation order.
|
|
This can be seen in query 2, on which the shared optimizer does not converge; and on queries 8 and 9, where the shared optimizer requires twice as many iterations to converge. |