paper-Declarative-Compilers/sections/experiments.tex

%!TEX root=../main.tex
\section{Experiments}
\label{sec:experiments}

\begin{figure}
  \includegraphics[width=\columnwidth]{figures/tpch.pdf}
  \trimmedcaption{Compile times on TPC-H using a set of 7 rules.}
  \label{fig:results}
\end{figure}

Evaluation was performed on a 3.5 GHz AMD Ryzen 9 5950X 16-Core CPU, with Linux 6.2.6, OpenJDK 11.0.19, Scala 2.12.15.
Results shown are averaged over 10 runs, with 4 discarded burn-in trials to trigger JIT.
We implemented the Astral compiler, as well as its work sharing optimization as a rewriter for Spark 3.4.1.
We evaluated Astral's work sharing optimization by manually translating 7 rewrite rules (selected to be relevant to the 22 queries of the TPC-H workload) into Astral:
\textbf{PushProjectionThroughUnion},
\textbf{PushProjectionThroughLimit},
\textbf{ReorderJoin},
\textbf{EliminateOuterJoin},
\textbf{PushDownPredicates},
\textbf{PushDownLeftSemiAntiJoin}, and
\textbf{CollapseProject}.
\Cref{fig:results} compares the runtime of a naive rule-at-a-time optimizer (\textbf{Astral-Standalone}) to an optimizer leveraging the work-sharing optimization (\textbf{Astral-Shared}).
Results are normalized to the rule-at-a-time runtime, which varied from $35\mu s$ to $4ms$; these runtimes were comparable to Spark's optimizer using the same rules, and produced comparable query plans.
In general, work-sharing can reduce runtimes significantly, up 4 times faster.
The main limiting factor of the work sharing optimization is that the merged rewrite queries deviate from Spark's carefully selected rule evaluation order.
This can be seen in query 2, on which the shared optimizer does not converge; and on queries 8 and 9, where the shared optimizer requires twice as many iterations to converge.