main
Oliver Kennedy 2023-07-13 21:36:18 -04:00
parent 9972e79863
commit 02f7e67df4
Signed by: okennedy
GPG Key ID: 3E5F9B3ABD3FDB60
4 changed files with 65 additions and 7 deletions

2
.gitignore vendored
View File

@ -6,3 +6,5 @@
*.out
/main.pdf
*.synctex.gz
*.fls
*.fdb_latexmk

View File

@ -45,13 +45,14 @@
%% for your publication.
%%
%%
\documentclass[sigconf,anonymous]{acmart}
\documentclass[sigconf]{acmart}
\usepackage{todonotes}
\definecolor{lightorange}{RGB}{255,230,200}
\newcommand{\OK}[1]{\todo[color=lightorange]{#1}}
\newcommand{\DB}[1]{\todo[color=green]{#1}}
\newcommand{\LZ}[1]{\todo[color=blue]{#1}}
\input{sections/macros}
%%
@ -149,7 +150,6 @@
\country{USA}
}
\author{???}
\email{???@amazon.com}

View File

@ -1,12 +1,65 @@
%!TEX root=../main.tex
\section{Introduction}
Compiler theory is often expressed in terms of production rules: If a pattern of one form is found in a program, it generates an output based on that pattern.
For example, the classical selection push-down rule from database compilers may be expressed as:
$$\production{\sigma_\theta(\pi_{A}(R))}{\pi_{A}(\sigma_\theta(R))}$$
In other words, given a relational algebra subtree consisting of a selection operator with a projection operator child, the entire subtree can be replaced by a projection operator with the selection operator child.
As several prior works have observed~\cite{balakrishnan:2021:sigmod:treetoaster}\OK{there were a few HN posts some time ago about similar ideas...}, there is considerable overlap in how a compiler and a database are constructed.
For example, a key part of any compiler is the optimizer, which incrementally rewrites a program into an equivalent, albeit more efficient program.
As observed by Balakrishnan et. al.~\cite{}
Although production rules are elegantly declarative, this structure is not, as yet, exploited by production compilers.
In this paper, we propose a new ``declarative'' approach to writing compilers that achieves increased performance and scalability by viewing the compiler as a database.
For example, Apache Spark's Catalyst Optimizer implements its relational algebra optimization rewrites as a collection of about 400 structural \texttt{match} patterns.
As in most compilers, the optimizer iteratively searches the tree for matches, one pattern at a time;
Matches are rewritten, and the search starts over from scratch.
Some rules contain manually heuristics: for example one rule may trigger a rewrite for a specific second rule.
However, searching for rewrite opportunities consumes a significant portion of the optimizer's runtime~\cite{balakrishnan:2021:sigmod:treetoaster}.
Conversely, manually inserted heuristics make the code less maintainable.
\paragraph{Declarative Compilers}
Logic in a declarative compiler is specified through a set of declarative production rules.
In this paper, we argue that program compilation is analogous to query processing, and that this relationship may be leveraged to create compilers that are maintainable, scalable, and performant.
In such a ``declarative'' compiler, logic for optimization and transformation is expressed through declarative production rules.
When the compiler itself is compiled, classical database optimizations (e.g., inlining, indexing, materialized views, and join order selection) are applied to these rules to derive logic
is expressed
, fundamentally
the database community
As others have noted, multiple
Compiler writers occasionally include heuristics to accelerate this search.
For example, if applying rule 1 frequently creates opportunities to apply rule 2, the rewrite might imfmediately force a search for matches to rule 2's pattern.
However, such heuristics
(e.g., rule 1 frequently creates opportunities to apply rule 2).
However,
% Ideas
% - SIGMOD '21
% - BDD optimization
% - + Recursion in future work
% - Aggregate properties
% - e.g., Spark's query feature bitvector
% - Recursive scans
% - Inlining opportunities
% - e.g., synthesize new rules by collapsing common sequences together
% - Rule development IDE
% - e.g., monitor predicted runtimes for queries
% - Lens-style parallel maintenance of logical and physical plans ()

3
sections/macros.tex Normal file
View File

@ -0,0 +1,3 @@
%!TEX root=../main.tex
\newcommand{\production}[2]{\frac{#2}{#1}}