Stuff
parent
9972e79863
commit
02f7e67df4
|
@ -6,3 +6,5 @@
|
|||
*.out
|
||||
/main.pdf
|
||||
*.synctex.gz
|
||||
*.fls
|
||||
*.fdb_latexmk
|
||||
|
|
4
main.tex
4
main.tex
|
@ -45,13 +45,14 @@
|
|||
%% for your publication.
|
||||
%%
|
||||
%%
|
||||
\documentclass[sigconf,anonymous]{acmart}
|
||||
\documentclass[sigconf]{acmart}
|
||||
|
||||
\usepackage{todonotes}
|
||||
\definecolor{lightorange}{RGB}{255,230,200}
|
||||
\newcommand{\OK}[1]{\todo[color=lightorange]{#1}}
|
||||
\newcommand{\DB}[1]{\todo[color=green]{#1}}
|
||||
\newcommand{\LZ}[1]{\todo[color=blue]{#1}}
|
||||
\input{sections/macros}
|
||||
|
||||
|
||||
%%
|
||||
|
@ -149,7 +150,6 @@
|
|||
\country{USA}
|
||||
}
|
||||
|
||||
|
||||
\author{???}
|
||||
\email{???@amazon.com}
|
||||
|
||||
|
|
|
@ -1,12 +1,65 @@
|
|||
%!TEX root=../main.tex
|
||||
\section{Introduction}
|
||||
Compiler theory is often expressed in terms of production rules: If a pattern of one form is found in a program, it generates an output based on that pattern.
|
||||
For example, the classical selection push-down rule from database compilers may be expressed as:
|
||||
$$\production{\sigma_\theta(\pi_{A}(R))}{\pi_{A}(\sigma_\theta(R))}$$
|
||||
In other words, given a relational algebra subtree consisting of a selection operator with a projection operator child, the entire subtree can be replaced by a projection operator with the selection operator child.
|
||||
|
||||
As several prior works have observed~\cite{balakrishnan:2021:sigmod:treetoaster}\OK{there were a few HN posts some time ago about similar ideas...}, there is considerable overlap in how a compiler and a database are constructed.
|
||||
For example, a key part of any compiler is the optimizer, which incrementally rewrites a program into an equivalent, albeit more efficient program.
|
||||
As observed by Balakrishnan et. al.~\cite{}
|
||||
Although production rules are elegantly declarative, this structure is not, as yet, exploited by production compilers.
|
||||
In this paper, we propose a new ``declarative'' approach to writing compilers that achieves increased performance and scalability by viewing the compiler as a database.
|
||||
|
||||
For example, Apache Spark's Catalyst Optimizer implements its relational algebra optimization rewrites as a collection of about 400 structural \texttt{match} patterns.
|
||||
As in most compilers, the optimizer iteratively searches the tree for matches, one pattern at a time;
|
||||
Matches are rewritten, and the search starts over from scratch.
|
||||
Some rules contain manually heuristics: for example one rule may trigger a rewrite for a specific second rule.
|
||||
However, searching for rewrite opportunities consumes a significant portion of the optimizer's runtime~\cite{balakrishnan:2021:sigmod:treetoaster}.
|
||||
Conversely, manually inserted heuristics make the code less maintainable.
|
||||
|
||||
\paragraph{Declarative Compilers}
|
||||
Logic in a declarative compiler is specified through a set of declarative production rules.
|
||||
|
||||
|
||||
In this paper, we argue that program compilation is analogous to query processing, and that this relationship may be leveraged to create compilers that are maintainable, scalable, and performant.
|
||||
In such a ``declarative'' compiler, logic for optimization and transformation is expressed through declarative production rules.
|
||||
When the compiler itself is compiled, classical database optimizations (e.g., inlining, indexing, materialized views, and join order selection) are applied to these rules to derive logic
|
||||
|
||||
|
||||
is expressed
|
||||
|
||||
|
||||
|
||||
, fundamentally
|
||||
|
||||
the database community
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
As others have noted, multiple
|
||||
|
||||
|
||||
|
||||
|
||||
Compiler writers occasionally include heuristics to accelerate this search.
|
||||
For example, if applying rule 1 frequently creates opportunities to apply rule 2, the rewrite might imfmediately force a search for matches to rule 2's pattern.
|
||||
However, such heuristics
|
||||
|
||||
|
||||
|
||||
|
||||
(e.g., rule 1 frequently creates opportunities to apply rule 2).
|
||||
However,
|
||||
|
||||
|
||||
|
||||
% Ideas
|
||||
% - SIGMOD '21
|
||||
% - BDD optimization
|
||||
% - + Recursion in future work
|
||||
% - Aggregate properties
|
||||
% - e.g., Spark's query feature bitvector
|
||||
% - Recursive scans
|
||||
% - Inlining opportunities
|
||||
% - e.g., synthesize new rules by collapsing common sequences together
|
||||
% - Rule development IDE
|
||||
% - e.g., monitor predicted runtimes for queries
|
||||
% - Lens-style parallel maintenance of logical and physical plans ()
|
|
@ -0,0 +1,3 @@
|
|||
%!TEX root=../main.tex
|
||||
|
||||
\newcommand{\production}[2]{\frac{#2}{#1}}
|
Loading…
Reference in New Issue