Stuff

2023-07-13 21:36:18 -04:00 · 2023-07-13 21:36:18 -04:00 · 02f7e67df4
parent 9972e79863
commit 02f7e67df4
4 changed files with 65 additions and 7 deletions
--- a/.gitignore
+++ b/.gitignore
@ -6,3 +6,5 @@
 *.out
 /main.pdf
 *.synctex.gz
+*.fls
+*.fdb_latexmk
--- a/main.tex
+++ b/main.tex
@ -45,13 +45,14 @@
 %% for your publication.
 %%
 %%
-\documentclass[sigconf,anonymous]{acmart}
+\documentclass[sigconf]{acmart}

 \usepackage{todonotes}
 \definecolor{lightorange}{RGB}{255,230,200}
 \newcommand{\OK}[1]{\todo[color=lightorange]{#1}}
 \newcommand{\DB}[1]{\todo[color=green]{#1}}
 \newcommand{\LZ}[1]{\todo[color=blue]{#1}}
+\input{sections/macros}


 %%
@ -149,7 +150,6 @@
  \country{USA}
 }

-
 \author{???}
 \email{???@amazon.com}

--- a/sections/introduction.tex
+++ b/sections/introduction.tex
@ -1,12 +1,65 @@
 %!TEX root=../main.tex
-\section{Introduction}
+Compiler theory is often expressed in terms of production rules: If a pattern of one form is found in a program, it generates an output based on that pattern.
+For example, the classical selection push-down rule from database compilers may be expressed as:
+$$\production{\sigma_\theta(\pi_{A}(R))}{\pi_{A}(\sigma_\theta(R))}$$
+In other words, given a relational algebra subtree consisting of a selection operator with a projection operator child, the entire subtree can be replaced by a projection operator with the selection operator child.

-As several prior works have observed~\cite{balakrishnan:2021:sigmod:treetoaster}\OK{there were a few HN posts some time ago about similar ideas...}, there is considerable overlap in how a compiler and a database are constructed.
-For example, a key part of any compiler is the optimizer, which incrementally rewrites a program into an equivalent, albeit more efficient program.
-As observed by Balakrishnan et. al.~\cite{}
+Although production rules are elegantly declarative, this structure is not, as yet, exploited by production compilers.
+In this paper, we propose a new ``declarative'' approach to writing compilers that achieves increased performance and scalability by viewing the compiler as a database.
+
+For example, Apache Spark's Catalyst Optimizer implements its relational algebra optimization rewrites as a collection of about 400 structural \texttt{match} patterns.
+As in most compilers, the optimizer iteratively searches the tree for matches, one pattern at a time; 
+Matches are rewritten, and the search starts over from scratch.
+Some rules contain manually heuristics: for example one rule may trigger a rewrite for a specific second rule.
+However, searching for rewrite opportunities consumes a significant portion of the optimizer's runtime~\cite{balakrishnan:2021:sigmod:treetoaster}.  
+Conversely, manually inserted heuristics make the code less maintainable.
+
+\paragraph{Declarative Compilers}
+Logic in a declarative compiler is specified through a set of declarative production rules.  
+
+
+In this paper, we argue that program compilation is analogous to query processing, and that this relationship may be leveraged to create compilers that are maintainable, scalable, and performant.
+In such a ``declarative'' compiler, logic for optimization and transformation is expressed through declarative production rules.
+When the compiler itself is compiled, classical database optimizations (e.g., inlining, indexing, materialized views, and join order selection) are applied to these rules to derive logic 
+
+
+is expressed 
+
+
+
+, fundamentally 
+
+the database community 





-As others have noted, multiple
+
+
+
+
+Compiler writers occasionally include heuristics to accelerate this search. 
+For example, if applying rule 1 frequently creates opportunities to apply rule 2, the rewrite might imfmediately force a search for matches to rule 2's pattern.
+However, such heuristics 
+
+
+
+
+ (e.g., rule 1 frequently creates opportunities to apply rule 2).
+However, 
+
+
+
+% Ideas
+% - SIGMOD '21
+% - BDD optimization
+%   - + Recursion in future work
+% - Aggregate properties
+%   - e.g., Spark's query feature bitvector
+%   - Recursive scans
+% - Inlining opportunities
+%   - e.g., synthesize new rules by collapsing common sequences together
+% - Rule development IDE
+%   - e.g., monitor predicted runtimes for queries
+% - Lens-style parallel maintenance of logical and physical plans ()
--- a/sections/macros.tex
+++ b/sections/macros.tex
@ -0,0 +1,3 @@
+%!TEX root=../main.tex
+
+\newcommand{\production}[2]{\frac{#2}{#1}}