Merge branch 'main' of https://git.odin.cse.buffalo.edu/Declarative-Compilers/paper-Declarative-Compilers

progress on beginning of 1
2023-07-17 19:53:57 -04:00 · 2023-07-17 19:53:00 -04:00
2 changed files with 120 additions and 19 deletions
--- a/main.bib
+++ b/main.bib
@ -1,3 +1,79 @@
+@article{Cocke:1977,
+author = {Cocke, John and Kennedy, Ken},
+title = {An Algorithm for Reduction of Operator Strength},
+year = {1977},
+issue_date = {Nov. 1977},
+publisher = {Association for Computing Machinery},
+address = {New York, NY, USA},
+volume = {20},
+number = {11},
+issn = {0001-0782},
+url = {https://doi.org/10.1145/359863.359888},
+doi = {10.1145/359863.359888},
+abstract = {A simple algorithm which uses an indexed temporary table to perform reduction of operator strength in strongly connected regions is presented. Several extensions, including linear function test replacement, are discussed. These algorithms should fit well into an integrated package of local optimization algorithms.},
+journal = {Commun. ACM},
+month = {nov},
+pages = {850–856},
+numpages = {7},
+keywords = {operator strength reduction, optimization of compiled code, compilers, program analysis, test replacement, strongly connected region}
+}
+
+
+@inproceedings{Tate:2009,
+author = {Tate, Ross and Stepp, Michael and Tatlock, Zachary and Lerner, Sorin},
+title = {Equality Saturation: A New Approach to Optimization},
+year = {2009},
+isbn = {9781605583792},
+publisher = {Association for Computing Machinery},
+address = {New York, NY, USA},
+url = {https://doi.org/10.1145/1480881.1480915},
+doi = {10.1145/1480881.1480915},
+abstract = {Optimizations in a traditional compiler are applied sequentially, with each optimization destructively modifying the program to produce a transformed program that is then passed to the next optimization. We present a new approach for structuring the optimization phase of a compiler. In our approach, optimizations take the form of equality analyses that add equality information to a common intermediate representation. The optimizer works by repeatedly applying these analyses to infer equivalences between program fragments, thus saturating the intermediate representation with equalities. Once saturated, the intermediate representation encodes multiple optimized versions of the input program. At this point, a profitability heuristic picks the final optimized program from the various programs represented in the saturated representation. Our proposed way of structuring optimizers has a variety of benefits over previous approaches: our approach obviates the need to worry about optimization ordering, enables the use of a global optimization heuristic that selects among fully optimized programs, and can be used to perform translation validation, even on compilers other than our own. We present our approach, formalize it, and describe our choice of intermediate representation. We also present experimental results showing that our approach is practical in terms of time and space overhead, is effective at discovering intricate optimization opportunities, and is effective at performing translation validation for a realistic optimizer.},
+booktitle = {Proceedings of the 36th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages},
+pages = {264–276},
+numpages = {13},
+keywords = {equality reasoning, compiler optimization, intermediate representation},
+location = {Savannah, GA, USA},
+series = {POPL '09}
+}
+
+                  
+
+@inproceedings{Luc:2008,
+author = {Maranget, Luc},
+title = {Compiling Pattern Matching to Good Decision Trees},
+year = {2008},
+isbn = {9781605580623},
+publisher = {Association for Computing Machinery},
+address = {New York, NY, USA},
+url = {https://doi.org/10.1145/1411304.1411311},
+doi = {10.1145/1411304.1411311},
+abstract = {We address the issue of compiling ML pattern matching to compact and efficient decisions trees. Traditionally, compilation to decision trees is optimized by (1) implementing decision trees as dags with maximal sharing; (2) guiding a simple compiler with heuristics. We first design new heuristics that are inspired by necessity, a concept from lazy pattern matching that we rephrase in terms of decision tree semantics. Thereby, we simplify previous semantic frameworks and demonstrate a straightforward connection between necessity and decision tree runtime efficiency. We complete our study by experiments, showing that optimizing compilation to decision trees is competitive with the optimizing match compiler of Le Fessant and Maranget (2001).},
+booktitle = {Proceedings of the 2008 ACM SIGPLAN Workshop on ML},
+pages = {35–46},
+numpages = {12},
+keywords = {heuristics, match compilers, decision trees},
+location = {Victoria, BC, Canada},
+series = {ML '08}
+}
+
+@inproceedings{Keep:2013,
+author = {Keep, Andrew W. and Dybvig, R. Kent},
+title = {A Nanopass Framework for Commercial Compiler Development},
+year = {2013},
+isbn = {9781450323260},
+publisher = {Association for Computing Machinery},
+address = {New York, NY, USA},
+url = {https://doi.org/10.1145/2500365.2500618},
+doi = {10.1145/2500365.2500618},
+abstract = {Contemporary compilers must typically handle sophisticated high-level source languages, generate efficient code for multiple hardware architectures and operating systems, and support source-level debugging, profiling, and other program development tools. As a result, compilers tend to be among the most complex of software systems. Nanopass frameworks are designed to help manage this complexity. A nanopass compiler is comprised of many single-task passes with formally defined intermediate languages. The perceived downside of a nanopass compiler is that the extra passes will lead to substantially longer compilation times. To determine whether this is the case, we have created a plug replacement for the commercial Chez Scheme compiler, implemented using an updated nanopass framework, and we have compared the speed of the new compiler and the code it generates against the original compiler for a large set of benchmark programs. This paper describes the updated nanopass framework, the new compiler, and the results of our experiments. The compiler produces faster code than the original, averaging 15-27% depending on architecture and optimization level, due to a more sophisticated but slower register allocator and improvements to several optimizations. Compilation times average well within a factor of two of the original compiler, despite the slower register allocator and the replacement of five passes of the original 10 with over 50 nanopasses.},
+booktitle = {Proceedings of the 18th ACM SIGPLAN International Conference on Functional Programming},
+pages = {343–350},
+numpages = {8},
+keywords = {nanopass, compiler, scheme},
+location = {Boston, Massachusetts, USA},
+series = {ICFP '13}
+}


@inproceedings{balakrishnan:2021:sigmod:treetoaster,
--- a/sections/introduction.tex
+++ b/sections/introduction.tex
@ -8,40 +8,65 @@
 \section{Introduction}
 \label{sec:introduction}

-Scalability poses a challenge for compilers and program analysis tools, which require large programs to be broken down into small, manageable chunks.
-Even breaking down programs to smaller, more manageable analysis tasks can be slow and preclude potential analyses from being written idiomatically.
-We propose a new relational-algebra-based language (called \systemlang) that reframes common compilation (and analysis) tasks as database operations.
-\systemlang gives us a framework with which to unify existing database optimizations (e.g., work sharing from streaming systems) 
-with existing compiler tricks (e.g., Tree Toasting~\cite{balakrishnan:2021:sigmod:treetoaster}), laying the groundwork for a truly scalable, `declarative' compiler allowing
-for greater scalability.
+Compilers and program analyses grapple with exceedingly large state
+spaces, a struggle which has historically necessitated that high-level
+analysis tasks be broken into small, managable chunks
+(\emph{e}.\emph{g}., separate compilation and context-insensitive
+program analyses). Unfortunately, such a granular analysis methodology
+can often prohibit the idiomatic expression of useful analyses, and
+the loss of precision may preclude useful optimizations. We propose a
+new relational-algebra-based language (called \systemlang) that
+reframes common compilation and analysis tasks as database
+operations. \systemlang unifies existing database optimizations (e.g.,
+work sharing from streaming systems) with existing compiler tricks
+(e.g., Tree Toasting~\cite{balakrishnan:2021:sigmod:treetoaster}),
+laying the groundwork for a truly scalable, ``declarative'' compiler
+leveraging the rich history of database optimization.
+
+

 \paragraph{Production Rules}
-Compiler analyses are often expressed in terms of production rules: If a pattern of one form is found in a program, it generates an output based on that pattern.
-For example, the classical selection push-down rule from database compilers may be expressed as:
+Compiler transformations and optimizations are often expressed in
+terms of production rules: if a pattern matching a certain form is
+found, it is transformed according to the production rule. For
+example, the classic push-down rule common in query plan optimization
+may be expressed as:
+
 $$\production{\sigma_\theta(\pi_{A}(R))}{\pi_{A}(\sigma_\theta(R))}$$
-In other words, any relational algebra subtree of the form $\sigma_{\theta}(\pi_{A}(R))$ (where $R$ is any subtree) may be safely replaced by a new subtree of the form $\pi_{A}(\sigma_{\theta}(R))$ (where $A$, $\theta$, and $R$ take their values from the original tree).  
+
+In other words, any relational algebra expression
+$\sigma_{\theta}(\pi_{A}(R))$ may be safely replaced by an equivalent
+expression of the form $\pi_{A}(\sigma_{\theta}(R))$. Expressions are
+often realized as trees, and thus this style of optimization may be
+viewed as a tree transformation. 

 \paragraph{Match Patterns}
-Production rules are typically implemented through a common construct in functional programming languages called `match'.  
-For example, the analogous rewrite could be expressed using match as:
+
+Production rules are often implemented via ``match patterns'' in
+functional languages; match patterns are ubiquitous in such languages
+due to their enabling programming with algebraic data~\cite{Luc:2008}.
+For example, the push-down rule above could be expressed using
+\texttt{match}:
+
 \begin{lstlisting}
 plan match {
  case Filter(condition, Project(targetList, child)) =>
    Project(targetList, Filter(condition, child))
 }
 \end{lstlisting}
+
 The pattern checks to see if \lstinline{plan} represents a \lstinline{Filter} ($\sigma$) node. 
-If so, it binds its first field to the variable \lstinline{condition}, and checks to see if its second field is a \lstinline{Project} ($\pi$) node.  
-If so, it binds the two child variables, and runs the code to the right of the \lstinline{=>} operator to generate a replacement tree.
+If so, it binds its first field to \lstinline{condition}, and checks to see if its second field is a \lstinline{Project} ($\pi$) node.  
+If so, it binds two child variables and runs the code to the right of \lstinline{=>} to build a replacement tree.

-A typical relational query optimizer iterates over every operator in an entire relational algebra tree, trying out dozens or even hundreds of such pattern matching rules for each.
-Whenever a match is found, the matched operator (and any relevant children) are replaced, and the search continues.
-When no further rewrites apply (i.e., when all match patterns leave the tree untouched) --- a so-called fixed point --- or when a timeout is reached, the optimizer is done.
-
-The key insights driving \systemlang are that: 
+A key design methodology for optimizers of relational query languages (and other paradigms) is to execute a large set of production rules simultaneously, iteratively inferring more optimal join plans until a fixed-point is reached (or time dictates we must give up). The key insights driving \systemlang are that: 
 (i) these match patterns are effectively queries over the relational algebra tree, and 
 (ii) reframing them as such allows us to employ classical database optimizations to create more scalable compilers.

+% A typical relational query optimizer iterates over every operator in an entire relational algebra tree, trying out dozens or even hundreds of such pattern matching rules for each.
+% Whenever a match is found, the matched operator (and any relevant children) are replaced, and the search continues.
+% When no further rewrites apply (i.e., when all match patterns leave the tree untouched) --- a so-called fixed point --- or when a timeout is reached, the optimizer is done.
+
 \subsection{Tree Toasting}
 In \cite{balakrishnan:2021:sigmod:treetoaster}, we leveraged a similar insight to realize a compiler technique called Tree Toasting.  
 A toasted optimizer pre-computes the set of subtrees that match a rewrite rule's predicate.  
@ -65,4 +90,4 @@ In this paper, we make the following contributions:
 (iii) We develop an runtime for \systemlang based on work sharing in stream processing systems in \Cref{sec:queryEvaluation};
 (iv) We adapt tree toasting to \systemlang in \Cref{sec:treetoasting}; 
 (v) We evaluate \systemlang by re-implementing a fragment of Spark's Catalyst Optimizer in \Cref{sec:experiments}; and
-(vi) We explore potential further ways to leverage the declarative nature of \systemlang in \Cref{sec:conclusions}.
+(vi) We explore potential further ways to leverage the declarative nature of \systemlang in \Cref{sec:conclusions}.
Author	SHA1	Message	Date
Kristopher Micinski	b60a3e39df	Merge branch 'main' of https://git.odin.cse.buffalo.edu/Declarative-Compilers/paper-Declarative-Compilers	2023-07-17 19:53:57 -04:00
Kristopher Micinski	1d6403f821	progress on beginning of 1	2023-07-17 19:53:00 -04:00