finishing upstream and adding downstream

main
Oliver Kennedy 2023-03-20 10:32:24 -04:00
parent dd809554b3
commit 6724670efa
Signed by: okennedy
GPG Key ID: 3E5F9B3ABD3FDB60
1 changed files with 20 additions and 8 deletions

View File

@ -64,8 +64,9 @@ The pattern for a specific cell is obtained by looking up the cell's column in t
\State $(\column_{d}, \rowRange_{d}) \leftarrow (\column_{d}, \rowRange_{d}) - \texttt{upstream}$
\If{$(\column_{d}, \rowRange_{d})$ is non-empty}
\State $\texttt{upstream} \leftarrow \texttt{upstream} + (\column_{d}, \rowRange_{d})$
\State $\texttt{queue}.\textbf{enqueue}( \column_{d}, \rowRange_{d},$
\State \hfill$\texttt{lineage} \cup \{\texttt{pattern} \rightarrow \texttt{offset}\} )$
\State $\texttt{queue}.\textbf{enqueue}( \column_{d}, \rowRange_{d},$\\
\hfill$\comprehension{ \texttt{p}' \rightarrow (\texttt{o}'+\texttt{offset})}{ ((\texttt{p}' \rightarrow \texttt{o}' )\in \texttt{lineage}}$\\
\hfill $\cup \{\texttt{pattern} \rightarrow \texttt{offset}\} )$
\EndIf
\EndFor
\EndFor
@ -117,16 +118,27 @@ Observe that this dependency chain is defined entirely by a single pattern: Each
We refer to a pattern that references cells defined by the same pattern as \emph{recursive}.
Note that the pattern's value may be self-referential, even if there is not a dependency cycle between the individual cells that the pattern defines.
When we encounter a recursive pattern, it may be possible to compute a closed form representation of its dependencies without visiting each individual dependency.
Continuing the example above, any of the pattern's cells each depends on all of the preceding cells in the pattern.
Patterns allow absolute references or offset references, but the former can not trigger a recursive pattern without creating a cycle in the dependency graph.
Thus, recursive dependencies must be at fixed offsets, and the transitive closure must have a closed form representation.
For example, consider a cell $\cellRef{\texttt{total}}{500}$ defined by a recursive pattern over rows $[1,1000]$, with a recursive pattern dependency on $\cellRef{\texttt{total}}{@-2}$.
The transitive closure of the cell's dependency thus includes exactly the set of even rows (given the offset of $-2$) in the range $[1,500]$ (the cell through the start of the pattern's range).
TODO
Unfortunately, the size of the encoding of the range set needed to represent these dependencies scales with the number of rows, due to the gaps.
However, with the more common offset of $-1$, the entire set of rows can be defined by a single range.
We address this more common case here, and leave the more general case to future work.
Specifically, \textbf{getDeps} (\Cref{alg:getDeps}, line 3) tracks the offset of each dependency, while \textbf{upstream} (\Cref{alg:upstream}, lines 10-11) maintains a record of which patterns have been seen at which offsets in a lineage record.
In its naive implementation, \textbf{upstream} attempts to advance the frontier by one hop with each work unit (lines 5-14).
However, prior to line 5, we can check the lineage object to determine if the pattern defining the cell we are currently examining has previously been encountered along the path being advanced at an offset of $\pm 1$.
If so, we add the remainder of the range over which the pattern is defined in the direction indicated by the offset to the active range.
\paragraph{Downstream Reachability}
TODO:
Same algorithm as above, but use a reverse index.
Talk about maintaining the reverse index efficiently.
When a cell's expression is updated, cells that depend on it (even transitively) must be recomputed.
The index must thus also support downstream reachability queries.
To support these efficiently, we maintain a backward index that relates cell ranges to the ranges of patterns that depend on it.
Analog to $\textbf{getDeps}$ inferring cells immediately upstream of a range of cells, we can infer the cells downstream of any cell or set of cells, with one caveat.
When the cell identified an absolute reference in a pattern is modified, all cells using the pattern are invalidated, so we track the set of ranges over which any given pattern is defined.
\paragraph{Column Insertions and Deletions}