Done with pass on App A

master
Atri Rudra 2021-09-20 23:27:03 -04:00
parent 14cd63ac12
commit 1d3a255539
2 changed files with 6 additions and 6 deletions

View File

@ -5,23 +5,23 @@
\subsection{\abbrTIDB{}s}
\label{sec:abbrtidbs}
For results for \abbrTIDBs, we assumed a model of \abbrTIDBs where each input tuple is assigned a probability $p$ of having multiplicity $1$. That is, we assumed inputs to be sets, but interpret queries under bag semantics. Other sensible interpretations of what the generalization of \abbrTIDBs from sets to bags should be exist.
In our definition of \abbrTIDBs (\Cref{subsec:tidbs-and-bidbs}), we assumed a model of \abbrTIDBs where each input tuple is assigned a probability $p$ of having multiplicity $1$. That is, we assumed inputs to be sets, but interpret queries under bag semantics. Other sensible generalizations of \abbrTIDBs from set semantics to bag semantics also exist.
One important such generalization is to assign each input tuple $\tup$ a multiplicity $m_\tup$ and probability $p$: the tuple has probability $p$ to exists with multiplicity $m_\tup$, and otherwise has multiplicity $0$. If the maximal multiplicity of all tuples in the \abbrTIDB is bound by some constant, then a generalization of our hardness results and approximation algorithm can be achieved by changing the construction of lineage polynomials as follows:
One very natural such generalization is to assign each input tuple $\tup$ a multiplicity $m_\tup$ and probability $p$: the tuple has probability $p$ to exists with multiplicity $m_\tup$, and otherwise has multiplicity $0$. If the maximal multiplicity of all input tuples in the \abbrTIDB is bounded by some constant, then a generalization of our hardness results and approximation algorithm can be achieved by changing the construction of lineage polynomials (in \Cref{fig:nxDBSemantics}) as follows (all other cases remain the same as in \cref{fig:nxDBSemantics}):
\begin{align*}
\polyqdt{\rel}{\dbbase}{\tup} =&\begin{cases}
m_\tup X_\tup & \text{if }\dbbase.\rel\inparen{\tup} = m_\tup \\
0 &\text{otherwise.}\end{cases}
\end{align*}
That is the variable representing a tuple is multiplied by $m_\tup$ to encode the tuple's multiplicity $m_\tup$.
That is the variable representing a tuple is multiplied by $m_\tup$ to encode the tuple's multiplicity $m_\tup$. We note that our lower bounds still hold for this model since we only need $m_\tup=1$ for all tuples $\tup$. Further, it can be argued that our proofs (as is) for approximation algorithms also work for this model. The only change is that since we now allow $m_\tup>1$ some of the constants in the runtime analysis of our algorithms change but the overall asymptotic runtime bound remains the same.
Yet another option would be to assign each tuple a probability distribution over multiplicities. It seems clear that our results would not extend to a model that allows arbitrary probability distributions for this purpose. However, we would like to note that the special case of a normal distribution over multiplicities can be handled as follows: we add an additional identifier attribute to each relation in the database. For a tuple $\tup$ with maximal multiplicity $m_\tup$, we create $m_\tup$ copies of $\tup$ with different identifiers. To answer a query over this encoding, we first project away the identifier attribute.
Yet another option would be to assign each tuple a probability distribution over multiplicities. It seems very unlikely that our results would extend to a model that allows arbitrary probability distributions over multiplicities (our current proof techniques definitely break down). However, we would like to note that the special case of a Poisson binomial distribution (sum of independent but not necessarily identical Bernoulli trials) over multiplicities can be handled as follows: we add an additional identifier attribute to each relation in the database. For a tuple $\tup$ with maximal multiplicity $m_\tup$, we create $m_\tup$ copies of $\tup$ with different identifiers. To answer a query over this encoding, we first project away the identifier attribute (note that as per \Cref{fig:nxDBSemantics}, in $\poly$ this would add up all the variables corresponding to the same tuple $\tup$).
\subsection{\abbrBIDB{}s}
\label{sec:abbrbidbs}
The approach described above works for \abbrBIDB\xplural as well if we define the bag version of \abbrBIDB{}s to associate each tuple $\tup$ a multiplicity $m_\tup$. Recall that we associate each tuple in a block with a unique variable. Thus, the modified lineage polynomial construction shown above can be applied for \abbrBIDB{}s too.
The approach described above works for \abbrBIDB\xplural as well if we define the bag version of \abbrBIDB{}s to associate each tuple $\tup$ a multiplicity $m_\tup$. Recall that we associate each tuple in a block with a unique variable. Thus, the modified lineage polynomial construction shown above can be applied for \abbrBIDB{}s too (and our approximation results also hold).

View File

@ -331,7 +331,7 @@ Respectively, these are: \\
%\end{proof}
\section{Higher Moments}
%\label{sec:momemts}
\label{sec:momemts}
%
We make a simple observation to conclude the presentation of our results.
So far we have only focused on the expectation of $\poly$.