Finished pass on rebuttal.

master
Aaron Huber 2021-09-20 23:20:58 -04:00
parent c9e1b68d9c
commit 8a1f4c3b35
2 changed files with 11 additions and 30 deletions

View File

@ -86,7 +86,7 @@ In this work, we study the complexity of \Cref{prob:bag-pdb-poly-expected} for s
\mypar{\abbrTIDB\xplural}
We initially focus on tuple-independent probabilistic bag-databases\footnote{See \cite{DBLP:series/synthesis/2011Suciu} for a survey of set-\abbrTIDBs; the bag encoding is analogous~\cite{DBLP:conf/pods/GreenKT07}.} (\abbrTIDB\xplural), a compressed encoding of probabilistic databases where the presence of each individual tuple (out of a total of $\numvar$ input tuples) in a possible world is modeled as an independent probabilistic event.\footnote{
This model is exactly the definition of \abbrTIDB{}s \cite{VS17} under set semantics. Note that this is only one possible definition of \abbrTIDB{}s under bag semantics. In \Cref{sec:gener-results-beyond} we discuss alternatives and to what degree our results extend to these alternatives.
This model is exactly the definition of \abbrTIDB{}s \cite{VS17} under set semantics. Note that this is only one possible definition of \abbrTIDB{}s under bag semantics. In \Cref{sec:gener-results-beyond} we discuss alternatives and to what degree our results extend to these alternatives.\label{footnote:set-not-limit}
% Mirroring the implementation of bag relations in production database systems (e.g., Postgresql, DB2), tuple multiplicities are modeled by retaining copies of each tuple (up to its largest possible multiplicity).
% % To make each duplicate tuple unique in a set-\abbrTIDB we can assign unique keys across all duplicates.
% When the multiplicity of input tuple is bound by some constant,

View File

@ -43,7 +43,7 @@ We have made an explicit mention of data complexity when alluding to Dalvi and S
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\RCOMMENT{l.51 "Consider ... and tuples are independent random event": so this is actually a set PDB... You might want to use an example where the input PDB is actually a bag PDB. The last sentence before the example makes the reader *expect* that the example will be of a bag PDB that is not a set PDB}
Our revision has removed the example referred to above. While the paper considers inputs to queries that are equivalent to set-\abbrPDB, this is not limiting. Please see \Cref{footnote:set-not-limit} on \Cpageref{footnote:set-not-limit}. Furthermore, we have added a discussion to the appendix that expands on why our results do extend beyond set inputs.\BG{Add reference}
Our revision has removed the example referred to above. While the paper considers inputs to queries that are equivalent to set-\abbrPDB, this is not limiting. Please see \Cref{footnote:set-not-limit} on \Cpageref{footnote:set-not-limit}. Furthermore, we have added a discussion to the appendix that expands on why our results do extend beyond set inputs (\Cref{sec:gener-results-beyond}).
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\RCOMMENT{- In the case of set semantics, the lineage of a tuple can be defined for *any* query: it is the unique Boolean function that satisfies the if and only if property that you mention on line 70. For bag semantics however, to the best of my knowledge there is no general definition of what is a lineage for an arbitrary query. On line 73, it is not clear at all how the polynomial should be defined, since this will depend on the type of query that you consider}
@ -54,7 +54,7 @@ We also note that these semantics are not novel (e.g., similar semantics appear
However, as we were unable to find a formal proof of the equivalence between the expectation of the query multiplicity and of the lineage polynomial in related work, we have included a proof of \Cref{prop:expection-of-polynom}.
\RCOMMENT{l.75 "evaluating the lineage of t over an assignment corresponding to a possible world": here, does the assignment assigns each tuple to true or false? In other words, do the variables X still represent individual tuples? From what I see later in the article it seems that no, so this is confusing if we compare to what is explained in the previous paragraph about set TIDB}
The discussion after \Cref{prob:bag-pdb-poly-expected} (in particular, the paragraph \textbf{\abbrTIDB\xplural}) specifically address these questions. While values for possible worlds assigned are from $\{0, 1\}$, which is analog to Boolean, but note \Cref{footnote:set-not-limit} and the new appendix \BG{add reference} \BG{REMOVED: which describes the encoding of a bag as a set.}
The discussion after \Cref{prob:bag-pdb-poly-expected} (in particular, the paragraph \textbf{\abbrTIDB\xplural}) specifically address these questions. While values for possible worlds assigned are from $\{0, 1\}$, which is analog to Boolean, this is not limiting. Please see \Cref{footnote:set-not-limit} $\inparen{\Cpageref{footnote:set-not-limit}}$ and the new appendix section \Cref{sec:gener-results-beyond}.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\RCOMMENT{- l.135 "polynomial Q(X)": Q should be reserved for queries... You could use $\varphi$ or $\phi$ or... anything else but Q really}
@ -67,13 +67,11 @@ We have rewritten \Cref{sec:intro} in a way to stress that we are are primarily
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\RCOMMENT{A discussion is missing about the difference between the approach usually taken in PDB literature and your approach. In which case would one be more interested in the expected multiplicity or in the marginal probability of a tuple? This should be discussed clearly in the introduction, as currently there is no clear "motivation" to what you do. There is a section about related work at the end but it is mostly a set of facts and there is no insightful comparison to what you do.}
We provide more motivating examples in the first paragraph, and include a more detailed discussion of the relationship to sets in paragraph \textbf{Relationship to Set-Probabilistic Query Evaluation} after \Cref{prob:informal}.
\AH{We need to maybe talk about the motivation for computing expected multiplicity.}
\AR{Agree a line summarizing the stuff below is needed in the intro: I also modified the next line, so pls. chk!}
For example, expected multiplicities can model expectation of a \lstinline{COUNT(*)} query, while in many context computing the probability that this count is non-zero is not that useful.
For example, expected multiplicities can model expectation of a \lstinline{COUNT(*)} query, while in many contexts computing the probability that this count is non-zero is not that useful.
%As a trivial (albeit relevant) example, consider a model of a contact network.
%The probability that there exists at least one new COVID infection in the graph is far less informative than the expected number of new infections.
As we now explain in the introduction, another motivation for generalizing marginal probability to expected multiplicity is that the marginal probability of a tuple $t$ is the expectation of a Boolean random variable that is assigned 1 in every world where tuple $t$ exists and $0$ otherwise. For bag-PDBs the multiplicity of a query result tuple can be modeled as a natural-number random variable that for a world $\db$ is assigned the multiplicity of the tuple in $\db$. Thus, a natural generalization of the marginal probability (expectation of a Boolean random variable) to bags is the expectation of this variable: the tuple's expected multiplicity.
As we now explain in the introduction, another motivation for generalizing marginal probability to expected multiplicity is that it is a natural generalization. The marginal probability of a tuple $t$ is the expectation of a Boolean random variable that is assigned 1 in every world where tuple $t$ exists and $0$ otherwise. For bag-PDBs the multiplicity of a query result tuple can be modeled as a natural-number random variable that for a world $\db$ is assigned the multiplicity of the tuple in $\db$. Thus, a natural generalization of the marginal probability (expectation of a Boolean random variable) to bags is the expectation of this variable: the tuple's expected multiplicity.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\RCOMMENT{l.176 "N[X] relations are closed under RA+": is this a *definition* of what it means to take an RA+ query and evaluate it over an N[X] database, or does this sentence say something more? Also, I think it would be clearer to use UCQs in the whole paper instead of constantly changing between UCQs, RA+ and SPJU formalisms}
@ -93,12 +91,11 @@ The text now refers to latter as an \abbrNXPDB\xplural.
Our objective is to establish the feasibility of bag-probabilistic databases as compared to existing deterministic query processing systems.
Accordingly, we take our input model from production database systems like Postgresql, Oracle, DB2, SQLServer, etc. (e.g., see \Cref{footnote:set-not-limit} on \Cpageref{footnote:set-not-limit}), where duplicate tuples are represented as independent entities.
As a convenient benefit, this leads to a direct translation of TIDBs (which are defined over $\{0,1\}$ inputs).
Finally, as we mention earlier, an easy generalization exists to encode a \abbrBPDB in a set-\abbrPDB (which then allows for bag inputs).\AR{This parts needs to be updated after the new appendix section}
Finally, as we mention earlier, an easy generalization exists to encode a \abbrBPDB in a set-\abbrPDB (which then allows for bag inputs). \Cref{sec:gener-results-beyond}.
\RCOMMENT{- l.656 "Thus, from now on we will solely use such vectors...": this seems to
be false. Moreover you keep switching notation which makes it very hard to read... Sometimes it is $\varphi$, sometimes it is small w, sometimes it is big W (l.174 or l.722), sometimes the database is $\varphi(D)$, sometimes it is $\varphi_w(D)$, other times it is $D_{[w]}$ (l.671), and so on.}
We have made effort to be consistent with the use of notation, following standard usage whenever possible.
\AH{We need to be sure this is taken care of in the appendix.}
\RCOMMENT{l.658 "we use $\varphi(D)$ to denote the semiring homomorphism $\semNX \rightarrow \semN$
that...": I don't understand why you need a database to extend an assignment to its semiring homomorphism from $\semNX \rightarrow \semN$}
@ -114,7 +111,7 @@ We have updated \Cref{fig:nxDBSemantics} (originally figure 2) to not need $K$.
We have reserved $\query$ to mean an $\raPlus$ query and nothing else.
\RCOMMENT{Section 2.1.1: here you are considering set semantics no? Otherwise, one would think that for bag semantics the annotation of a tuple could be 0 or something of the form c $\times$ X, where X is a variable and c is a natural number}
The semantics for the polynomial as seen in \Cref{eq:sop-form} is specified indeed as the reviewer has pointed out.\AR{\Cref{eq:sop-form} has nothing to do with bag/set semantics. Update this part once new app section on this is done.}
Please see \Cref{sec:gener-results-beyond} for a discussion on going beyond set inputs.
\RCOMMENT{Proof of Proposition A.3. I seems the proof should end after l.687, since you already proved everything from the statement of the proposition. I don't understand what it is that you do after this line.}
@ -123,7 +120,7 @@ We agree that this should not be part of the proof of the later, and have remove
\RCOMMENT{l.686 "The closure of ... over K-relations": you should give more details on this part. It is not obvious to me that the relations from l.646 hold.}
The core of this (otherwise trivial) argument, that semiring homomorphisms commute through queries, was already proven in \cite{DBLP:conf/pods/GreenKT07}.\AR{This ref shows as dangling to me} We now make this reference explicit.
The core of this (otherwise trivial) argument, that semiring homomorphisms commute through queries, was already proven in \cite{DBLP:conf/pods/GreenKT07}. We now make this reference explicit.
We apologize for not explaining this in more detail. In universal algebra~\cite{graetzer-08-un}, it has been proven (the HSP theorem) that for any variety, the set of all structures (called objects) with a certain signature that obey a set of equational laws, there exists a ``most general'' object called the \emph{free object}. The elements of the free objects are equivalence classes (with respect to the laws of the variety) of symbolic expressions over a set of variables $\vct{X}$ that consist of the operations of the structure. The operations of the free object are combining symbolic expression using the operation. It has been shown that for any other object $K$ of a variety, any assignment $\phi: \vct{X} \to K$ uniquely extends to a homomorphism from the free object to $K$ by substituting variables for based on $\phi$ in symbolic expression and then evaluating the resulting expression in $K$.
@ -154,28 +151,23 @@ As alluded to above, we have incorporated the reviewer's suggestion, c.f. \Cref{
Multiplicity problem restricted to conjunctive queries is \#W[1]-hard, parameterized by query size. Indeed if I look at the proof, all you need is the queries $Q^k_G$. The problem is \#W[1]-hard and it should not matter how one tries to solve it: using an approach with lineages or using anything else.
Currently it is confusing because you make it look like the problem is hard only when you consider general arithmetic circuits, but your hardness proof has nothing to do with circuits. Moreover, it is not surprising that computing the expected output of an arithmetic circuit is hard: it is trivial, given a CNF $\phi$, to build an arithmetic circuit C such that for any valuation $\nu$ of the variables the formula $\phi$ evaluates to True under $\nu$ if C evaluates to 1 and the formula $\phi$ evaluates to False under $\nu$ if C evaluates to 0, so this problem is \sharpphard anyways.}
\AR{Changed the answer completely to the one below.}
The reviewer is correct. Our hardness results are now stated independently of circuits. We note that the hardness result alluded to at the end of the comment above is not applicable in our case since for fixed queries $\query$, \Cref{prob:bag-pdb-query-eval} and \Cref{prob:bag-pdb-poly-expected} can be solved in polynomial time.
Further, as we point out in \Cref{sec:intro} what is new in our hardness results is show a query $Q^k$ such that $\qruntime{\query^k,\dbbase}$ is small (linear in $\abs{\dbbase}$ but solving \Cref{prob:bag-pdb-query-eval} and \Cref{prob:bag-pdb-poly-expected} is hard. We note that it is well-known that one can reduce the problem of counting $k$-cliques or $k$-matchings to a query $\query$ for which computing $\query(\dbbase)$ is $\sharpwone$-hard. So our contribution to come up with a different reduction from counting $k$-matchings so that the hardness manifests itself in the probabilistic computing part of our problem.
Further, as we point out in \Cref{sec:intro} what is new in our hardness results is that we show a query $Q^k$ such that $\qruntime{\query^k,\dbbase}$ is small (linear in $\abs{\dbbase}$ but solving \Cref{prob:bag-pdb-query-eval} and \Cref{prob:bag-pdb-poly-expected} is hard. We note that it is well-known that one can reduce the problem of counting $k$-cliques or $k$-matchings to a query $\query$ for which computing $\query(\dbbase)$ is $\sharpwone$-hard. So our contribution to come up with a different reduction from counting $k$-matchings so that the hardness manifests itself in the probabilistic computing part of our problem.
%We have rewritten \Cref{sec:intro} with a series of refined problem statements to show that the problem we explore and the results we obtain directly involve lineage polynomials. The reviewer is correct that the output is the expected multiplicity, and we hope that our updated presentation of the paper makes it clear that $\expct_{\vct{\randWorld}\sim\pdassign}\pbox{\apolyqdt\inparen{\vct{\randWorld}}}$ is indeed the expected multiplicity spoken of. We have also addressed the ambiguity in the complexity we are focusing on, both explicitly in the intro and in the revised definition, \Cref{def:the-expected-multipl}.
%
%Regarding the use of circuits, it is true that our hardness results do not require circuits while our approximation algorithm and cost model both rely on circuits. We have adjusted our presentation (e.g. the segway between \Cref{prob:informal} and \Cref{prob:big-o-joint-steps}) to make this distinction clear and eliminate any confusion.
\RCOMMENT{Section 3.3. It seems to me the important part of this section is not so much the fact that we have fixed values of p but that the query is now fixed and that you are looking at the fine-grained complexity. If what you really cared about was having fixed value of p, then the result of this section should be exactly like the one in Theorem 3.4, but starting with "fix p". So something like "Fix p. Computing $\tilde{Q}^k_G$ for arbitrary G is \#W1-hard".}
%\AH{Need help in responding to this one.}
\AR{Added stuff below.}
We agree with the reviewer that the result on fixed value of $p$ is mostly of (narrow) theoretical interest. We have added a discussion summarizing the reviewer's point above below \Cref{th:single-p-hard}.
\RCOMMENT{General remark: The story of the paper I think should be this: we can always compute the expected multiplicity for a UCQ Q and N[X]-database D and tuple t by first computing the lineage in SOP form and then using linearity of expectation, which gives an upper bound of (roughly) $O(|D|^|Q|)$. We show that this exponential dependence in |Q| is unavoidable by proving that this problem is \#W1 hard parameterized by |Q| (which implies that we cannot solve it in $f(|Q|) |D|^c$ ). Furthermore we obtain fine-grained superlinear lower bounds for a fix conjunctive query Q. (Observe how up to here, there is no need to talk about lineages at all). We then obtain an approximation algorithm for this problem for [this class of queries] and [that class of bag PDBs] with [that running time (Q,D)]. The method is to first compute the lineage as an arithmetic circuit C in [this running time (Q,D)], and then from the arithmetic circuit C compute in [running time(C)] an approximation of its expected output. Currently I don't understand to which queries your approximation algorithm can be applied (see later comments).}
%We have followed the suggestions of the reviewer to delineate between the `coarse' polynomial time and the fine grained complexity analysis. We found it necessary to introduce polynomials earlier since our hard query, hardness results, and their proofs are easier to present (and we feel make the paper more accessible) than doing so without the lineage polynomials.
%We have taken pains to be very clear that this work only considers $\raPlus$ queries, adding a reminder to this end in the first paragraph of \Cref{sec:algo}.
%\AH{We need to address the last line of the reviewer's comment. Also, not sure if I answered the comment perfectly.}
\AR{Replaced previous ans with stuff below.}
We have restructured \Cref{sec:intro} to more or less follow the reviewer's outline above. The only deviation is that we still introduce lineage polynomials. We do this because the polynomial view is very helpful in the proofs of our hardness result (in addition to the obvious relevance for the approximation algorithm). We have also clarified that our approximation result applied to all $\raPlus$ queries (see \Cref{cor:approx-algo-punchline}).
\RCOMMENT{l.381: Here again, I think it would be simpler to consider that the input of the problem is the query, the database and a tuple and claim that you can compute an approximation of the expected multiplicity in linear time. The algo is to first compute the lineage as an arithmetic circuit, and then to use what you currently use (which could be put in a lemma or in a proposition).}
\AR{Replaced previous ans with stuff below.}
We have implemented the above overview in \Cref{sec:intro} when we move from \Cref{prob:informal} to \Cref{prob:intro-stmt}. For the approximation algorithm we focus on \Cref{prob:intro-stmt}, which still takes a circuit as an input.
%Our appoximation algorithm assumes an input circuit \circuit that has been computed via an arbitrary $\raPlus$ query $\query$ and arbitrary \abbrBIDB $\pdb$. We have included prose to describe this at the beginning of {sec:algo:sub:main-result}.
@ -183,7 +175,6 @@ We have implemented the above overview in \Cref{sec:intro} when we move from \Cr
We have provided an example in directly after \Cref{def:expand-circuit} as well as a sentence pointing out why this definitions is useful. %\AR{This is not enough: we need a line for {\em why} this notation is useful. {\bf TODO for Atri}}
\RCOMMENT{- l.409: how does it matter that the circuit C is the lineage of a UCQ? Doesn't this work for any arithmetic circuit?}
\AR{Replaced previous ans with stuff below.}
The reviewer is correct that the earlier Theorem 4.9 works for any circuit (this result is now in the appendix).
%The reviewer is correct that our approximation results apply to $\raPlus$ queries over \abbrBIDB\xplural. This we specify this in the formal statements of \Cref{sec:algo}, e.g. see \Cref{def:param-gamma} and \Cref{cor:approx-algo-const-p}.
%More specifically, our proofs rely on (i) circuits with a bounded polynomial degree (we use a slightly non standard definition of degree --- \Cref{def:degree}), which is the case for any circuit resulting from an $\raPlus$ query; and (ii) specific assumptions about variable independence, which hold when the input to the query is a BIDB.
@ -195,14 +186,10 @@ We clarify this overloaded notation immediately after \Cref{def:positive-circuit
As alluded to previously, we have followed the reviewer's suggestion and have found $\raPlus$ queries to be most amenable for this work.
\RCOMMENT{l.432 what is an FAQ query?}
\AR{Replaced previous ans with stuff below.}
%We have added a reference. Please see \Cref{lem:val-ub}.
We actually no longer need that result since \Cref{lem:val-ub} now has a bound on $|\circuit|(1,\dots,1)$ in terms of $\depth(\circuit)$ and the latter is used in \Cref{cor:approx-algo-punchline} for all $\raPlus$ queries. Please see \Cref{lem:val-ub} and the followup discussion for more on this.
\RCOMMENT{Generally speaking, I think I don't understand much about Section 4, and the convolutedness of the appendix does not help to understand. I don't even see in which result you get a linear runtime and to which queries the linear runtime applies. Somewhere there should be a corollary that clearly states a linear time approximation algorithm for some queries.}
%\AH{Needs to be addressed.}
\AR{Added ans below.}
We have re-organized {sec:algo} to address the above comments as follows:
\begin{itemize}
\item We now start off \Cref{sec:algo:sub:main-result} with the algorithm idea.
@ -251,9 +238,6 @@ We have fixed this mistake. Unfortunately, because of the changes in the paper (
We agree with the reviewer that this notation is confusing;
$\eta$ is meant to cope with the fact that tuples from the same group in a BIDB can not co-exist, even though our $\{0,1\}$-input vectors can encode such worlds.
We now address this constraint by embedding it directly into the reduced polynomial with \Cref{def:reduced-bi-poly}.
\AH{Needs to be addressed.}
\OK{We have addressed this... but may still need to elide $\eta$ out of the appendices}
\AR{I thought $\eta$ was altogether eliminated?}
\RCOMMENT{line 305: please define what is an "occurrence of H in G". It could mean: a homomorphic image, a subgraph of G isomorphic to H, an induced subgraph of G isomorphic to H, or maybe something else.}
We agree with the reviewer's suggestion and have rephrased the wording to be clear. Please see the beginning of \Cref{sec:hard:sub:pre}.
@ -263,7 +247,6 @@ We have implemented the reviewer's suggestion. Please see the last sentence of
\RCOMMENT{line 177: what is $\Omega_{\semNX}$?}
We have eliminated the use of $\semNX$-DBs in the paper proper, using them only when necessary in the proofs of the appendix.
\AH{Need to address what is $\idb_\semNX$}
\RCOMMENT{line 217. The polynomial $X^2 + 2XY + Y^2$ is a poor choice to illustrate the degree. There are two standard definitions of the degree of a multivariate polynomial, and one has to always clarify which one is meant. One definition is the total degree (which is Def. 2.3 in the paper), the other is the maximum degree of any single variable. It is nice that you are trying to clarify for the reader which definition you are using, but the polynomial $X^2 + 2XY + Y^2$ is worst choice, since here the two coincide.}
We have adjusted the example to account for the reviewer's correct observation.
@ -273,7 +256,6 @@ We have removed the redundant terminology the reviewer has pointed out, and refi
\RCOMMENT{"Note that our hardness results even hold for the expression trees". At this point we haven't seen the hardness results, nor their proofs, and we don't know what expression trees are. It's unclear what we can note.}
%We have accounted for the reviewer's concern in the rewrite of \Cref{sec:hard} adjusting the prose accordingly.
\AR{Changed the ans to the one below.}
Our hardness results are now stated independently of circuits so the above statement no longer appears in the paper.
\RCOMMENT{paragraph at the top of pp.10 is confusing. My guess is that it is trying to this say: "there exists a query Q, such that, for each graph G, there exists a database D s.t. the lineage of Q on D is the polynomial $Q_G$."}
@ -282,13 +264,12 @@ Our revision has eliminated this statement.
\subsection{Reviewer 3}
\RCOMMENT{The overall study is then extended to a multiplicative approximation algorithm for the expectation of polynomial circuits in linear time in the size of the polynomial. It was much harder to read this part, and I found the examples and flow in the appendix quite helpful. I suggest to include these examples into the body of the paper. }
%\AH{Need to address this.}
\AR{Added the ans below.}
In our revision we expanded on \Cref{sec:intro} to give a better overview of the problems we are considering in this paper. This meant we had to cut out material in later sections, which unfortunately meant we did not have space in \Cref{sec:algo} to include any examples that the reviewer suggested above. However, we have tried to make \Cref{sec:algo} more readable as a whole.
\RCOMMENT{While ApproximateQ is linear in the size of the circuit, it is quadratic in epsilon and so we need quadratically many samples for the desired accuracy -- overall runtime is not linear therefore and it may be better to elaborate this. It may also be helpful to comment on how this relates to Karp, Luby, Madras algorithm [1] for \#DNF which is also quadratic in epsilon.}
%\AH{Need to elaborate on this.}
\AR{Added the ans below. Part on the set-PDB approx has not been addressed below}
In \Cref{prob:big-o-joint-steps} we note explicitly that we care about linear dependence on $\qruntime(\query,\dbbase)$ and do not care about the exact dependence on $\epsilon$. While it would be nice to design an approximation algorithm that is linear in $1/\epsilon$ as well, we believe it is out of scope of this initial work.
In \Cref{prob:big-o-joint-steps} we note explicitly that we care about linear dependence on $\qruntime{\inparen{\query,\dbbase}}$ and do not care about the exact dependence on $\epsilon$. While it would be nice to design an approximation algorithm that is linear in $1/\epsilon$ as well, we believe it is out of scope of this initial work.
\RCOMMENT{The coverage of related work is adequate. Fink et. al seems as the closest related work to me and I would appreciate a more elaborate comparison with this paper. My understanding is that Fink et. al considers exact evaluation only and focuses on knowledge compilation techniques based on decompositions. They also note that "Expected values can lead to unintuitive query answers, for instance when data values and their probabilities follow skewed and non-aligned distributions" attributed to [2]. Does this apply to the current work? Can you please comment on this?}