paper-BagRelationalPDBsAreHard/Sketching Worlds/revision_report.tex

\documentclass[sigconf,9pt]{acmart}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% COMMENTS
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

 % \newcommand{\BG}[1]{\todo[inline]{\textbf{Boris says:$\,$} #1}}
 % \newcommand{\SF}[1]{\todo[inline]{\textbf{Su says:$\,$} #1}}
 % \newcommand{\OK}[1]{\todo[inline]{\textbf{Oliver says:$\,$} #1}}
 % \newcommand{\AH}[1]{\todo[inline]{\textbf{Aaron says:$\,$} #1}}
%\newcommand{\comment}[1]{}


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% ACMART settings
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\setcopyright{none}
\settopmatter{printacmref=false, printccs=false, printfolios=false}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% PACKAGES
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\usepackage{comment}
\usepackage{amsmath}
%\usepackage{amssymb}
% \let\proof\relax
% \let\endproof\relax
\usepackage{amsthm}
\usepackage{mathtools}
\usepackage{etoolbox}


\usepackage{url}
\def\UrlBreaks{\do\/\do-}
\usepackage{hyperref}
\hypersetup{breaklinks=true}


\usepackage{stmaryrd}
\usepackage[normalem]{ulem}
\usepackage{subcaption}
\usepackage{booktabs}
\usepackage{graphicx}
\usepackage{listings}
\usepackage{fancyvrb}
\usepackage{caption}
\usepackage{subcaption}
\usepackage{braket}
\usepackage[inline]{enumitem}
\usepackage{xspace}
\usepackage{colortbl}
\usepackage{hyphenat}
% \usepackage{bbold}

%\usepackage[breaklinks]{hyperref}
%\allowdisplaybreaks
\usepackage{multirow}
%\usepackage{makecell}
\usepackage{cleveref}

% \usepackage{footnote}
% \makesavenoteenv{tabular}


\usepackage{todonotes}
%\usepackage[disable]{todonotes}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Colors
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\definecolor{black}{rgb}{0,0,0}
\definecolor{grey}{rgb}{0.8,0.8,0.8}
\definecolor{red}{rgb}{1,0,0}
\definecolor{green}{rgb}{0,1,0}
\definecolor{darkgreen}{rgb}{0,0.5,0}
\definecolor{darkpurple}{rgb}{0.5,0,0.5}
\definecolor{darkdarkpurple}{rgb}{0.3,0,0.3}
\definecolor{blue}{rgb}{0,0,1}
\definecolor{shadegreen}{rgb}{0.95,1,0.95}
\definecolor{shadeblue}{rgb}{0.95,0.95,1}
\definecolor{shadered}{rgb}{1,0.85,0.85}
\definecolor{shadegrey}{rgb}{0.85,0.85,0.85}
\definecolor{oddRowGrey}{rgb}{0.80,0.80,0.80}
\definecolor{evenRowGrey}{rgb}{0.85,0.85,0.85}

%%%%% workaround for citations spanning multiple pages breaking pdflatex
%%%%% see: https://tex.stackexchange.com/questions/1522/pdfendlink-ended-up-in-different-nesting-level-than-pdfstartlink
\usepackage{etoolbox}
% \makeatletter
% \patchcmd\@combinedblfloats{\box\@outputbox}{\unvbox\@outputbox}{}{\errmessage{\noexpand patch failed}}
% \makeatother

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% DOCS
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\title{Revision}
\subtitle{Efficient Uncertainty Tracking for Complex Queries with Attribute-level Bounds}


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Include information below and uncomment for camera ready

\definecolor{GrayRew}{gray}{0.85}

\newcommand{\RCOMMENT}[1]{\medskip\noindent \begin{tabular}{|p{\linewidth-3ex}}\rowcolor{GrayRew} #1 \end{tabular}\smallskip\\}
\newcommand{\MREV}[1]{\noindent {\color{blue}{#1}}}
\newcommand{\MREVO}[1]{\noindent {\color{darkgreen}{#1}}}

\newcommand{\FillInValue}{\textbf{\underline{XXX}}\xspace}


% REVISION COLORING
\definecolor{revgreen}{rgb}{0,0.5,0}
\newrobustcmd{\reva}[1]{\textcolor{blue}{{#1}}}
\newrobustcmd{\revb}[1]{\textcolor{revgreen}{{#1}}}
\newrobustcmd{\revc}[1]{\textcolor{magenta}{{#1}}}
\newrobustcmd{\revm}[1]{\textcolor{red}{{#1}}}

\newcommand{\todoMarker}{\textcolor{red}{TODO}}
\newcommand{\todoC}[1]{\textcolor{red}{TODO[{#1}]}}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% ALLOW REFERENCES TO MAIN DOCUMENT
\usepackage{xcite}
\usepackage{xr-hyper}
\externalcitedocument{./main}
%\externaldocument[techrep-]{../techreport}
\externaldocument{../main}


\begin{document}

\maketitle

 We thank the reviewers for their detailed reviews and constructive suggestions. Our changes in the paper are highlighted:
\reva{blue} for changes addressing comments of reviewer 1, \revb{green} for changes addressing comments of reviewer 2, \revc{magenta} for changes addressing comments of reviewer 3, and \revm{red} for general changes or changes addressing comments from more than one reviewer.
 We do not  highlight fixed typos and deletions to improve readability.
 % This revision report explains in detail how we
  % addressed the comments and edited the paper.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section*{Meta Reviewer}\label{sec:meta-reviewer}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\RCOMMENT{
  \textbf{R1} We look forward to a revision that incorporates the authors responses in the rebuttal in the final revised version. This set of promised revisions was agreed to be a minimal acceptable set of revisions and the reviewers have faith that the authors will deliver. A major concern is how dense the paper is and the lack of space to provide further details. After discussion, the reviewers agreed to leave it to the authors to find the right balance and given the limited space, the reviewers hope the authors can address all the comments as much as they can.
  }

  We addressed the ``opportunity for improvement'' sections of all reviewers and added all promised changes while keeping as much essential content as possible. Please see detailed responses addressing individual reviews below.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Reviewer 1}
\label{sec:reviewer-1}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\RCOMMENT{
\textbf{Response to author feedback}
The author feedback is convincing in that I believe the authors will be able to handle my comments in a revision
\textbf{List required changes for a revision}
opportunities for improvement (e.g., O1, O3, O6).
Please clean up the model and add all details requested in my comments in the "opportunities for improvement" part.
}
Thank you for the detailed comments, we explain how we have addressed each of your concerns in the following.

%%%%%%%%%%%%%%%%%%%%
\RCOMMENT{
\textbf{O1} Section 7 claims to define a query semantics for RA but it only focuses on selection. Please include all operators.
}
Note that $\uaaN$ is a semiring and standard K-relational query semantics for $\raPlus$ preserves bounds on relations over $\db$ annotated with $\uaaN$. As we note at the start of section 7, this is also true for relations over range-annotated values, with the exception of selection. Since the other operators have standard K-relational semantics, we did not show their definitions in 7 (they are the same as in 3.1). We now state  more clearly in section 7 that $\uaaN$ is a semiring and standard K-relational query semantics applies.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\RCOMMENT{
\textbf{O2} Specifically, is there a way to handle joins that would not often degenerate into cartesian products? With C-tables we still need to maintain all combinations but at least we can recover the join condition; here it seems that we both "pay" in impreciseness caused by lose the predicates and "pay" in size. There is a discussion of join optimizations in p10 but I think it is too late and too terse, and I propose the authors include both the basic join construction and the optimization in Section 7 along with examples.
}
% In the variant of C-tables where variables are allowed as attribute values (as defined in~\cite{DBLP:journals/jacm/ImielinskiL84}), joins may degenerate to cartesian products. For instance, if we are joining two tables on attributes whose values are unconstrained variables (no restrictions on the values of the variables through global constraints allow us to filter out any join results).
Without our optimization, joins over AU-DBs may degenerate to cross-products in the worst-case (when the bounds on the join attribute values of all tuples from both input tables overlap). The join optimization factors out attribute-level uncertainty and possible answers from the SGW + under-approximation of the certain answer. The cross product is restricted to the (ideally already small) possible part of both inputs, which we can further compress in a bound preserving way (albeit at a loss of accuracy). The larger SGW part can be joined using a regular join operation with the join condition from the user's query. We moved the discussion on join optimizations to section 7.1 and added more details. Furthermore, we have added additional experiments evaluating performance for queries with multiple joins (Figure 14) when controlling the aggressiveness of compression.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\RCOMMENT{
\textbf{O3} Similarly, the formal treatment of aggregates (Section 8) should be improved. I see how it may work for K=N, but the generalization to semirings is deferred to the full version and is unclear. Specifically, the model heavily uses the $*_{M}$ function, that essentially embeds elements of the semi-module back into the monoid. It is not obvious to me that it makes sense to have such an operation for every semiring and monoid: what if the semiring is access control/tropical from the PODS '07 semiring paper? How would such an embedding make sense there?
}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\RCOMMENT{
\textbf{O4}
Continuing O3, I think the crucial definition 15 can only be bound-preserving if one makes some monotonicity assumptions on the $*_{M}$ operation. These assumption should be spelled out. Again monotonicity makes sense if K=N (but even then, it must be stated as an assumption), but otherwise I'm not sure: first, there is the issue in O1 of how *{M} would even work; then, what if K is only partially ordered? Etc.
One solution is of course to restrict the whole construction to K=N, i.e. bag semantics. The authors indeed include a paragraph saying that the treatment of other semirings is deferred to the full version. But this hurts the generality of the contribution, so I do hope the authors can do better and formalize their assumptions in a more general fashion than that within the conference paper.
}

We apologize for not clarifying this better. As shown in [6], indeed $*_M$ is only well-behaved for certain combinations of semirings and aggregation monoids.  The solution in [6] is to use monoids whose domains are symbolic (bags of pairs of monoid elements and semiring elements). However, these tensors do not correspond to regular monoid elements in the general case.  It may be possible to allow such symbolic expressions as bounds, but may require storing expressions as large as the input (e.g., aggregation without group-by), which are probably meaningless to a user. Definition 15 should have been limited to the aggregation monoids mentioned earlier: SUM, MIN, MAX. We have fixed this definition and now state the assumptions and limitation of our aggregation semantics more clearly upfront.


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\RCOMMENT{
\textbf{O5} Corrolary 1: What's $RA^{agg}$? I thought it means RA with agg+group-by as last operation, because it seems this the supported class. However the Introduction seems to criticize other approaches for supporting this very same class (please see last paragraph of Page 1), so this needs to be clarified.
}
Note that $RA^{agg}$ is any query with aggregation, not just queries where there is a single aggregation as the last operation. A major advantage of our data model is that it is closed under aggregation which allows us to support queries with multiple aggregations. We state this now explicitly in the list of contributions at the end of the introduction. Also note that in the experiments (Section 9.1, TPC-H queries) we are using actual TPC-H benchmark queries and query Q7 has multiple aggregations.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\RCOMMENT{
\textbf{O6} Query evaluation for $RA^{agg}$ using your semantics is PTIME, right? Maybe worthwhile to have a proposition to this effect.
}
We added the statement that $RA^{agg}$ is PTIME in section 8 and have added a proposition to our technical report.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\RCOMMENT{
\textbf{O7} Continuing O2, how "join-intensive" are the queries used in the experiments? I.e. how many joins, how large are the relations being joined, what is the selectivity, etc? It may be worthwhile adding experiments that explicitly measure the effect of the number of joins on the algorithms performance.
}
We have added a benchmark (Figure 12) showing the effect of the number of joins on performance. For that experiment we used tables of a fixed size (4k) and varied the number of joins, uncertainty rate (3\% and 10\%) and compressed data size (i.e., the size of the compressed possible part of an input table or intermediate query result generated by our join optimization) and measured execution time for a given number of joins. We used equality joins with a single join condition. The join graph is a chain, e.g., $R \join S \join T$. As expected, the difference in runtime between the uncompressed and optimized version can be significant, especially for larger number of joins (up to $\sim$ 3 orders of magnitude for 2 joins and up to $\sim$ 5 orders of magnitude for 4 joins on 10\% uncertainty rate).

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Reviewer 2}\label{sec:reviewer-2}
%%%%%%%%%%%%%%%%%%%%

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\RCOMMENT{\textbf{Response to author feedback} O1: Per author feedback, I also
  appreciate the blending of systems and theory, and how difficult it is to
  condense the work to conference-paper length. I have faith that the authors
  will do their best to balance out the material as best they can within the
  current space constraints when responding to the reviewers. I hope that there
  is a follow-up full journal paper that can fully archive this nice piece of
  work.}

\RCOMMENT{O3: I certainly agree that ranges are helpful in communicating uncertainty. I
  was concerned that the fine-grained level of information provided could be
  hard to assimilate for decision makers. Maybe just indicate that the results
  could potentially be fed into either appropriate analytics tools or
  uncertainty-sensitive visual displays?

  \textbf{List required changes for a revision} I think all of O1-O6 need to be
  addressed to some degree. This may require splitting up the paper as
  mentioned.
}

\textbf{Regarding O1:} thank you for your kind words, we tried our best to present the work as clearly as possible within the space constraints.

\textbf{Regarding O3:} thank you, this is a good suggestion, we have added a comment to this effect to the introduction (last paragraph).

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% \RCOMMENT{
% \textbf{O1} The paper reads more like a PODS submission, with a lot of formalism (semirings, monoids, semimodules, and so on) and only 1.5 columns devoted to implementation. I’m wondering whether the authors should split this into two papers, a PODS paper that fills out the theory in the current draft (right now there are many pointers to a full tech report and a missing result as described in O2), and then writing a second paper going into some interesting implementation and perhaps usability details, which seem to be just hinted at in the current paper. Then both the various theoretical and practical issues can be given the full treatment that they deserve.
% }
% We believe that blending formalism and implementation is one of our paper's strengths. Without the discussion of the implementation, it is difficult to see the practical implications of the formalism (i.e., performance, accuracy in practice).  Without the discussion of the formalism, it is not clear that the implementation is computing anything sensible.\BG{Maybe remove, because superseeded by the response ot author feedback}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\RCOMMENT{
\textbf{O2} A related comment is that the authors assert in Sections 1, 4, and 11 that query evaluation has PTIME complexity, but they never discuss or prove this result. This needs to be fleshed out.
}
We have added a statement to section 7 that our use of K-Relations implies PTIME query evaluation and added a more detailed argument to our supplementary technical report arguing why aggregation (and set difference) is also in PTIME.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\RCOMMENT{
\textbf{O3} In terms of real-world practicality, I am not sure how digestible a query output such as the one in Figure 1(c) would be to an analyst or decision maker who is trying to make decisions under uncertainty. (There are HCI studies indicating that people can barely deal with confidence intervals.) It would greatly strengthen the paper to provide some convincing scenarios around how the output of an AU-DB query could be used in various important real-world applications.
}
We have added relevant references to the introduction to explain that ranges are established as a user-friendly representation of uncertainty.
% Figure 1.c is playing double-duty as an example and a view "under-the-hood", and as such we agree that a cleaner interface would be preferable, e.g. see [1, 2] for examples. However, we disagree that ranges are unhelpful. [4], as well as many others, demonstrates that communicating uncertainty (via ranges) can lead to better decision-making. \\
% \\
% $[1]$ Hellerstein, Haas, Wang. Online aggregation. SIGMOD, 1997. \\
% $[2]$ Kumari, Achmiz, Kennedy. Communicating data quality in on-demand curation. QDB, 2016. \\
% $[4]$ Jung, Sirkin, Gür, and Steinert. Displayed uncertainty improves driving experience and behavior: The case of range anxiety in an electric car. CHI, 2015.


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\RCOMMENT{
\textbf{O4}
The formalism seems to struggle when it comes to categorical attributes. E.g., the required total order on attribute values seems pretty artificial, and simply saying that for some attribute any value is possible seems crude. It is often the case that the value of an uncertain categorical attribute is known to lie in some small set (having more than three possible values). I don't see how this be represented in an AU-DB; if the point is that this level of detail is sacrificed in order to obtain computational efficiency, then this should be made clear.
}
The reviewer is correct that AU-DBs shine for numerical domains or other domains that have a clear total order. Note that aggregation, which is one of our main use cases, is producing numerical results. The examples have been reworked to use categorical attributes that have a hierarchical relationship defining a sensible total order (e.g., town $<$ city  $<$ metro). We now explicitly state that attribute bounds would typically degrade to a binary marker for unordered categorical attributes. We believe that there are opportunities for interesting follow-up work that considers other compact representations of sets of possible values, perhaps using the lowest common ancestor of values in a taxonomy for unordered domains (for instance, a set of cities may be replaced by the state or country they belong to).

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\RCOMMENT{
\textbf{O5} I am puzzled by the comparison to probabilistic databases, both in the literature review and in the experiments. PDB's provide a different functionality from AU-DB's, in that AU-DB's efficiently compute and represent sets of possible worlds, but PDB's also provide a probability distribution over the possible worlds. I.e., knowing that some world is possible is not so interesting if we know that the probability of seeing that world is negligible. Given this greater functionality (which of course comes with greater demands on the input data), it is not surprising that AU-DB is more computationally efficient than, say, MCDB. I think a more precise discussion is needed about where each of these types of databases is more appropriate.
}
We compare against probabilistic databases (PDBs) in related work, because PDBs also capture uncertainty. Of course, probabilistic databases are more expressive than incomplete databases (on which AU-DBs are based on). The choice to use PDBs for evaluation is motivated by the lack of available incomplete database (IDB) implementations that support possible answers (and aggregation). As we note in Section 9, we skip MayBMS's probability computation step, and MCDB is already independent of probabilities once samples are generated.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\RCOMMENT{
\textbf{O6} It is also unclear how one goes from a PDB representation to an AU-DB representation, which the paper asserts can be done. It seems as if you need to throw away a lot of information. E.g., if an attribute is real-valued with a normal probability distribution. Then the lower and upper bounds are -infinity and +infinity respectively, which is not very informative.
}
To be clear, the claim that we are making is that existing schemes designed to create a PDB can be re-used to create AU-DBs (e.g., MayBMS' probabilistic repair-key operator). % We now  clarify that this comes at a loss of accuracy.
We envision our techniques to be applied to PDBs when queries have to be answered that are computationally infeasible over PDBs. In this case, we may map the PDB into an AU-DB at query time and preserve the input PDB for future use. Our approach may only return a coarse approximation (the example presented by the reviewer of a continuous domain where all values have a non-zero probability is a worst-case scenario), but does so within reasonable time.
We now clearly state these limitations in the paper in \Cref{sec:creating-abbruaadbs}.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Additional remarks
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Reviewer 2 additional remarks}
\RCOMMENT{
\textbf{12. } The example does not involve any cardinalities greater than 1. I was briefly confused about whether, for a tuple with uncertain multiplicity and uncertain attribute values, in a possible world where the multiplicity is, say, 3, all three instances would have the same attribute values. I think that they would, but a reworked example might make this clearer. The terms under- and over-approximate are used in Sec. 1 but not defined until Sec. 2.
}

Note that a tuple in an AU-DB with multiplicity 3 may represent different tuples in different worlds. Otherwise, the attribute-level uncertainty would not give us only very limited power to compactly encode possible answers.

\RCOMMENT{
\textbf{12. Sec 3} It would be helpful to say more explicitly that UA-DB's handle uncertainty only at the level of tuple multiplicity.
}

We now mention this explicitly.

\RCOMMENT{
\textbf{12. Sec 8} In the paragraph on aggregation monoids, there seems to be a slight notation clash, because M is used both as the domain of a monoid and an element of {SUM, MIN, MAX}. In Section 8.1, the notation $*_{N_{AU},SUM}$ is a bit confusing.
}
Using common practice in work on K-relations, we abuse notation to use $M$ to mean both the structure (monoid) as well as the domain over which the structure is defined.

\RCOMMENT{
\textbf{12. Sec 8.2}  It appears that the circled-star operator could lead to some rather loose bounds. Is this related to the occasionally high metric values in Fig. 11? (Those high values should be explained in any case.) Does this have something to do with the rather loose-looking upper bounds in Def. 19?
}

With sincere apologies, this was due to a poor choice of datasets from our part:
the errors in our ``real world'' benchmarks, taken from a data cleaning
benchmark (\url{http://db.unibas.it/projects/bart/}), are synthetically
generated.  Because of their synthetic origins, the error values in these
datasets had an unrealistically high variance, creating wide bounds on uncertain
attributes, and in turn leading to the high imprecision in aggregate results
that appeared in the original Figure 11. We have moved the synthetic real world
experiments to our technical report, and identified several new real world
datasets with conflicting duplicate entries. The precision on these real world
datasets with real world errors turns out to be significantly better.

\RCOMMENT{
\textbf{12. p9} bottom left par.: Some more explanation around exactly why the given assumptions are "worst case" would be helpful. In the top right par., change "[6] did extend" -> "[6] extended", and in the bottom left par., change "From Thm. 4,... it follows our main technical result" to "Our main technical result follows from Thm. 4..."
}

Thanks, we have fixed these.

\RCOMMENT{
\textbf{} It would be interesting to know whether the new techniques could be applied to OLAP databases, improving on Sismanis et al., ICDE 2009.
}

In general there is nothing that prevents the application of our work on OLAP databases. We assume the reviewer is referring to the particular type of uncertainty from Sismanis et al., ICDE 2009, where uncertainty stems from unresolved entity resolution decisions. This type of uncertainty can be represented as a block-independent database / x-db (each entity is a block and all tuples that could represent this entity are the tuples belonging to the block). In the supplementary technical report we present a scheme for mapping x-dbs to AU-DBs. This scheme could be used for entity resolution. We now discuss Sismanis et al., ICDE 2009. in related work as one of the approaches that represents aggregation results using bounds.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\section{Reviewer 3}
\label{sec:reviewer-3}
%%%%%%%%%%%%%%%%%%%%

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\RCOMMENT{
\textbf{List required changes for a revision}
O1 and O2 are the most important. Followed by O5.
}

We have addressed all opportunities of improvement, please see below.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\RCOMMENT{
\textbf{O1} a discussion of runtime of the AU-DB method. The details may be in the technical report, but some overall results would be helpful to understand the performance.
}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
We added a statement to this effect to Section 7 and at the beginning of Section 8. Since we are using K-relational semantics, query evaluation for $\raPlus$ has PTIME data complexity. We prove in [24] that aggregation (and set difference) also has PTIME data complexity.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\RCOMMENT{
\textbf{O2}  I appreciate the real-world data experiment, but that section needs to be better explained. What real world datasets are these? What are their attributes? Why only modify a single tuple per group? Where is the percentage of uncertain tuples shown? (The table is a bit confusing)
}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
We modified the real-world data experiments section to add detailed explanations and present additional results. The percentage of uncertain tuples is labeled in the table by the name of the dataset with the average amount of variations for each uncertain tuple. More details also added to the tech-report about the datasets. For possible recall, we used two metrics, one using tuple id and group by attributes to identify tuples (all tuple variations in a block are the same tuple) and the other using all values in the tuple to identify the tuple (all variations in a block are different tuples). Please also see our response to Reviewer 2 (additional comments, Sec 8.2) for why we used different datasets for the revision.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\RCOMMENT{
\textbf{O3} The paper could use a better discussion of the challenges and cons of UA-DBs. They touch on this at the beginning of section 4, but if they expanded this, it would be more clear as to the real benefits of their approach.
}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Critically, UA-DBs do not support aggregation, which requires a compact over-approximation of possible tuples and uncertain attributes, nor do they support set difference, which requires the former. Please see our response to your detailed comments.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\RCOMMENT{
\textbf{O4} A discussion of the impact of tuple bounds on query results. Perhaps measure the error of your upper/lower bounds one the non-altered deterministic table. The point of this work is clearly not to make the tightest bounds and be highly accurate, but it would be nice to understand the impact of bounds on query results.
}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
We added a benchmark (Figure 11) showing the effect of over-estimation of attribute bounds on input tuples on the over-estimation of output tuple bounds produced by our query semantics.  We construct x-DBs (block-independent incomplete databases) and create AU-DB instances accordingly. We compute query result over the x-DBs to calculate tight attribute-level bounds as the ground truth and compare this with the bounds produced by AU-DBs.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\RCOMMENT{
\textbf{O5} they discuss that an improvement over UA-DBs is in their compactness, but they do not have experiments/metrics/analysis showing this. This type of result would strengthen their argument.
}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
We have added the comparison with UA-DBs to Figure 13.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% additional remarks
\subsection{Reviewer 3 additional remarks}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\RCOMMENT{
High level suggestion: it might be worth changing the font of AU-DB to be different than UA-DB. I often misread one
for the other and got very confused. This is not critical though, just a suggestion as they are very similar acronyms.
}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Thanks, we have done this.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\RCOMMENT{
Sec 4: this first part could be improved to really strengthen your argument. Can you expand upon what the
challenges of using UA-DBs are. Some more detailed comments on that section are: (1) is "precise" really the right
term to describe UA-DBs. It seems you are looking more for compactness of representation? Also, can you explain
the UA-DB query semantics not supporting non-monotone operations due to over-approximation statement more?
}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

As you point out we want a representation of possible answers that is more compact than in UA-DBs. However, AU-DBs can also be more precise, because it is possible to express the fact that a tuple certainly exists even though some of its attribute values are not precisely known. This has real implications on precision. For example, our aggregation semantics over AU-DBs can often determine that a group in the aggregation result certainly exists, but its aggregation value is unknown: if at least one certain tuple with these group-by values exists, then the group exists certainly in the output. In UA-DBs, even if we extend the semantics to keep an over-approximation of possible tuples, then still we have to mark such tuples in the aggregation result as being uncertain. We have tried to express this more clearly in the beginning of Section 4. Also note that we only claim that AU-DBs are more precise than UA-DBs. Obviously, PDBs or incomplete databases by encoding exactly all possible tuples are often more precise than AU-DBs.

% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% \RCOMMENT{
% Figure 4: can you explain the annotation of 2 for the D2 tuple 2 row? I also don't know where you use D2 in the
% paper.
% }
% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% Note that this is an example incomplete database with two possible worlds ($D_1$ and $D_2$). We are using $D_2$, because the example AU-DB bounds the incomplete database (it bounds both $D_1$ and $D_2$). With just a single possible world $D_1$ it would be hard to explaining the bounding properties of AU-DBs. Regarding your specific question, the annotation means that tuple $<1,3>$ appears twice in the possible world $D_2$.\BG{Maybe not necessary, should we remove?}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\RCOMMENT{
Sec 9: the join optimization section was a bit confusion. I understand you are pressed for space and the details are
in the report, but I struggled to understand at a high level what you were doing. Maybe explain it at a higher level of
abstraction? You also mention that the latter part of the join (upper bound) is expensive, how expensive?
}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Following the suggestion of reviewer 1, we have merged the discussion of the query semantics with the join optimization and have tried to present it at a higher-level of abstraction as suggested. We hope this is now more clear.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\RCOMMENT{
Experiments: my main comment on this is explaining the real word data section. The table did not make much
sense, and I didn't understand the datasets. Where are they coming from, what are the queries, what is your
reasoning for only altering one tuple per group. My other comments on this section are in the "opportunities for
improvement section".
}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Originally we used some datasets that were commonly used in data cleaning, but the error in these datasets were synthetically generated leading to unrealistically large attribute bounds.  We now use three real world datasets (Netflix, Crimes, Healthcare) which have real errors. Detailed information about these datasets and all queries we used are shown in our technical report. To keep the paper more self-contained, we have added links to the datasets. We have updated the description of this section and provide additional details in the technical report. Please also see our response to reviewer 2's additional comments.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\RCOMMENT{
- Experiments: can you define scale factor as SF (I don't think you define that acronym).
- Def 9: the bold fact t and t are switched in other definitions of TM (boldface is usually first)
- You sometime use $N^3$ and sometimes use $N_{AU}$ to define the annotation. Can you make this consistent?
- Example 6: it might help to add a sentence about why you don't need the min(0, ...) formulation for t1's participation
in the result. It's because the annotations are not uncertain (3=3=3)
}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Thanks, we have fixed these.

% \bibliographystyle{plain}
% \bibliography{../uaadb}
\end{document}