diff --git a/sections/generalizing.tex b/sections/generalizing.tex index f4e33f5..2617de1 100644 --- a/sections/generalizing.tex +++ b/sections/generalizing.tex @@ -19,7 +19,7 @@ Similar transformations appear in optimizing compilers --- the above equivalence \subsection{Generalizing Singletons} Singletons allow users to try out hypotheticals, explore cleaning solutions, and conduct small-scale tests. -It is often easier for users to perform one-off curation steps initially, repairing errors in the data as singletons, rather than expending the mental effort to generalize the repair upfront. +It is often easier for users to perform one-off curation steps initially, repairing errors in the data as singletons, rather than expending the mental effort to generalize the repair upfront. However, when the user needs to adapt their preliminary data cleaning solution to new data, to a larger dataset, or to an updated dataset, these singleton operations can become a burden. Although they put more control over the curation process in the user's hands, singleton actions increase the size and complexity of a \langname script, with no benefits beyond the initial dataset. In addition to considering readability-enhancing rewrites that preserve semantic equivalence, it will be necessary for \sysname to evaluate how singleton actions can be generalized --- effectively a form of query (or curation, in this case) by example~\cite{Zloof:1975:QE:1499949.1500034}. Concretely, given a set of similar statements with singleton targets, we would like to propose to the user a set of rewrite that applies the same update to a region covering all the singletons.