571aa27554
## What changes were proposed in this pull request? Equi-height histogram is one of the state-of-the-art statistics for cardinality estimation, which can provide better estimation accuracy, and good at cases with skew data. This PR is to improve join estimation based on equi-height histogram. The difference from basic estimation (based on ndv) is the logic for computing join cardinality and the new ndv after join. The main idea is as follows: 1. find overlapped ranges between two histograms from two join keys; 2. apply the formula `T(A IJ B) = T(A) * T(B) / max(V(A.k1), V(B.k1))` in each overlapped range. ## How was this patch tested? Added new test cases. Author: Zhenhua Wang <wangzhenhua@huawei.com> Closes #19594 from wzhfy/join_estimation_histogram. |
||
---|---|---|
.. | ||
src | ||
pom.xml |