spark-instrumented-optimizer/sql/core
Karen Feng ddc61e62b9 [SPARK-36079][SQL] Null-based filter estimate should always be in the range [0, 1]
### What changes were proposed in this pull request?

Forces the selectivity estimate for null-based filters to be in the range `[0,1]`.

### Why are the changes needed?

I noticed in a few TPC-DS query tests that the column statistic null count can be higher than the table statistic row count. In the current implementation, the selectivity estimate for `IsNotNull` is negative.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Unit test

Closes #33286 from karenfeng/bound-selectivity-est.

Authored-by: Karen Feng <karen.feng@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2021-07-20 21:32:13 +08:00
..
benchmarks [SPARK-34981][SQL][FOLLOWUP] Use SpecificInternalRow in ApplyFunctionExpression 2021-05-24 17:25:24 +09:00
src [SPARK-36079][SQL] Null-based filter estimate should always be in the range [0, 1] 2021-07-20 21:32:13 +08:00
pom.xml [SPARK-35996][BUILD] Setting version to 3.3.0-SNAPSHOT 2021-07-02 13:47:36 -07:00