90f2d4d9cf
### What changes were proposed in this pull request? Replaced the `agg(if (('gid = 1)) 'cat1 else null)` pattern in `RewriteDistinctAggregates` with `agg('cat1) FILTER (WHERE 'gid = 1)` ### Why are the changes needed? For aggregate functions, that do not ignore NULL values (`First`, `Last` or `UDAF`s) the current approach can return wrong results. In the added UT there are no nulls in the input `testData`. The query returned `Row(0, 1, 0, 51, 100)` before this PR. ### Does this PR introduce _any_ user-facing change? Bugfix ### How was this patch tested? UT Closes #31983 from tanelk/SPARK-34882_distinct_agg_filter. Lead-authored-by: Tanel Kiis <tanel.kiis@gmail.com> Co-authored-by: tanel.kiis@gmail.com <tanel.kiis@gmail.com> Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org> |
||
---|---|---|
.. | ||
benchmarks | ||
src | ||
pom.xml |