spark-instrumented-optimizer/sql/catalyst
Angerszhuuuu 75bffd972d [SPARK-36755][SQL] ArraysOverlap should handle duplicated Double.NaN and Float.NaN
### What changes were proposed in this pull request?
For query
```
select arrays_overlap(array(cast('nan' as double), 1d), array(cast('nan' as double)))
```
This returns [false], but it should return [true].
This issue is caused by `scala.mutable.HashSet` can't handle `Double.NaN` and `Float.NaN`.

### Why are the changes needed?
Fix bug

### Does this PR introduce _any_ user-facing change?
arrays_overlap won't handle equal `NaN` value

### How was this patch tested?
Added UT

Closes #34006 from AngersZhuuuu/SPARK-36755.

Authored-by: Angerszhuuuu <angers.zhu@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit b665782f0d)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2021-09-15 22:32:18 +08:00
..
benchmarks [SPARK-34950][TESTS] Update benchmark results to the ones created by GitHub Actions machines 2021-04-03 23:02:56 +03:00
src [SPARK-36755][SQL] ArraysOverlap should handle duplicated Double.NaN and Float.NaN 2021-09-15 22:32:18 +08:00
pom.xml [SPARK-36712][BUILD] Make scala-parallel-collections in 2.13 POM a direct dependency (not in maven profile) 2021-09-13 11:06:58 -05:00