spark-instrumented-optimizer/common
wangguangxin.cn 9a35b93c8a [SPARK-32559][SQL] Fix the trim logic in UTF8String.toInt/toLong did't handle non-ASCII characters correctly
### What changes were proposed in this pull request?
The trim logic in Cast expression introduced in https://github.com/apache/spark/pull/26622 trim non-ASCII characters unexpectly.

Before this patch
![image](https://user-images.githubusercontent.com/1312321/89513154-caad9b80-d806-11ea-9ebe-17c9e7d1b5b3.png)

After this patch
![image](https://user-images.githubusercontent.com/1312321/89513196-d731f400-d806-11ea-959c-6a7dc29dcd49.png)

### Why are the changes needed?
The behavior described above doesn't make sense, and also doesn't consistent with the behavior when cast a string to double/float, as well as doesn't consistent with the behavior of Hive

### Does this PR introduce _any_ user-facing change?
Yes

### How was this patch tested?
Added more UT

Closes #29375 from WangGuangxin/cast-bugfix.

Authored-by: wangguangxin.cn <wangguangxin.cn@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2020-08-07 05:00:33 +00:00
..
kvstore [SPARK-32350][CORE] Add batch-write on LevelDB to improve performance of HybridStore 2020-07-22 13:27:34 +09:00
network-common [SPARK-32036] Replace references to blacklist/whitelist language with more appropriate terminology, excluding the blacklisting feature 2020-07-15 11:40:55 -05:00
network-shuffle [SPARK-32149][SHUFFLE] Improve file path name normalisation at block resolution within the external shuffle service 2020-07-11 22:55:26 +09:00
network-yarn [SPARK-31611][YARN] Register NettyMemoryMetrics into Node Manager's metrics system 2020-05-08 15:50:19 -07:00
sketch [SPARK-32398][TESTS][CORE][STREAMING][SQL][ML] Update to scalatest 3.2.0 for Scala 2.13.3+ 2020-07-23 16:20:17 -07:00
tags [SPARK-32245][INFRA] Run Spark tests in Github Actions 2020-07-11 13:09:06 -07:00
unsafe [SPARK-32559][SQL] Fix the trim logic in UTF8String.toInt/toLong did't handle non-ASCII characters correctly 2020-08-07 05:00:33 +00:00