350aa859fe
### What changes were proposed in this pull request? 1. Add method `getTimeFormatters` to `HiveResult` which creates timestamp and date formatters. 2. Move creation of `dateFormatter` and `timestampFormatter` from the constructor of the `HiveResult` object to `HiveResult. hiveResultString()` via `getTimeFormatters`. This allows to resolve time zone ID from Spark's session time zone `spark.sql.session.timeZone` and create date/timestamp formatters only once before collecting `java.sql.Timestamp`/`java.sql.Date` values. 3. Create date/timestamp formatters once in SparkExecuteStatementOperation. ### Why are the changes needed? To fix perf regression comparing to Spark 2.4 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - By existing test suite `HiveResultSuite` and etc. - Re-generate benchmarks results of `DateTimeBenchmark` in the environment: | Item | Description | | ---- | ----| | Region | us-west-2 (Oregon) | | Instance | r3.xlarge | | AMI | ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20190722.1 (ami-06f2f779464715dc5) | | Java | OpenJDK 64-Bit Server VM 1.8.0_252 and OpenJDK 64-Bit Server VM 11.0.7+10 | Closes #28842 from MaxGekk/opt-toHiveString-oss-master. Authored-by: Max Gekk <max.gekk@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> |
||
---|---|---|
.. | ||
src | ||
v1.2 | ||
v2.3 | ||
pom.xml |