spark-instrumented-optimizer

History

Hyukjin Kwon 9c5bcac61e [SPARK-36626][PYTHON] Support TimestampNTZ in createDataFrame/toPandas and Python UDFs ### What changes were proposed in this pull request? This PR proposes to implement `TimestampNTZType` support in PySpark's `SparkSession.createDataFrame`, `DataFrame.toPandas`, Python UDFs, and pandas UDFs with and without Arrow. ### Why are the changes needed? To complete `TimestampNTZType` support. ### Does this PR introduce _any_ user-facing change? Yes. - Users now can use `TimestampNTZType` type in `SparkSession.createDataFrame`, `DataFrame.toPandas`, Python UDFs, and pandas UDFs with and without Arrow. - If `spark.sql.timestampType` is configured to `TIMESTAMP_NTZ`, PySpark will infer the `datetime` without timezone as `TimestampNTZType`. If it has a timezone, it will be inferred as `TimestampType` in `SparkSession.createDataFrame`. - If `TimestampType` and `TimestampNTZType` conflict during merging inferred schema, `TimestampType` has a higher precedence. - If the type is `TimestampNTZType`, treat this internally as an unknown timezone, and compute w/ UTC (same as JVM side), and avoid localization externally. ### How was this patch tested? Manually tested and unittests were added. Closes #33876 from HyukjinKwon/SPARK-36626. Lead-authored-by: Hyukjin Kwon <gurwls223@apache.org> Co-authored-by: Dominik Gehl <dog@open.ch> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>		2021-09-02 14:00:27 +09:00
..
benchmarks	[SPARK-34981][SQL][FOLLOWUP] Use SpecificInternalRow in ApplyFunctionExpression	2021-05-24 17:25:24 +09:00
src	[SPARK-36626][PYTHON] Support TimestampNTZ in createDataFrame/toPandas and Python UDFs	2021-09-02 14:00:27 +09:00
pom.xml	Revert "[SPARK-34309][BUILD][CORE][SQL][K8S] Use Caffeine instead of Guava Cache"	2021-08-22 09:36:15 +09:00