[SPARK-36626][PYTHON][FOLLOW-UP] Use datetime.tzinfo instead of datetime.tzname()

### What changes were proposed in this pull request?

This PR is a small followup of https://github.com/apache/spark/pull/33876 which proposes to use `datetime.tzinfo` instead of `datetime.tzname` to see if timezome information is provided or not.

This way is consistent with other places such as:

9c5bcac61e/python/pyspark/sql/types.py (L182)

9c5bcac61e/python/pyspark/sql/types.py (L1662)

### Why are the changes needed?

In some cases, `datetime.tzname` can raise an exception (https://docs.python.org/3/library/datetime.html#datetime.datetime.tzname):

> ... raises an exception if the latter doesn’t return None or a string object,

I was able to reproduce this in Jenkins with setting `spark.sql.timestampType` to `TIMESTAMP_NTZ` by default:

```
======================================================================
ERROR: test_time_with_timezone (pyspark.sql.tests.test_serde.SerdeTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/sql/tests/test_serde.py", line 92, in test_time_with_timezone
...
  File "/usr/lib/pypy3/lib-python/3/datetime.py", line 979, in tzname
    raise NotImplementedError("tzinfo subclass must override tzname()")
NotImplementedError: tzinfo subclass must override tzname()
```

### Does this PR introduce _any_ user-facing change?

No to end users because it has not be released.
This is rather a safeguard to prevent potential breakage.

### How was this patch tested?

Manually tested.

Closes #33918 from HyukjinKwon/SPARK-36626-followup.

Authored-by: Hyukjin Kwon <gurwls223@apache.org>
Signed-off-by: Max Gekk <max.gekk@gmail.com>
This commit is contained in:
Hyukjin Kwon 2021-09-06 17:16:52 +02:00 committed by Max Gekk
parent db95960f4b
commit c6f3a13087

View file

@ -1045,7 +1045,7 @@ def _infer_type(obj, infer_dict_as_struct=False, prefer_timestamp_ntz=False):
if dataType is DecimalType:
# the precision and scale of `obj` may be different from row to row.
return DecimalType(38, 18)
if dataType is TimestampType and prefer_timestamp_ntz and obj.tzname() is None:
if dataType is TimestampType and prefer_timestamp_ntz and obj.tzinfo is None:
return TimestampNTZType()
elif dataType is not None:
return dataType()