spark-instrumented-optimizer/python/pyspark/sql
Li Jin d3eed8fd6d [SPARK-24563][PYTHON] Catch TypeError when testing existence of HiveConf when creating pysp…
…ark shell

## What changes were proposed in this pull request?

This PR catches TypeError when testing existence of HiveConf when creating pyspark shell

## How was this patch tested?

Manually tested. Here are the manual test cases:

Build with hive:
```
(pyarrow-dev) Lis-MacBook-Pro:spark icexelloss$ bin/pyspark
Python 3.6.5 | packaged by conda-forge | (default, Apr  6 2018, 13:44:09)
[GCC 4.2.1 Compatible Apple LLVM 6.1.0 (clang-602.0.53)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
18/06/14 14:55:41 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /__ / .__/\_,_/_/ /_/\_\   version 2.4.0-SNAPSHOT
      /_/

Using Python version 3.6.5 (default, Apr  6 2018 13:44:09)
SparkSession available as 'spark'.
>>> spark.conf.get('spark.sql.catalogImplementation')
'hive'
```

Build without hive:
```
(pyarrow-dev) Lis-MacBook-Pro:spark icexelloss$ bin/pyspark
Python 3.6.5 | packaged by conda-forge | (default, Apr  6 2018, 13:44:09)
[GCC 4.2.1 Compatible Apple LLVM 6.1.0 (clang-602.0.53)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
18/06/14 15:04:52 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /__ / .__/\_,_/_/ /_/\_\   version 2.4.0-SNAPSHOT
      /_/

Using Python version 3.6.5 (default, Apr  6 2018 13:44:09)
SparkSession available as 'spark'.
>>> spark.conf.get('spark.sql.catalogImplementation')
'in-memory'
```

Failed to start shell:
```
(pyarrow-dev) Lis-MacBook-Pro:spark icexelloss$ bin/pyspark
Python 3.6.5 | packaged by conda-forge | (default, Apr  6 2018, 13:44:09)
[GCC 4.2.1 Compatible Apple LLVM 6.1.0 (clang-602.0.53)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
18/06/14 15:07:53 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
/Users/icexelloss/workspace/spark/python/pyspark/shell.py:45: UserWarning: Failed to initialize Spark session.
  warnings.warn("Failed to initialize Spark session.")
Traceback (most recent call last):
  File "/Users/icexelloss/workspace/spark/python/pyspark/shell.py", line 41, in <module>
    spark = SparkSession._create_shell_session()
  File "/Users/icexelloss/workspace/spark/python/pyspark/sql/session.py", line 581, in _create_shell_session
    return SparkSession.builder.getOrCreate()
  File "/Users/icexelloss/workspace/spark/python/pyspark/sql/session.py", line 168, in getOrCreate
    raise py4j.protocol.Py4JError("Fake Py4JError")
py4j.protocol.Py4JError: Fake Py4JError
(pyarrow-dev) Lis-MacBook-Pro:spark icexelloss$
```

Author: Li Jin <ice.xelloss@gmail.com>

Closes #21569 from icexelloss/SPARK-24563-fix-pyspark-shell-without-hive.
2018-06-14 13:16:20 -07:00
..
__init__.py [SPARK-22369][PYTHON][DOCS] Exposes catalog API documentation in PySpark 2017-11-02 15:22:52 +01:00
catalog.py [SPARK-23522][PYTHON] always use sys.exit over builtin exit 2018-03-08 20:38:34 +09:00
column.py [SPARK-23847][PYTHON][SQL] Add asc_nulls_first, asc_nulls_last to PySpark 2018-04-08 12:09:06 +08:00
conf.py [SPARK-23700][PYTHON] Cleanup imports in pyspark.sql 2018-03-26 12:42:32 +09:00
context.py [SPARK-23706][PYTHON] spark.conf.get(value, default=None) should produce None in PySpark 2018-03-18 20:24:14 +09:00
dataframe.py [SPARK-24215][PYSPARK] Implement _repr_html_ for dataframes in PySpark 2018-06-05 08:23:08 +07:00
functions.py [SPARK-22239][SQL][PYTHON] Enable grouped aggregate pandas UDFs as window functions with unbounded window frames 2018-06-13 09:10:52 +08:00
group.py [SPARK-24392][PYTHON] Label pandas_udf as Experimental 2018-05-28 12:56:05 +08:00
readwriter.py [SPARK-23786][SQL] Checking column names of csv headers 2018-06-03 22:02:21 -07:00
session.py [SPARK-24563][PYTHON] Catch TypeError when testing existence of HiveConf when creating pysp… 2018-06-14 13:16:20 -07:00
streaming.py [SPARK-23786][SQL] Checking column names of csv headers 2018-06-03 22:02:21 -07:00
tests.py [SPARK-22239][SQL][PYTHON] Enable grouped aggregate pandas UDFs as window functions with unbounded window frames 2018-06-13 09:10:52 +08:00
types.py [SPARK-24057][PYTHON] put the real data type in the AssertionError message 2018-04-26 14:21:22 -07:00
udf.py [SPARK-23754][PYTHON][FOLLOWUP] Move UDF stop iteration wrapping from driver to executor 2018-06-11 10:15:42 +08:00
utils.py [SPARK-23699][PYTHON][SQL] Raise same type of error caught with Arrow enabled 2018-03-27 20:06:12 -07:00
window.py [SPARK-23861][SQL][DOC] Clarify default window frame with and without orderBy clause 2018-04-07 00:15:54 +08:00