spark-instrumented-optimizer

History

HyukjinKwon e065e22e5e [SPARK-30861][PYTHON][SQL] Deprecate constructor of SQLContext and getOrCreate in SQLContext at PySpark ### What changes were proposed in this pull request? This PR proposes to deprecate the APIs at `SQLContext` removed in SPARK-25908. We should remove equivalent APIs; however, seems we missed to deprecate. While I am here, I fix one more issue. After SPARK-25908, `sc._jvm.SQLContext.getOrCreate` dose not exist anymore. So, ```python from pyspark.sql import SQLContext from pyspark import SparkContext sc = SparkContext.getOrCreate() SQLContext.getOrCreate(sc).range(10).show() ``` throws an exception as below: ``` Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/.../spark/python/pyspark/sql/context.py", line 110, in getOrCreate jsqlContext = sc._jvm.SQLContext.getOrCreate(sc._jsc.sc()) File "/.../spark/python/lib/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", line 1516, in __getattr__ py4j.protocol.Py4JError: org.apache.spark.sql.SQLContext.getOrCreate does not exist in the JVM ``` After this PR: ``` /.../spark/python/pyspark/sql/context.py:113: DeprecationWarning: Deprecated in 3.0.0. Use SparkSession.builder.getOrCreate() instead. DeprecationWarning) +---+ \| id\| +---+ \| 0\| \| 1\| \| 2\| \| 3\| \| 4\| \| 5\| \| 6\| \| 7\| \| 8\| \| 9\| +---+ ``` In case of the constructor of `SQLContext`, after this PR: ```python from pyspark.sql import SQLContext sc = SparkContext.getOrCreate() SQLContext(sc) ``` ``` /.../spark/python/pyspark/sql/context.py:77: DeprecationWarning: Deprecated in 3.0.0. Use SparkSession.builder.getOrCreate() instead. DeprecationWarning) ``` ### Why are the changes needed? To promote to use SparkSession, and keep the API party consistent with Scala side. ### Does this PR introduce any user-facing change? Yes, it will show deprecation warning to users. ### How was this patch tested? Manually tested as described above. Unittests were also added. Closes #27614 from HyukjinKwon/SPARK-30861. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: HyukjinKwon <gurwls223@apache.org>		2020-02-19 11:17:47 +09:00
..
__init__.py	[SPARK-26032][PYTHON] Break large sql/tests.py files into smaller files	2018-11-14 14:51:11 +08:00
test_arrow.py	[SPARK-30777][PYTHON][TESTS] Fix test failures for Pandas >= 1.0.0	2020-02-11 10:03:01 +09:00
test_catalog.py	[SPARK-28130][PYTHON] Print pretty messages for skipped tests when xmlrunner is available in PySpark	2019-06-24 09:58:17 +09:00
test_column.py	[SPARK-29664][PYTHON][SQL] Column.getItem behavior is not consistent with Scala	2019-11-01 12:25:48 +09:00
test_conf.py	[SPARK-28130][PYTHON] Print pretty messages for skipped tests when xmlrunner is available in PySpark	2019-06-24 09:58:17 +09:00
test_context.py	[SPARK-30861][PYTHON][SQL] Deprecate constructor of SQLContext and getOrCreate in SQLContext at PySpark	2020-02-19 11:17:47 +09:00
test_dataframe.py	[SPARK-30791][SQL][PYTHON] Add 'sameSemantics' and 'sementicHash' methods in Dataset	2020-02-18 09:22:26 +08:00
test_datasources.py	[SPARK-28130][PYTHON] Print pretty messages for skipped tests when xmlrunner is available in PySpark	2019-06-24 09:58:17 +09:00
test_functions.py	[SPARK-30607][SQL][PYSPARK][SPARKR] Add overlay wrappers for SparkR and PySpark	2020-01-23 16:16:47 +09:00
test_group.py	[SPARK-28130][PYTHON] Print pretty messages for skipped tests when xmlrunner is available in PySpark	2019-06-24 09:58:17 +09:00
test_pandas_cogrouped_map.py	[SPARK-28264][PYTHON][SQL] Support type hints in pandas UDF and rename/move inconsistent pandas UDF types	2020-01-22 15:32:58 +09:00
test_pandas_grouped_map.py	[SPARK-30777][PYTHON][TESTS] Fix test failures for Pandas >= 1.0.0	2020-02-11 10:03:01 +09:00
test_pandas_map.py	[SPARK-28264][PYTHON][SQL] Support type hints in pandas UDF and rename/move inconsistent pandas UDF types	2020-01-22 15:32:58 +09:00
test_pandas_udf.py	[SPARK-30812][SQL][CORE] Revise boolean config name to comply with new config naming policy	2020-02-18 20:39:50 +08:00
test_pandas_udf_grouped_agg.py	[SPARK-30777][PYTHON][TESTS] Fix test failures for Pandas >= 1.0.0	2020-02-11 10:03:01 +09:00
test_pandas_udf_scalar.py	[SPARK-27870][PYTHON][FOLLOW-UP] Rename spark.sql.pandas.udf.buffer.size to spark.sql.execution.pandas.udf.buffer.size	2020-02-05 11:38:33 +09:00
test_pandas_udf_typehints.py	[SPARK-28264][PYTHON][SQL] Support type hints in pandas UDF and rename/move inconsistent pandas UDF types	2020-01-22 15:32:58 +09:00
test_pandas_udf_window.py	[SPARK-28130][PYTHON] Print pretty messages for skipped tests when xmlrunner is available in PySpark	2019-06-24 09:58:17 +09:00
test_readwriter.py	[SPARK-28411][PYTHON][SQL] InsertInto with overwrite is not honored	2019-07-18 13:37:59 +09:00
test_serde.py	[SPARK-29041][PYTHON] Allows createDataFrame to accept bytes as binary type	2019-09-12 08:52:25 +09:00
test_session.py	[SPARK-28130][PYTHON] Print pretty messages for skipped tests when xmlrunner is available in PySpark	2019-06-24 09:58:17 +09:00
test_streaming.py	[SPARK-28130][PYTHON] Print pretty messages for skipped tests when xmlrunner is available in PySpark	2019-06-24 09:58:17 +09:00
test_types.py	[SPARK-30812][SQL][CORE] Revise boolean config name to comply with new config naming policy	2020-02-18 20:39:50 +08:00
test_udf.py	[SPARK-28264][PYTHON][SQL] Support type hints in pandas UDF and rename/move inconsistent pandas UDF types	2020-01-22 15:32:58 +09:00
test_utils.py	[SPARK-19926][PYSPARK] make captured exception from JVM side user friendly	2019-09-18 23:32:10 +09:00