spark-instrumented-optimizer/python/pyspark/tests
reidy-p 4ab9aa0305 [SPARK-33017][PYTHON] Add getCheckpointDir method to PySpark Context
### What changes were proposed in this pull request?

Adding a method to get the checkpoint directory from the PySpark context to match the Scala API

### Why are the changes needed?

To make the Scala and Python APIs consistent and remove the need to use the JavaObject

### Does this PR introduce _any_ user-facing change?

Yes, there is a new method which makes it easier to get the checkpoint directory directly rather than using the JavaObject

#### Previous behaviour:
```python
>>> spark.sparkContext.setCheckpointDir('/tmp/spark/checkpoint/')
>>> sc._jsc.sc().getCheckpointDir().get()
'file:/tmp/spark/checkpoint/63f7b67c-e5dc-4d11-a70c-33554a71717a'
```
This method returns a confusing Scala error if it has not been set
```python
>>> sc._jsc.sc().getCheckpointDir().get()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/paul/Desktop/spark/python/lib/py4j-0.10.9-src.zip/py4j/java_gateway.py", line 1305, in __call__
  File "/home/paul/Desktop/spark/python/pyspark/sql/utils.py", line 111, in deco
    return f(*a, **kw)
  File "/home/paul/Desktop/spark/python/lib/py4j-0.10.9-src.zip/py4j/protocol.py", line 328, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling o25.get.
: java.util.NoSuchElementException: None.get
        at scala.None$.get(Option.scala:529)
        at scala.None$.get(Option.scala:527)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:282)
        at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
        at py4j.commands.CallCommand.execute(CallCommand.java:79)
        at py4j.GatewayConnection.run(GatewayConnection.java:238)
        at java.lang.Thread.run(Thread.java:748)

```

#### New method:
```python
>>> spark.sparkContext.setCheckpointDir('/tmp/spark/checkpoint/')
>>> spark.sparkContext.getCheckpointDir()
'file:/tmp/spark/checkpoint/b38aca2e-8ace-44fc-a4c4-f4e36c2da2a7'
```

``getCheckpointDir()`` returns ``None`` if it has not been set
```python
>>> print(spark.sparkContext.getCheckpointDir())
None
```

### How was this patch tested?

Added to existing unit tests. But I'm not sure how to add a test for the case where ``getCheckpointDir()`` should return ``None`` since the existing checkpoint tests set the checkpoint directory in the ``setUp`` method before any tests are run as far as I can tell.

Closes #29918 from reidy-p/SPARK-33017.

Authored-by: reidy-p <paul_reidy@outlook.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2020-10-05 11:48:28 +09:00
..
__init__.py [SPARK-26036][PYTHON] Break large tests.py files into smaller files 2018-11-15 12:30:52 +08:00
test_appsubmit.py [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
test_broadcast.py [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
test_conf.py [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
test_context.py [SPARK-33017][PYTHON] Add getCheckpointDir method to PySpark Context 2020-10-05 11:48:28 +09:00
test_daemon.py [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
test_install_spark.py [SPARK-32714][FOLLOW-UP][PYTHON] Address pyspark.install typing errors 2020-09-27 16:21:23 +09:00
test_join.py [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
test_pin_thread.py [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
test_profiler.py [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
test_rdd.py [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
test_rddbarrier.py [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
test_readwrite.py [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
test_serializers.py [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
test_shuffle.py [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
test_taskcontext.py [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
test_util.py [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
test_worker.py [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00