## What changes were proposed in this pull request?
I made one pass over the Python APIs for barrier mode and updated them to match the Scala doc in #22240 . Major changes:
* export the public classes
* expand the docs
* add doc for BarrierTaskInfo.addresss
cc: jiangxb1987
Closes#22261 from mengxr/SPARK-25248.1.
Authored-by: Xiangrui Meng <meng@databricks.com>
Signed-off-by: Xiangrui Meng <meng@databricks.com>
This eliminates some duplication in the code to connect to a server on localhost to talk directly to the jvm. Also it gives consistent ipv6 and error handling. Two other incidental changes, that shouldn't matter:
1) python barrier tasks perform authentication immediately (rather than waiting for the BARRIER_FUNCTION indicator)
2) for `rdd._load_from_socket`, the timeout is only increased after authentication.
Closes#22247 from squito/py_connection_refactor.
Authored-by: Imran Rashid <irashid@cloudera.com>
Signed-off-by: hyukjinkwon <gurwls223@apache.org>
## What changes were proposed in this pull request?
Add method `barrier()` and `getTaskInfos()` in python TaskContext, these two methods are only allowed for barrier tasks.
## How was this patch tested?
Add new tests in `tests.py`
Closes#22085 from jiangxb1987/python.barrier.
Authored-by: Xingbo Jiang <xingbo.jiang@databricks.com>
Signed-off-by: Xiangrui Meng <meng@databricks.com>
## What changes were proposed in this pull request?
This adds a new API `TaskContext.getLocalProperty(key)` to the Python TaskContext. It mirrors the Java TaskContext API of returning a string value if the key exists, or None if the key does not exist.
## How was this patch tested?
New test added.
Author: Tathagata Das <tathagata.das1565@gmail.com>
Closes#21437 from tdas/SPARK-24397.
## What changes were proposed in this pull request?
Adds basic TaskContext information to PySpark.
## How was this patch tested?
New unit tests to `tests.py` & existing unit tests.
Author: Holden Karau <holden@us.ibm.com>
Closes#16211 from holdenk/SPARK-18576-pyspark-taskcontext.