spark-instrumented-optimizer/python/pyspark
Davies Liu 55349f9fe8 [SPARK-1740] [PySpark] kill the python worker
Kill only the python worker related to cancelled tasks.

The daemon will start a background thread to monitor all the opened sockets for all workers. If the socket is closed by JVM, this thread will kill the worker.

When an task is cancelled, the socket to worker will be closed, then the worker will be killed by deamon.

Author: Davies Liu <davies.liu@gmail.com>

Closes #1643 from davies/kill and squashes the following commits:

8ffe9f3 [Davies Liu] kill worker by deamon, because runtime.exec() is too heavy
46ca150 [Davies Liu] address comment
acd751c [Davies Liu] kill the worker when task is canceled
2014-08-03 15:52:00 -07:00
..
mllib [SPARK-2478] [mllib] DecisionTree Python API 2014-08-02 13:07:17 -07:00
__init__.py [SPARK-2724] Python version of RandomRDDGenerators 2014-07-31 20:32:57 -07:00
accumulators.py SPARK-2282: Reuse Socket for sending accumulator updates to Pyspark 2014-07-31 15:31:53 -07:00
broadcast.py Fix some Python docs and make sure to unset SPARK_TESTING in Python 2013-12-29 20:15:07 -05:00
cloudpickle.py [SPARK-791] [PySpark] fix pickle itemgetter with cloudpickle 2014-07-29 01:02:18 -07:00
conf.py [SPARK-2014] Make PySpark store RDDs in MEMORY_ONLY_SER with compression by default 2014-07-24 18:15:37 -07:00
context.py [SPARK-2454] Do not ship spark home to Workers 2014-08-02 00:45:38 -07:00
daemon.py [SPARK-1740] [PySpark] kill the python worker 2014-08-03 15:52:00 -07:00
files.py Initial work to rename package to org.apache.spark 2013-09-01 14:13:13 -07:00
java_gateway.py [SPARK-2470] PEP8 fixes to PySpark 2014-07-21 22:30:53 -07:00
join.py [SPARK-2470] PEP8 fixes to PySpark 2014-07-21 22:30:53 -07:00
rdd.py [SPARK-2010] [PySpark] [SQL] support nested structure in SchemaRDD 2014-08-01 18:47:41 -07:00
rddsampler.py [SPARK-2656] Python version of stratified sampling 2014-07-24 23:42:08 -07:00
resultiterable.py [SPARK-2470] PEP8 fixes to PySpark 2014-07-21 22:30:53 -07:00
serializers.py [SPARK-2538] [PySpark] Hash based disk spilling aggregation 2014-07-24 22:53:47 -07:00
shell.py [SPARK-2470] PEP8 fixes to PySpark 2014-07-21 22:30:53 -07:00
shuffle.py [SPARK-2538] [PySpark] Hash based disk spilling aggregation 2014-07-24 22:53:47 -07:00
sql.py [SPARK-2784][SQL] Deprecate hql() method in favor of a config option, 'spark.sql.dialect' 2014-08-03 12:28:29 -07:00
statcounter.py StatCounter on NumPy arrays [PYSPARK][SPARK-2012] 2014-08-01 22:33:25 -07:00
storagelevel.py [SPARK-2470] PEP8 fixes to PySpark 2014-07-21 22:30:53 -07:00
tests.py [SPARK-1740] [PySpark] kill the python worker 2014-08-03 15:52:00 -07:00
worker.py [SPARK-2580] [PySpark] keep silent in worker if JVM close the socket 2014-07-29 00:15:45 -07:00