[SPARK-22554][PYTHON] Add a config to control if PySpark should use daemon or not for workers

## What changes were proposed in this pull request?

This PR proposes to add a flag to control if PySpark should use daemon or not.

Actually, SparkR already has a flag for useDaemon:
478fbc866f/core/src/main/scala/org/apache/spark/api/r/RRunner.scala (L362)

It'd be great if we have this flag too. It makes easier to debug Windows specific issue.

## How was this patch tested?

Manually tested.

Author: hyukjinkwon <gurwls223@gmail.com>

Closes #19782 from HyukjinKwon/use-daemon-flag.
This commit is contained in:
hyukjinkwon 2017-11-20 13:34:06 +09:00
parent b10837ab1a
commit 57c5514de9

View file

@ -38,7 +38,12 @@ private[spark] class PythonWorkerFactory(pythonExec: String, envVars: Map[String
// (pyspark/daemon.py) and tell it to fork new workers for our tasks. This daemon currently
// only works on UNIX-based systems now because it uses signals for child management, so we can
// also fall back to launching workers (pyspark/worker.py) directly.
val useDaemon = !System.getProperty("os.name").startsWith("Windows")
val useDaemon = {
val useDaemonEnabled = SparkEnv.get.conf.getBoolean("spark.python.use.daemon", true)
// This flag is ignored on Windows as it's unable to fork.
!System.getProperty("os.name").startsWith("Windows") && useDaemonEnabled
}
var daemon: Process = null
val daemonHost = InetAddress.getByAddress(Array(127, 0, 0, 1))