diff --git a/docs/core-migration-guide.md b/docs/core-migration-guide.md index 66a489bcc8..cde6e070c5 100644 --- a/docs/core-migration-guide.md +++ b/docs/core-migration-guide.md @@ -38,3 +38,5 @@ license: | - Event log file will be written as UTF-8 encoding, and Spark History Server will replay event log files as UTF-8 encoding. Previously Spark wrote the event log file as default charset of driver JVM process, so Spark History Server of Spark 2.x is needed to read the old event log files in case of incompatible encoding. - A new protocol for fetching shuffle blocks is used. It's recommended that external shuffle services be upgraded when running Spark 3.0 apps. You can still use old external shuffle services by setting the configuration `spark.shuffle.useOldFetchProtocol` to `true`. Otherwise, Spark may run into errors with messages like `IllegalArgumentException: Unexpected message type: `. + +- `SPARK_WORKER_INSTANCES` is deprecated in Standalone mode. It's recommended to launch multiple executors in one worker and launch one worker per node instead of launching multiple workers per node and launching one executor per worker. diff --git a/docs/hardware-provisioning.md b/docs/hardware-provisioning.md index 4e5d681962..fc87995f98 100644 --- a/docs/hardware-provisioning.md +++ b/docs/hardware-provisioning.md @@ -63,10 +63,10 @@ Note that memory usage is greatly affected by storage level and serialization fo the [tuning guide](tuning.html) for tips on how to reduce it. Finally, note that the Java VM does not always behave well with more than 200 GiB of RAM. If you -purchase machines with more RAM than this, you can run _multiple worker JVMs per node_. In -Spark's [standalone mode](spark-standalone.html), you can set the number of workers per node -with the `SPARK_WORKER_INSTANCES` variable in `conf/spark-env.sh`, and the number of cores -per worker with `SPARK_WORKER_CORES`. +purchase machines with more RAM than this, you can launch multiple executors in a single node. In +Spark's [standalone mode](spark-standalone.html), a worker is responsible for launching multiple +executors according to its available memory and cores, and each executor will be launched in a +separate Java VM. # Network diff --git a/sbin/start-slave.sh b/sbin/start-slave.sh index 2cb17a04f6..9b3b26b078 100755 --- a/sbin/start-slave.sh +++ b/sbin/start-slave.sh @@ -22,7 +22,7 @@ # Environment Variables # # SPARK_WORKER_INSTANCES The number of worker instances to run on this -# slave. Default is 1. +# slave. Default is 1. Note it has been deprecate since Spark 3.0. # SPARK_WORKER_PORT The base port number for the first worker. If set, # subsequent workers will increment this number. If # unset, Spark will find a valid port number, but