Clarify spark.default.parallelism

It's the task count across the cluster, not per worker, per machine, per core, or anything else.
This commit is contained in:
Andrew Ash 2014-01-21 14:49:35 -08:00
parent f8544981a6
commit 069bb94206

View file

@ -98,7 +98,7 @@ Apart from these, the following properties are also available, and may be useful
<td>spark.default.parallelism</td>
<td>8</td>
<td>
Default number of tasks to use for distributed shuffle operations (<code>groupByKey</code>,
Default number of tasks to use across the cluster for distributed shuffle operations (<code>groupByKey</code>,
<code>reduceByKey</code>, etc) when not set by user.
</td>
</tr>