[MINOR][SQL][DOCS] Fix some wrong default values in SQL tuning guide's AQE section

### What changes were proposed in this pull request? spark.sql.adaptive.coalescePartitions.initialPartitionNum 200 -> (none) spark.sql.adaptive.skewJoin.skewedPartitionFactor is 10 -> 5 ### Why are the changes needed? the wrong doc misguide people ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? passing doc Closes #31717 from yaooqinn/minordoc0. Authored-by: Kent Yao <yao@apache.org> Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2021-03-03 15:00:09 +09:00 · 2021-03-03 15:00:09 +09:00 · 499f620037
parent 229d2e0554
commit 499f620037
3 changed files with 6 additions and 6 deletions
--- a/docs/sql-performance-tuning.md
+++ b/docs/sql-performance-tuning.md
@ -255,9 +255,9 @@ This feature coalesces the post shuffle partitions based on the map output stati
   </tr>
   <tr>
     <td><code>spark.sql.adaptive.coalescePartitions.initialPartitionNum</code></td>
-     <td>200</td>
+     <td>(none)</td>
     <td>
-       The initial number of shuffle partitions before coalescing. By default it equals to <code>spark.sql.shuffle.partitions</code>. This configuration only has an effect when <code>spark.sql.adaptive.enabled</code> and <code>spark.sql.adaptive.coalescePartitions.enabled</code> are both enabled.
+       The initial number of shuffle partitions before coalescing. If not set, it equals to <code>spark.sql.shuffle.partitions</code>. This configuration only has an effect when <code>spark.sql.adaptive.enabled</code> and <code>spark.sql.adaptive.coalescePartitions.enabled</code> are both enabled.
     </td>
     <td>3.0.0</td>
   </tr>
@ -288,7 +288,7 @@ Data skew can severely downgrade the performance of join queries. This feature d
     </tr>
     <tr>
       <td><code>spark.sql.adaptive.skewJoin.skewedPartitionFactor</code></td>
-       <td>10</td>
+       <td>5</td>
       <td>
         A partition is considered as skewed if its size is larger than this factor multiplying the median partition size and also larger than <code>spark.sql.adaptive.skewJoin.skewedPartitionThresholdInBytes</code>.
       </td>
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
@ -483,7 +483,7 @@ object SQLConf {

  val COALESCE_PARTITIONS_INITIAL_PARTITION_NUM =
    buildConf("spark.sql.adaptive.coalescePartitions.initialPartitionNum")
-      .doc("The initial number of shuffle partitions before coalescing. By default it equals to " +
+      .doc("The initial number of shuffle partitions before coalescing. If not set, it equals to " +
        s"${SHUFFLE_PARTITIONS.key}. This configuration only has an effect when " +
        s"'${ADAPTIVE_EXECUTION_ENABLED.key}' and '${COALESCE_PARTITIONS_ENABLED.key}' " +
        "are both true.")
--- a/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala
@ -63,8 +63,8 @@ object OptimizeSkewedJoin extends CustomShuffleReaderRule {

  /**
   * A partition is considered as a skewed partition if its size is larger than the median
-   * partition size * ADAPTIVE_EXECUTION_SKEWED_PARTITION_FACTOR and also larger than
-   * ADVISORY_PARTITION_SIZE_IN_BYTES.
+   * partition size * SKEW_JOIN_SKEWED_PARTITION_FACTOR and also larger than
+   * SKEW_JOIN_SKEWED_PARTITION_THRESHOLD.
   */
  private def isSkewed(size: Long, medianSize: Long): Boolean = {
    size > medianSize * conf.getConf(SQLConf.SKEW_JOIN_SKEWED_PARTITION_FACTOR) &&