[SPARK-25476][SPARK-25510][TEST] Refactor AggregateBenchmark and add a new trait to better support Dataset and DataFrame API

## What changes were proposed in this pull request?

This PR does 2 things:
1. Add a new trait(`SqlBasedBenchmark`) to better support Dataset and DataFrame API.
2. Refactor `AggregateBenchmark` to use main method. Generate benchmark result:
```
SPARK_GENERATE_BENCHMARK_FILES=1 build/sbt "sql/test:runMain org.apache.spark.sql.execution.benchmark.AggregateBenchmark"
```

## How was this patch tested?

manual tests

Closes #22484 from wangyum/SPARK-25476.

Lead-authored-by: Yuming Wang <yumwang@ebay.com>
Co-authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
This commit is contained in:
Yuming Wang 2018-10-01 07:32:40 -07:00 committed by Dongjoon Hyun
parent 30f5d0f2dd
commit b96fd44f0e
No known key found for this signature in database
GPG key ID: EDA00CE834F0FC5C
3 changed files with 687 additions and 567 deletions

View file

@ -0,0 +1,143 @@
================================================================================================
aggregate without grouping
================================================================================================
OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
agg w/o group: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
agg w/o group wholestage off 65374 / 70665 32.1 31.2 1.0X
agg w/o group wholestage on 1178 / 1209 1779.8 0.6 55.5X
================================================================================================
stat functions
================================================================================================
OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
stddev: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
stddev wholestage off 8667 / 8851 12.1 82.7 1.0X
stddev wholestage on 1266 / 1273 82.8 12.1 6.8X
OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
kurtosis: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
kurtosis wholestage off 41218 / 41231 2.5 393.1 1.0X
kurtosis wholestage on 1347 / 1357 77.8 12.8 30.6X
================================================================================================
aggregate with linear keys
================================================================================================
OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Aggregate w keys: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
codegen = F 9309 / 9389 9.0 111.0 1.0X
codegen = T hashmap = F 4417 / 4435 19.0 52.7 2.1X
codegen = T hashmap = T 1289 / 1298 65.1 15.4 7.2X
================================================================================================
aggregate with randomized keys
================================================================================================
OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Aggregate w keys: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
codegen = F 11424 / 11426 7.3 136.2 1.0X
codegen = T hashmap = F 6441 / 6496 13.0 76.8 1.8X
codegen = T hashmap = T 2333 / 2344 36.0 27.8 4.9X
================================================================================================
aggregate with string key
================================================================================================
OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Aggregate w string key: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
codegen = F 4751 / 4890 4.4 226.5 1.0X
codegen = T hashmap = F 3146 / 3182 6.7 150.0 1.5X
codegen = T hashmap = T 2211 / 2261 9.5 105.4 2.1X
================================================================================================
aggregate with decimal key
================================================================================================
OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Aggregate w decimal key: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
codegen = F 3029 / 3062 6.9 144.4 1.0X
codegen = T hashmap = F 1534 / 1569 13.7 73.2 2.0X
codegen = T hashmap = T 575 / 578 36.5 27.4 5.3X
================================================================================================
aggregate with multiple key types
================================================================================================
OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Aggregate w multiple keys: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
codegen = F 7506 / 7521 2.8 357.9 1.0X
codegen = T hashmap = F 4791 / 4808 4.4 228.5 1.6X
codegen = T hashmap = T 3553 / 3585 5.9 169.4 2.1X
================================================================================================
max function bytecode size of wholestagecodegen
================================================================================================
OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
max function bytecode size: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
codegen = F 608 / 656 1.1 927.1 1.0X
codegen = T hugeMethodLimit = 10000 402 / 419 1.6 613.5 1.5X
codegen = T hugeMethodLimit = 1500 616 / 619 1.1 939.9 1.0X
================================================================================================
cube
================================================================================================
OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
cube: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
cube wholestage off 3229 / 3237 1.6 615.9 1.0X
cube wholestage on 1285 / 1306 4.1 245.2 2.5X
================================================================================================
hash and BytesToBytesMap
================================================================================================
OpenJDK 64-Bit Server VM 1.8.0_181-b13 on Linux 3.10.0-862.3.2.el7.x86_64
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
BytesToBytesMap: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------
UnsafeRowhash 328 / 330 64.0 15.6 1.0X
murmur3 hash 167 / 167 125.4 8.0 2.0X
fast hash 84 / 85 249.0 4.0 3.9X
arrayEqual 192 / 192 109.3 9.1 1.7X
Java HashMap (Long) 144 / 147 145.9 6.9 2.3X
Java HashMap (two ints) 147 / 153 142.3 7.0 2.2X
Java HashMap (UnsafeRow) 785 / 788 26.7 37.4 0.4X
LongToUnsafeRowMap (opt=false) 456 / 457 46.0 21.8 0.7X
LongToUnsafeRowMap (opt=true) 125 / 125 168.3 5.9 2.6X
BytesToBytesMap (off Heap) 885 / 885 23.7 42.2 0.4X
BytesToBytesMap (on Heap) 860 / 864 24.4 41.0 0.4X
Aggregate HashMap 56 / 56 373.9 2.7 5.8X

View file

@ -0,0 +1,60 @@
/*
* Licensed to the Apache Software Foundation (ASF) under one or more
* contributor license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright ownership.
* The ASF licenses this file to You under the Apache License, Version 2.0
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.spark.sql.execution.benchmark
import org.apache.spark.benchmark.{Benchmark, BenchmarkBase}
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.catalyst.plans.SQLHelper
import org.apache.spark.sql.internal.SQLConf
/**
* Common base trait to run benchmark with the Dataset and DataFrame API.
*/
trait SqlBasedBenchmark extends BenchmarkBase with SQLHelper {
protected val spark: SparkSession = getSparkSession
/** Subclass can override this function to build their own SparkSession */
def getSparkSession: SparkSession = {
SparkSession.builder()
.master("local[1]")
.appName(this.getClass.getCanonicalName)
.config(SQLConf.SHUFFLE_PARTITIONS.key, 1)
.config(SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key, 1)
.getOrCreate()
}
/** Runs function `f` with whole stage codegen on and off. */
final def codegenBenchmark(name: String, cardinality: Long)(f: => Unit): Unit = {
val benchmark = new Benchmark(name, cardinality, output = output)
benchmark.addCase(s"$name wholestage off", numIters = 2) { _ =>
withSQLConf(SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "false") {
f
}
}
benchmark.addCase(s"$name wholestage on", numIters = 5) { _ =>
withSQLConf(SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true") {
f
}
}
benchmark.run()
}
}