spark-instrumented-optimizer/sql/core/benchmarks/AggregateBenchmark-jdk11-results.txt

144 lines
11 KiB
Plaintext
Raw Normal View History

[SPARK-29320][TESTS] Compare `sql/core` module in JDK8/11 (Part 1) ### What changes were proposed in this pull request? This PR regenerates the `sql/core` benchmarks in JDK8/11 to compare the result. In general, we compare the ratio instead of the time. However, in this PR, the average time is compared. This PR should be considered as a rough comparison. **A. EXPECTED CASES(JDK11 is faster in general)** - [x] BloomFilterBenchmark (JDK11 is faster except one case) - [x] BuiltInDataSourceWriteBenchmark (JDK11 is faster at CSV/ORC) - [x] CSVBenchmark (JDK11 is faster except five cases) - [x] ColumnarBatchBenchmark (JDK11 is faster at `boolean`/`string` and some cases in `int`/`array`) - [x] DatasetBenchmark (JDK11 is faster with `string`, but is slower for `long` type) - [x] ExternalAppendOnlyUnsafeRowArrayBenchmark (JDK11 is faster except two cases) - [x] ExtractBenchmark (JDK11 is faster except HOUR/MINUTE/SECOND/MILLISECONDS/MICROSECONDS) - [x] HashedRelationMetricsBenchmark (JDK11 is faster) - [x] JSONBenchmark (JDK11 is much faster except eight cases) - [x] JoinBenchmark (JDK11 is faster except five cases) - [x] OrcNestedSchemaPruningBenchmark (JDK11 is faster in nine cases) - [x] PrimitiveArrayBenchmark (JDK11 is faster) - [x] SortBenchmark (JDK11 is faster except `Arrays.sort` case) - [x] UDFBenchmark (N/A, values are too small) - [x] UnsafeArrayDataBenchmark (JDK11 is faster except one case) - [x] WideTableBenchmark (JDK11 is faster except two cases) **B. CASES WE NEED TO INVESTIGATE MORE LATER** - [x] AggregateBenchmark (JDK11 is slower in general) - [x] CompressionSchemeBenchmark (JDK11 is slower in general except `string`) - [x] DataSourceReadBenchmark (JDK11 is slower in general) - [x] DateTimeBenchmark (JDK11 is slightly slower in general except `parsing`) - [x] MakeDateTimeBenchmark (JDK11 is slower except two cases) - [x] MiscBenchmark (JDK11 is slower except ten cases) - [x] OrcV2NestedSchemaPruningBenchmark (JDK11 is slower) - [x] ParquetNestedSchemaPruningBenchmark (JDK11 is slower except six cases) - [x] RangeBenchmark (JDK11 is slower except one case) `FilterPushdownBenchmark/InExpressionBenchmark/WideSchemaBenchmark` will be compared later because it took long timer. ### Why are the changes needed? According to the result, there are some difference between JDK8/JDK11. This will be a baseline for the future improvement and comparison. Also, as a reproducible environment, the following environment is used. - Instance: `r3.xlarge` - OS: `CentOS Linux release 7.5.1804 (Core)` - JDK: - `OpenJDK Runtime Environment (build 1.8.0_222-b10)` - `OpenJDK Runtime Environment 18.9 (build 11.0.4+11-LTS)` ### Does this PR introduce any user-facing change? No. ### How was this patch tested? This is a test-only PR. We need to run benchmark. Closes #26003 from dongjoon-hyun/SPARK-29320. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-03 11:58:25 -04:00
================================================================================================
aggregate without grouping
================================================================================================
[SPARK-30409][SPARK-29173][SQL][TESTS] Use `NoOp` datasource in SQL benchmarks ### What changes were proposed in this pull request? In the PR, I propose to replace `.collect()`, `.count()` and `.foreach(_ => ())` in SQL benchmarks and use the `NoOp` datasource. I added an implicit class to `SqlBasedBenchmark` with the `.noop()` method. It can be used in benchmark like: `ds.noop()`. The last one is unfolded to `ds.write.format("noop").mode(Overwrite).save()`. ### Why are the changes needed? To avoid additional overhead that `collect()` (and other actions) has. For example, `.collect()` has to convert values according to external types and pull data to the driver. This can hide actual performance regressions or improvements of benchmarked operations. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Re-run all modified benchmarks using Amazon EC2. | Item | Description | | ---- | ----| | Region | us-west-2 (Oregon) | | Instance | r3.xlarge (spot instance) | | AMI | ami-06f2f779464715dc5 (ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20190722.1) | | Java | OpenJDK8/10 | - Run `TPCDSQueryBenchmark` using instructions from the PR #26049 ``` # `spark-tpcds-datagen` needs this. (JDK8) $ git clone https://github.com/apache/spark.git -b branch-2.4 --depth 1 spark-2.4 $ export SPARK_HOME=$PWD $ ./build/mvn clean package -DskipTests # Generate data. (JDK8) $ git clone gitgithub.com:maropu/spark-tpcds-datagen.git $ cd spark-tpcds-datagen/ $ build/mvn clean package $ mkdir -p /data/tpcds $ ./bin/dsdgen --output-location /data/tpcds/s1 // This need `Spark 2.4` ``` - Other benchmarks ran by the script: ``` #!/usr/bin/env python3 import os from sparktestsupport.shellutils import run_cmd benchmarks = [ ['sql/test', 'org.apache.spark.sql.execution.benchmark.AggregateBenchmark'], ['avro/test', 'org.apache.spark.sql.execution.benchmark.AvroReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.BloomFilterBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DataSourceReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ExtractBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.FilterPushdownBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.InExpressionBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.IntervalBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.JoinBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MakeDateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MiscBenchmark'], ['hive/test', 'org.apache.spark.sql.execution.benchmark.ObjectHashAggregateExecBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcV2NestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ParquetNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.RangeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.UDFBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideSchemaBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideTableBenchmark'], ['hive/test', 'org.apache.spark.sql.hive.orc.OrcReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.csv.CSVBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.json.JsonBenchmark'] ] print('Set SPARK_GENERATE_BENCHMARK_FILES=1') os.environ['SPARK_GENERATE_BENCHMARK_FILES'] = '1' for b in benchmarks: print("Run benchmark: %s" % b[1]) run_cmd(['build/sbt', '%s:runMain %s' % (b[0], b[1])]) ``` Closes #27078 from MaxGekk/noop-in-benchmarks. Lead-authored-by: Maxim Gekk <max.gekk@gmail.com> Co-authored-by: Maxim Gekk <maxim.gekk@databricks.com> Co-authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-01-12 16:18:19 -05:00
OpenJDK 64-Bit Server VM 11.0.5+10-post-Ubuntu-0ubuntu1.118.04 on Linux 4.15.0-1044-aws
[SPARK-29320][TESTS] Compare `sql/core` module in JDK8/11 (Part 1) ### What changes were proposed in this pull request? This PR regenerates the `sql/core` benchmarks in JDK8/11 to compare the result. In general, we compare the ratio instead of the time. However, in this PR, the average time is compared. This PR should be considered as a rough comparison. **A. EXPECTED CASES(JDK11 is faster in general)** - [x] BloomFilterBenchmark (JDK11 is faster except one case) - [x] BuiltInDataSourceWriteBenchmark (JDK11 is faster at CSV/ORC) - [x] CSVBenchmark (JDK11 is faster except five cases) - [x] ColumnarBatchBenchmark (JDK11 is faster at `boolean`/`string` and some cases in `int`/`array`) - [x] DatasetBenchmark (JDK11 is faster with `string`, but is slower for `long` type) - [x] ExternalAppendOnlyUnsafeRowArrayBenchmark (JDK11 is faster except two cases) - [x] ExtractBenchmark (JDK11 is faster except HOUR/MINUTE/SECOND/MILLISECONDS/MICROSECONDS) - [x] HashedRelationMetricsBenchmark (JDK11 is faster) - [x] JSONBenchmark (JDK11 is much faster except eight cases) - [x] JoinBenchmark (JDK11 is faster except five cases) - [x] OrcNestedSchemaPruningBenchmark (JDK11 is faster in nine cases) - [x] PrimitiveArrayBenchmark (JDK11 is faster) - [x] SortBenchmark (JDK11 is faster except `Arrays.sort` case) - [x] UDFBenchmark (N/A, values are too small) - [x] UnsafeArrayDataBenchmark (JDK11 is faster except one case) - [x] WideTableBenchmark (JDK11 is faster except two cases) **B. CASES WE NEED TO INVESTIGATE MORE LATER** - [x] AggregateBenchmark (JDK11 is slower in general) - [x] CompressionSchemeBenchmark (JDK11 is slower in general except `string`) - [x] DataSourceReadBenchmark (JDK11 is slower in general) - [x] DateTimeBenchmark (JDK11 is slightly slower in general except `parsing`) - [x] MakeDateTimeBenchmark (JDK11 is slower except two cases) - [x] MiscBenchmark (JDK11 is slower except ten cases) - [x] OrcV2NestedSchemaPruningBenchmark (JDK11 is slower) - [x] ParquetNestedSchemaPruningBenchmark (JDK11 is slower except six cases) - [x] RangeBenchmark (JDK11 is slower except one case) `FilterPushdownBenchmark/InExpressionBenchmark/WideSchemaBenchmark` will be compared later because it took long timer. ### Why are the changes needed? According to the result, there are some difference between JDK8/JDK11. This will be a baseline for the future improvement and comparison. Also, as a reproducible environment, the following environment is used. - Instance: `r3.xlarge` - OS: `CentOS Linux release 7.5.1804 (Core)` - JDK: - `OpenJDK Runtime Environment (build 1.8.0_222-b10)` - `OpenJDK Runtime Environment 18.9 (build 11.0.4+11-LTS)` ### Does this PR introduce any user-facing change? No. ### How was this patch tested? This is a test-only PR. We need to run benchmark. Closes #26003 from dongjoon-hyun/SPARK-29320. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-03 11:58:25 -04:00
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
agg w/o group: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
[SPARK-30409][SPARK-29173][SQL][TESTS] Use `NoOp` datasource in SQL benchmarks ### What changes were proposed in this pull request? In the PR, I propose to replace `.collect()`, `.count()` and `.foreach(_ => ())` in SQL benchmarks and use the `NoOp` datasource. I added an implicit class to `SqlBasedBenchmark` with the `.noop()` method. It can be used in benchmark like: `ds.noop()`. The last one is unfolded to `ds.write.format("noop").mode(Overwrite).save()`. ### Why are the changes needed? To avoid additional overhead that `collect()` (and other actions) has. For example, `.collect()` has to convert values according to external types and pull data to the driver. This can hide actual performance regressions or improvements of benchmarked operations. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Re-run all modified benchmarks using Amazon EC2. | Item | Description | | ---- | ----| | Region | us-west-2 (Oregon) | | Instance | r3.xlarge (spot instance) | | AMI | ami-06f2f779464715dc5 (ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20190722.1) | | Java | OpenJDK8/10 | - Run `TPCDSQueryBenchmark` using instructions from the PR #26049 ``` # `spark-tpcds-datagen` needs this. (JDK8) $ git clone https://github.com/apache/spark.git -b branch-2.4 --depth 1 spark-2.4 $ export SPARK_HOME=$PWD $ ./build/mvn clean package -DskipTests # Generate data. (JDK8) $ git clone gitgithub.com:maropu/spark-tpcds-datagen.git $ cd spark-tpcds-datagen/ $ build/mvn clean package $ mkdir -p /data/tpcds $ ./bin/dsdgen --output-location /data/tpcds/s1 // This need `Spark 2.4` ``` - Other benchmarks ran by the script: ``` #!/usr/bin/env python3 import os from sparktestsupport.shellutils import run_cmd benchmarks = [ ['sql/test', 'org.apache.spark.sql.execution.benchmark.AggregateBenchmark'], ['avro/test', 'org.apache.spark.sql.execution.benchmark.AvroReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.BloomFilterBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DataSourceReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ExtractBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.FilterPushdownBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.InExpressionBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.IntervalBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.JoinBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MakeDateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MiscBenchmark'], ['hive/test', 'org.apache.spark.sql.execution.benchmark.ObjectHashAggregateExecBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcV2NestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ParquetNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.RangeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.UDFBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideSchemaBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideTableBenchmark'], ['hive/test', 'org.apache.spark.sql.hive.orc.OrcReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.csv.CSVBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.json.JsonBenchmark'] ] print('Set SPARK_GENERATE_BENCHMARK_FILES=1') os.environ['SPARK_GENERATE_BENCHMARK_FILES'] = '1' for b in benchmarks: print("Run benchmark: %s" % b[1]) run_cmd(['build/sbt', '%s:runMain %s' % (b[0], b[1])]) ``` Closes #27078 from MaxGekk/noop-in-benchmarks. Lead-authored-by: Maxim Gekk <max.gekk@gmail.com> Co-authored-by: Maxim Gekk <maxim.gekk@databricks.com> Co-authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-01-12 16:18:19 -05:00
agg w/o group wholestage off 59499 62969 2375 35.2 28.4 1.0X
agg w/o group wholestage on 890 902 10 2357.2 0.4 66.9X
[SPARK-29320][TESTS] Compare `sql/core` module in JDK8/11 (Part 1) ### What changes were proposed in this pull request? This PR regenerates the `sql/core` benchmarks in JDK8/11 to compare the result. In general, we compare the ratio instead of the time. However, in this PR, the average time is compared. This PR should be considered as a rough comparison. **A. EXPECTED CASES(JDK11 is faster in general)** - [x] BloomFilterBenchmark (JDK11 is faster except one case) - [x] BuiltInDataSourceWriteBenchmark (JDK11 is faster at CSV/ORC) - [x] CSVBenchmark (JDK11 is faster except five cases) - [x] ColumnarBatchBenchmark (JDK11 is faster at `boolean`/`string` and some cases in `int`/`array`) - [x] DatasetBenchmark (JDK11 is faster with `string`, but is slower for `long` type) - [x] ExternalAppendOnlyUnsafeRowArrayBenchmark (JDK11 is faster except two cases) - [x] ExtractBenchmark (JDK11 is faster except HOUR/MINUTE/SECOND/MILLISECONDS/MICROSECONDS) - [x] HashedRelationMetricsBenchmark (JDK11 is faster) - [x] JSONBenchmark (JDK11 is much faster except eight cases) - [x] JoinBenchmark (JDK11 is faster except five cases) - [x] OrcNestedSchemaPruningBenchmark (JDK11 is faster in nine cases) - [x] PrimitiveArrayBenchmark (JDK11 is faster) - [x] SortBenchmark (JDK11 is faster except `Arrays.sort` case) - [x] UDFBenchmark (N/A, values are too small) - [x] UnsafeArrayDataBenchmark (JDK11 is faster except one case) - [x] WideTableBenchmark (JDK11 is faster except two cases) **B. CASES WE NEED TO INVESTIGATE MORE LATER** - [x] AggregateBenchmark (JDK11 is slower in general) - [x] CompressionSchemeBenchmark (JDK11 is slower in general except `string`) - [x] DataSourceReadBenchmark (JDK11 is slower in general) - [x] DateTimeBenchmark (JDK11 is slightly slower in general except `parsing`) - [x] MakeDateTimeBenchmark (JDK11 is slower except two cases) - [x] MiscBenchmark (JDK11 is slower except ten cases) - [x] OrcV2NestedSchemaPruningBenchmark (JDK11 is slower) - [x] ParquetNestedSchemaPruningBenchmark (JDK11 is slower except six cases) - [x] RangeBenchmark (JDK11 is slower except one case) `FilterPushdownBenchmark/InExpressionBenchmark/WideSchemaBenchmark` will be compared later because it took long timer. ### Why are the changes needed? According to the result, there are some difference between JDK8/JDK11. This will be a baseline for the future improvement and comparison. Also, as a reproducible environment, the following environment is used. - Instance: `r3.xlarge` - OS: `CentOS Linux release 7.5.1804 (Core)` - JDK: - `OpenJDK Runtime Environment (build 1.8.0_222-b10)` - `OpenJDK Runtime Environment 18.9 (build 11.0.4+11-LTS)` ### Does this PR introduce any user-facing change? No. ### How was this patch tested? This is a test-only PR. We need to run benchmark. Closes #26003 from dongjoon-hyun/SPARK-29320. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-03 11:58:25 -04:00
================================================================================================
stat functions
================================================================================================
[SPARK-30409][SPARK-29173][SQL][TESTS] Use `NoOp` datasource in SQL benchmarks ### What changes were proposed in this pull request? In the PR, I propose to replace `.collect()`, `.count()` and `.foreach(_ => ())` in SQL benchmarks and use the `NoOp` datasource. I added an implicit class to `SqlBasedBenchmark` with the `.noop()` method. It can be used in benchmark like: `ds.noop()`. The last one is unfolded to `ds.write.format("noop").mode(Overwrite).save()`. ### Why are the changes needed? To avoid additional overhead that `collect()` (and other actions) has. For example, `.collect()` has to convert values according to external types and pull data to the driver. This can hide actual performance regressions or improvements of benchmarked operations. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Re-run all modified benchmarks using Amazon EC2. | Item | Description | | ---- | ----| | Region | us-west-2 (Oregon) | | Instance | r3.xlarge (spot instance) | | AMI | ami-06f2f779464715dc5 (ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20190722.1) | | Java | OpenJDK8/10 | - Run `TPCDSQueryBenchmark` using instructions from the PR #26049 ``` # `spark-tpcds-datagen` needs this. (JDK8) $ git clone https://github.com/apache/spark.git -b branch-2.4 --depth 1 spark-2.4 $ export SPARK_HOME=$PWD $ ./build/mvn clean package -DskipTests # Generate data. (JDK8) $ git clone gitgithub.com:maropu/spark-tpcds-datagen.git $ cd spark-tpcds-datagen/ $ build/mvn clean package $ mkdir -p /data/tpcds $ ./bin/dsdgen --output-location /data/tpcds/s1 // This need `Spark 2.4` ``` - Other benchmarks ran by the script: ``` #!/usr/bin/env python3 import os from sparktestsupport.shellutils import run_cmd benchmarks = [ ['sql/test', 'org.apache.spark.sql.execution.benchmark.AggregateBenchmark'], ['avro/test', 'org.apache.spark.sql.execution.benchmark.AvroReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.BloomFilterBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DataSourceReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ExtractBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.FilterPushdownBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.InExpressionBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.IntervalBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.JoinBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MakeDateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MiscBenchmark'], ['hive/test', 'org.apache.spark.sql.execution.benchmark.ObjectHashAggregateExecBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcV2NestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ParquetNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.RangeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.UDFBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideSchemaBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideTableBenchmark'], ['hive/test', 'org.apache.spark.sql.hive.orc.OrcReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.csv.CSVBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.json.JsonBenchmark'] ] print('Set SPARK_GENERATE_BENCHMARK_FILES=1') os.environ['SPARK_GENERATE_BENCHMARK_FILES'] = '1' for b in benchmarks: print("Run benchmark: %s" % b[1]) run_cmd(['build/sbt', '%s:runMain %s' % (b[0], b[1])]) ``` Closes #27078 from MaxGekk/noop-in-benchmarks. Lead-authored-by: Maxim Gekk <max.gekk@gmail.com> Co-authored-by: Maxim Gekk <maxim.gekk@databricks.com> Co-authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-01-12 16:18:19 -05:00
OpenJDK 64-Bit Server VM 11.0.5+10-post-Ubuntu-0ubuntu1.118.04 on Linux 4.15.0-1044-aws
[SPARK-29320][TESTS] Compare `sql/core` module in JDK8/11 (Part 1) ### What changes were proposed in this pull request? This PR regenerates the `sql/core` benchmarks in JDK8/11 to compare the result. In general, we compare the ratio instead of the time. However, in this PR, the average time is compared. This PR should be considered as a rough comparison. **A. EXPECTED CASES(JDK11 is faster in general)** - [x] BloomFilterBenchmark (JDK11 is faster except one case) - [x] BuiltInDataSourceWriteBenchmark (JDK11 is faster at CSV/ORC) - [x] CSVBenchmark (JDK11 is faster except five cases) - [x] ColumnarBatchBenchmark (JDK11 is faster at `boolean`/`string` and some cases in `int`/`array`) - [x] DatasetBenchmark (JDK11 is faster with `string`, but is slower for `long` type) - [x] ExternalAppendOnlyUnsafeRowArrayBenchmark (JDK11 is faster except two cases) - [x] ExtractBenchmark (JDK11 is faster except HOUR/MINUTE/SECOND/MILLISECONDS/MICROSECONDS) - [x] HashedRelationMetricsBenchmark (JDK11 is faster) - [x] JSONBenchmark (JDK11 is much faster except eight cases) - [x] JoinBenchmark (JDK11 is faster except five cases) - [x] OrcNestedSchemaPruningBenchmark (JDK11 is faster in nine cases) - [x] PrimitiveArrayBenchmark (JDK11 is faster) - [x] SortBenchmark (JDK11 is faster except `Arrays.sort` case) - [x] UDFBenchmark (N/A, values are too small) - [x] UnsafeArrayDataBenchmark (JDK11 is faster except one case) - [x] WideTableBenchmark (JDK11 is faster except two cases) **B. CASES WE NEED TO INVESTIGATE MORE LATER** - [x] AggregateBenchmark (JDK11 is slower in general) - [x] CompressionSchemeBenchmark (JDK11 is slower in general except `string`) - [x] DataSourceReadBenchmark (JDK11 is slower in general) - [x] DateTimeBenchmark (JDK11 is slightly slower in general except `parsing`) - [x] MakeDateTimeBenchmark (JDK11 is slower except two cases) - [x] MiscBenchmark (JDK11 is slower except ten cases) - [x] OrcV2NestedSchemaPruningBenchmark (JDK11 is slower) - [x] ParquetNestedSchemaPruningBenchmark (JDK11 is slower except six cases) - [x] RangeBenchmark (JDK11 is slower except one case) `FilterPushdownBenchmark/InExpressionBenchmark/WideSchemaBenchmark` will be compared later because it took long timer. ### Why are the changes needed? According to the result, there are some difference between JDK8/JDK11. This will be a baseline for the future improvement and comparison. Also, as a reproducible environment, the following environment is used. - Instance: `r3.xlarge` - OS: `CentOS Linux release 7.5.1804 (Core)` - JDK: - `OpenJDK Runtime Environment (build 1.8.0_222-b10)` - `OpenJDK Runtime Environment 18.9 (build 11.0.4+11-LTS)` ### Does this PR introduce any user-facing change? No. ### How was this patch tested? This is a test-only PR. We need to run benchmark. Closes #26003 from dongjoon-hyun/SPARK-29320. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-03 11:58:25 -04:00
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
stddev: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
[SPARK-30409][SPARK-29173][SQL][TESTS] Use `NoOp` datasource in SQL benchmarks ### What changes were proposed in this pull request? In the PR, I propose to replace `.collect()`, `.count()` and `.foreach(_ => ())` in SQL benchmarks and use the `NoOp` datasource. I added an implicit class to `SqlBasedBenchmark` with the `.noop()` method. It can be used in benchmark like: `ds.noop()`. The last one is unfolded to `ds.write.format("noop").mode(Overwrite).save()`. ### Why are the changes needed? To avoid additional overhead that `collect()` (and other actions) has. For example, `.collect()` has to convert values according to external types and pull data to the driver. This can hide actual performance regressions or improvements of benchmarked operations. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Re-run all modified benchmarks using Amazon EC2. | Item | Description | | ---- | ----| | Region | us-west-2 (Oregon) | | Instance | r3.xlarge (spot instance) | | AMI | ami-06f2f779464715dc5 (ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20190722.1) | | Java | OpenJDK8/10 | - Run `TPCDSQueryBenchmark` using instructions from the PR #26049 ``` # `spark-tpcds-datagen` needs this. (JDK8) $ git clone https://github.com/apache/spark.git -b branch-2.4 --depth 1 spark-2.4 $ export SPARK_HOME=$PWD $ ./build/mvn clean package -DskipTests # Generate data. (JDK8) $ git clone gitgithub.com:maropu/spark-tpcds-datagen.git $ cd spark-tpcds-datagen/ $ build/mvn clean package $ mkdir -p /data/tpcds $ ./bin/dsdgen --output-location /data/tpcds/s1 // This need `Spark 2.4` ``` - Other benchmarks ran by the script: ``` #!/usr/bin/env python3 import os from sparktestsupport.shellutils import run_cmd benchmarks = [ ['sql/test', 'org.apache.spark.sql.execution.benchmark.AggregateBenchmark'], ['avro/test', 'org.apache.spark.sql.execution.benchmark.AvroReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.BloomFilterBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DataSourceReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ExtractBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.FilterPushdownBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.InExpressionBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.IntervalBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.JoinBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MakeDateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MiscBenchmark'], ['hive/test', 'org.apache.spark.sql.execution.benchmark.ObjectHashAggregateExecBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcV2NestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ParquetNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.RangeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.UDFBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideSchemaBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideTableBenchmark'], ['hive/test', 'org.apache.spark.sql.hive.orc.OrcReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.csv.CSVBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.json.JsonBenchmark'] ] print('Set SPARK_GENERATE_BENCHMARK_FILES=1') os.environ['SPARK_GENERATE_BENCHMARK_FILES'] = '1' for b in benchmarks: print("Run benchmark: %s" % b[1]) run_cmd(['build/sbt', '%s:runMain %s' % (b[0], b[1])]) ``` Closes #27078 from MaxGekk/noop-in-benchmarks. Lead-authored-by: Maxim Gekk <max.gekk@gmail.com> Co-authored-by: Maxim Gekk <maxim.gekk@databricks.com> Co-authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-01-12 16:18:19 -05:00
stddev wholestage off 8702 8870 237 12.0 83.0 1.0X
stddev wholestage on 1307 1314 9 80.2 12.5 6.7X
[SPARK-29320][TESTS] Compare `sql/core` module in JDK8/11 (Part 1) ### What changes were proposed in this pull request? This PR regenerates the `sql/core` benchmarks in JDK8/11 to compare the result. In general, we compare the ratio instead of the time. However, in this PR, the average time is compared. This PR should be considered as a rough comparison. **A. EXPECTED CASES(JDK11 is faster in general)** - [x] BloomFilterBenchmark (JDK11 is faster except one case) - [x] BuiltInDataSourceWriteBenchmark (JDK11 is faster at CSV/ORC) - [x] CSVBenchmark (JDK11 is faster except five cases) - [x] ColumnarBatchBenchmark (JDK11 is faster at `boolean`/`string` and some cases in `int`/`array`) - [x] DatasetBenchmark (JDK11 is faster with `string`, but is slower for `long` type) - [x] ExternalAppendOnlyUnsafeRowArrayBenchmark (JDK11 is faster except two cases) - [x] ExtractBenchmark (JDK11 is faster except HOUR/MINUTE/SECOND/MILLISECONDS/MICROSECONDS) - [x] HashedRelationMetricsBenchmark (JDK11 is faster) - [x] JSONBenchmark (JDK11 is much faster except eight cases) - [x] JoinBenchmark (JDK11 is faster except five cases) - [x] OrcNestedSchemaPruningBenchmark (JDK11 is faster in nine cases) - [x] PrimitiveArrayBenchmark (JDK11 is faster) - [x] SortBenchmark (JDK11 is faster except `Arrays.sort` case) - [x] UDFBenchmark (N/A, values are too small) - [x] UnsafeArrayDataBenchmark (JDK11 is faster except one case) - [x] WideTableBenchmark (JDK11 is faster except two cases) **B. CASES WE NEED TO INVESTIGATE MORE LATER** - [x] AggregateBenchmark (JDK11 is slower in general) - [x] CompressionSchemeBenchmark (JDK11 is slower in general except `string`) - [x] DataSourceReadBenchmark (JDK11 is slower in general) - [x] DateTimeBenchmark (JDK11 is slightly slower in general except `parsing`) - [x] MakeDateTimeBenchmark (JDK11 is slower except two cases) - [x] MiscBenchmark (JDK11 is slower except ten cases) - [x] OrcV2NestedSchemaPruningBenchmark (JDK11 is slower) - [x] ParquetNestedSchemaPruningBenchmark (JDK11 is slower except six cases) - [x] RangeBenchmark (JDK11 is slower except one case) `FilterPushdownBenchmark/InExpressionBenchmark/WideSchemaBenchmark` will be compared later because it took long timer. ### Why are the changes needed? According to the result, there are some difference between JDK8/JDK11. This will be a baseline for the future improvement and comparison. Also, as a reproducible environment, the following environment is used. - Instance: `r3.xlarge` - OS: `CentOS Linux release 7.5.1804 (Core)` - JDK: - `OpenJDK Runtime Environment (build 1.8.0_222-b10)` - `OpenJDK Runtime Environment 18.9 (build 11.0.4+11-LTS)` ### Does this PR introduce any user-facing change? No. ### How was this patch tested? This is a test-only PR. We need to run benchmark. Closes #26003 from dongjoon-hyun/SPARK-29320. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-03 11:58:25 -04:00
[SPARK-30409][SPARK-29173][SQL][TESTS] Use `NoOp` datasource in SQL benchmarks ### What changes were proposed in this pull request? In the PR, I propose to replace `.collect()`, `.count()` and `.foreach(_ => ())` in SQL benchmarks and use the `NoOp` datasource. I added an implicit class to `SqlBasedBenchmark` with the `.noop()` method. It can be used in benchmark like: `ds.noop()`. The last one is unfolded to `ds.write.format("noop").mode(Overwrite).save()`. ### Why are the changes needed? To avoid additional overhead that `collect()` (and other actions) has. For example, `.collect()` has to convert values according to external types and pull data to the driver. This can hide actual performance regressions or improvements of benchmarked operations. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Re-run all modified benchmarks using Amazon EC2. | Item | Description | | ---- | ----| | Region | us-west-2 (Oregon) | | Instance | r3.xlarge (spot instance) | | AMI | ami-06f2f779464715dc5 (ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20190722.1) | | Java | OpenJDK8/10 | - Run `TPCDSQueryBenchmark` using instructions from the PR #26049 ``` # `spark-tpcds-datagen` needs this. (JDK8) $ git clone https://github.com/apache/spark.git -b branch-2.4 --depth 1 spark-2.4 $ export SPARK_HOME=$PWD $ ./build/mvn clean package -DskipTests # Generate data. (JDK8) $ git clone gitgithub.com:maropu/spark-tpcds-datagen.git $ cd spark-tpcds-datagen/ $ build/mvn clean package $ mkdir -p /data/tpcds $ ./bin/dsdgen --output-location /data/tpcds/s1 // This need `Spark 2.4` ``` - Other benchmarks ran by the script: ``` #!/usr/bin/env python3 import os from sparktestsupport.shellutils import run_cmd benchmarks = [ ['sql/test', 'org.apache.spark.sql.execution.benchmark.AggregateBenchmark'], ['avro/test', 'org.apache.spark.sql.execution.benchmark.AvroReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.BloomFilterBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DataSourceReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ExtractBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.FilterPushdownBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.InExpressionBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.IntervalBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.JoinBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MakeDateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MiscBenchmark'], ['hive/test', 'org.apache.spark.sql.execution.benchmark.ObjectHashAggregateExecBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcV2NestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ParquetNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.RangeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.UDFBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideSchemaBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideTableBenchmark'], ['hive/test', 'org.apache.spark.sql.hive.orc.OrcReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.csv.CSVBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.json.JsonBenchmark'] ] print('Set SPARK_GENERATE_BENCHMARK_FILES=1') os.environ['SPARK_GENERATE_BENCHMARK_FILES'] = '1' for b in benchmarks: print("Run benchmark: %s" % b[1]) run_cmd(['build/sbt', '%s:runMain %s' % (b[0], b[1])]) ``` Closes #27078 from MaxGekk/noop-in-benchmarks. Lead-authored-by: Maxim Gekk <max.gekk@gmail.com> Co-authored-by: Maxim Gekk <maxim.gekk@databricks.com> Co-authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-01-12 16:18:19 -05:00
OpenJDK 64-Bit Server VM 11.0.5+10-post-Ubuntu-0ubuntu1.118.04 on Linux 4.15.0-1044-aws
[SPARK-29320][TESTS] Compare `sql/core` module in JDK8/11 (Part 1) ### What changes were proposed in this pull request? This PR regenerates the `sql/core` benchmarks in JDK8/11 to compare the result. In general, we compare the ratio instead of the time. However, in this PR, the average time is compared. This PR should be considered as a rough comparison. **A. EXPECTED CASES(JDK11 is faster in general)** - [x] BloomFilterBenchmark (JDK11 is faster except one case) - [x] BuiltInDataSourceWriteBenchmark (JDK11 is faster at CSV/ORC) - [x] CSVBenchmark (JDK11 is faster except five cases) - [x] ColumnarBatchBenchmark (JDK11 is faster at `boolean`/`string` and some cases in `int`/`array`) - [x] DatasetBenchmark (JDK11 is faster with `string`, but is slower for `long` type) - [x] ExternalAppendOnlyUnsafeRowArrayBenchmark (JDK11 is faster except two cases) - [x] ExtractBenchmark (JDK11 is faster except HOUR/MINUTE/SECOND/MILLISECONDS/MICROSECONDS) - [x] HashedRelationMetricsBenchmark (JDK11 is faster) - [x] JSONBenchmark (JDK11 is much faster except eight cases) - [x] JoinBenchmark (JDK11 is faster except five cases) - [x] OrcNestedSchemaPruningBenchmark (JDK11 is faster in nine cases) - [x] PrimitiveArrayBenchmark (JDK11 is faster) - [x] SortBenchmark (JDK11 is faster except `Arrays.sort` case) - [x] UDFBenchmark (N/A, values are too small) - [x] UnsafeArrayDataBenchmark (JDK11 is faster except one case) - [x] WideTableBenchmark (JDK11 is faster except two cases) **B. CASES WE NEED TO INVESTIGATE MORE LATER** - [x] AggregateBenchmark (JDK11 is slower in general) - [x] CompressionSchemeBenchmark (JDK11 is slower in general except `string`) - [x] DataSourceReadBenchmark (JDK11 is slower in general) - [x] DateTimeBenchmark (JDK11 is slightly slower in general except `parsing`) - [x] MakeDateTimeBenchmark (JDK11 is slower except two cases) - [x] MiscBenchmark (JDK11 is slower except ten cases) - [x] OrcV2NestedSchemaPruningBenchmark (JDK11 is slower) - [x] ParquetNestedSchemaPruningBenchmark (JDK11 is slower except six cases) - [x] RangeBenchmark (JDK11 is slower except one case) `FilterPushdownBenchmark/InExpressionBenchmark/WideSchemaBenchmark` will be compared later because it took long timer. ### Why are the changes needed? According to the result, there are some difference between JDK8/JDK11. This will be a baseline for the future improvement and comparison. Also, as a reproducible environment, the following environment is used. - Instance: `r3.xlarge` - OS: `CentOS Linux release 7.5.1804 (Core)` - JDK: - `OpenJDK Runtime Environment (build 1.8.0_222-b10)` - `OpenJDK Runtime Environment 18.9 (build 11.0.4+11-LTS)` ### Does this PR introduce any user-facing change? No. ### How was this patch tested? This is a test-only PR. We need to run benchmark. Closes #26003 from dongjoon-hyun/SPARK-29320. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-03 11:58:25 -04:00
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
kurtosis: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
[SPARK-30409][SPARK-29173][SQL][TESTS] Use `NoOp` datasource in SQL benchmarks ### What changes were proposed in this pull request? In the PR, I propose to replace `.collect()`, `.count()` and `.foreach(_ => ())` in SQL benchmarks and use the `NoOp` datasource. I added an implicit class to `SqlBasedBenchmark` with the `.noop()` method. It can be used in benchmark like: `ds.noop()`. The last one is unfolded to `ds.write.format("noop").mode(Overwrite).save()`. ### Why are the changes needed? To avoid additional overhead that `collect()` (and other actions) has. For example, `.collect()` has to convert values according to external types and pull data to the driver. This can hide actual performance regressions or improvements of benchmarked operations. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Re-run all modified benchmarks using Amazon EC2. | Item | Description | | ---- | ----| | Region | us-west-2 (Oregon) | | Instance | r3.xlarge (spot instance) | | AMI | ami-06f2f779464715dc5 (ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20190722.1) | | Java | OpenJDK8/10 | - Run `TPCDSQueryBenchmark` using instructions from the PR #26049 ``` # `spark-tpcds-datagen` needs this. (JDK8) $ git clone https://github.com/apache/spark.git -b branch-2.4 --depth 1 spark-2.4 $ export SPARK_HOME=$PWD $ ./build/mvn clean package -DskipTests # Generate data. (JDK8) $ git clone gitgithub.com:maropu/spark-tpcds-datagen.git $ cd spark-tpcds-datagen/ $ build/mvn clean package $ mkdir -p /data/tpcds $ ./bin/dsdgen --output-location /data/tpcds/s1 // This need `Spark 2.4` ``` - Other benchmarks ran by the script: ``` #!/usr/bin/env python3 import os from sparktestsupport.shellutils import run_cmd benchmarks = [ ['sql/test', 'org.apache.spark.sql.execution.benchmark.AggregateBenchmark'], ['avro/test', 'org.apache.spark.sql.execution.benchmark.AvroReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.BloomFilterBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DataSourceReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ExtractBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.FilterPushdownBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.InExpressionBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.IntervalBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.JoinBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MakeDateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MiscBenchmark'], ['hive/test', 'org.apache.spark.sql.execution.benchmark.ObjectHashAggregateExecBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcV2NestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ParquetNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.RangeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.UDFBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideSchemaBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideTableBenchmark'], ['hive/test', 'org.apache.spark.sql.hive.orc.OrcReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.csv.CSVBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.json.JsonBenchmark'] ] print('Set SPARK_GENERATE_BENCHMARK_FILES=1') os.environ['SPARK_GENERATE_BENCHMARK_FILES'] = '1' for b in benchmarks: print("Run benchmark: %s" % b[1]) run_cmd(['build/sbt', '%s:runMain %s' % (b[0], b[1])]) ``` Closes #27078 from MaxGekk/noop-in-benchmarks. Lead-authored-by: Maxim Gekk <max.gekk@gmail.com> Co-authored-by: Maxim Gekk <maxim.gekk@databricks.com> Co-authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-01-12 16:18:19 -05:00
kurtosis wholestage off 42656 42797 198 2.5 406.8 1.0X
kurtosis wholestage on 1440 1466 32 72.8 13.7 29.6X
[SPARK-29320][TESTS] Compare `sql/core` module in JDK8/11 (Part 1) ### What changes were proposed in this pull request? This PR regenerates the `sql/core` benchmarks in JDK8/11 to compare the result. In general, we compare the ratio instead of the time. However, in this PR, the average time is compared. This PR should be considered as a rough comparison. **A. EXPECTED CASES(JDK11 is faster in general)** - [x] BloomFilterBenchmark (JDK11 is faster except one case) - [x] BuiltInDataSourceWriteBenchmark (JDK11 is faster at CSV/ORC) - [x] CSVBenchmark (JDK11 is faster except five cases) - [x] ColumnarBatchBenchmark (JDK11 is faster at `boolean`/`string` and some cases in `int`/`array`) - [x] DatasetBenchmark (JDK11 is faster with `string`, but is slower for `long` type) - [x] ExternalAppendOnlyUnsafeRowArrayBenchmark (JDK11 is faster except two cases) - [x] ExtractBenchmark (JDK11 is faster except HOUR/MINUTE/SECOND/MILLISECONDS/MICROSECONDS) - [x] HashedRelationMetricsBenchmark (JDK11 is faster) - [x] JSONBenchmark (JDK11 is much faster except eight cases) - [x] JoinBenchmark (JDK11 is faster except five cases) - [x] OrcNestedSchemaPruningBenchmark (JDK11 is faster in nine cases) - [x] PrimitiveArrayBenchmark (JDK11 is faster) - [x] SortBenchmark (JDK11 is faster except `Arrays.sort` case) - [x] UDFBenchmark (N/A, values are too small) - [x] UnsafeArrayDataBenchmark (JDK11 is faster except one case) - [x] WideTableBenchmark (JDK11 is faster except two cases) **B. CASES WE NEED TO INVESTIGATE MORE LATER** - [x] AggregateBenchmark (JDK11 is slower in general) - [x] CompressionSchemeBenchmark (JDK11 is slower in general except `string`) - [x] DataSourceReadBenchmark (JDK11 is slower in general) - [x] DateTimeBenchmark (JDK11 is slightly slower in general except `parsing`) - [x] MakeDateTimeBenchmark (JDK11 is slower except two cases) - [x] MiscBenchmark (JDK11 is slower except ten cases) - [x] OrcV2NestedSchemaPruningBenchmark (JDK11 is slower) - [x] ParquetNestedSchemaPruningBenchmark (JDK11 is slower except six cases) - [x] RangeBenchmark (JDK11 is slower except one case) `FilterPushdownBenchmark/InExpressionBenchmark/WideSchemaBenchmark` will be compared later because it took long timer. ### Why are the changes needed? According to the result, there are some difference between JDK8/JDK11. This will be a baseline for the future improvement and comparison. Also, as a reproducible environment, the following environment is used. - Instance: `r3.xlarge` - OS: `CentOS Linux release 7.5.1804 (Core)` - JDK: - `OpenJDK Runtime Environment (build 1.8.0_222-b10)` - `OpenJDK Runtime Environment 18.9 (build 11.0.4+11-LTS)` ### Does this PR introduce any user-facing change? No. ### How was this patch tested? This is a test-only PR. We need to run benchmark. Closes #26003 from dongjoon-hyun/SPARK-29320. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-03 11:58:25 -04:00
================================================================================================
aggregate with linear keys
================================================================================================
[SPARK-30409][SPARK-29173][SQL][TESTS] Use `NoOp` datasource in SQL benchmarks ### What changes were proposed in this pull request? In the PR, I propose to replace `.collect()`, `.count()` and `.foreach(_ => ())` in SQL benchmarks and use the `NoOp` datasource. I added an implicit class to `SqlBasedBenchmark` with the `.noop()` method. It can be used in benchmark like: `ds.noop()`. The last one is unfolded to `ds.write.format("noop").mode(Overwrite).save()`. ### Why are the changes needed? To avoid additional overhead that `collect()` (and other actions) has. For example, `.collect()` has to convert values according to external types and pull data to the driver. This can hide actual performance regressions or improvements of benchmarked operations. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Re-run all modified benchmarks using Amazon EC2. | Item | Description | | ---- | ----| | Region | us-west-2 (Oregon) | | Instance | r3.xlarge (spot instance) | | AMI | ami-06f2f779464715dc5 (ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20190722.1) | | Java | OpenJDK8/10 | - Run `TPCDSQueryBenchmark` using instructions from the PR #26049 ``` # `spark-tpcds-datagen` needs this. (JDK8) $ git clone https://github.com/apache/spark.git -b branch-2.4 --depth 1 spark-2.4 $ export SPARK_HOME=$PWD $ ./build/mvn clean package -DskipTests # Generate data. (JDK8) $ git clone gitgithub.com:maropu/spark-tpcds-datagen.git $ cd spark-tpcds-datagen/ $ build/mvn clean package $ mkdir -p /data/tpcds $ ./bin/dsdgen --output-location /data/tpcds/s1 // This need `Spark 2.4` ``` - Other benchmarks ran by the script: ``` #!/usr/bin/env python3 import os from sparktestsupport.shellutils import run_cmd benchmarks = [ ['sql/test', 'org.apache.spark.sql.execution.benchmark.AggregateBenchmark'], ['avro/test', 'org.apache.spark.sql.execution.benchmark.AvroReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.BloomFilterBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DataSourceReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ExtractBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.FilterPushdownBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.InExpressionBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.IntervalBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.JoinBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MakeDateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MiscBenchmark'], ['hive/test', 'org.apache.spark.sql.execution.benchmark.ObjectHashAggregateExecBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcV2NestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ParquetNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.RangeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.UDFBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideSchemaBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideTableBenchmark'], ['hive/test', 'org.apache.spark.sql.hive.orc.OrcReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.csv.CSVBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.json.JsonBenchmark'] ] print('Set SPARK_GENERATE_BENCHMARK_FILES=1') os.environ['SPARK_GENERATE_BENCHMARK_FILES'] = '1' for b in benchmarks: print("Run benchmark: %s" % b[1]) run_cmd(['build/sbt', '%s:runMain %s' % (b[0], b[1])]) ``` Closes #27078 from MaxGekk/noop-in-benchmarks. Lead-authored-by: Maxim Gekk <max.gekk@gmail.com> Co-authored-by: Maxim Gekk <maxim.gekk@databricks.com> Co-authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-01-12 16:18:19 -05:00
OpenJDK 64-Bit Server VM 11.0.5+10-post-Ubuntu-0ubuntu1.118.04 on Linux 4.15.0-1044-aws
[SPARK-29320][TESTS] Compare `sql/core` module in JDK8/11 (Part 1) ### What changes were proposed in this pull request? This PR regenerates the `sql/core` benchmarks in JDK8/11 to compare the result. In general, we compare the ratio instead of the time. However, in this PR, the average time is compared. This PR should be considered as a rough comparison. **A. EXPECTED CASES(JDK11 is faster in general)** - [x] BloomFilterBenchmark (JDK11 is faster except one case) - [x] BuiltInDataSourceWriteBenchmark (JDK11 is faster at CSV/ORC) - [x] CSVBenchmark (JDK11 is faster except five cases) - [x] ColumnarBatchBenchmark (JDK11 is faster at `boolean`/`string` and some cases in `int`/`array`) - [x] DatasetBenchmark (JDK11 is faster with `string`, but is slower for `long` type) - [x] ExternalAppendOnlyUnsafeRowArrayBenchmark (JDK11 is faster except two cases) - [x] ExtractBenchmark (JDK11 is faster except HOUR/MINUTE/SECOND/MILLISECONDS/MICROSECONDS) - [x] HashedRelationMetricsBenchmark (JDK11 is faster) - [x] JSONBenchmark (JDK11 is much faster except eight cases) - [x] JoinBenchmark (JDK11 is faster except five cases) - [x] OrcNestedSchemaPruningBenchmark (JDK11 is faster in nine cases) - [x] PrimitiveArrayBenchmark (JDK11 is faster) - [x] SortBenchmark (JDK11 is faster except `Arrays.sort` case) - [x] UDFBenchmark (N/A, values are too small) - [x] UnsafeArrayDataBenchmark (JDK11 is faster except one case) - [x] WideTableBenchmark (JDK11 is faster except two cases) **B. CASES WE NEED TO INVESTIGATE MORE LATER** - [x] AggregateBenchmark (JDK11 is slower in general) - [x] CompressionSchemeBenchmark (JDK11 is slower in general except `string`) - [x] DataSourceReadBenchmark (JDK11 is slower in general) - [x] DateTimeBenchmark (JDK11 is slightly slower in general except `parsing`) - [x] MakeDateTimeBenchmark (JDK11 is slower except two cases) - [x] MiscBenchmark (JDK11 is slower except ten cases) - [x] OrcV2NestedSchemaPruningBenchmark (JDK11 is slower) - [x] ParquetNestedSchemaPruningBenchmark (JDK11 is slower except six cases) - [x] RangeBenchmark (JDK11 is slower except one case) `FilterPushdownBenchmark/InExpressionBenchmark/WideSchemaBenchmark` will be compared later because it took long timer. ### Why are the changes needed? According to the result, there are some difference between JDK8/JDK11. This will be a baseline for the future improvement and comparison. Also, as a reproducible environment, the following environment is used. - Instance: `r3.xlarge` - OS: `CentOS Linux release 7.5.1804 (Core)` - JDK: - `OpenJDK Runtime Environment (build 1.8.0_222-b10)` - `OpenJDK Runtime Environment 18.9 (build 11.0.4+11-LTS)` ### Does this PR introduce any user-facing change? No. ### How was this patch tested? This is a test-only PR. We need to run benchmark. Closes #26003 from dongjoon-hyun/SPARK-29320. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-03 11:58:25 -04:00
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Aggregate w keys: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
[SPARK-30409][SPARK-29173][SQL][TESTS] Use `NoOp` datasource in SQL benchmarks ### What changes were proposed in this pull request? In the PR, I propose to replace `.collect()`, `.count()` and `.foreach(_ => ())` in SQL benchmarks and use the `NoOp` datasource. I added an implicit class to `SqlBasedBenchmark` with the `.noop()` method. It can be used in benchmark like: `ds.noop()`. The last one is unfolded to `ds.write.format("noop").mode(Overwrite).save()`. ### Why are the changes needed? To avoid additional overhead that `collect()` (and other actions) has. For example, `.collect()` has to convert values according to external types and pull data to the driver. This can hide actual performance regressions or improvements of benchmarked operations. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Re-run all modified benchmarks using Amazon EC2. | Item | Description | | ---- | ----| | Region | us-west-2 (Oregon) | | Instance | r3.xlarge (spot instance) | | AMI | ami-06f2f779464715dc5 (ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20190722.1) | | Java | OpenJDK8/10 | - Run `TPCDSQueryBenchmark` using instructions from the PR #26049 ``` # `spark-tpcds-datagen` needs this. (JDK8) $ git clone https://github.com/apache/spark.git -b branch-2.4 --depth 1 spark-2.4 $ export SPARK_HOME=$PWD $ ./build/mvn clean package -DskipTests # Generate data. (JDK8) $ git clone gitgithub.com:maropu/spark-tpcds-datagen.git $ cd spark-tpcds-datagen/ $ build/mvn clean package $ mkdir -p /data/tpcds $ ./bin/dsdgen --output-location /data/tpcds/s1 // This need `Spark 2.4` ``` - Other benchmarks ran by the script: ``` #!/usr/bin/env python3 import os from sparktestsupport.shellutils import run_cmd benchmarks = [ ['sql/test', 'org.apache.spark.sql.execution.benchmark.AggregateBenchmark'], ['avro/test', 'org.apache.spark.sql.execution.benchmark.AvroReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.BloomFilterBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DataSourceReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ExtractBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.FilterPushdownBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.InExpressionBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.IntervalBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.JoinBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MakeDateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MiscBenchmark'], ['hive/test', 'org.apache.spark.sql.execution.benchmark.ObjectHashAggregateExecBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcV2NestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ParquetNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.RangeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.UDFBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideSchemaBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideTableBenchmark'], ['hive/test', 'org.apache.spark.sql.hive.orc.OrcReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.csv.CSVBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.json.JsonBenchmark'] ] print('Set SPARK_GENERATE_BENCHMARK_FILES=1') os.environ['SPARK_GENERATE_BENCHMARK_FILES'] = '1' for b in benchmarks: print("Run benchmark: %s" % b[1]) run_cmd(['build/sbt', '%s:runMain %s' % (b[0], b[1])]) ``` Closes #27078 from MaxGekk/noop-in-benchmarks. Lead-authored-by: Maxim Gekk <max.gekk@gmail.com> Co-authored-by: Maxim Gekk <maxim.gekk@databricks.com> Co-authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-01-12 16:18:19 -05:00
codegen = F 10963 11035 101 7.7 130.7 1.0X
codegen = T hashmap = F 6852 7080 207 12.2 81.7 1.6X
codegen = T hashmap = T 1377 1421 43 60.9 16.4 8.0X
[SPARK-29320][TESTS] Compare `sql/core` module in JDK8/11 (Part 1) ### What changes were proposed in this pull request? This PR regenerates the `sql/core` benchmarks in JDK8/11 to compare the result. In general, we compare the ratio instead of the time. However, in this PR, the average time is compared. This PR should be considered as a rough comparison. **A. EXPECTED CASES(JDK11 is faster in general)** - [x] BloomFilterBenchmark (JDK11 is faster except one case) - [x] BuiltInDataSourceWriteBenchmark (JDK11 is faster at CSV/ORC) - [x] CSVBenchmark (JDK11 is faster except five cases) - [x] ColumnarBatchBenchmark (JDK11 is faster at `boolean`/`string` and some cases in `int`/`array`) - [x] DatasetBenchmark (JDK11 is faster with `string`, but is slower for `long` type) - [x] ExternalAppendOnlyUnsafeRowArrayBenchmark (JDK11 is faster except two cases) - [x] ExtractBenchmark (JDK11 is faster except HOUR/MINUTE/SECOND/MILLISECONDS/MICROSECONDS) - [x] HashedRelationMetricsBenchmark (JDK11 is faster) - [x] JSONBenchmark (JDK11 is much faster except eight cases) - [x] JoinBenchmark (JDK11 is faster except five cases) - [x] OrcNestedSchemaPruningBenchmark (JDK11 is faster in nine cases) - [x] PrimitiveArrayBenchmark (JDK11 is faster) - [x] SortBenchmark (JDK11 is faster except `Arrays.sort` case) - [x] UDFBenchmark (N/A, values are too small) - [x] UnsafeArrayDataBenchmark (JDK11 is faster except one case) - [x] WideTableBenchmark (JDK11 is faster except two cases) **B. CASES WE NEED TO INVESTIGATE MORE LATER** - [x] AggregateBenchmark (JDK11 is slower in general) - [x] CompressionSchemeBenchmark (JDK11 is slower in general except `string`) - [x] DataSourceReadBenchmark (JDK11 is slower in general) - [x] DateTimeBenchmark (JDK11 is slightly slower in general except `parsing`) - [x] MakeDateTimeBenchmark (JDK11 is slower except two cases) - [x] MiscBenchmark (JDK11 is slower except ten cases) - [x] OrcV2NestedSchemaPruningBenchmark (JDK11 is slower) - [x] ParquetNestedSchemaPruningBenchmark (JDK11 is slower except six cases) - [x] RangeBenchmark (JDK11 is slower except one case) `FilterPushdownBenchmark/InExpressionBenchmark/WideSchemaBenchmark` will be compared later because it took long timer. ### Why are the changes needed? According to the result, there are some difference between JDK8/JDK11. This will be a baseline for the future improvement and comparison. Also, as a reproducible environment, the following environment is used. - Instance: `r3.xlarge` - OS: `CentOS Linux release 7.5.1804 (Core)` - JDK: - `OpenJDK Runtime Environment (build 1.8.0_222-b10)` - `OpenJDK Runtime Environment 18.9 (build 11.0.4+11-LTS)` ### Does this PR introduce any user-facing change? No. ### How was this patch tested? This is a test-only PR. We need to run benchmark. Closes #26003 from dongjoon-hyun/SPARK-29320. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-03 11:58:25 -04:00
================================================================================================
aggregate with randomized keys
================================================================================================
[SPARK-30409][SPARK-29173][SQL][TESTS] Use `NoOp` datasource in SQL benchmarks ### What changes were proposed in this pull request? In the PR, I propose to replace `.collect()`, `.count()` and `.foreach(_ => ())` in SQL benchmarks and use the `NoOp` datasource. I added an implicit class to `SqlBasedBenchmark` with the `.noop()` method. It can be used in benchmark like: `ds.noop()`. The last one is unfolded to `ds.write.format("noop").mode(Overwrite).save()`. ### Why are the changes needed? To avoid additional overhead that `collect()` (and other actions) has. For example, `.collect()` has to convert values according to external types and pull data to the driver. This can hide actual performance regressions or improvements of benchmarked operations. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Re-run all modified benchmarks using Amazon EC2. | Item | Description | | ---- | ----| | Region | us-west-2 (Oregon) | | Instance | r3.xlarge (spot instance) | | AMI | ami-06f2f779464715dc5 (ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20190722.1) | | Java | OpenJDK8/10 | - Run `TPCDSQueryBenchmark` using instructions from the PR #26049 ``` # `spark-tpcds-datagen` needs this. (JDK8) $ git clone https://github.com/apache/spark.git -b branch-2.4 --depth 1 spark-2.4 $ export SPARK_HOME=$PWD $ ./build/mvn clean package -DskipTests # Generate data. (JDK8) $ git clone gitgithub.com:maropu/spark-tpcds-datagen.git $ cd spark-tpcds-datagen/ $ build/mvn clean package $ mkdir -p /data/tpcds $ ./bin/dsdgen --output-location /data/tpcds/s1 // This need `Spark 2.4` ``` - Other benchmarks ran by the script: ``` #!/usr/bin/env python3 import os from sparktestsupport.shellutils import run_cmd benchmarks = [ ['sql/test', 'org.apache.spark.sql.execution.benchmark.AggregateBenchmark'], ['avro/test', 'org.apache.spark.sql.execution.benchmark.AvroReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.BloomFilterBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DataSourceReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ExtractBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.FilterPushdownBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.InExpressionBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.IntervalBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.JoinBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MakeDateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MiscBenchmark'], ['hive/test', 'org.apache.spark.sql.execution.benchmark.ObjectHashAggregateExecBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcV2NestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ParquetNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.RangeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.UDFBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideSchemaBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideTableBenchmark'], ['hive/test', 'org.apache.spark.sql.hive.orc.OrcReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.csv.CSVBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.json.JsonBenchmark'] ] print('Set SPARK_GENERATE_BENCHMARK_FILES=1') os.environ['SPARK_GENERATE_BENCHMARK_FILES'] = '1' for b in benchmarks: print("Run benchmark: %s" % b[1]) run_cmd(['build/sbt', '%s:runMain %s' % (b[0], b[1])]) ``` Closes #27078 from MaxGekk/noop-in-benchmarks. Lead-authored-by: Maxim Gekk <max.gekk@gmail.com> Co-authored-by: Maxim Gekk <maxim.gekk@databricks.com> Co-authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-01-12 16:18:19 -05:00
OpenJDK 64-Bit Server VM 11.0.5+10-post-Ubuntu-0ubuntu1.118.04 on Linux 4.15.0-1044-aws
[SPARK-29320][TESTS] Compare `sql/core` module in JDK8/11 (Part 1) ### What changes were proposed in this pull request? This PR regenerates the `sql/core` benchmarks in JDK8/11 to compare the result. In general, we compare the ratio instead of the time. However, in this PR, the average time is compared. This PR should be considered as a rough comparison. **A. EXPECTED CASES(JDK11 is faster in general)** - [x] BloomFilterBenchmark (JDK11 is faster except one case) - [x] BuiltInDataSourceWriteBenchmark (JDK11 is faster at CSV/ORC) - [x] CSVBenchmark (JDK11 is faster except five cases) - [x] ColumnarBatchBenchmark (JDK11 is faster at `boolean`/`string` and some cases in `int`/`array`) - [x] DatasetBenchmark (JDK11 is faster with `string`, but is slower for `long` type) - [x] ExternalAppendOnlyUnsafeRowArrayBenchmark (JDK11 is faster except two cases) - [x] ExtractBenchmark (JDK11 is faster except HOUR/MINUTE/SECOND/MILLISECONDS/MICROSECONDS) - [x] HashedRelationMetricsBenchmark (JDK11 is faster) - [x] JSONBenchmark (JDK11 is much faster except eight cases) - [x] JoinBenchmark (JDK11 is faster except five cases) - [x] OrcNestedSchemaPruningBenchmark (JDK11 is faster in nine cases) - [x] PrimitiveArrayBenchmark (JDK11 is faster) - [x] SortBenchmark (JDK11 is faster except `Arrays.sort` case) - [x] UDFBenchmark (N/A, values are too small) - [x] UnsafeArrayDataBenchmark (JDK11 is faster except one case) - [x] WideTableBenchmark (JDK11 is faster except two cases) **B. CASES WE NEED TO INVESTIGATE MORE LATER** - [x] AggregateBenchmark (JDK11 is slower in general) - [x] CompressionSchemeBenchmark (JDK11 is slower in general except `string`) - [x] DataSourceReadBenchmark (JDK11 is slower in general) - [x] DateTimeBenchmark (JDK11 is slightly slower in general except `parsing`) - [x] MakeDateTimeBenchmark (JDK11 is slower except two cases) - [x] MiscBenchmark (JDK11 is slower except ten cases) - [x] OrcV2NestedSchemaPruningBenchmark (JDK11 is slower) - [x] ParquetNestedSchemaPruningBenchmark (JDK11 is slower except six cases) - [x] RangeBenchmark (JDK11 is slower except one case) `FilterPushdownBenchmark/InExpressionBenchmark/WideSchemaBenchmark` will be compared later because it took long timer. ### Why are the changes needed? According to the result, there are some difference between JDK8/JDK11. This will be a baseline for the future improvement and comparison. Also, as a reproducible environment, the following environment is used. - Instance: `r3.xlarge` - OS: `CentOS Linux release 7.5.1804 (Core)` - JDK: - `OpenJDK Runtime Environment (build 1.8.0_222-b10)` - `OpenJDK Runtime Environment 18.9 (build 11.0.4+11-LTS)` ### Does this PR introduce any user-facing change? No. ### How was this patch tested? This is a test-only PR. We need to run benchmark. Closes #26003 from dongjoon-hyun/SPARK-29320. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-03 11:58:25 -04:00
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Aggregate w keys: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
[SPARK-30409][SPARK-29173][SQL][TESTS] Use `NoOp` datasource in SQL benchmarks ### What changes were proposed in this pull request? In the PR, I propose to replace `.collect()`, `.count()` and `.foreach(_ => ())` in SQL benchmarks and use the `NoOp` datasource. I added an implicit class to `SqlBasedBenchmark` with the `.noop()` method. It can be used in benchmark like: `ds.noop()`. The last one is unfolded to `ds.write.format("noop").mode(Overwrite).save()`. ### Why are the changes needed? To avoid additional overhead that `collect()` (and other actions) has. For example, `.collect()` has to convert values according to external types and pull data to the driver. This can hide actual performance regressions or improvements of benchmarked operations. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Re-run all modified benchmarks using Amazon EC2. | Item | Description | | ---- | ----| | Region | us-west-2 (Oregon) | | Instance | r3.xlarge (spot instance) | | AMI | ami-06f2f779464715dc5 (ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20190722.1) | | Java | OpenJDK8/10 | - Run `TPCDSQueryBenchmark` using instructions from the PR #26049 ``` # `spark-tpcds-datagen` needs this. (JDK8) $ git clone https://github.com/apache/spark.git -b branch-2.4 --depth 1 spark-2.4 $ export SPARK_HOME=$PWD $ ./build/mvn clean package -DskipTests # Generate data. (JDK8) $ git clone gitgithub.com:maropu/spark-tpcds-datagen.git $ cd spark-tpcds-datagen/ $ build/mvn clean package $ mkdir -p /data/tpcds $ ./bin/dsdgen --output-location /data/tpcds/s1 // This need `Spark 2.4` ``` - Other benchmarks ran by the script: ``` #!/usr/bin/env python3 import os from sparktestsupport.shellutils import run_cmd benchmarks = [ ['sql/test', 'org.apache.spark.sql.execution.benchmark.AggregateBenchmark'], ['avro/test', 'org.apache.spark.sql.execution.benchmark.AvroReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.BloomFilterBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DataSourceReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ExtractBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.FilterPushdownBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.InExpressionBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.IntervalBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.JoinBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MakeDateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MiscBenchmark'], ['hive/test', 'org.apache.spark.sql.execution.benchmark.ObjectHashAggregateExecBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcV2NestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ParquetNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.RangeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.UDFBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideSchemaBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideTableBenchmark'], ['hive/test', 'org.apache.spark.sql.hive.orc.OrcReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.csv.CSVBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.json.JsonBenchmark'] ] print('Set SPARK_GENERATE_BENCHMARK_FILES=1') os.environ['SPARK_GENERATE_BENCHMARK_FILES'] = '1' for b in benchmarks: print("Run benchmark: %s" % b[1]) run_cmd(['build/sbt', '%s:runMain %s' % (b[0], b[1])]) ``` Closes #27078 from MaxGekk/noop-in-benchmarks. Lead-authored-by: Maxim Gekk <max.gekk@gmail.com> Co-authored-by: Maxim Gekk <maxim.gekk@databricks.com> Co-authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-01-12 16:18:19 -05:00
codegen = F 12240 12296 80 6.9 145.9 1.0X
codegen = T hashmap = F 8318 8376 87 10.1 99.2 1.5X
codegen = T hashmap = T 2551 2617 80 32.9 30.4 4.8X
[SPARK-29320][TESTS] Compare `sql/core` module in JDK8/11 (Part 1) ### What changes were proposed in this pull request? This PR regenerates the `sql/core` benchmarks in JDK8/11 to compare the result. In general, we compare the ratio instead of the time. However, in this PR, the average time is compared. This PR should be considered as a rough comparison. **A. EXPECTED CASES(JDK11 is faster in general)** - [x] BloomFilterBenchmark (JDK11 is faster except one case) - [x] BuiltInDataSourceWriteBenchmark (JDK11 is faster at CSV/ORC) - [x] CSVBenchmark (JDK11 is faster except five cases) - [x] ColumnarBatchBenchmark (JDK11 is faster at `boolean`/`string` and some cases in `int`/`array`) - [x] DatasetBenchmark (JDK11 is faster with `string`, but is slower for `long` type) - [x] ExternalAppendOnlyUnsafeRowArrayBenchmark (JDK11 is faster except two cases) - [x] ExtractBenchmark (JDK11 is faster except HOUR/MINUTE/SECOND/MILLISECONDS/MICROSECONDS) - [x] HashedRelationMetricsBenchmark (JDK11 is faster) - [x] JSONBenchmark (JDK11 is much faster except eight cases) - [x] JoinBenchmark (JDK11 is faster except five cases) - [x] OrcNestedSchemaPruningBenchmark (JDK11 is faster in nine cases) - [x] PrimitiveArrayBenchmark (JDK11 is faster) - [x] SortBenchmark (JDK11 is faster except `Arrays.sort` case) - [x] UDFBenchmark (N/A, values are too small) - [x] UnsafeArrayDataBenchmark (JDK11 is faster except one case) - [x] WideTableBenchmark (JDK11 is faster except two cases) **B. CASES WE NEED TO INVESTIGATE MORE LATER** - [x] AggregateBenchmark (JDK11 is slower in general) - [x] CompressionSchemeBenchmark (JDK11 is slower in general except `string`) - [x] DataSourceReadBenchmark (JDK11 is slower in general) - [x] DateTimeBenchmark (JDK11 is slightly slower in general except `parsing`) - [x] MakeDateTimeBenchmark (JDK11 is slower except two cases) - [x] MiscBenchmark (JDK11 is slower except ten cases) - [x] OrcV2NestedSchemaPruningBenchmark (JDK11 is slower) - [x] ParquetNestedSchemaPruningBenchmark (JDK11 is slower except six cases) - [x] RangeBenchmark (JDK11 is slower except one case) `FilterPushdownBenchmark/InExpressionBenchmark/WideSchemaBenchmark` will be compared later because it took long timer. ### Why are the changes needed? According to the result, there are some difference between JDK8/JDK11. This will be a baseline for the future improvement and comparison. Also, as a reproducible environment, the following environment is used. - Instance: `r3.xlarge` - OS: `CentOS Linux release 7.5.1804 (Core)` - JDK: - `OpenJDK Runtime Environment (build 1.8.0_222-b10)` - `OpenJDK Runtime Environment 18.9 (build 11.0.4+11-LTS)` ### Does this PR introduce any user-facing change? No. ### How was this patch tested? This is a test-only PR. We need to run benchmark. Closes #26003 from dongjoon-hyun/SPARK-29320. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-03 11:58:25 -04:00
================================================================================================
aggregate with string key
================================================================================================
[SPARK-30409][SPARK-29173][SQL][TESTS] Use `NoOp` datasource in SQL benchmarks ### What changes were proposed in this pull request? In the PR, I propose to replace `.collect()`, `.count()` and `.foreach(_ => ())` in SQL benchmarks and use the `NoOp` datasource. I added an implicit class to `SqlBasedBenchmark` with the `.noop()` method. It can be used in benchmark like: `ds.noop()`. The last one is unfolded to `ds.write.format("noop").mode(Overwrite).save()`. ### Why are the changes needed? To avoid additional overhead that `collect()` (and other actions) has. For example, `.collect()` has to convert values according to external types and pull data to the driver. This can hide actual performance regressions or improvements of benchmarked operations. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Re-run all modified benchmarks using Amazon EC2. | Item | Description | | ---- | ----| | Region | us-west-2 (Oregon) | | Instance | r3.xlarge (spot instance) | | AMI | ami-06f2f779464715dc5 (ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20190722.1) | | Java | OpenJDK8/10 | - Run `TPCDSQueryBenchmark` using instructions from the PR #26049 ``` # `spark-tpcds-datagen` needs this. (JDK8) $ git clone https://github.com/apache/spark.git -b branch-2.4 --depth 1 spark-2.4 $ export SPARK_HOME=$PWD $ ./build/mvn clean package -DskipTests # Generate data. (JDK8) $ git clone gitgithub.com:maropu/spark-tpcds-datagen.git $ cd spark-tpcds-datagen/ $ build/mvn clean package $ mkdir -p /data/tpcds $ ./bin/dsdgen --output-location /data/tpcds/s1 // This need `Spark 2.4` ``` - Other benchmarks ran by the script: ``` #!/usr/bin/env python3 import os from sparktestsupport.shellutils import run_cmd benchmarks = [ ['sql/test', 'org.apache.spark.sql.execution.benchmark.AggregateBenchmark'], ['avro/test', 'org.apache.spark.sql.execution.benchmark.AvroReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.BloomFilterBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DataSourceReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ExtractBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.FilterPushdownBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.InExpressionBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.IntervalBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.JoinBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MakeDateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MiscBenchmark'], ['hive/test', 'org.apache.spark.sql.execution.benchmark.ObjectHashAggregateExecBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcV2NestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ParquetNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.RangeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.UDFBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideSchemaBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideTableBenchmark'], ['hive/test', 'org.apache.spark.sql.hive.orc.OrcReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.csv.CSVBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.json.JsonBenchmark'] ] print('Set SPARK_GENERATE_BENCHMARK_FILES=1') os.environ['SPARK_GENERATE_BENCHMARK_FILES'] = '1' for b in benchmarks: print("Run benchmark: %s" % b[1]) run_cmd(['build/sbt', '%s:runMain %s' % (b[0], b[1])]) ``` Closes #27078 from MaxGekk/noop-in-benchmarks. Lead-authored-by: Maxim Gekk <max.gekk@gmail.com> Co-authored-by: Maxim Gekk <maxim.gekk@databricks.com> Co-authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-01-12 16:18:19 -05:00
OpenJDK 64-Bit Server VM 11.0.5+10-post-Ubuntu-0ubuntu1.118.04 on Linux 4.15.0-1044-aws
[SPARK-29320][TESTS] Compare `sql/core` module in JDK8/11 (Part 1) ### What changes were proposed in this pull request? This PR regenerates the `sql/core` benchmarks in JDK8/11 to compare the result. In general, we compare the ratio instead of the time. However, in this PR, the average time is compared. This PR should be considered as a rough comparison. **A. EXPECTED CASES(JDK11 is faster in general)** - [x] BloomFilterBenchmark (JDK11 is faster except one case) - [x] BuiltInDataSourceWriteBenchmark (JDK11 is faster at CSV/ORC) - [x] CSVBenchmark (JDK11 is faster except five cases) - [x] ColumnarBatchBenchmark (JDK11 is faster at `boolean`/`string` and some cases in `int`/`array`) - [x] DatasetBenchmark (JDK11 is faster with `string`, but is slower for `long` type) - [x] ExternalAppendOnlyUnsafeRowArrayBenchmark (JDK11 is faster except two cases) - [x] ExtractBenchmark (JDK11 is faster except HOUR/MINUTE/SECOND/MILLISECONDS/MICROSECONDS) - [x] HashedRelationMetricsBenchmark (JDK11 is faster) - [x] JSONBenchmark (JDK11 is much faster except eight cases) - [x] JoinBenchmark (JDK11 is faster except five cases) - [x] OrcNestedSchemaPruningBenchmark (JDK11 is faster in nine cases) - [x] PrimitiveArrayBenchmark (JDK11 is faster) - [x] SortBenchmark (JDK11 is faster except `Arrays.sort` case) - [x] UDFBenchmark (N/A, values are too small) - [x] UnsafeArrayDataBenchmark (JDK11 is faster except one case) - [x] WideTableBenchmark (JDK11 is faster except two cases) **B. CASES WE NEED TO INVESTIGATE MORE LATER** - [x] AggregateBenchmark (JDK11 is slower in general) - [x] CompressionSchemeBenchmark (JDK11 is slower in general except `string`) - [x] DataSourceReadBenchmark (JDK11 is slower in general) - [x] DateTimeBenchmark (JDK11 is slightly slower in general except `parsing`) - [x] MakeDateTimeBenchmark (JDK11 is slower except two cases) - [x] MiscBenchmark (JDK11 is slower except ten cases) - [x] OrcV2NestedSchemaPruningBenchmark (JDK11 is slower) - [x] ParquetNestedSchemaPruningBenchmark (JDK11 is slower except six cases) - [x] RangeBenchmark (JDK11 is slower except one case) `FilterPushdownBenchmark/InExpressionBenchmark/WideSchemaBenchmark` will be compared later because it took long timer. ### Why are the changes needed? According to the result, there are some difference between JDK8/JDK11. This will be a baseline for the future improvement and comparison. Also, as a reproducible environment, the following environment is used. - Instance: `r3.xlarge` - OS: `CentOS Linux release 7.5.1804 (Core)` - JDK: - `OpenJDK Runtime Environment (build 1.8.0_222-b10)` - `OpenJDK Runtime Environment 18.9 (build 11.0.4+11-LTS)` ### Does this PR introduce any user-facing change? No. ### How was this patch tested? This is a test-only PR. We need to run benchmark. Closes #26003 from dongjoon-hyun/SPARK-29320. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-03 11:58:25 -04:00
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Aggregate w string key: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
[SPARK-30409][SPARK-29173][SQL][TESTS] Use `NoOp` datasource in SQL benchmarks ### What changes were proposed in this pull request? In the PR, I propose to replace `.collect()`, `.count()` and `.foreach(_ => ())` in SQL benchmarks and use the `NoOp` datasource. I added an implicit class to `SqlBasedBenchmark` with the `.noop()` method. It can be used in benchmark like: `ds.noop()`. The last one is unfolded to `ds.write.format("noop").mode(Overwrite).save()`. ### Why are the changes needed? To avoid additional overhead that `collect()` (and other actions) has. For example, `.collect()` has to convert values according to external types and pull data to the driver. This can hide actual performance regressions or improvements of benchmarked operations. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Re-run all modified benchmarks using Amazon EC2. | Item | Description | | ---- | ----| | Region | us-west-2 (Oregon) | | Instance | r3.xlarge (spot instance) | | AMI | ami-06f2f779464715dc5 (ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20190722.1) | | Java | OpenJDK8/10 | - Run `TPCDSQueryBenchmark` using instructions from the PR #26049 ``` # `spark-tpcds-datagen` needs this. (JDK8) $ git clone https://github.com/apache/spark.git -b branch-2.4 --depth 1 spark-2.4 $ export SPARK_HOME=$PWD $ ./build/mvn clean package -DskipTests # Generate data. (JDK8) $ git clone gitgithub.com:maropu/spark-tpcds-datagen.git $ cd spark-tpcds-datagen/ $ build/mvn clean package $ mkdir -p /data/tpcds $ ./bin/dsdgen --output-location /data/tpcds/s1 // This need `Spark 2.4` ``` - Other benchmarks ran by the script: ``` #!/usr/bin/env python3 import os from sparktestsupport.shellutils import run_cmd benchmarks = [ ['sql/test', 'org.apache.spark.sql.execution.benchmark.AggregateBenchmark'], ['avro/test', 'org.apache.spark.sql.execution.benchmark.AvroReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.BloomFilterBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DataSourceReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ExtractBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.FilterPushdownBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.InExpressionBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.IntervalBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.JoinBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MakeDateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MiscBenchmark'], ['hive/test', 'org.apache.spark.sql.execution.benchmark.ObjectHashAggregateExecBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcV2NestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ParquetNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.RangeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.UDFBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideSchemaBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideTableBenchmark'], ['hive/test', 'org.apache.spark.sql.hive.orc.OrcReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.csv.CSVBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.json.JsonBenchmark'] ] print('Set SPARK_GENERATE_BENCHMARK_FILES=1') os.environ['SPARK_GENERATE_BENCHMARK_FILES'] = '1' for b in benchmarks: print("Run benchmark: %s" % b[1]) run_cmd(['build/sbt', '%s:runMain %s' % (b[0], b[1])]) ``` Closes #27078 from MaxGekk/noop-in-benchmarks. Lead-authored-by: Maxim Gekk <max.gekk@gmail.com> Co-authored-by: Maxim Gekk <maxim.gekk@databricks.com> Co-authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-01-12 16:18:19 -05:00
codegen = F 4442 4504 88 4.7 211.8 1.0X
codegen = T hashmap = F 2685 2778 95 7.8 128.0 1.7X
codegen = T hashmap = T 1181 1203 15 17.8 56.3 3.8X
[SPARK-29320][TESTS] Compare `sql/core` module in JDK8/11 (Part 1) ### What changes were proposed in this pull request? This PR regenerates the `sql/core` benchmarks in JDK8/11 to compare the result. In general, we compare the ratio instead of the time. However, in this PR, the average time is compared. This PR should be considered as a rough comparison. **A. EXPECTED CASES(JDK11 is faster in general)** - [x] BloomFilterBenchmark (JDK11 is faster except one case) - [x] BuiltInDataSourceWriteBenchmark (JDK11 is faster at CSV/ORC) - [x] CSVBenchmark (JDK11 is faster except five cases) - [x] ColumnarBatchBenchmark (JDK11 is faster at `boolean`/`string` and some cases in `int`/`array`) - [x] DatasetBenchmark (JDK11 is faster with `string`, but is slower for `long` type) - [x] ExternalAppendOnlyUnsafeRowArrayBenchmark (JDK11 is faster except two cases) - [x] ExtractBenchmark (JDK11 is faster except HOUR/MINUTE/SECOND/MILLISECONDS/MICROSECONDS) - [x] HashedRelationMetricsBenchmark (JDK11 is faster) - [x] JSONBenchmark (JDK11 is much faster except eight cases) - [x] JoinBenchmark (JDK11 is faster except five cases) - [x] OrcNestedSchemaPruningBenchmark (JDK11 is faster in nine cases) - [x] PrimitiveArrayBenchmark (JDK11 is faster) - [x] SortBenchmark (JDK11 is faster except `Arrays.sort` case) - [x] UDFBenchmark (N/A, values are too small) - [x] UnsafeArrayDataBenchmark (JDK11 is faster except one case) - [x] WideTableBenchmark (JDK11 is faster except two cases) **B. CASES WE NEED TO INVESTIGATE MORE LATER** - [x] AggregateBenchmark (JDK11 is slower in general) - [x] CompressionSchemeBenchmark (JDK11 is slower in general except `string`) - [x] DataSourceReadBenchmark (JDK11 is slower in general) - [x] DateTimeBenchmark (JDK11 is slightly slower in general except `parsing`) - [x] MakeDateTimeBenchmark (JDK11 is slower except two cases) - [x] MiscBenchmark (JDK11 is slower except ten cases) - [x] OrcV2NestedSchemaPruningBenchmark (JDK11 is slower) - [x] ParquetNestedSchemaPruningBenchmark (JDK11 is slower except six cases) - [x] RangeBenchmark (JDK11 is slower except one case) `FilterPushdownBenchmark/InExpressionBenchmark/WideSchemaBenchmark` will be compared later because it took long timer. ### Why are the changes needed? According to the result, there are some difference between JDK8/JDK11. This will be a baseline for the future improvement and comparison. Also, as a reproducible environment, the following environment is used. - Instance: `r3.xlarge` - OS: `CentOS Linux release 7.5.1804 (Core)` - JDK: - `OpenJDK Runtime Environment (build 1.8.0_222-b10)` - `OpenJDK Runtime Environment 18.9 (build 11.0.4+11-LTS)` ### Does this PR introduce any user-facing change? No. ### How was this patch tested? This is a test-only PR. We need to run benchmark. Closes #26003 from dongjoon-hyun/SPARK-29320. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-03 11:58:25 -04:00
================================================================================================
aggregate with decimal key
================================================================================================
[SPARK-30409][SPARK-29173][SQL][TESTS] Use `NoOp` datasource in SQL benchmarks ### What changes were proposed in this pull request? In the PR, I propose to replace `.collect()`, `.count()` and `.foreach(_ => ())` in SQL benchmarks and use the `NoOp` datasource. I added an implicit class to `SqlBasedBenchmark` with the `.noop()` method. It can be used in benchmark like: `ds.noop()`. The last one is unfolded to `ds.write.format("noop").mode(Overwrite).save()`. ### Why are the changes needed? To avoid additional overhead that `collect()` (and other actions) has. For example, `.collect()` has to convert values according to external types and pull data to the driver. This can hide actual performance regressions or improvements of benchmarked operations. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Re-run all modified benchmarks using Amazon EC2. | Item | Description | | ---- | ----| | Region | us-west-2 (Oregon) | | Instance | r3.xlarge (spot instance) | | AMI | ami-06f2f779464715dc5 (ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20190722.1) | | Java | OpenJDK8/10 | - Run `TPCDSQueryBenchmark` using instructions from the PR #26049 ``` # `spark-tpcds-datagen` needs this. (JDK8) $ git clone https://github.com/apache/spark.git -b branch-2.4 --depth 1 spark-2.4 $ export SPARK_HOME=$PWD $ ./build/mvn clean package -DskipTests # Generate data. (JDK8) $ git clone gitgithub.com:maropu/spark-tpcds-datagen.git $ cd spark-tpcds-datagen/ $ build/mvn clean package $ mkdir -p /data/tpcds $ ./bin/dsdgen --output-location /data/tpcds/s1 // This need `Spark 2.4` ``` - Other benchmarks ran by the script: ``` #!/usr/bin/env python3 import os from sparktestsupport.shellutils import run_cmd benchmarks = [ ['sql/test', 'org.apache.spark.sql.execution.benchmark.AggregateBenchmark'], ['avro/test', 'org.apache.spark.sql.execution.benchmark.AvroReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.BloomFilterBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DataSourceReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ExtractBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.FilterPushdownBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.InExpressionBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.IntervalBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.JoinBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MakeDateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MiscBenchmark'], ['hive/test', 'org.apache.spark.sql.execution.benchmark.ObjectHashAggregateExecBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcV2NestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ParquetNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.RangeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.UDFBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideSchemaBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideTableBenchmark'], ['hive/test', 'org.apache.spark.sql.hive.orc.OrcReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.csv.CSVBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.json.JsonBenchmark'] ] print('Set SPARK_GENERATE_BENCHMARK_FILES=1') os.environ['SPARK_GENERATE_BENCHMARK_FILES'] = '1' for b in benchmarks: print("Run benchmark: %s" % b[1]) run_cmd(['build/sbt', '%s:runMain %s' % (b[0], b[1])]) ``` Closes #27078 from MaxGekk/noop-in-benchmarks. Lead-authored-by: Maxim Gekk <max.gekk@gmail.com> Co-authored-by: Maxim Gekk <maxim.gekk@databricks.com> Co-authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-01-12 16:18:19 -05:00
OpenJDK 64-Bit Server VM 11.0.5+10-post-Ubuntu-0ubuntu1.118.04 on Linux 4.15.0-1044-aws
[SPARK-29320][TESTS] Compare `sql/core` module in JDK8/11 (Part 1) ### What changes were proposed in this pull request? This PR regenerates the `sql/core` benchmarks in JDK8/11 to compare the result. In general, we compare the ratio instead of the time. However, in this PR, the average time is compared. This PR should be considered as a rough comparison. **A. EXPECTED CASES(JDK11 is faster in general)** - [x] BloomFilterBenchmark (JDK11 is faster except one case) - [x] BuiltInDataSourceWriteBenchmark (JDK11 is faster at CSV/ORC) - [x] CSVBenchmark (JDK11 is faster except five cases) - [x] ColumnarBatchBenchmark (JDK11 is faster at `boolean`/`string` and some cases in `int`/`array`) - [x] DatasetBenchmark (JDK11 is faster with `string`, but is slower for `long` type) - [x] ExternalAppendOnlyUnsafeRowArrayBenchmark (JDK11 is faster except two cases) - [x] ExtractBenchmark (JDK11 is faster except HOUR/MINUTE/SECOND/MILLISECONDS/MICROSECONDS) - [x] HashedRelationMetricsBenchmark (JDK11 is faster) - [x] JSONBenchmark (JDK11 is much faster except eight cases) - [x] JoinBenchmark (JDK11 is faster except five cases) - [x] OrcNestedSchemaPruningBenchmark (JDK11 is faster in nine cases) - [x] PrimitiveArrayBenchmark (JDK11 is faster) - [x] SortBenchmark (JDK11 is faster except `Arrays.sort` case) - [x] UDFBenchmark (N/A, values are too small) - [x] UnsafeArrayDataBenchmark (JDK11 is faster except one case) - [x] WideTableBenchmark (JDK11 is faster except two cases) **B. CASES WE NEED TO INVESTIGATE MORE LATER** - [x] AggregateBenchmark (JDK11 is slower in general) - [x] CompressionSchemeBenchmark (JDK11 is slower in general except `string`) - [x] DataSourceReadBenchmark (JDK11 is slower in general) - [x] DateTimeBenchmark (JDK11 is slightly slower in general except `parsing`) - [x] MakeDateTimeBenchmark (JDK11 is slower except two cases) - [x] MiscBenchmark (JDK11 is slower except ten cases) - [x] OrcV2NestedSchemaPruningBenchmark (JDK11 is slower) - [x] ParquetNestedSchemaPruningBenchmark (JDK11 is slower except six cases) - [x] RangeBenchmark (JDK11 is slower except one case) `FilterPushdownBenchmark/InExpressionBenchmark/WideSchemaBenchmark` will be compared later because it took long timer. ### Why are the changes needed? According to the result, there are some difference between JDK8/JDK11. This will be a baseline for the future improvement and comparison. Also, as a reproducible environment, the following environment is used. - Instance: `r3.xlarge` - OS: `CentOS Linux release 7.5.1804 (Core)` - JDK: - `OpenJDK Runtime Environment (build 1.8.0_222-b10)` - `OpenJDK Runtime Environment 18.9 (build 11.0.4+11-LTS)` ### Does this PR introduce any user-facing change? No. ### How was this patch tested? This is a test-only PR. We need to run benchmark. Closes #26003 from dongjoon-hyun/SPARK-29320. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-03 11:58:25 -04:00
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Aggregate w decimal key: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
[SPARK-30409][SPARK-29173][SQL][TESTS] Use `NoOp` datasource in SQL benchmarks ### What changes were proposed in this pull request? In the PR, I propose to replace `.collect()`, `.count()` and `.foreach(_ => ())` in SQL benchmarks and use the `NoOp` datasource. I added an implicit class to `SqlBasedBenchmark` with the `.noop()` method. It can be used in benchmark like: `ds.noop()`. The last one is unfolded to `ds.write.format("noop").mode(Overwrite).save()`. ### Why are the changes needed? To avoid additional overhead that `collect()` (and other actions) has. For example, `.collect()` has to convert values according to external types and pull data to the driver. This can hide actual performance regressions or improvements of benchmarked operations. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Re-run all modified benchmarks using Amazon EC2. | Item | Description | | ---- | ----| | Region | us-west-2 (Oregon) | | Instance | r3.xlarge (spot instance) | | AMI | ami-06f2f779464715dc5 (ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20190722.1) | | Java | OpenJDK8/10 | - Run `TPCDSQueryBenchmark` using instructions from the PR #26049 ``` # `spark-tpcds-datagen` needs this. (JDK8) $ git clone https://github.com/apache/spark.git -b branch-2.4 --depth 1 spark-2.4 $ export SPARK_HOME=$PWD $ ./build/mvn clean package -DskipTests # Generate data. (JDK8) $ git clone gitgithub.com:maropu/spark-tpcds-datagen.git $ cd spark-tpcds-datagen/ $ build/mvn clean package $ mkdir -p /data/tpcds $ ./bin/dsdgen --output-location /data/tpcds/s1 // This need `Spark 2.4` ``` - Other benchmarks ran by the script: ``` #!/usr/bin/env python3 import os from sparktestsupport.shellutils import run_cmd benchmarks = [ ['sql/test', 'org.apache.spark.sql.execution.benchmark.AggregateBenchmark'], ['avro/test', 'org.apache.spark.sql.execution.benchmark.AvroReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.BloomFilterBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DataSourceReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ExtractBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.FilterPushdownBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.InExpressionBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.IntervalBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.JoinBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MakeDateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MiscBenchmark'], ['hive/test', 'org.apache.spark.sql.execution.benchmark.ObjectHashAggregateExecBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcV2NestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ParquetNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.RangeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.UDFBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideSchemaBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideTableBenchmark'], ['hive/test', 'org.apache.spark.sql.hive.orc.OrcReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.csv.CSVBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.json.JsonBenchmark'] ] print('Set SPARK_GENERATE_BENCHMARK_FILES=1') os.environ['SPARK_GENERATE_BENCHMARK_FILES'] = '1' for b in benchmarks: print("Run benchmark: %s" % b[1]) run_cmd(['build/sbt', '%s:runMain %s' % (b[0], b[1])]) ``` Closes #27078 from MaxGekk/noop-in-benchmarks. Lead-authored-by: Maxim Gekk <max.gekk@gmail.com> Co-authored-by: Maxim Gekk <maxim.gekk@databricks.com> Co-authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-01-12 16:18:19 -05:00
codegen = F 3487 3534 66 6.0 166.3 1.0X
codegen = T hashmap = F 2239 2353 161 9.4 106.8 1.6X
codegen = T hashmap = T 664 693 26 31.6 31.7 5.3X
[SPARK-29320][TESTS] Compare `sql/core` module in JDK8/11 (Part 1) ### What changes were proposed in this pull request? This PR regenerates the `sql/core` benchmarks in JDK8/11 to compare the result. In general, we compare the ratio instead of the time. However, in this PR, the average time is compared. This PR should be considered as a rough comparison. **A. EXPECTED CASES(JDK11 is faster in general)** - [x] BloomFilterBenchmark (JDK11 is faster except one case) - [x] BuiltInDataSourceWriteBenchmark (JDK11 is faster at CSV/ORC) - [x] CSVBenchmark (JDK11 is faster except five cases) - [x] ColumnarBatchBenchmark (JDK11 is faster at `boolean`/`string` and some cases in `int`/`array`) - [x] DatasetBenchmark (JDK11 is faster with `string`, but is slower for `long` type) - [x] ExternalAppendOnlyUnsafeRowArrayBenchmark (JDK11 is faster except two cases) - [x] ExtractBenchmark (JDK11 is faster except HOUR/MINUTE/SECOND/MILLISECONDS/MICROSECONDS) - [x] HashedRelationMetricsBenchmark (JDK11 is faster) - [x] JSONBenchmark (JDK11 is much faster except eight cases) - [x] JoinBenchmark (JDK11 is faster except five cases) - [x] OrcNestedSchemaPruningBenchmark (JDK11 is faster in nine cases) - [x] PrimitiveArrayBenchmark (JDK11 is faster) - [x] SortBenchmark (JDK11 is faster except `Arrays.sort` case) - [x] UDFBenchmark (N/A, values are too small) - [x] UnsafeArrayDataBenchmark (JDK11 is faster except one case) - [x] WideTableBenchmark (JDK11 is faster except two cases) **B. CASES WE NEED TO INVESTIGATE MORE LATER** - [x] AggregateBenchmark (JDK11 is slower in general) - [x] CompressionSchemeBenchmark (JDK11 is slower in general except `string`) - [x] DataSourceReadBenchmark (JDK11 is slower in general) - [x] DateTimeBenchmark (JDK11 is slightly slower in general except `parsing`) - [x] MakeDateTimeBenchmark (JDK11 is slower except two cases) - [x] MiscBenchmark (JDK11 is slower except ten cases) - [x] OrcV2NestedSchemaPruningBenchmark (JDK11 is slower) - [x] ParquetNestedSchemaPruningBenchmark (JDK11 is slower except six cases) - [x] RangeBenchmark (JDK11 is slower except one case) `FilterPushdownBenchmark/InExpressionBenchmark/WideSchemaBenchmark` will be compared later because it took long timer. ### Why are the changes needed? According to the result, there are some difference between JDK8/JDK11. This will be a baseline for the future improvement and comparison. Also, as a reproducible environment, the following environment is used. - Instance: `r3.xlarge` - OS: `CentOS Linux release 7.5.1804 (Core)` - JDK: - `OpenJDK Runtime Environment (build 1.8.0_222-b10)` - `OpenJDK Runtime Environment 18.9 (build 11.0.4+11-LTS)` ### Does this PR introduce any user-facing change? No. ### How was this patch tested? This is a test-only PR. We need to run benchmark. Closes #26003 from dongjoon-hyun/SPARK-29320. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-03 11:58:25 -04:00
================================================================================================
aggregate with multiple key types
================================================================================================
[SPARK-30409][SPARK-29173][SQL][TESTS] Use `NoOp` datasource in SQL benchmarks ### What changes were proposed in this pull request? In the PR, I propose to replace `.collect()`, `.count()` and `.foreach(_ => ())` in SQL benchmarks and use the `NoOp` datasource. I added an implicit class to `SqlBasedBenchmark` with the `.noop()` method. It can be used in benchmark like: `ds.noop()`. The last one is unfolded to `ds.write.format("noop").mode(Overwrite).save()`. ### Why are the changes needed? To avoid additional overhead that `collect()` (and other actions) has. For example, `.collect()` has to convert values according to external types and pull data to the driver. This can hide actual performance regressions or improvements of benchmarked operations. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Re-run all modified benchmarks using Amazon EC2. | Item | Description | | ---- | ----| | Region | us-west-2 (Oregon) | | Instance | r3.xlarge (spot instance) | | AMI | ami-06f2f779464715dc5 (ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20190722.1) | | Java | OpenJDK8/10 | - Run `TPCDSQueryBenchmark` using instructions from the PR #26049 ``` # `spark-tpcds-datagen` needs this. (JDK8) $ git clone https://github.com/apache/spark.git -b branch-2.4 --depth 1 spark-2.4 $ export SPARK_HOME=$PWD $ ./build/mvn clean package -DskipTests # Generate data. (JDK8) $ git clone gitgithub.com:maropu/spark-tpcds-datagen.git $ cd spark-tpcds-datagen/ $ build/mvn clean package $ mkdir -p /data/tpcds $ ./bin/dsdgen --output-location /data/tpcds/s1 // This need `Spark 2.4` ``` - Other benchmarks ran by the script: ``` #!/usr/bin/env python3 import os from sparktestsupport.shellutils import run_cmd benchmarks = [ ['sql/test', 'org.apache.spark.sql.execution.benchmark.AggregateBenchmark'], ['avro/test', 'org.apache.spark.sql.execution.benchmark.AvroReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.BloomFilterBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DataSourceReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ExtractBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.FilterPushdownBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.InExpressionBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.IntervalBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.JoinBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MakeDateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MiscBenchmark'], ['hive/test', 'org.apache.spark.sql.execution.benchmark.ObjectHashAggregateExecBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcV2NestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ParquetNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.RangeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.UDFBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideSchemaBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideTableBenchmark'], ['hive/test', 'org.apache.spark.sql.hive.orc.OrcReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.csv.CSVBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.json.JsonBenchmark'] ] print('Set SPARK_GENERATE_BENCHMARK_FILES=1') os.environ['SPARK_GENERATE_BENCHMARK_FILES'] = '1' for b in benchmarks: print("Run benchmark: %s" % b[1]) run_cmd(['build/sbt', '%s:runMain %s' % (b[0], b[1])]) ``` Closes #27078 from MaxGekk/noop-in-benchmarks. Lead-authored-by: Maxim Gekk <max.gekk@gmail.com> Co-authored-by: Maxim Gekk <maxim.gekk@databricks.com> Co-authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-01-12 16:18:19 -05:00
OpenJDK 64-Bit Server VM 11.0.5+10-post-Ubuntu-0ubuntu1.118.04 on Linux 4.15.0-1044-aws
[SPARK-29320][TESTS] Compare `sql/core` module in JDK8/11 (Part 1) ### What changes were proposed in this pull request? This PR regenerates the `sql/core` benchmarks in JDK8/11 to compare the result. In general, we compare the ratio instead of the time. However, in this PR, the average time is compared. This PR should be considered as a rough comparison. **A. EXPECTED CASES(JDK11 is faster in general)** - [x] BloomFilterBenchmark (JDK11 is faster except one case) - [x] BuiltInDataSourceWriteBenchmark (JDK11 is faster at CSV/ORC) - [x] CSVBenchmark (JDK11 is faster except five cases) - [x] ColumnarBatchBenchmark (JDK11 is faster at `boolean`/`string` and some cases in `int`/`array`) - [x] DatasetBenchmark (JDK11 is faster with `string`, but is slower for `long` type) - [x] ExternalAppendOnlyUnsafeRowArrayBenchmark (JDK11 is faster except two cases) - [x] ExtractBenchmark (JDK11 is faster except HOUR/MINUTE/SECOND/MILLISECONDS/MICROSECONDS) - [x] HashedRelationMetricsBenchmark (JDK11 is faster) - [x] JSONBenchmark (JDK11 is much faster except eight cases) - [x] JoinBenchmark (JDK11 is faster except five cases) - [x] OrcNestedSchemaPruningBenchmark (JDK11 is faster in nine cases) - [x] PrimitiveArrayBenchmark (JDK11 is faster) - [x] SortBenchmark (JDK11 is faster except `Arrays.sort` case) - [x] UDFBenchmark (N/A, values are too small) - [x] UnsafeArrayDataBenchmark (JDK11 is faster except one case) - [x] WideTableBenchmark (JDK11 is faster except two cases) **B. CASES WE NEED TO INVESTIGATE MORE LATER** - [x] AggregateBenchmark (JDK11 is slower in general) - [x] CompressionSchemeBenchmark (JDK11 is slower in general except `string`) - [x] DataSourceReadBenchmark (JDK11 is slower in general) - [x] DateTimeBenchmark (JDK11 is slightly slower in general except `parsing`) - [x] MakeDateTimeBenchmark (JDK11 is slower except two cases) - [x] MiscBenchmark (JDK11 is slower except ten cases) - [x] OrcV2NestedSchemaPruningBenchmark (JDK11 is slower) - [x] ParquetNestedSchemaPruningBenchmark (JDK11 is slower except six cases) - [x] RangeBenchmark (JDK11 is slower except one case) `FilterPushdownBenchmark/InExpressionBenchmark/WideSchemaBenchmark` will be compared later because it took long timer. ### Why are the changes needed? According to the result, there are some difference between JDK8/JDK11. This will be a baseline for the future improvement and comparison. Also, as a reproducible environment, the following environment is used. - Instance: `r3.xlarge` - OS: `CentOS Linux release 7.5.1804 (Core)` - JDK: - `OpenJDK Runtime Environment (build 1.8.0_222-b10)` - `OpenJDK Runtime Environment 18.9 (build 11.0.4+11-LTS)` ### Does this PR introduce any user-facing change? No. ### How was this patch tested? This is a test-only PR. We need to run benchmark. Closes #26003 from dongjoon-hyun/SPARK-29320. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-03 11:58:25 -04:00
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
Aggregate w multiple keys: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
[SPARK-30409][SPARK-29173][SQL][TESTS] Use `NoOp` datasource in SQL benchmarks ### What changes were proposed in this pull request? In the PR, I propose to replace `.collect()`, `.count()` and `.foreach(_ => ())` in SQL benchmarks and use the `NoOp` datasource. I added an implicit class to `SqlBasedBenchmark` with the `.noop()` method. It can be used in benchmark like: `ds.noop()`. The last one is unfolded to `ds.write.format("noop").mode(Overwrite).save()`. ### Why are the changes needed? To avoid additional overhead that `collect()` (and other actions) has. For example, `.collect()` has to convert values according to external types and pull data to the driver. This can hide actual performance regressions or improvements of benchmarked operations. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Re-run all modified benchmarks using Amazon EC2. | Item | Description | | ---- | ----| | Region | us-west-2 (Oregon) | | Instance | r3.xlarge (spot instance) | | AMI | ami-06f2f779464715dc5 (ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20190722.1) | | Java | OpenJDK8/10 | - Run `TPCDSQueryBenchmark` using instructions from the PR #26049 ``` # `spark-tpcds-datagen` needs this. (JDK8) $ git clone https://github.com/apache/spark.git -b branch-2.4 --depth 1 spark-2.4 $ export SPARK_HOME=$PWD $ ./build/mvn clean package -DskipTests # Generate data. (JDK8) $ git clone gitgithub.com:maropu/spark-tpcds-datagen.git $ cd spark-tpcds-datagen/ $ build/mvn clean package $ mkdir -p /data/tpcds $ ./bin/dsdgen --output-location /data/tpcds/s1 // This need `Spark 2.4` ``` - Other benchmarks ran by the script: ``` #!/usr/bin/env python3 import os from sparktestsupport.shellutils import run_cmd benchmarks = [ ['sql/test', 'org.apache.spark.sql.execution.benchmark.AggregateBenchmark'], ['avro/test', 'org.apache.spark.sql.execution.benchmark.AvroReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.BloomFilterBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DataSourceReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ExtractBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.FilterPushdownBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.InExpressionBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.IntervalBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.JoinBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MakeDateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MiscBenchmark'], ['hive/test', 'org.apache.spark.sql.execution.benchmark.ObjectHashAggregateExecBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcV2NestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ParquetNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.RangeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.UDFBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideSchemaBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideTableBenchmark'], ['hive/test', 'org.apache.spark.sql.hive.orc.OrcReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.csv.CSVBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.json.JsonBenchmark'] ] print('Set SPARK_GENERATE_BENCHMARK_FILES=1') os.environ['SPARK_GENERATE_BENCHMARK_FILES'] = '1' for b in benchmarks: print("Run benchmark: %s" % b[1]) run_cmd(['build/sbt', '%s:runMain %s' % (b[0], b[1])]) ``` Closes #27078 from MaxGekk/noop-in-benchmarks. Lead-authored-by: Maxim Gekk <max.gekk@gmail.com> Co-authored-by: Maxim Gekk <maxim.gekk@databricks.com> Co-authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-01-12 16:18:19 -05:00
codegen = F 6414 6475 86 3.3 305.9 1.0X
codegen = T hashmap = F 3776 3800 34 5.6 180.1 1.7X
codegen = T hashmap = T 2824 2852 40 7.4 134.7 2.3X
[SPARK-29320][TESTS] Compare `sql/core` module in JDK8/11 (Part 1) ### What changes were proposed in this pull request? This PR regenerates the `sql/core` benchmarks in JDK8/11 to compare the result. In general, we compare the ratio instead of the time. However, in this PR, the average time is compared. This PR should be considered as a rough comparison. **A. EXPECTED CASES(JDK11 is faster in general)** - [x] BloomFilterBenchmark (JDK11 is faster except one case) - [x] BuiltInDataSourceWriteBenchmark (JDK11 is faster at CSV/ORC) - [x] CSVBenchmark (JDK11 is faster except five cases) - [x] ColumnarBatchBenchmark (JDK11 is faster at `boolean`/`string` and some cases in `int`/`array`) - [x] DatasetBenchmark (JDK11 is faster with `string`, but is slower for `long` type) - [x] ExternalAppendOnlyUnsafeRowArrayBenchmark (JDK11 is faster except two cases) - [x] ExtractBenchmark (JDK11 is faster except HOUR/MINUTE/SECOND/MILLISECONDS/MICROSECONDS) - [x] HashedRelationMetricsBenchmark (JDK11 is faster) - [x] JSONBenchmark (JDK11 is much faster except eight cases) - [x] JoinBenchmark (JDK11 is faster except five cases) - [x] OrcNestedSchemaPruningBenchmark (JDK11 is faster in nine cases) - [x] PrimitiveArrayBenchmark (JDK11 is faster) - [x] SortBenchmark (JDK11 is faster except `Arrays.sort` case) - [x] UDFBenchmark (N/A, values are too small) - [x] UnsafeArrayDataBenchmark (JDK11 is faster except one case) - [x] WideTableBenchmark (JDK11 is faster except two cases) **B. CASES WE NEED TO INVESTIGATE MORE LATER** - [x] AggregateBenchmark (JDK11 is slower in general) - [x] CompressionSchemeBenchmark (JDK11 is slower in general except `string`) - [x] DataSourceReadBenchmark (JDK11 is slower in general) - [x] DateTimeBenchmark (JDK11 is slightly slower in general except `parsing`) - [x] MakeDateTimeBenchmark (JDK11 is slower except two cases) - [x] MiscBenchmark (JDK11 is slower except ten cases) - [x] OrcV2NestedSchemaPruningBenchmark (JDK11 is slower) - [x] ParquetNestedSchemaPruningBenchmark (JDK11 is slower except six cases) - [x] RangeBenchmark (JDK11 is slower except one case) `FilterPushdownBenchmark/InExpressionBenchmark/WideSchemaBenchmark` will be compared later because it took long timer. ### Why are the changes needed? According to the result, there are some difference between JDK8/JDK11. This will be a baseline for the future improvement and comparison. Also, as a reproducible environment, the following environment is used. - Instance: `r3.xlarge` - OS: `CentOS Linux release 7.5.1804 (Core)` - JDK: - `OpenJDK Runtime Environment (build 1.8.0_222-b10)` - `OpenJDK Runtime Environment 18.9 (build 11.0.4+11-LTS)` ### Does this PR introduce any user-facing change? No. ### How was this patch tested? This is a test-only PR. We need to run benchmark. Closes #26003 from dongjoon-hyun/SPARK-29320. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-03 11:58:25 -04:00
================================================================================================
max function bytecode size of wholestagecodegen
================================================================================================
[SPARK-30409][SPARK-29173][SQL][TESTS] Use `NoOp` datasource in SQL benchmarks ### What changes were proposed in this pull request? In the PR, I propose to replace `.collect()`, `.count()` and `.foreach(_ => ())` in SQL benchmarks and use the `NoOp` datasource. I added an implicit class to `SqlBasedBenchmark` with the `.noop()` method. It can be used in benchmark like: `ds.noop()`. The last one is unfolded to `ds.write.format("noop").mode(Overwrite).save()`. ### Why are the changes needed? To avoid additional overhead that `collect()` (and other actions) has. For example, `.collect()` has to convert values according to external types and pull data to the driver. This can hide actual performance regressions or improvements of benchmarked operations. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Re-run all modified benchmarks using Amazon EC2. | Item | Description | | ---- | ----| | Region | us-west-2 (Oregon) | | Instance | r3.xlarge (spot instance) | | AMI | ami-06f2f779464715dc5 (ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20190722.1) | | Java | OpenJDK8/10 | - Run `TPCDSQueryBenchmark` using instructions from the PR #26049 ``` # `spark-tpcds-datagen` needs this. (JDK8) $ git clone https://github.com/apache/spark.git -b branch-2.4 --depth 1 spark-2.4 $ export SPARK_HOME=$PWD $ ./build/mvn clean package -DskipTests # Generate data. (JDK8) $ git clone gitgithub.com:maropu/spark-tpcds-datagen.git $ cd spark-tpcds-datagen/ $ build/mvn clean package $ mkdir -p /data/tpcds $ ./bin/dsdgen --output-location /data/tpcds/s1 // This need `Spark 2.4` ``` - Other benchmarks ran by the script: ``` #!/usr/bin/env python3 import os from sparktestsupport.shellutils import run_cmd benchmarks = [ ['sql/test', 'org.apache.spark.sql.execution.benchmark.AggregateBenchmark'], ['avro/test', 'org.apache.spark.sql.execution.benchmark.AvroReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.BloomFilterBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DataSourceReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ExtractBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.FilterPushdownBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.InExpressionBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.IntervalBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.JoinBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MakeDateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MiscBenchmark'], ['hive/test', 'org.apache.spark.sql.execution.benchmark.ObjectHashAggregateExecBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcV2NestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ParquetNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.RangeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.UDFBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideSchemaBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideTableBenchmark'], ['hive/test', 'org.apache.spark.sql.hive.orc.OrcReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.csv.CSVBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.json.JsonBenchmark'] ] print('Set SPARK_GENERATE_BENCHMARK_FILES=1') os.environ['SPARK_GENERATE_BENCHMARK_FILES'] = '1' for b in benchmarks: print("Run benchmark: %s" % b[1]) run_cmd(['build/sbt', '%s:runMain %s' % (b[0], b[1])]) ``` Closes #27078 from MaxGekk/noop-in-benchmarks. Lead-authored-by: Maxim Gekk <max.gekk@gmail.com> Co-authored-by: Maxim Gekk <maxim.gekk@databricks.com> Co-authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-01-12 16:18:19 -05:00
OpenJDK 64-Bit Server VM 11.0.5+10-post-Ubuntu-0ubuntu1.118.04 on Linux 4.15.0-1044-aws
[SPARK-29320][TESTS] Compare `sql/core` module in JDK8/11 (Part 1) ### What changes were proposed in this pull request? This PR regenerates the `sql/core` benchmarks in JDK8/11 to compare the result. In general, we compare the ratio instead of the time. However, in this PR, the average time is compared. This PR should be considered as a rough comparison. **A. EXPECTED CASES(JDK11 is faster in general)** - [x] BloomFilterBenchmark (JDK11 is faster except one case) - [x] BuiltInDataSourceWriteBenchmark (JDK11 is faster at CSV/ORC) - [x] CSVBenchmark (JDK11 is faster except five cases) - [x] ColumnarBatchBenchmark (JDK11 is faster at `boolean`/`string` and some cases in `int`/`array`) - [x] DatasetBenchmark (JDK11 is faster with `string`, but is slower for `long` type) - [x] ExternalAppendOnlyUnsafeRowArrayBenchmark (JDK11 is faster except two cases) - [x] ExtractBenchmark (JDK11 is faster except HOUR/MINUTE/SECOND/MILLISECONDS/MICROSECONDS) - [x] HashedRelationMetricsBenchmark (JDK11 is faster) - [x] JSONBenchmark (JDK11 is much faster except eight cases) - [x] JoinBenchmark (JDK11 is faster except five cases) - [x] OrcNestedSchemaPruningBenchmark (JDK11 is faster in nine cases) - [x] PrimitiveArrayBenchmark (JDK11 is faster) - [x] SortBenchmark (JDK11 is faster except `Arrays.sort` case) - [x] UDFBenchmark (N/A, values are too small) - [x] UnsafeArrayDataBenchmark (JDK11 is faster except one case) - [x] WideTableBenchmark (JDK11 is faster except two cases) **B. CASES WE NEED TO INVESTIGATE MORE LATER** - [x] AggregateBenchmark (JDK11 is slower in general) - [x] CompressionSchemeBenchmark (JDK11 is slower in general except `string`) - [x] DataSourceReadBenchmark (JDK11 is slower in general) - [x] DateTimeBenchmark (JDK11 is slightly slower in general except `parsing`) - [x] MakeDateTimeBenchmark (JDK11 is slower except two cases) - [x] MiscBenchmark (JDK11 is slower except ten cases) - [x] OrcV2NestedSchemaPruningBenchmark (JDK11 is slower) - [x] ParquetNestedSchemaPruningBenchmark (JDK11 is slower except six cases) - [x] RangeBenchmark (JDK11 is slower except one case) `FilterPushdownBenchmark/InExpressionBenchmark/WideSchemaBenchmark` will be compared later because it took long timer. ### Why are the changes needed? According to the result, there are some difference between JDK8/JDK11. This will be a baseline for the future improvement and comparison. Also, as a reproducible environment, the following environment is used. - Instance: `r3.xlarge` - OS: `CentOS Linux release 7.5.1804 (Core)` - JDK: - `OpenJDK Runtime Environment (build 1.8.0_222-b10)` - `OpenJDK Runtime Environment 18.9 (build 11.0.4+11-LTS)` ### Does this PR introduce any user-facing change? No. ### How was this patch tested? This is a test-only PR. We need to run benchmark. Closes #26003 from dongjoon-hyun/SPARK-29320. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-03 11:58:25 -04:00
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
max function bytecode size: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
[SPARK-30409][SPARK-29173][SQL][TESTS] Use `NoOp` datasource in SQL benchmarks ### What changes were proposed in this pull request? In the PR, I propose to replace `.collect()`, `.count()` and `.foreach(_ => ())` in SQL benchmarks and use the `NoOp` datasource. I added an implicit class to `SqlBasedBenchmark` with the `.noop()` method. It can be used in benchmark like: `ds.noop()`. The last one is unfolded to `ds.write.format("noop").mode(Overwrite).save()`. ### Why are the changes needed? To avoid additional overhead that `collect()` (and other actions) has. For example, `.collect()` has to convert values according to external types and pull data to the driver. This can hide actual performance regressions or improvements of benchmarked operations. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Re-run all modified benchmarks using Amazon EC2. | Item | Description | | ---- | ----| | Region | us-west-2 (Oregon) | | Instance | r3.xlarge (spot instance) | | AMI | ami-06f2f779464715dc5 (ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20190722.1) | | Java | OpenJDK8/10 | - Run `TPCDSQueryBenchmark` using instructions from the PR #26049 ``` # `spark-tpcds-datagen` needs this. (JDK8) $ git clone https://github.com/apache/spark.git -b branch-2.4 --depth 1 spark-2.4 $ export SPARK_HOME=$PWD $ ./build/mvn clean package -DskipTests # Generate data. (JDK8) $ git clone gitgithub.com:maropu/spark-tpcds-datagen.git $ cd spark-tpcds-datagen/ $ build/mvn clean package $ mkdir -p /data/tpcds $ ./bin/dsdgen --output-location /data/tpcds/s1 // This need `Spark 2.4` ``` - Other benchmarks ran by the script: ``` #!/usr/bin/env python3 import os from sparktestsupport.shellutils import run_cmd benchmarks = [ ['sql/test', 'org.apache.spark.sql.execution.benchmark.AggregateBenchmark'], ['avro/test', 'org.apache.spark.sql.execution.benchmark.AvroReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.BloomFilterBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DataSourceReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ExtractBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.FilterPushdownBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.InExpressionBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.IntervalBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.JoinBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MakeDateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MiscBenchmark'], ['hive/test', 'org.apache.spark.sql.execution.benchmark.ObjectHashAggregateExecBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcV2NestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ParquetNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.RangeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.UDFBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideSchemaBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideTableBenchmark'], ['hive/test', 'org.apache.spark.sql.hive.orc.OrcReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.csv.CSVBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.json.JsonBenchmark'] ] print('Set SPARK_GENERATE_BENCHMARK_FILES=1') os.environ['SPARK_GENERATE_BENCHMARK_FILES'] = '1' for b in benchmarks: print("Run benchmark: %s" % b[1]) run_cmd(['build/sbt', '%s:runMain %s' % (b[0], b[1])]) ``` Closes #27078 from MaxGekk/noop-in-benchmarks. Lead-authored-by: Maxim Gekk <max.gekk@gmail.com> Co-authored-by: Maxim Gekk <maxim.gekk@databricks.com> Co-authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-01-12 16:18:19 -05:00
codegen = F 654 696 36 1.0 998.6 1.0X
codegen = T hugeMethodLimit = 10000 378 405 24 1.7 577.1 1.7X
codegen = T hugeMethodLimit = 1500 383 403 22 1.7 584.3 1.7X
[SPARK-29320][TESTS] Compare `sql/core` module in JDK8/11 (Part 1) ### What changes were proposed in this pull request? This PR regenerates the `sql/core` benchmarks in JDK8/11 to compare the result. In general, we compare the ratio instead of the time. However, in this PR, the average time is compared. This PR should be considered as a rough comparison. **A. EXPECTED CASES(JDK11 is faster in general)** - [x] BloomFilterBenchmark (JDK11 is faster except one case) - [x] BuiltInDataSourceWriteBenchmark (JDK11 is faster at CSV/ORC) - [x] CSVBenchmark (JDK11 is faster except five cases) - [x] ColumnarBatchBenchmark (JDK11 is faster at `boolean`/`string` and some cases in `int`/`array`) - [x] DatasetBenchmark (JDK11 is faster with `string`, but is slower for `long` type) - [x] ExternalAppendOnlyUnsafeRowArrayBenchmark (JDK11 is faster except two cases) - [x] ExtractBenchmark (JDK11 is faster except HOUR/MINUTE/SECOND/MILLISECONDS/MICROSECONDS) - [x] HashedRelationMetricsBenchmark (JDK11 is faster) - [x] JSONBenchmark (JDK11 is much faster except eight cases) - [x] JoinBenchmark (JDK11 is faster except five cases) - [x] OrcNestedSchemaPruningBenchmark (JDK11 is faster in nine cases) - [x] PrimitiveArrayBenchmark (JDK11 is faster) - [x] SortBenchmark (JDK11 is faster except `Arrays.sort` case) - [x] UDFBenchmark (N/A, values are too small) - [x] UnsafeArrayDataBenchmark (JDK11 is faster except one case) - [x] WideTableBenchmark (JDK11 is faster except two cases) **B. CASES WE NEED TO INVESTIGATE MORE LATER** - [x] AggregateBenchmark (JDK11 is slower in general) - [x] CompressionSchemeBenchmark (JDK11 is slower in general except `string`) - [x] DataSourceReadBenchmark (JDK11 is slower in general) - [x] DateTimeBenchmark (JDK11 is slightly slower in general except `parsing`) - [x] MakeDateTimeBenchmark (JDK11 is slower except two cases) - [x] MiscBenchmark (JDK11 is slower except ten cases) - [x] OrcV2NestedSchemaPruningBenchmark (JDK11 is slower) - [x] ParquetNestedSchemaPruningBenchmark (JDK11 is slower except six cases) - [x] RangeBenchmark (JDK11 is slower except one case) `FilterPushdownBenchmark/InExpressionBenchmark/WideSchemaBenchmark` will be compared later because it took long timer. ### Why are the changes needed? According to the result, there are some difference between JDK8/JDK11. This will be a baseline for the future improvement and comparison. Also, as a reproducible environment, the following environment is used. - Instance: `r3.xlarge` - OS: `CentOS Linux release 7.5.1804 (Core)` - JDK: - `OpenJDK Runtime Environment (build 1.8.0_222-b10)` - `OpenJDK Runtime Environment 18.9 (build 11.0.4+11-LTS)` ### Does this PR introduce any user-facing change? No. ### How was this patch tested? This is a test-only PR. We need to run benchmark. Closes #26003 from dongjoon-hyun/SPARK-29320. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-03 11:58:25 -04:00
================================================================================================
cube
================================================================================================
[SPARK-30409][SPARK-29173][SQL][TESTS] Use `NoOp` datasource in SQL benchmarks ### What changes were proposed in this pull request? In the PR, I propose to replace `.collect()`, `.count()` and `.foreach(_ => ())` in SQL benchmarks and use the `NoOp` datasource. I added an implicit class to `SqlBasedBenchmark` with the `.noop()` method. It can be used in benchmark like: `ds.noop()`. The last one is unfolded to `ds.write.format("noop").mode(Overwrite).save()`. ### Why are the changes needed? To avoid additional overhead that `collect()` (and other actions) has. For example, `.collect()` has to convert values according to external types and pull data to the driver. This can hide actual performance regressions or improvements of benchmarked operations. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Re-run all modified benchmarks using Amazon EC2. | Item | Description | | ---- | ----| | Region | us-west-2 (Oregon) | | Instance | r3.xlarge (spot instance) | | AMI | ami-06f2f779464715dc5 (ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20190722.1) | | Java | OpenJDK8/10 | - Run `TPCDSQueryBenchmark` using instructions from the PR #26049 ``` # `spark-tpcds-datagen` needs this. (JDK8) $ git clone https://github.com/apache/spark.git -b branch-2.4 --depth 1 spark-2.4 $ export SPARK_HOME=$PWD $ ./build/mvn clean package -DskipTests # Generate data. (JDK8) $ git clone gitgithub.com:maropu/spark-tpcds-datagen.git $ cd spark-tpcds-datagen/ $ build/mvn clean package $ mkdir -p /data/tpcds $ ./bin/dsdgen --output-location /data/tpcds/s1 // This need `Spark 2.4` ``` - Other benchmarks ran by the script: ``` #!/usr/bin/env python3 import os from sparktestsupport.shellutils import run_cmd benchmarks = [ ['sql/test', 'org.apache.spark.sql.execution.benchmark.AggregateBenchmark'], ['avro/test', 'org.apache.spark.sql.execution.benchmark.AvroReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.BloomFilterBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DataSourceReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ExtractBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.FilterPushdownBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.InExpressionBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.IntervalBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.JoinBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MakeDateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MiscBenchmark'], ['hive/test', 'org.apache.spark.sql.execution.benchmark.ObjectHashAggregateExecBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcV2NestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ParquetNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.RangeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.UDFBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideSchemaBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideTableBenchmark'], ['hive/test', 'org.apache.spark.sql.hive.orc.OrcReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.csv.CSVBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.json.JsonBenchmark'] ] print('Set SPARK_GENERATE_BENCHMARK_FILES=1') os.environ['SPARK_GENERATE_BENCHMARK_FILES'] = '1' for b in benchmarks: print("Run benchmark: %s" % b[1]) run_cmd(['build/sbt', '%s:runMain %s' % (b[0], b[1])]) ``` Closes #27078 from MaxGekk/noop-in-benchmarks. Lead-authored-by: Maxim Gekk <max.gekk@gmail.com> Co-authored-by: Maxim Gekk <maxim.gekk@databricks.com> Co-authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-01-12 16:18:19 -05:00
OpenJDK 64-Bit Server VM 11.0.5+10-post-Ubuntu-0ubuntu1.118.04 on Linux 4.15.0-1044-aws
[SPARK-29320][TESTS] Compare `sql/core` module in JDK8/11 (Part 1) ### What changes were proposed in this pull request? This PR regenerates the `sql/core` benchmarks in JDK8/11 to compare the result. In general, we compare the ratio instead of the time. However, in this PR, the average time is compared. This PR should be considered as a rough comparison. **A. EXPECTED CASES(JDK11 is faster in general)** - [x] BloomFilterBenchmark (JDK11 is faster except one case) - [x] BuiltInDataSourceWriteBenchmark (JDK11 is faster at CSV/ORC) - [x] CSVBenchmark (JDK11 is faster except five cases) - [x] ColumnarBatchBenchmark (JDK11 is faster at `boolean`/`string` and some cases in `int`/`array`) - [x] DatasetBenchmark (JDK11 is faster with `string`, but is slower for `long` type) - [x] ExternalAppendOnlyUnsafeRowArrayBenchmark (JDK11 is faster except two cases) - [x] ExtractBenchmark (JDK11 is faster except HOUR/MINUTE/SECOND/MILLISECONDS/MICROSECONDS) - [x] HashedRelationMetricsBenchmark (JDK11 is faster) - [x] JSONBenchmark (JDK11 is much faster except eight cases) - [x] JoinBenchmark (JDK11 is faster except five cases) - [x] OrcNestedSchemaPruningBenchmark (JDK11 is faster in nine cases) - [x] PrimitiveArrayBenchmark (JDK11 is faster) - [x] SortBenchmark (JDK11 is faster except `Arrays.sort` case) - [x] UDFBenchmark (N/A, values are too small) - [x] UnsafeArrayDataBenchmark (JDK11 is faster except one case) - [x] WideTableBenchmark (JDK11 is faster except two cases) **B. CASES WE NEED TO INVESTIGATE MORE LATER** - [x] AggregateBenchmark (JDK11 is slower in general) - [x] CompressionSchemeBenchmark (JDK11 is slower in general except `string`) - [x] DataSourceReadBenchmark (JDK11 is slower in general) - [x] DateTimeBenchmark (JDK11 is slightly slower in general except `parsing`) - [x] MakeDateTimeBenchmark (JDK11 is slower except two cases) - [x] MiscBenchmark (JDK11 is slower except ten cases) - [x] OrcV2NestedSchemaPruningBenchmark (JDK11 is slower) - [x] ParquetNestedSchemaPruningBenchmark (JDK11 is slower except six cases) - [x] RangeBenchmark (JDK11 is slower except one case) `FilterPushdownBenchmark/InExpressionBenchmark/WideSchemaBenchmark` will be compared later because it took long timer. ### Why are the changes needed? According to the result, there are some difference between JDK8/JDK11. This will be a baseline for the future improvement and comparison. Also, as a reproducible environment, the following environment is used. - Instance: `r3.xlarge` - OS: `CentOS Linux release 7.5.1804 (Core)` - JDK: - `OpenJDK Runtime Environment (build 1.8.0_222-b10)` - `OpenJDK Runtime Environment 18.9 (build 11.0.4+11-LTS)` ### Does this PR introduce any user-facing change? No. ### How was this patch tested? This is a test-only PR. We need to run benchmark. Closes #26003 from dongjoon-hyun/SPARK-29320. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-03 11:58:25 -04:00
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
cube: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
[SPARK-30409][SPARK-29173][SQL][TESTS] Use `NoOp` datasource in SQL benchmarks ### What changes were proposed in this pull request? In the PR, I propose to replace `.collect()`, `.count()` and `.foreach(_ => ())` in SQL benchmarks and use the `NoOp` datasource. I added an implicit class to `SqlBasedBenchmark` with the `.noop()` method. It can be used in benchmark like: `ds.noop()`. The last one is unfolded to `ds.write.format("noop").mode(Overwrite).save()`. ### Why are the changes needed? To avoid additional overhead that `collect()` (and other actions) has. For example, `.collect()` has to convert values according to external types and pull data to the driver. This can hide actual performance regressions or improvements of benchmarked operations. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Re-run all modified benchmarks using Amazon EC2. | Item | Description | | ---- | ----| | Region | us-west-2 (Oregon) | | Instance | r3.xlarge (spot instance) | | AMI | ami-06f2f779464715dc5 (ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20190722.1) | | Java | OpenJDK8/10 | - Run `TPCDSQueryBenchmark` using instructions from the PR #26049 ``` # `spark-tpcds-datagen` needs this. (JDK8) $ git clone https://github.com/apache/spark.git -b branch-2.4 --depth 1 spark-2.4 $ export SPARK_HOME=$PWD $ ./build/mvn clean package -DskipTests # Generate data. (JDK8) $ git clone gitgithub.com:maropu/spark-tpcds-datagen.git $ cd spark-tpcds-datagen/ $ build/mvn clean package $ mkdir -p /data/tpcds $ ./bin/dsdgen --output-location /data/tpcds/s1 // This need `Spark 2.4` ``` - Other benchmarks ran by the script: ``` #!/usr/bin/env python3 import os from sparktestsupport.shellutils import run_cmd benchmarks = [ ['sql/test', 'org.apache.spark.sql.execution.benchmark.AggregateBenchmark'], ['avro/test', 'org.apache.spark.sql.execution.benchmark.AvroReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.BloomFilterBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DataSourceReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ExtractBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.FilterPushdownBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.InExpressionBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.IntervalBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.JoinBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MakeDateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MiscBenchmark'], ['hive/test', 'org.apache.spark.sql.execution.benchmark.ObjectHashAggregateExecBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcV2NestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ParquetNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.RangeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.UDFBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideSchemaBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideTableBenchmark'], ['hive/test', 'org.apache.spark.sql.hive.orc.OrcReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.csv.CSVBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.json.JsonBenchmark'] ] print('Set SPARK_GENERATE_BENCHMARK_FILES=1') os.environ['SPARK_GENERATE_BENCHMARK_FILES'] = '1' for b in benchmarks: print("Run benchmark: %s" % b[1]) run_cmd(['build/sbt', '%s:runMain %s' % (b[0], b[1])]) ``` Closes #27078 from MaxGekk/noop-in-benchmarks. Lead-authored-by: Maxim Gekk <max.gekk@gmail.com> Co-authored-by: Maxim Gekk <maxim.gekk@databricks.com> Co-authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-01-12 16:18:19 -05:00
cube wholestage off 3443 3444 1 1.5 656.7 1.0X
cube wholestage on 1797 1815 20 2.9 342.7 1.9X
[SPARK-29320][TESTS] Compare `sql/core` module in JDK8/11 (Part 1) ### What changes were proposed in this pull request? This PR regenerates the `sql/core` benchmarks in JDK8/11 to compare the result. In general, we compare the ratio instead of the time. However, in this PR, the average time is compared. This PR should be considered as a rough comparison. **A. EXPECTED CASES(JDK11 is faster in general)** - [x] BloomFilterBenchmark (JDK11 is faster except one case) - [x] BuiltInDataSourceWriteBenchmark (JDK11 is faster at CSV/ORC) - [x] CSVBenchmark (JDK11 is faster except five cases) - [x] ColumnarBatchBenchmark (JDK11 is faster at `boolean`/`string` and some cases in `int`/`array`) - [x] DatasetBenchmark (JDK11 is faster with `string`, but is slower for `long` type) - [x] ExternalAppendOnlyUnsafeRowArrayBenchmark (JDK11 is faster except two cases) - [x] ExtractBenchmark (JDK11 is faster except HOUR/MINUTE/SECOND/MILLISECONDS/MICROSECONDS) - [x] HashedRelationMetricsBenchmark (JDK11 is faster) - [x] JSONBenchmark (JDK11 is much faster except eight cases) - [x] JoinBenchmark (JDK11 is faster except five cases) - [x] OrcNestedSchemaPruningBenchmark (JDK11 is faster in nine cases) - [x] PrimitiveArrayBenchmark (JDK11 is faster) - [x] SortBenchmark (JDK11 is faster except `Arrays.sort` case) - [x] UDFBenchmark (N/A, values are too small) - [x] UnsafeArrayDataBenchmark (JDK11 is faster except one case) - [x] WideTableBenchmark (JDK11 is faster except two cases) **B. CASES WE NEED TO INVESTIGATE MORE LATER** - [x] AggregateBenchmark (JDK11 is slower in general) - [x] CompressionSchemeBenchmark (JDK11 is slower in general except `string`) - [x] DataSourceReadBenchmark (JDK11 is slower in general) - [x] DateTimeBenchmark (JDK11 is slightly slower in general except `parsing`) - [x] MakeDateTimeBenchmark (JDK11 is slower except two cases) - [x] MiscBenchmark (JDK11 is slower except ten cases) - [x] OrcV2NestedSchemaPruningBenchmark (JDK11 is slower) - [x] ParquetNestedSchemaPruningBenchmark (JDK11 is slower except six cases) - [x] RangeBenchmark (JDK11 is slower except one case) `FilterPushdownBenchmark/InExpressionBenchmark/WideSchemaBenchmark` will be compared later because it took long timer. ### Why are the changes needed? According to the result, there are some difference between JDK8/JDK11. This will be a baseline for the future improvement and comparison. Also, as a reproducible environment, the following environment is used. - Instance: `r3.xlarge` - OS: `CentOS Linux release 7.5.1804 (Core)` - JDK: - `OpenJDK Runtime Environment (build 1.8.0_222-b10)` - `OpenJDK Runtime Environment 18.9 (build 11.0.4+11-LTS)` ### Does this PR introduce any user-facing change? No. ### How was this patch tested? This is a test-only PR. We need to run benchmark. Closes #26003 from dongjoon-hyun/SPARK-29320. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-03 11:58:25 -04:00
================================================================================================
hash and BytesToBytesMap
================================================================================================
[SPARK-30409][SPARK-29173][SQL][TESTS] Use `NoOp` datasource in SQL benchmarks ### What changes were proposed in this pull request? In the PR, I propose to replace `.collect()`, `.count()` and `.foreach(_ => ())` in SQL benchmarks and use the `NoOp` datasource. I added an implicit class to `SqlBasedBenchmark` with the `.noop()` method. It can be used in benchmark like: `ds.noop()`. The last one is unfolded to `ds.write.format("noop").mode(Overwrite).save()`. ### Why are the changes needed? To avoid additional overhead that `collect()` (and other actions) has. For example, `.collect()` has to convert values according to external types and pull data to the driver. This can hide actual performance regressions or improvements of benchmarked operations. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Re-run all modified benchmarks using Amazon EC2. | Item | Description | | ---- | ----| | Region | us-west-2 (Oregon) | | Instance | r3.xlarge (spot instance) | | AMI | ami-06f2f779464715dc5 (ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20190722.1) | | Java | OpenJDK8/10 | - Run `TPCDSQueryBenchmark` using instructions from the PR #26049 ``` # `spark-tpcds-datagen` needs this. (JDK8) $ git clone https://github.com/apache/spark.git -b branch-2.4 --depth 1 spark-2.4 $ export SPARK_HOME=$PWD $ ./build/mvn clean package -DskipTests # Generate data. (JDK8) $ git clone gitgithub.com:maropu/spark-tpcds-datagen.git $ cd spark-tpcds-datagen/ $ build/mvn clean package $ mkdir -p /data/tpcds $ ./bin/dsdgen --output-location /data/tpcds/s1 // This need `Spark 2.4` ``` - Other benchmarks ran by the script: ``` #!/usr/bin/env python3 import os from sparktestsupport.shellutils import run_cmd benchmarks = [ ['sql/test', 'org.apache.spark.sql.execution.benchmark.AggregateBenchmark'], ['avro/test', 'org.apache.spark.sql.execution.benchmark.AvroReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.BloomFilterBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DataSourceReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ExtractBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.FilterPushdownBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.InExpressionBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.IntervalBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.JoinBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MakeDateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MiscBenchmark'], ['hive/test', 'org.apache.spark.sql.execution.benchmark.ObjectHashAggregateExecBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcV2NestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ParquetNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.RangeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.UDFBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideSchemaBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideTableBenchmark'], ['hive/test', 'org.apache.spark.sql.hive.orc.OrcReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.csv.CSVBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.json.JsonBenchmark'] ] print('Set SPARK_GENERATE_BENCHMARK_FILES=1') os.environ['SPARK_GENERATE_BENCHMARK_FILES'] = '1' for b in benchmarks: print("Run benchmark: %s" % b[1]) run_cmd(['build/sbt', '%s:runMain %s' % (b[0], b[1])]) ``` Closes #27078 from MaxGekk/noop-in-benchmarks. Lead-authored-by: Maxim Gekk <max.gekk@gmail.com> Co-authored-by: Maxim Gekk <maxim.gekk@databricks.com> Co-authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-01-12 16:18:19 -05:00
OpenJDK 64-Bit Server VM 11.0.5+10-post-Ubuntu-0ubuntu1.118.04 on Linux 4.15.0-1044-aws
[SPARK-29320][TESTS] Compare `sql/core` module in JDK8/11 (Part 1) ### What changes were proposed in this pull request? This PR regenerates the `sql/core` benchmarks in JDK8/11 to compare the result. In general, we compare the ratio instead of the time. However, in this PR, the average time is compared. This PR should be considered as a rough comparison. **A. EXPECTED CASES(JDK11 is faster in general)** - [x] BloomFilterBenchmark (JDK11 is faster except one case) - [x] BuiltInDataSourceWriteBenchmark (JDK11 is faster at CSV/ORC) - [x] CSVBenchmark (JDK11 is faster except five cases) - [x] ColumnarBatchBenchmark (JDK11 is faster at `boolean`/`string` and some cases in `int`/`array`) - [x] DatasetBenchmark (JDK11 is faster with `string`, but is slower for `long` type) - [x] ExternalAppendOnlyUnsafeRowArrayBenchmark (JDK11 is faster except two cases) - [x] ExtractBenchmark (JDK11 is faster except HOUR/MINUTE/SECOND/MILLISECONDS/MICROSECONDS) - [x] HashedRelationMetricsBenchmark (JDK11 is faster) - [x] JSONBenchmark (JDK11 is much faster except eight cases) - [x] JoinBenchmark (JDK11 is faster except five cases) - [x] OrcNestedSchemaPruningBenchmark (JDK11 is faster in nine cases) - [x] PrimitiveArrayBenchmark (JDK11 is faster) - [x] SortBenchmark (JDK11 is faster except `Arrays.sort` case) - [x] UDFBenchmark (N/A, values are too small) - [x] UnsafeArrayDataBenchmark (JDK11 is faster except one case) - [x] WideTableBenchmark (JDK11 is faster except two cases) **B. CASES WE NEED TO INVESTIGATE MORE LATER** - [x] AggregateBenchmark (JDK11 is slower in general) - [x] CompressionSchemeBenchmark (JDK11 is slower in general except `string`) - [x] DataSourceReadBenchmark (JDK11 is slower in general) - [x] DateTimeBenchmark (JDK11 is slightly slower in general except `parsing`) - [x] MakeDateTimeBenchmark (JDK11 is slower except two cases) - [x] MiscBenchmark (JDK11 is slower except ten cases) - [x] OrcV2NestedSchemaPruningBenchmark (JDK11 is slower) - [x] ParquetNestedSchemaPruningBenchmark (JDK11 is slower except six cases) - [x] RangeBenchmark (JDK11 is slower except one case) `FilterPushdownBenchmark/InExpressionBenchmark/WideSchemaBenchmark` will be compared later because it took long timer. ### Why are the changes needed? According to the result, there are some difference between JDK8/JDK11. This will be a baseline for the future improvement and comparison. Also, as a reproducible environment, the following environment is used. - Instance: `r3.xlarge` - OS: `CentOS Linux release 7.5.1804 (Core)` - JDK: - `OpenJDK Runtime Environment (build 1.8.0_222-b10)` - `OpenJDK Runtime Environment 18.9 (build 11.0.4+11-LTS)` ### Does this PR introduce any user-facing change? No. ### How was this patch tested? This is a test-only PR. We need to run benchmark. Closes #26003 from dongjoon-hyun/SPARK-29320. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-03 11:58:25 -04:00
Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
BytesToBytesMap: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
[SPARK-30409][SPARK-29173][SQL][TESTS] Use `NoOp` datasource in SQL benchmarks ### What changes were proposed in this pull request? In the PR, I propose to replace `.collect()`, `.count()` and `.foreach(_ => ())` in SQL benchmarks and use the `NoOp` datasource. I added an implicit class to `SqlBasedBenchmark` with the `.noop()` method. It can be used in benchmark like: `ds.noop()`. The last one is unfolded to `ds.write.format("noop").mode(Overwrite).save()`. ### Why are the changes needed? To avoid additional overhead that `collect()` (and other actions) has. For example, `.collect()` has to convert values according to external types and pull data to the driver. This can hide actual performance regressions or improvements of benchmarked operations. ### Does this PR introduce any user-facing change? No ### How was this patch tested? Re-run all modified benchmarks using Amazon EC2. | Item | Description | | ---- | ----| | Region | us-west-2 (Oregon) | | Instance | r3.xlarge (spot instance) | | AMI | ami-06f2f779464715dc5 (ubuntu/images/hvm-ssd/ubuntu-bionic-18.04-amd64-server-20190722.1) | | Java | OpenJDK8/10 | - Run `TPCDSQueryBenchmark` using instructions from the PR #26049 ``` # `spark-tpcds-datagen` needs this. (JDK8) $ git clone https://github.com/apache/spark.git -b branch-2.4 --depth 1 spark-2.4 $ export SPARK_HOME=$PWD $ ./build/mvn clean package -DskipTests # Generate data. (JDK8) $ git clone gitgithub.com:maropu/spark-tpcds-datagen.git $ cd spark-tpcds-datagen/ $ build/mvn clean package $ mkdir -p /data/tpcds $ ./bin/dsdgen --output-location /data/tpcds/s1 // This need `Spark 2.4` ``` - Other benchmarks ran by the script: ``` #!/usr/bin/env python3 import os from sparktestsupport.shellutils import run_cmd benchmarks = [ ['sql/test', 'org.apache.spark.sql.execution.benchmark.AggregateBenchmark'], ['avro/test', 'org.apache.spark.sql.execution.benchmark.AvroReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.BloomFilterBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DataSourceReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.DateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ExtractBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.FilterPushdownBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.InExpressionBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.IntervalBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.JoinBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MakeDateTimeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.MiscBenchmark'], ['hive/test', 'org.apache.spark.sql.execution.benchmark.ObjectHashAggregateExecBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.OrcV2NestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.ParquetNestedSchemaPruningBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.RangeBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.UDFBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideSchemaBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.benchmark.WideTableBenchmark'], ['hive/test', 'org.apache.spark.sql.hive.orc.OrcReadBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.csv.CSVBenchmark'], ['sql/test', 'org.apache.spark.sql.execution.datasources.json.JsonBenchmark'] ] print('Set SPARK_GENERATE_BENCHMARK_FILES=1') os.environ['SPARK_GENERATE_BENCHMARK_FILES'] = '1' for b in benchmarks: print("Run benchmark: %s" % b[1]) run_cmd(['build/sbt', '%s:runMain %s' % (b[0], b[1])]) ``` Closes #27078 from MaxGekk/noop-in-benchmarks. Lead-authored-by: Maxim Gekk <max.gekk@gmail.com> Co-authored-by: Maxim Gekk <maxim.gekk@databricks.com> Co-authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2020-01-12 16:18:19 -05:00
UnsafeRowhash 313 314 2 67.1 14.9 1.0X
murmur3 hash 144 145 2 145.9 6.9 2.2X
fast hash 70 72 5 301.5 3.3 4.5X
arrayEqual 189 190 0 111.0 9.0 1.7X
Java HashMap (Long) 129 132 4 162.4 6.2 2.4X
Java HashMap (two ints) 148 154 7 142.0 7.0 2.1X
Java HashMap (UnsafeRow) 839 840 1 25.0 40.0 0.4X
LongToUnsafeRowMap (opt=false) 462 464 1 45.4 22.0 0.7X
LongToUnsafeRowMap (opt=true) 108 108 1 194.2 5.2 2.9X
BytesToBytesMap (off Heap) 1027 1029 2 20.4 49.0 0.3X
BytesToBytesMap (on Heap) 999 1001 3 21.0 47.6 0.3X
Aggregate HashMap 44 44 0 477.8 2.1 7.1X
[SPARK-29320][TESTS] Compare `sql/core` module in JDK8/11 (Part 1) ### What changes were proposed in this pull request? This PR regenerates the `sql/core` benchmarks in JDK8/11 to compare the result. In general, we compare the ratio instead of the time. However, in this PR, the average time is compared. This PR should be considered as a rough comparison. **A. EXPECTED CASES(JDK11 is faster in general)** - [x] BloomFilterBenchmark (JDK11 is faster except one case) - [x] BuiltInDataSourceWriteBenchmark (JDK11 is faster at CSV/ORC) - [x] CSVBenchmark (JDK11 is faster except five cases) - [x] ColumnarBatchBenchmark (JDK11 is faster at `boolean`/`string` and some cases in `int`/`array`) - [x] DatasetBenchmark (JDK11 is faster with `string`, but is slower for `long` type) - [x] ExternalAppendOnlyUnsafeRowArrayBenchmark (JDK11 is faster except two cases) - [x] ExtractBenchmark (JDK11 is faster except HOUR/MINUTE/SECOND/MILLISECONDS/MICROSECONDS) - [x] HashedRelationMetricsBenchmark (JDK11 is faster) - [x] JSONBenchmark (JDK11 is much faster except eight cases) - [x] JoinBenchmark (JDK11 is faster except five cases) - [x] OrcNestedSchemaPruningBenchmark (JDK11 is faster in nine cases) - [x] PrimitiveArrayBenchmark (JDK11 is faster) - [x] SortBenchmark (JDK11 is faster except `Arrays.sort` case) - [x] UDFBenchmark (N/A, values are too small) - [x] UnsafeArrayDataBenchmark (JDK11 is faster except one case) - [x] WideTableBenchmark (JDK11 is faster except two cases) **B. CASES WE NEED TO INVESTIGATE MORE LATER** - [x] AggregateBenchmark (JDK11 is slower in general) - [x] CompressionSchemeBenchmark (JDK11 is slower in general except `string`) - [x] DataSourceReadBenchmark (JDK11 is slower in general) - [x] DateTimeBenchmark (JDK11 is slightly slower in general except `parsing`) - [x] MakeDateTimeBenchmark (JDK11 is slower except two cases) - [x] MiscBenchmark (JDK11 is slower except ten cases) - [x] OrcV2NestedSchemaPruningBenchmark (JDK11 is slower) - [x] ParquetNestedSchemaPruningBenchmark (JDK11 is slower except six cases) - [x] RangeBenchmark (JDK11 is slower except one case) `FilterPushdownBenchmark/InExpressionBenchmark/WideSchemaBenchmark` will be compared later because it took long timer. ### Why are the changes needed? According to the result, there are some difference between JDK8/JDK11. This will be a baseline for the future improvement and comparison. Also, as a reproducible environment, the following environment is used. - Instance: `r3.xlarge` - OS: `CentOS Linux release 7.5.1804 (Core)` - JDK: - `OpenJDK Runtime Environment (build 1.8.0_222-b10)` - `OpenJDK Runtime Environment 18.9 (build 11.0.4+11-LTS)` ### Does this PR introduce any user-facing change? No. ### How was this patch tested? This is a test-only PR. We need to run benchmark. Closes #26003 from dongjoon-hyun/SPARK-29320. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-03 11:58:25 -04:00