spark-instrumented-optimizer

History

Marco Gaido 93db7b870d [SPARK-27684][SQL] Avoid conversion overhead for primitive types ## What changes were proposed in this pull request? As outlined in the JIRA by JoshRosen, our conversion mechanism from catalyst types to scala ones is pretty inefficient for primitive data types. Indeed, in these cases, most of the times we are adding useless calls to `identity` function or anyway to functions which return the same value. Using the information we have when we generate the code, we can avoid most of these overheads. ## How was this patch tested? Here is a simple test which shows the benefit that this PR can bring: ``` test("SPARK-27684: perf evaluation") { val intLongUdf = ScalaUDF( (a: Int, b: Long) => a + b, LongType, Literal(1) :: Literal(1L) :: Nil, true :: true :: Nil, nullable = false) val plan = generateProject( MutableProjection.create(Alias(intLongUdf, s"udf")() :: Nil), intLongUdf) plan.initialize(0) var i = 0 val N = 100000000 val t0 = System.nanoTime() while(i < N) { plan(EmptyRow).get(0, intLongUdf.dataType) plan(EmptyRow).get(0, intLongUdf.dataType) plan(EmptyRow).get(0, intLongUdf.dataType) plan(EmptyRow).get(0, intLongUdf.dataType) plan(EmptyRow).get(0, intLongUdf.dataType) plan(EmptyRow).get(0, intLongUdf.dataType) plan(EmptyRow).get(0, intLongUdf.dataType) plan(EmptyRow).get(0, intLongUdf.dataType) plan(EmptyRow).get(0, intLongUdf.dataType) plan(EmptyRow).get(0, intLongUdf.dataType) i += 1 } val t1 = System.nanoTime() println(s"Avg time: ${(t1 - t0).toDouble / N} ns") } ``` The output before the patch is: ``` Avg time: 51.27083294 ns ``` after, we get: ``` Avg time: 11.85874227 ns ``` which is ~5X faster. Moreover a benchmark has been added for Scala UDF. The output after the patch can be seen in this PR, before the patch, the output was: ``` ================================================================================================ UDF with mixed input types ================================================================================================ Java HotSpot(TM) 64-Bit Server VM 1.8.0_152-b16 on Mac OS X 10.13.6 Intel(R) Core(TM) i7-4558U CPU 2.80GHz long/nullable int/string to string: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ long/nullable int/string to string wholestage off 257 287 42 0,4 2569,5 1,0X long/nullable int/string to string wholestage on 158 172 18 0,6 1579,0 1,6X Java HotSpot(TM) 64-Bit Server VM 1.8.0_152-b16 on Mac OS X 10.13.6 Intel(R) Core(TM) i7-4558U CPU 2.80GHz long/nullable int/string to option: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ long/nullable int/string to option wholestage off 104 107 5 1,0 1037,9 1,0X long/nullable int/string to option wholestage on 80 92 12 1,2 804,0 1,3X Java HotSpot(TM) 64-Bit Server VM 1.8.0_152-b16 on Mac OS X 10.13.6 Intel(R) Core(TM) i7-4558U CPU 2.80GHz long/nullable int to primitive: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ long/nullable int to primitive wholestage off 71 76 7 1,4 712,1 1,0X long/nullable int to primitive wholestage on 64 71 6 1,6 636,2 1,1X ================================================================================================ UDF with primitive types ================================================================================================ Java HotSpot(TM) 64-Bit Server VM 1.8.0_152-b16 on Mac OS X 10.13.6 Intel(R) Core(TM) i7-4558U CPU 2.80GHz long/nullable int to string: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ long/nullable int to string wholestage off 60 60 0 1,7 600,3 1,0X long/nullable int to string wholestage on 55 64 8 1,8 551,2 1,1X Java HotSpot(TM) 64-Bit Server VM 1.8.0_152-b16 on Mac OS X 10.13.6 Intel(R) Core(TM) i7-4558U CPU 2.80GHz long/nullable int to option: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ long/nullable int to option wholestage off 66 73 9 1,5 663,0 1,0X long/nullable int to option wholestage on 30 32 2 3,3 300,7 2,2X Java HotSpot(TM) 64-Bit Server VM 1.8.0_152-b16 on Mac OS X 10.13.6 Intel(R) Core(TM) i7-4558U CPU 2.80GHz long/nullable int/string to primitive: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ long/nullable int/string to primitive wholestage off 32 35 5 3,2 316,7 1,0X long/nullable int/string to primitive wholestage on 41 68 17 2,4 414,0 0,8X ``` The improvements are particularly visible in the second case, ie. when only primitive types are used as inputs. Closes #24636 from mgaido91/SPARK-27684. Authored-by: Marco Gaido <marcogaido91@gmail.com> Signed-off-by: Josh Rosen <rosenville@gmail.com>		2019-05-30 17:09:19 -07:00
..
AggregateBenchmark-results.txt	[SPARK-25476][SPARK-25510][TEST] Refactor AggregateBenchmark and add a new trait to better support Dataset and DataFrame API	2018-10-01 07:32:40 -07:00
BloomFilterBenchmark-results.txt	[SPARK-25589][SQL][TEST] Add BloomFilterBenchmark	2018-10-03 04:14:07 -07:00
BuiltInDataSourceWriteBenchmark-results.txt	[SPARK-25663][SPARK-25661][SQL][TEST] Refactor BuiltInDataSourceWriteBenchmark, DataSourceWriteBenchmark and AvroWriteBenchmark to use main method	2018-10-31 03:03:42 -07:00
ColumnarBatchBenchmark-results.txt	[SPARK-25481][SQL][TEST] Refactor ColumnarBatchBenchmark to use main method	2018-09-26 20:40:10 -07:00
CompressionSchemeBenchmark-results.txt	[SPARK-25478][SQL][TEST] Refactor CompressionSchemeBenchmark to use main method	2018-09-23 20:46:40 -07:00
CSVBenchmark-results.txt	[SPARK-27533][SQL][TEST] Date and timestamp CSV benchmarks	2019-04-23 11:08:02 +09:00
DatasetBenchmark-results.txt	[SPARK-25479][TEST] Refactor DatasetBenchmark to use main method	2018-10-04 11:58:16 -07:00
DataSourceReadBenchmark-results.txt	[SPARK-26584][SQL] Remove `spark.sql.orc.copyBatchToSpark` internal conf	2019-01-10 08:42:23 -08:00
DateTimeBenchmark-results.txt	[SPARK-27438][SQL] Parse strings with timestamps by to_timestamp() in microsecond precision	2019-04-22 19:41:32 +08:00
ExternalAppendOnlyUnsafeRowArrayBenchmark-results.txt	[SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsafeRowArrayBenchmark	2019-01-09 09:54:21 -08:00
FilterPushdownBenchmark-results.txt	[SPARK-25438][SQL][TEST] Fix FilterPushdownBenchmark to use the same memory assumption	2018-09-15 17:48:39 -07:00
HashedRelationMetricsBenchmark-results.txt	[SPARK-26337][SQL][TEST] Add benchmark for LongToUnsafeRowMap	2018-12-14 10:50:48 +08:00
InExpressionBenchmark-results.txt	[SPARK-26205][SQL] Optimize InSet Expression for bytes, shorts, ints, dates	2019-03-04 15:40:04 -08:00
JoinBenchmark-results.txt	[SPARK-25664][SQL][TEST] Refactor JoinBenchmark to use main method	2018-10-12 16:08:12 -07:00
JSONBenchmark-results.txt	[SPARK-27535][SQL][TEST] Date and timestamp JSON benchmarks	2019-04-23 11:09:14 +09:00
MiscBenchmark-results.txt	[SPARK-25488][SQL][TEST] Refactor MiscBenchmark to use main method	2018-10-06 08:47:43 -07:00
OrcNestedSchemaPruningBenchmark-results.txt	[SPARK-26975][SQL] Support nested-column pruning over limit/sample/repartition	2019-03-19 20:24:22 -07:00
OrcV2NestedSchemaPruningBenchmark-results.txt	[SPARK-27502][SQL][TEST] Update nested schema benchmark result for Orc V2	2019-04-18 08:08:22 -07:00
ParquetNestedSchemaPruningBenchmark-results.txt	[SPARK-26975][SQL] Support nested-column pruning over limit/sample/repartition	2019-03-19 20:24:22 -07:00
PrimitiveArrayBenchmark-results.txt	[SPARK-25487][SQL][TEST] Refactor PrimitiveArrayBenchmark	2018-09-21 15:04:47 +09:00
RangeBenchmark-results.txt	[SPARK-25710][SQL] range should report metrics correctly	2018-10-13 13:55:28 +08:00
SortBenchmark-results.txt	[SPARK-25486][TEST] Refactor SortBenchmark to use main method	2018-09-25 11:13:05 -07:00
UDFBenchmark-results.txt	[SPARK-27684][SQL] Avoid conversion overhead for primitive types	2019-05-30 17:09:19 -07:00
UnsafeArrayDataBenchmark-results.txt	[SPARK-25483][TEST] Refactor UnsafeArrayDataBenchmark to use main method	2018-10-03 04:20:02 -07:00
WideSchemaBenchmark-results.txt	[SPARK-25492][TEST] Refactor WideSchemaBenchmark to use main method	2018-10-20 17:31:13 -07:00
WideTableBenchmark-results.txt	[SPARK-25676][SQL][FOLLOWUP] Use 'foreach(_ => ())'	2018-11-08 23:37:14 +08:00