spark-instrumented-optimizer/sql/core/benchmarks
Marco Gaido 93db7b870d [SPARK-27684][SQL] Avoid conversion overhead for primitive types
## What changes were proposed in this pull request?

As outlined in the JIRA by JoshRosen, our conversion mechanism from catalyst types to scala ones is pretty inefficient for primitive data types. Indeed, in these cases, most of the times we are adding useless calls to `identity` function or anyway to functions which return the same value. Using the information we have when we generate the code, we can avoid most of these overheads.

## How was this patch tested?

Here is a simple test which shows the benefit that this PR can bring:
```
test("SPARK-27684: perf evaluation") {
    val intLongUdf = ScalaUDF(
      (a: Int, b: Long) => a + b, LongType,
      Literal(1) :: Literal(1L) :: Nil,
      true :: true :: Nil,
      nullable = false)

    val plan = generateProject(
      MutableProjection.create(Alias(intLongUdf, s"udf")() :: Nil),
      intLongUdf)
    plan.initialize(0)

    var i = 0
    val N = 100000000
    val t0 = System.nanoTime()
    while(i < N) {
      plan(EmptyRow).get(0, intLongUdf.dataType)
      plan(EmptyRow).get(0, intLongUdf.dataType)
      plan(EmptyRow).get(0, intLongUdf.dataType)
      plan(EmptyRow).get(0, intLongUdf.dataType)
      plan(EmptyRow).get(0, intLongUdf.dataType)
      plan(EmptyRow).get(0, intLongUdf.dataType)
      plan(EmptyRow).get(0, intLongUdf.dataType)
      plan(EmptyRow).get(0, intLongUdf.dataType)
      plan(EmptyRow).get(0, intLongUdf.dataType)
      plan(EmptyRow).get(0, intLongUdf.dataType)
      i += 1
    }
    val t1 = System.nanoTime()
    println(s"Avg time: ${(t1 - t0).toDouble / N} ns")
  }
```
The output before the patch is:
```
Avg time: 51.27083294 ns
```
after, we get:
```
Avg time: 11.85874227 ns
```
which is ~5X faster.

Moreover a benchmark has been added for Scala UDF. The output after the patch can be seen in this PR, before the patch, the output was:
```
================================================================================================
UDF with mixed input types
================================================================================================

Java HotSpot(TM) 64-Bit Server VM 1.8.0_152-b16 on Mac OS X 10.13.6
Intel(R) Core(TM) i7-4558U CPU  2.80GHz
long/nullable int/string to string:       Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
long/nullable int/string to string wholestage off            257            287          42          0,4        2569,5       1,0X
long/nullable int/string to string wholestage on            158            172          18          0,6        1579,0       1,6X

Java HotSpot(TM) 64-Bit Server VM 1.8.0_152-b16 on Mac OS X 10.13.6
Intel(R) Core(TM) i7-4558U CPU  2.80GHz
long/nullable int/string to option:       Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
long/nullable int/string to option wholestage off            104            107           5          1,0        1037,9       1,0X
long/nullable int/string to option wholestage on             80             92          12          1,2         804,0       1,3X

Java HotSpot(TM) 64-Bit Server VM 1.8.0_152-b16 on Mac OS X 10.13.6
Intel(R) Core(TM) i7-4558U CPU  2.80GHz
long/nullable int to primitive:           Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
long/nullable int to primitive wholestage off             71             76           7          1,4         712,1       1,0X
long/nullable int to primitive wholestage on             64             71           6          1,6         636,2       1,1X

================================================================================================
UDF with primitive types
================================================================================================

Java HotSpot(TM) 64-Bit Server VM 1.8.0_152-b16 on Mac OS X 10.13.6
Intel(R) Core(TM) i7-4558U CPU  2.80GHz
long/nullable int to string:              Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
long/nullable int to string wholestage off             60             60           0          1,7         600,3       1,0X
long/nullable int to string wholestage on             55             64           8          1,8         551,2       1,1X

Java HotSpot(TM) 64-Bit Server VM 1.8.0_152-b16 on Mac OS X 10.13.6
Intel(R) Core(TM) i7-4558U CPU  2.80GHz
long/nullable int to option:              Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
long/nullable int to option wholestage off             66             73           9          1,5         663,0       1,0X
long/nullable int to option wholestage on             30             32           2          3,3         300,7       2,2X

Java HotSpot(TM) 64-Bit Server VM 1.8.0_152-b16 on Mac OS X 10.13.6
Intel(R) Core(TM) i7-4558U CPU  2.80GHz
long/nullable int/string to primitive:    Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
long/nullable int/string to primitive wholestage off             32             35           5          3,2         316,7       1,0X
long/nullable int/string to primitive wholestage on             41             68          17          2,4         414,0       0,8X
```
The improvements are particularly visible in the second case, ie. when only primitive types are used as inputs.

Closes #24636 from mgaido91/SPARK-27684.

Authored-by: Marco Gaido <marcogaido91@gmail.com>
Signed-off-by: Josh Rosen <rosenville@gmail.com>
2019-05-30 17:09:19 -07:00
..
AggregateBenchmark-results.txt [SPARK-25476][SPARK-25510][TEST] Refactor AggregateBenchmark and add a new trait to better support Dataset and DataFrame API 2018-10-01 07:32:40 -07:00
BloomFilterBenchmark-results.txt [SPARK-25589][SQL][TEST] Add BloomFilterBenchmark 2018-10-03 04:14:07 -07:00
BuiltInDataSourceWriteBenchmark-results.txt [SPARK-25663][SPARK-25661][SQL][TEST] Refactor BuiltInDataSourceWriteBenchmark, DataSourceWriteBenchmark and AvroWriteBenchmark to use main method 2018-10-31 03:03:42 -07:00
ColumnarBatchBenchmark-results.txt [SPARK-25481][SQL][TEST] Refactor ColumnarBatchBenchmark to use main method 2018-09-26 20:40:10 -07:00
CompressionSchemeBenchmark-results.txt [SPARK-25478][SQL][TEST] Refactor CompressionSchemeBenchmark to use main method 2018-09-23 20:46:40 -07:00
CSVBenchmark-results.txt [SPARK-27533][SQL][TEST] Date and timestamp CSV benchmarks 2019-04-23 11:08:02 +09:00
DatasetBenchmark-results.txt [SPARK-25479][TEST] Refactor DatasetBenchmark to use main method 2018-10-04 11:58:16 -07:00
DataSourceReadBenchmark-results.txt [SPARK-26584][SQL] Remove spark.sql.orc.copyBatchToSpark internal conf 2019-01-10 08:42:23 -08:00
DateTimeBenchmark-results.txt [SPARK-27438][SQL] Parse strings with timestamps by to_timestamp() in microsecond precision 2019-04-22 19:41:32 +08:00
ExternalAppendOnlyUnsafeRowArrayBenchmark-results.txt [SPARK-25484][SQL][TEST] Refactor ExternalAppendOnlyUnsafeRowArrayBenchmark 2019-01-09 09:54:21 -08:00
FilterPushdownBenchmark-results.txt [SPARK-25438][SQL][TEST] Fix FilterPushdownBenchmark to use the same memory assumption 2018-09-15 17:48:39 -07:00
HashedRelationMetricsBenchmark-results.txt [SPARK-26337][SQL][TEST] Add benchmark for LongToUnsafeRowMap 2018-12-14 10:50:48 +08:00
InExpressionBenchmark-results.txt [SPARK-26205][SQL] Optimize InSet Expression for bytes, shorts, ints, dates 2019-03-04 15:40:04 -08:00
JoinBenchmark-results.txt [SPARK-25664][SQL][TEST] Refactor JoinBenchmark to use main method 2018-10-12 16:08:12 -07:00
JSONBenchmark-results.txt [SPARK-27535][SQL][TEST] Date and timestamp JSON benchmarks 2019-04-23 11:09:14 +09:00
MiscBenchmark-results.txt [SPARK-25488][SQL][TEST] Refactor MiscBenchmark to use main method 2018-10-06 08:47:43 -07:00
OrcNestedSchemaPruningBenchmark-results.txt [SPARK-26975][SQL] Support nested-column pruning over limit/sample/repartition 2019-03-19 20:24:22 -07:00
OrcV2NestedSchemaPruningBenchmark-results.txt [SPARK-27502][SQL][TEST] Update nested schema benchmark result for Orc V2 2019-04-18 08:08:22 -07:00
ParquetNestedSchemaPruningBenchmark-results.txt [SPARK-26975][SQL] Support nested-column pruning over limit/sample/repartition 2019-03-19 20:24:22 -07:00
PrimitiveArrayBenchmark-results.txt [SPARK-25487][SQL][TEST] Refactor PrimitiveArrayBenchmark 2018-09-21 15:04:47 +09:00
RangeBenchmark-results.txt [SPARK-25710][SQL] range should report metrics correctly 2018-10-13 13:55:28 +08:00
SortBenchmark-results.txt [SPARK-25486][TEST] Refactor SortBenchmark to use main method 2018-09-25 11:13:05 -07:00
UDFBenchmark-results.txt [SPARK-27684][SQL] Avoid conversion overhead for primitive types 2019-05-30 17:09:19 -07:00
UnsafeArrayDataBenchmark-results.txt [SPARK-25483][TEST] Refactor UnsafeArrayDataBenchmark to use main method 2018-10-03 04:20:02 -07:00
WideSchemaBenchmark-results.txt [SPARK-25492][TEST] Refactor WideSchemaBenchmark to use main method 2018-10-20 17:31:13 -07:00
WideTableBenchmark-results.txt [SPARK-25676][SQL][FOLLOWUP] Use 'foreach(_ => ())' 2018-11-08 23:37:14 +08:00