spark-instrumented-optimizer

History

yangjie01 a988aaf3fa [SPARK-29454][SQL] Reduce unsafeProjection times when read Parquet file use non-vectorized mode ### What changes were proposed in this pull request? There will be 2 times unsafeProjection convert operation When we read a Parquet data file use non-vectorized mode: 1. `ParquetGroupConverter` call unsafeProjection function to covert `SpecificInternalRow` to `UnsafeRow` every times when read Parquet data file use `ParquetRecordReader`. 2. `ParquetFileFormat` will call unsafeProjection function to covert this `UnsafeRow` to another `UnsafeRow` again when partitionSchema is not empty in DataSourceV1 branch, and `PartitionReaderWithPartitionValues` will always do this convert operation in DataSourceV2 branch. In this pr, remove `unsafeProjection` convert operation in `ParquetGroupConverter` and change `ParquetRecordReader` to produce `SpecificInternalRow` instead of `UnsafeRow`. ### Why are the changes needed? The first time convert in `ParquetGroupConverter` is redundant and `ParquetRecordReader` return a `InternalRow(SpecificInternalRow)` is enough. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Unit Test Closes #26106 from LuciferYang/spark-parquet-unsafe-projection. Authored-by: yangjie01 <yangjie01@baidu.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>		2019-10-15 12:42:42 +08:00
..
benchmarks	[SPARK-25668][SQL][TESTS] Refactor TPCDSQueryBenchmark to use main method	2019-10-08 13:33:42 +09:00
src	[SPARK-29454][SQL] Reduce unsafeProjection times when read Parquet file use non-vectorized mode	2019-10-15 12:42:42 +08:00
v1.2.1/src	[SPARK-28744][SQL][TEST] rename SharedSQLContext to SharedSparkSession	2019-08-19 19:01:56 +08:00
v2.3.5/src	[SPARK-28744][SQL][TEST] rename SharedSQLContext to SharedSparkSession	2019-08-19 19:01:56 +08:00
pom.xml	[SPARK-29296][BUILD][CORE] Remove use of .par to make 2.13 support easier; add scala-2.13 profile to enable pulling in par collections library separately, for the future	2019-10-03 08:56:08 -05:00