eaac60a1e2
## What changes were proposed in this pull request? This is mostly from https://github.com/apache/spark/pull/13775 The wrapper solution is pretty good for string/binary type, as the ORC column vector doesn't keep bytes in a continuous memory region, and has a significant overhead when copying the data to Spark columnar batch. For other cases, the wrapper solution is almost same with the current solution. I think we can treat the wrapper solution as a baseline and keep improving the writing to Spark solution. ## How was this patch tested? existing tests. Author: Wenchen Fan <wenchen@databricks.com> Closes #20205 from cloud-fan/orc. |
||
---|---|---|
.. | ||
compatibility/src/test/scala/org/apache/spark/sql/hive/execution | ||
src | ||
pom.xml |