spark-instrumented-optimizer/sql/core/src/main
Michael Armbrust cc7af86afd [SPARK-12813][SQL] Eliminate serialization for back to back operations
The goal of this PR is to eliminate unnecessary translations when there are back-to-back `MapPartitions` operations.  In order to achieve this I also made the following simplifications:

 - Operators no longer have hold encoders, instead they have only the expressions that they need.  The benefits here are twofold: the expressions are visible to transformations so go through the normal resolution/binding process.  now that they are visible we can change them on a case by case basis.
 - Operators no longer have type parameters.  Since the engine is responsible for its own type checking, having the types visible to the complier was an unnecessary complication.  We still leverage the scala compiler in the companion factory when constructing a new operator, but after this the types are discarded.

Deferred to a follow up PR:
 - Remove as much of the resolution/binding from Dataset/GroupedDataset as possible. We should still eagerly check resolution and throw an error though in the case of mismatches for an `as` operation.
 - Eliminate serializations in more cases by adding more cases to `EliminateSerialization`

Author: Michael Armbrust <michael@databricks.com>

Closes #10747 from marmbrus/encoderExpressions.
2016-01-14 17:44:56 -08:00
..
java/org/apache/spark/sql [SPARK-12785][SQL] Add ColumnarBatch, an in memory columnar format for execution. 2016-01-12 18:21:04 -08:00
resources [SPARK-11206] Support SQL UI on the history server (resubmit) 2015-12-03 16:39:12 -08:00
scala/org/apache/spark/sql [SPARK-12813][SQL] Eliminate serialization for back to back operations 2016-01-14 17:44:56 -08:00