spark-instrumented-optimizer

History

Tejas Patil 21c7539a52 [SPARK-18038][SQL] Move output partitioning definition from UnaryNodeExec to its children ## What changes were proposed in this pull request? Jira : https://issues.apache.org/jira/browse/SPARK-18038 This was a suggestion by rxin over one of the dev list discussion : http://apache-spark-developers-list.1001551.n3.nabble.com/Project-not-preserving-child-partitioning-td19417.html His words: >> It would be better (safer) to move the output partitioning definition into each of the operator and remove it from UnaryExecNode. With this PR, following is the output partitioning and ordering for all the impls of `UnaryExecNode`. UnaryExecNode's impl \| outputPartitioning \| outputOrdering \| comment ------------ \| ------------- \| ------------ \| ------------ AppendColumnsExec \| child's \| Nil \| child's ordering can be used AppendColumnsWithObjectExec \| child's \| Nil \| child's ordering can be used BroadcastExchangeExec \| BroadcastPartitioning \| Nil \| - CoalesceExec \| UnknownPartitioning \| Nil \| - CollectLimitExec \| SinglePartition \| Nil \| - DebugExec \| child's \| Nil \| child's ordering can be used DeserializeToObjectExec \| child's \| Nil \| child's ordering can be used ExpandExec \| UnknownPartitioning \| Nil \| - FilterExec \| child's \| child's \| - FlatMapGroupsInRExec \| child's \| Nil \| child's ordering can be used GenerateExec \| child's \| Nil \| need to dig more GlobalLimitExec \| child's \| child's \| - HashAggregateExec \| child's \| Nil \| - InputAdapter \| child's \| child's \| - InsertIntoHiveTable \| child's \| Nil \| terminal node, doesn't need partitioning LocalLimitExec \| child's \| child's \| - MapElementsExec \| child's \| child's \| - MapGroupsExec \| child's \| Nil \| child's ordering can be used MapPartitionsExec \| child's \| Nil \| child's ordering can be used ProjectExec \| child's \| child's \| - SampleExec \| child's \| Nil \| child's ordering can be used ScriptTransformation \| child's \| Nil \| child's ordering can be used SerializeFromObjectExec \| child's \| Nil \| child's ordering can be used ShuffleExchange \| custom \| Nil \| - SortAggregateExec \| child's \| sort over grouped exprs \| - SortExec \| child's \| custom \| - StateStoreRestoreExec \| child's \| Nil \| child's ordering can be used StateStoreSaveExec \| child's \| Nil \| child's ordering can be used SubqueryExec \| child's \| child's \| - TakeOrderedAndProjectExec \| SinglePartition \| custom \| - WholeStageCodegenExec \| child's \| child's \| - WindowExec \| child's \| child's \| - ## How was this patch tested? This does NOT change any existing functionality so relying on existing tests Author: Tejas Patil <tejasp@fb.com> Closes #15575 from tejasapatil/SPARK-18038_UnaryNodeExec_output_partitioning.		2016-10-23 13:25:47 +02:00
..
compatibility/src/test/scala/org/apache/spark/sql/hive/execution	[SPARK-17258][SQL] Parse scientific decimal literals as decimals	2016-10-04 23:48:26 -07:00
src	[SPARK-18038][SQL] Move output partitioning definition from UnaryNodeExec to its children	2016-10-23 13:25:47 +02:00
pom.xml	[SPARK-16535][BUILD] In pom.xml, remove groupId which is redundant definition and inherited from the parent	2016-07-19 11:59:46 +01:00