[SPARK-32466][TEST][SQL] Add PlanStabilitySuite to detect SparkPlan regression

### What changes were proposed in this pull request?

This PR proposes to detect possible regression inside `SparkPlan`. To achieve this goal, this PR added a base test suite called  `PlanStabilitySuite`. The basic workflow of this test suite is similar to  `SQLQueryTestSuite`. It also uses `SPARK_GENERATE_GOLDEN_FILES` to decide whether it should regenerate the golden files or compare to the golden result for each input query. The difference is, `PlanStabilitySuite` uses the serialized explain result(.txt format) of the `SparkPlan` as the output of a query, instead of the data result.

And since `SparkPlan` is non-deterministic for various reasons, e.g.,  expressions ids changes, expression order changes, we'd reduce the plan to a simplified version that only contains node names and references. And we only identify those important nodes, e.g., `Exchange`, `SubqueryExec`, in the simplified plan.

And we'd reuse TPC-DS queries(v1.4, v2.7, modified) to test plans' stability. Currently, one TPC-DS query can only have one corresponding simplified golden plan.

This PR also did a few refactor, which extracts `TPCDSBase` from `TPCDSQuerySuite`. So,  `PlanStabilitySuite` can use the TPC-DS queries as well.

### Why are the changes needed?

Nowadays, Spark is getting more and more complex. Any changes might cause regression unintentionally. Spark already has some benchmark to catch the performance regression. But, yet, it doesn't have a way to detect the regression inside `SparkPlan`. It would be good if we could detect the possible regression early during the compile phase before the runtime phase.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Added `PlanStabilitySuite` and it's subclasses.

Closes #29270 from Ngone51/plan-stable.

Authored-by: yi.wu <yi.wu@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
This commit is contained in:
yi.wu 2020-08-17 14:22:12 +00:00 committed by Wenchen Fan
parent b94c67b502
commit 9f2893cf2c
627 changed files with 130898 additions and 65 deletions

View file

@ -0,0 +1,286 @@
== Physical Plan ==
TakeOrderedAndProject (52)
+- * HashAggregate (51)
+- Exchange (50)
+- * HashAggregate (49)
+- * Project (48)
+- * BroadcastHashJoin Inner BuildLeft (47)
:- BroadcastExchange (43)
: +- * Project (42)
: +- * BroadcastHashJoin Inner BuildRight (41)
: :- * Project (35)
: : +- SortMergeJoin LeftSemi (34)
: : :- SortMergeJoin LeftSemi (25)
: : : :- * Sort (5)
: : : : +- Exchange (4)
: : : : +- * Filter (3)
: : : : +- * ColumnarToRow (2)
: : : : +- Scan parquet default.customer (1)
: : : +- * Sort (24)
: : : +- Exchange (23)
: : : +- Union (22)
: : : :- * Project (15)
: : : : +- * BroadcastHashJoin Inner BuildRight (14)
: : : : :- * Filter (8)
: : : : : +- * ColumnarToRow (7)
: : : : : +- Scan parquet default.web_sales (6)
: : : : +- BroadcastExchange (13)
: : : : +- * Project (12)
: : : : +- * Filter (11)
: : : : +- * ColumnarToRow (10)
: : : : +- Scan parquet default.date_dim (9)
: : : +- * Project (21)
: : : +- * BroadcastHashJoin Inner BuildRight (20)
: : : :- * Filter (18)
: : : : +- * ColumnarToRow (17)
: : : : +- Scan parquet default.catalog_sales (16)
: : : +- ReusedExchange (19)
: : +- * Sort (33)
: : +- Exchange (32)
: : +- * Project (31)
: : +- * BroadcastHashJoin Inner BuildRight (30)
: : :- * Filter (28)
: : : +- * ColumnarToRow (27)
: : : +- Scan parquet default.store_sales (26)
: : +- ReusedExchange (29)
: +- BroadcastExchange (40)
: +- * Project (39)
: +- * Filter (38)
: +- * ColumnarToRow (37)
: +- Scan parquet default.customer_address (36)
+- * Filter (46)
+- * ColumnarToRow (45)
+- Scan parquet default.customer_demographics (44)
(1) Scan parquet default.customer
Output [3]: [c_customer_sk#1, c_current_cdemo_sk#2, c_current_addr_sk#3]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/customer]
PushedFilters: [IsNotNull(c_customer_sk), IsNotNull(c_current_addr_sk), IsNotNull(c_current_cdemo_sk)]
ReadSchema: struct<c_customer_sk:int,c_current_cdemo_sk:int,c_current_addr_sk:int>
(2) ColumnarToRow [codegen id : 1]
Input [3]: [c_customer_sk#1, c_current_cdemo_sk#2, c_current_addr_sk#3]
(3) Filter [codegen id : 1]
Input [3]: [c_customer_sk#1, c_current_cdemo_sk#2, c_current_addr_sk#3]
Condition : ((isnotnull(c_customer_sk#1) AND isnotnull(c_current_addr_sk#3)) AND isnotnull(c_current_cdemo_sk#2))
(4) Exchange
Input [3]: [c_customer_sk#1, c_current_cdemo_sk#2, c_current_addr_sk#3]
Arguments: hashpartitioning(c_customer_sk#1, 5), true, [id=#4]
(5) Sort [codegen id : 2]
Input [3]: [c_customer_sk#1, c_current_cdemo_sk#2, c_current_addr_sk#3]
Arguments: [c_customer_sk#1 ASC NULLS FIRST], false, 0
(6) Scan parquet default.web_sales
Output [2]: [ws_sold_date_sk#5, ws_bill_customer_sk#6]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/web_sales]
PushedFilters: [IsNotNull(ws_sold_date_sk), IsNotNull(ws_bill_customer_sk)]
ReadSchema: struct<ws_sold_date_sk:int,ws_bill_customer_sk:int>
(7) ColumnarToRow [codegen id : 4]
Input [2]: [ws_sold_date_sk#5, ws_bill_customer_sk#6]
(8) Filter [codegen id : 4]
Input [2]: [ws_sold_date_sk#5, ws_bill_customer_sk#6]
Condition : (isnotnull(ws_sold_date_sk#5) AND isnotnull(ws_bill_customer_sk#6))
(9) Scan parquet default.date_dim
Output [3]: [d_date_sk#7, d_year#8, d_moy#9]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/date_dim]
PushedFilters: [IsNotNull(d_moy), IsNotNull(d_year), EqualTo(d_year,2002), GreaterThanOrEqual(d_moy,4), LessThanOrEqual(d_moy,7), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int,d_moy:int>
(10) ColumnarToRow [codegen id : 3]
Input [3]: [d_date_sk#7, d_year#8, d_moy#9]
(11) Filter [codegen id : 3]
Input [3]: [d_date_sk#7, d_year#8, d_moy#9]
Condition : (((((isnotnull(d_moy#9) AND isnotnull(d_year#8)) AND (d_year#8 = 2002)) AND (d_moy#9 >= 4)) AND (d_moy#9 <= 7)) AND isnotnull(d_date_sk#7))
(12) Project [codegen id : 3]
Output [1]: [d_date_sk#7]
Input [3]: [d_date_sk#7, d_year#8, d_moy#9]
(13) BroadcastExchange
Input [1]: [d_date_sk#7]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#10]
(14) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ws_sold_date_sk#5]
Right keys [1]: [d_date_sk#7]
Join condition: None
(15) Project [codegen id : 4]
Output [1]: [ws_bill_customer_sk#6 AS customer_sk#11]
Input [3]: [ws_sold_date_sk#5, ws_bill_customer_sk#6, d_date_sk#7]
(16) Scan parquet default.catalog_sales
Output [2]: [cs_sold_date_sk#12, cs_ship_customer_sk#13]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/catalog_sales]
PushedFilters: [IsNotNull(cs_sold_date_sk), IsNotNull(cs_ship_customer_sk)]
ReadSchema: struct<cs_sold_date_sk:int,cs_ship_customer_sk:int>
(17) ColumnarToRow [codegen id : 6]
Input [2]: [cs_sold_date_sk#12, cs_ship_customer_sk#13]
(18) Filter [codegen id : 6]
Input [2]: [cs_sold_date_sk#12, cs_ship_customer_sk#13]
Condition : (isnotnull(cs_sold_date_sk#12) AND isnotnull(cs_ship_customer_sk#13))
(19) ReusedExchange [Reuses operator id: 13]
Output [1]: [d_date_sk#7]
(20) BroadcastHashJoin [codegen id : 6]
Left keys [1]: [cs_sold_date_sk#12]
Right keys [1]: [d_date_sk#7]
Join condition: None
(21) Project [codegen id : 6]
Output [1]: [cs_ship_customer_sk#13 AS customer_sk#14]
Input [3]: [cs_sold_date_sk#12, cs_ship_customer_sk#13, d_date_sk#7]
(22) Union
(23) Exchange
Input [1]: [customer_sk#11]
Arguments: hashpartitioning(customer_sk#11, 5), true, [id=#15]
(24) Sort [codegen id : 7]
Input [1]: [customer_sk#11]
Arguments: [customer_sk#11 ASC NULLS FIRST], false, 0
(25) SortMergeJoin
Left keys [1]: [c_customer_sk#1]
Right keys [1]: [customer_sk#11]
Join condition: None
(26) Scan parquet default.store_sales
Output [2]: [ss_sold_date_sk#16, ss_customer_sk#17]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), IsNotNull(ss_customer_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_customer_sk:int>
(27) ColumnarToRow [codegen id : 9]
Input [2]: [ss_sold_date_sk#16, ss_customer_sk#17]
(28) Filter [codegen id : 9]
Input [2]: [ss_sold_date_sk#16, ss_customer_sk#17]
Condition : (isnotnull(ss_sold_date_sk#16) AND isnotnull(ss_customer_sk#17))
(29) ReusedExchange [Reuses operator id: 13]
Output [1]: [d_date_sk#7]
(30) BroadcastHashJoin [codegen id : 9]
Left keys [1]: [ss_sold_date_sk#16]
Right keys [1]: [d_date_sk#7]
Join condition: None
(31) Project [codegen id : 9]
Output [1]: [ss_customer_sk#17 AS customer_sk#18]
Input [3]: [ss_sold_date_sk#16, ss_customer_sk#17, d_date_sk#7]
(32) Exchange
Input [1]: [customer_sk#18]
Arguments: hashpartitioning(customer_sk#18, 5), true, [id=#19]
(33) Sort [codegen id : 10]
Input [1]: [customer_sk#18]
Arguments: [customer_sk#18 ASC NULLS FIRST], false, 0
(34) SortMergeJoin
Left keys [1]: [c_customer_sk#1]
Right keys [1]: [customer_sk#18]
Join condition: None
(35) Project [codegen id : 12]
Output [2]: [c_current_cdemo_sk#2, c_current_addr_sk#3]
Input [3]: [c_customer_sk#1, c_current_cdemo_sk#2, c_current_addr_sk#3]
(36) Scan parquet default.customer_address
Output [2]: [ca_address_sk#20, ca_county#21]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/customer_address]
PushedFilters: [In(ca_county, [Walker County,Richland County,Gaines County,Douglas County,Dona Ana County]), IsNotNull(ca_address_sk)]
ReadSchema: struct<ca_address_sk:int,ca_county:string>
(37) ColumnarToRow [codegen id : 11]
Input [2]: [ca_address_sk#20, ca_county#21]
(38) Filter [codegen id : 11]
Input [2]: [ca_address_sk#20, ca_county#21]
Condition : (ca_county#21 IN (Walker County,Richland County,Gaines County,Douglas County,Dona Ana County) AND isnotnull(ca_address_sk#20))
(39) Project [codegen id : 11]
Output [1]: [ca_address_sk#20]
Input [2]: [ca_address_sk#20, ca_county#21]
(40) BroadcastExchange
Input [1]: [ca_address_sk#20]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#22]
(41) BroadcastHashJoin [codegen id : 12]
Left keys [1]: [c_current_addr_sk#3]
Right keys [1]: [ca_address_sk#20]
Join condition: None
(42) Project [codegen id : 12]
Output [1]: [c_current_cdemo_sk#2]
Input [3]: [c_current_cdemo_sk#2, c_current_addr_sk#3, ca_address_sk#20]
(43) BroadcastExchange
Input [1]: [c_current_cdemo_sk#2]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#23]
(44) Scan parquet default.customer_demographics
Output [9]: [cd_demo_sk#24, cd_gender#25, cd_marital_status#26, cd_education_status#27, cd_purchase_estimate#28, cd_credit_rating#29, cd_dep_count#30, cd_dep_employed_count#31, cd_dep_college_count#32]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/customer_demographics]
PushedFilters: [IsNotNull(cd_demo_sk)]
ReadSchema: struct<cd_demo_sk:int,cd_gender:string,cd_marital_status:string,cd_education_status:string,cd_purchase_estimate:int,cd_credit_rating:string,cd_dep_count:int,cd_dep_employed_count:int,cd_dep_college_count:int>
(45) ColumnarToRow
Input [9]: [cd_demo_sk#24, cd_gender#25, cd_marital_status#26, cd_education_status#27, cd_purchase_estimate#28, cd_credit_rating#29, cd_dep_count#30, cd_dep_employed_count#31, cd_dep_college_count#32]
(46) Filter
Input [9]: [cd_demo_sk#24, cd_gender#25, cd_marital_status#26, cd_education_status#27, cd_purchase_estimate#28, cd_credit_rating#29, cd_dep_count#30, cd_dep_employed_count#31, cd_dep_college_count#32]
Condition : isnotnull(cd_demo_sk#24)
(47) BroadcastHashJoin [codegen id : 13]
Left keys [1]: [c_current_cdemo_sk#2]
Right keys [1]: [cd_demo_sk#24]
Join condition: None
(48) Project [codegen id : 13]
Output [8]: [cd_gender#25, cd_marital_status#26, cd_education_status#27, cd_purchase_estimate#28, cd_credit_rating#29, cd_dep_count#30, cd_dep_employed_count#31, cd_dep_college_count#32]
Input [10]: [c_current_cdemo_sk#2, cd_demo_sk#24, cd_gender#25, cd_marital_status#26, cd_education_status#27, cd_purchase_estimate#28, cd_credit_rating#29, cd_dep_count#30, cd_dep_employed_count#31, cd_dep_college_count#32]
(49) HashAggregate [codegen id : 13]
Input [8]: [cd_gender#25, cd_marital_status#26, cd_education_status#27, cd_purchase_estimate#28, cd_credit_rating#29, cd_dep_count#30, cd_dep_employed_count#31, cd_dep_college_count#32]
Keys [8]: [cd_gender#25, cd_marital_status#26, cd_education_status#27, cd_purchase_estimate#28, cd_credit_rating#29, cd_dep_count#30, cd_dep_employed_count#31, cd_dep_college_count#32]
Functions [1]: [partial_count(1)]
Aggregate Attributes [1]: [count#33]
Results [9]: [cd_gender#25, cd_marital_status#26, cd_education_status#27, cd_purchase_estimate#28, cd_credit_rating#29, cd_dep_count#30, cd_dep_employed_count#31, cd_dep_college_count#32, count#34]
(50) Exchange
Input [9]: [cd_gender#25, cd_marital_status#26, cd_education_status#27, cd_purchase_estimate#28, cd_credit_rating#29, cd_dep_count#30, cd_dep_employed_count#31, cd_dep_college_count#32, count#34]
Arguments: hashpartitioning(cd_gender#25, cd_marital_status#26, cd_education_status#27, cd_purchase_estimate#28, cd_credit_rating#29, cd_dep_count#30, cd_dep_employed_count#31, cd_dep_college_count#32, 5), true, [id=#35]
(51) HashAggregate [codegen id : 14]
Input [9]: [cd_gender#25, cd_marital_status#26, cd_education_status#27, cd_purchase_estimate#28, cd_credit_rating#29, cd_dep_count#30, cd_dep_employed_count#31, cd_dep_college_count#32, count#34]
Keys [8]: [cd_gender#25, cd_marital_status#26, cd_education_status#27, cd_purchase_estimate#28, cd_credit_rating#29, cd_dep_count#30, cd_dep_employed_count#31, cd_dep_college_count#32]
Functions [1]: [count(1)]
Aggregate Attributes [1]: [count(1)#36]
Results [14]: [cd_gender#25, cd_marital_status#26, cd_education_status#27, count(1)#36 AS cnt1#37, cd_purchase_estimate#28, count(1)#36 AS cnt2#38, cd_credit_rating#29, count(1)#36 AS cnt3#39, cd_dep_count#30, count(1)#36 AS cnt4#40, cd_dep_employed_count#31, count(1)#36 AS cnt5#41, cd_dep_college_count#32, count(1)#36 AS cnt6#42]
(52) TakeOrderedAndProject
Input [14]: [cd_gender#25, cd_marital_status#26, cd_education_status#27, cnt1#37, cd_purchase_estimate#28, cnt2#38, cd_credit_rating#29, cnt3#39, cd_dep_count#30, cnt4#40, cd_dep_employed_count#31, cnt5#41, cd_dep_college_count#32, cnt6#42]
Arguments: 100, [cd_gender#25 ASC NULLS FIRST, cd_marital_status#26 ASC NULLS FIRST, cd_education_status#27 ASC NULLS FIRST, cd_purchase_estimate#28 ASC NULLS FIRST, cd_credit_rating#29 ASC NULLS FIRST, cd_dep_count#30 ASC NULLS FIRST, cd_dep_employed_count#31 ASC NULLS FIRST, cd_dep_college_count#32 ASC NULLS FIRST], [cd_gender#25, cd_marital_status#26, cd_education_status#27, cnt1#37, cd_purchase_estimate#28, cnt2#38, cd_credit_rating#29, cnt3#39, cd_dep_count#30, cnt4#40, cd_dep_employed_count#31, cnt5#41, cd_dep_college_count#32, cnt6#42]

View file

@ -0,0 +1,81 @@
TakeOrderedAndProject [cd_credit_rating,cd_dep_college_count,cd_dep_count,cd_dep_employed_count,cd_education_status,cd_gender,cd_marital_status,cd_purchase_estimate,cnt1,cnt2,cnt3,cnt4,cnt5,cnt6]
WholeStageCodegen (14)
HashAggregate [cd_credit_rating,cd_dep_college_count,cd_dep_count,cd_dep_employed_count,cd_education_status,cd_gender,cd_marital_status,cd_purchase_estimate,count] [cnt1,cnt2,cnt3,cnt4,cnt5,cnt6,count,count(1)]
InputAdapter
Exchange [cd_credit_rating,cd_dep_college_count,cd_dep_count,cd_dep_employed_count,cd_education_status,cd_gender,cd_marital_status,cd_purchase_estimate] #1
WholeStageCodegen (13)
HashAggregate [cd_credit_rating,cd_dep_college_count,cd_dep_count,cd_dep_employed_count,cd_education_status,cd_gender,cd_marital_status,cd_purchase_estimate] [count,count]
Project [cd_credit_rating,cd_dep_college_count,cd_dep_count,cd_dep_employed_count,cd_education_status,cd_gender,cd_marital_status,cd_purchase_estimate]
BroadcastHashJoin [c_current_cdemo_sk,cd_demo_sk]
InputAdapter
BroadcastExchange #2
WholeStageCodegen (12)
Project [c_current_cdemo_sk]
BroadcastHashJoin [c_current_addr_sk,ca_address_sk]
Project [c_current_addr_sk,c_current_cdemo_sk]
InputAdapter
SortMergeJoin [c_customer_sk,customer_sk]
SortMergeJoin [c_customer_sk,customer_sk]
WholeStageCodegen (2)
Sort [c_customer_sk]
InputAdapter
Exchange [c_customer_sk] #3
WholeStageCodegen (1)
Filter [c_current_addr_sk,c_current_cdemo_sk,c_customer_sk]
ColumnarToRow
InputAdapter
Scan parquet default.customer [c_current_addr_sk,c_current_cdemo_sk,c_customer_sk]
WholeStageCodegen (7)
Sort [customer_sk]
InputAdapter
Exchange [customer_sk] #4
Union
WholeStageCodegen (4)
Project [ws_bill_customer_sk]
BroadcastHashJoin [d_date_sk,ws_sold_date_sk]
Filter [ws_bill_customer_sk,ws_sold_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.web_sales [ws_bill_customer_sk,ws_sold_date_sk]
InputAdapter
BroadcastExchange #5
WholeStageCodegen (3)
Project [d_date_sk]
Filter [d_date_sk,d_moy,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_moy,d_year]
WholeStageCodegen (6)
Project [cs_ship_customer_sk]
BroadcastHashJoin [cs_sold_date_sk,d_date_sk]
Filter [cs_ship_customer_sk,cs_sold_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.catalog_sales [cs_ship_customer_sk,cs_sold_date_sk]
InputAdapter
ReusedExchange [d_date_sk] #5
WholeStageCodegen (10)
Sort [customer_sk]
InputAdapter
Exchange [customer_sk] #6
WholeStageCodegen (9)
Project [ss_customer_sk]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Filter [ss_customer_sk,ss_sold_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_customer_sk,ss_sold_date_sk]
InputAdapter
ReusedExchange [d_date_sk] #5
InputAdapter
BroadcastExchange #7
WholeStageCodegen (11)
Project [ca_address_sk]
Filter [ca_address_sk,ca_county]
ColumnarToRow
InputAdapter
Scan parquet default.customer_address [ca_address_sk,ca_county]
Filter [cd_demo_sk]
ColumnarToRow
InputAdapter
Scan parquet default.customer_demographics [cd_credit_rating,cd_demo_sk,cd_dep_college_count,cd_dep_count,cd_dep_employed_count,cd_education_status,cd_gender,cd_marital_status,cd_purchase_estimate]

View file

@ -0,0 +1,266 @@
== Physical Plan ==
TakeOrderedAndProject (48)
+- * HashAggregate (47)
+- Exchange (46)
+- * HashAggregate (45)
+- * Project (44)
+- * BroadcastHashJoin Inner BuildRight (43)
:- * Project (38)
: +- * BroadcastHashJoin Inner BuildRight (37)
: :- * Project (31)
: : +- * BroadcastHashJoin LeftSemi BuildRight (30)
: : :- * BroadcastHashJoin LeftSemi BuildRight (22)
: : : :- * Filter (3)
: : : : +- * ColumnarToRow (2)
: : : : +- Scan parquet default.customer (1)
: : : +- BroadcastExchange (21)
: : : +- Union (20)
: : : :- * Project (13)
: : : : +- * BroadcastHashJoin Inner BuildRight (12)
: : : : :- * Filter (6)
: : : : : +- * ColumnarToRow (5)
: : : : : +- Scan parquet default.web_sales (4)
: : : : +- BroadcastExchange (11)
: : : : +- * Project (10)
: : : : +- * Filter (9)
: : : : +- * ColumnarToRow (8)
: : : : +- Scan parquet default.date_dim (7)
: : : +- * Project (19)
: : : +- * BroadcastHashJoin Inner BuildRight (18)
: : : :- * Filter (16)
: : : : +- * ColumnarToRow (15)
: : : : +- Scan parquet default.catalog_sales (14)
: : : +- ReusedExchange (17)
: : +- BroadcastExchange (29)
: : +- * Project (28)
: : +- * BroadcastHashJoin Inner BuildRight (27)
: : :- * Filter (25)
: : : +- * ColumnarToRow (24)
: : : +- Scan parquet default.store_sales (23)
: : +- ReusedExchange (26)
: +- BroadcastExchange (36)
: +- * Project (35)
: +- * Filter (34)
: +- * ColumnarToRow (33)
: +- Scan parquet default.customer_address (32)
+- BroadcastExchange (42)
+- * Filter (41)
+- * ColumnarToRow (40)
+- Scan parquet default.customer_demographics (39)
(1) Scan parquet default.customer
Output [3]: [c_customer_sk#1, c_current_cdemo_sk#2, c_current_addr_sk#3]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/customer]
PushedFilters: [IsNotNull(c_customer_sk), IsNotNull(c_current_addr_sk), IsNotNull(c_current_cdemo_sk)]
ReadSchema: struct<c_customer_sk:int,c_current_cdemo_sk:int,c_current_addr_sk:int>
(2) ColumnarToRow [codegen id : 9]
Input [3]: [c_customer_sk#1, c_current_cdemo_sk#2, c_current_addr_sk#3]
(3) Filter [codegen id : 9]
Input [3]: [c_customer_sk#1, c_current_cdemo_sk#2, c_current_addr_sk#3]
Condition : ((isnotnull(c_customer_sk#1) AND isnotnull(c_current_addr_sk#3)) AND isnotnull(c_current_cdemo_sk#2))
(4) Scan parquet default.web_sales
Output [2]: [ws_sold_date_sk#4, ws_bill_customer_sk#5]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/web_sales]
PushedFilters: [IsNotNull(ws_sold_date_sk), IsNotNull(ws_bill_customer_sk)]
ReadSchema: struct<ws_sold_date_sk:int,ws_bill_customer_sk:int>
(5) ColumnarToRow [codegen id : 2]
Input [2]: [ws_sold_date_sk#4, ws_bill_customer_sk#5]
(6) Filter [codegen id : 2]
Input [2]: [ws_sold_date_sk#4, ws_bill_customer_sk#5]
Condition : (isnotnull(ws_sold_date_sk#4) AND isnotnull(ws_bill_customer_sk#5))
(7) Scan parquet default.date_dim
Output [3]: [d_date_sk#6, d_year#7, d_moy#8]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/date_dim]
PushedFilters: [IsNotNull(d_year), IsNotNull(d_moy), EqualTo(d_year,2002), GreaterThanOrEqual(d_moy,4), LessThanOrEqual(d_moy,7), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int,d_moy:int>
(8) ColumnarToRow [codegen id : 1]
Input [3]: [d_date_sk#6, d_year#7, d_moy#8]
(9) Filter [codegen id : 1]
Input [3]: [d_date_sk#6, d_year#7, d_moy#8]
Condition : (((((isnotnull(d_year#7) AND isnotnull(d_moy#8)) AND (d_year#7 = 2002)) AND (d_moy#8 >= 4)) AND (d_moy#8 <= 7)) AND isnotnull(d_date_sk#6))
(10) Project [codegen id : 1]
Output [1]: [d_date_sk#6]
Input [3]: [d_date_sk#6, d_year#7, d_moy#8]
(11) BroadcastExchange
Input [1]: [d_date_sk#6]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#9]
(12) BroadcastHashJoin [codegen id : 2]
Left keys [1]: [ws_sold_date_sk#4]
Right keys [1]: [d_date_sk#6]
Join condition: None
(13) Project [codegen id : 2]
Output [1]: [ws_bill_customer_sk#5 AS customer_sk#10]
Input [3]: [ws_sold_date_sk#4, ws_bill_customer_sk#5, d_date_sk#6]
(14) Scan parquet default.catalog_sales
Output [2]: [cs_sold_date_sk#11, cs_ship_customer_sk#12]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/catalog_sales]
PushedFilters: [IsNotNull(cs_sold_date_sk), IsNotNull(cs_ship_customer_sk)]
ReadSchema: struct<cs_sold_date_sk:int,cs_ship_customer_sk:int>
(15) ColumnarToRow [codegen id : 4]
Input [2]: [cs_sold_date_sk#11, cs_ship_customer_sk#12]
(16) Filter [codegen id : 4]
Input [2]: [cs_sold_date_sk#11, cs_ship_customer_sk#12]
Condition : (isnotnull(cs_sold_date_sk#11) AND isnotnull(cs_ship_customer_sk#12))
(17) ReusedExchange [Reuses operator id: 11]
Output [1]: [d_date_sk#6]
(18) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [cs_sold_date_sk#11]
Right keys [1]: [d_date_sk#6]
Join condition: None
(19) Project [codegen id : 4]
Output [1]: [cs_ship_customer_sk#12 AS customer_sk#13]
Input [3]: [cs_sold_date_sk#11, cs_ship_customer_sk#12, d_date_sk#6]
(20) Union
(21) BroadcastExchange
Input [1]: [customer_sk#10]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#14]
(22) BroadcastHashJoin [codegen id : 9]
Left keys [1]: [c_customer_sk#1]
Right keys [1]: [customer_sk#10]
Join condition: None
(23) Scan parquet default.store_sales
Output [2]: [ss_sold_date_sk#15, ss_customer_sk#16]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), IsNotNull(ss_customer_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_customer_sk:int>
(24) ColumnarToRow [codegen id : 6]
Input [2]: [ss_sold_date_sk#15, ss_customer_sk#16]
(25) Filter [codegen id : 6]
Input [2]: [ss_sold_date_sk#15, ss_customer_sk#16]
Condition : (isnotnull(ss_sold_date_sk#15) AND isnotnull(ss_customer_sk#16))
(26) ReusedExchange [Reuses operator id: 11]
Output [1]: [d_date_sk#6]
(27) BroadcastHashJoin [codegen id : 6]
Left keys [1]: [ss_sold_date_sk#15]
Right keys [1]: [d_date_sk#6]
Join condition: None
(28) Project [codegen id : 6]
Output [1]: [ss_customer_sk#16 AS customer_sk#17]
Input [3]: [ss_sold_date_sk#15, ss_customer_sk#16, d_date_sk#6]
(29) BroadcastExchange
Input [1]: [customer_sk#17]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#18]
(30) BroadcastHashJoin [codegen id : 9]
Left keys [1]: [c_customer_sk#1]
Right keys [1]: [customer_sk#17]
Join condition: None
(31) Project [codegen id : 9]
Output [2]: [c_current_cdemo_sk#2, c_current_addr_sk#3]
Input [3]: [c_customer_sk#1, c_current_cdemo_sk#2, c_current_addr_sk#3]
(32) Scan parquet default.customer_address
Output [2]: [ca_address_sk#19, ca_county#20]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/customer_address]
PushedFilters: [In(ca_county, [Walker County,Richland County,Gaines County,Douglas County,Dona Ana County]), IsNotNull(ca_address_sk)]
ReadSchema: struct<ca_address_sk:int,ca_county:string>
(33) ColumnarToRow [codegen id : 7]
Input [2]: [ca_address_sk#19, ca_county#20]
(34) Filter [codegen id : 7]
Input [2]: [ca_address_sk#19, ca_county#20]
Condition : (ca_county#20 IN (Walker County,Richland County,Gaines County,Douglas County,Dona Ana County) AND isnotnull(ca_address_sk#19))
(35) Project [codegen id : 7]
Output [1]: [ca_address_sk#19]
Input [2]: [ca_address_sk#19, ca_county#20]
(36) BroadcastExchange
Input [1]: [ca_address_sk#19]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#21]
(37) BroadcastHashJoin [codegen id : 9]
Left keys [1]: [c_current_addr_sk#3]
Right keys [1]: [ca_address_sk#19]
Join condition: None
(38) Project [codegen id : 9]
Output [1]: [c_current_cdemo_sk#2]
Input [3]: [c_current_cdemo_sk#2, c_current_addr_sk#3, ca_address_sk#19]
(39) Scan parquet default.customer_demographics
Output [9]: [cd_demo_sk#22, cd_gender#23, cd_marital_status#24, cd_education_status#25, cd_purchase_estimate#26, cd_credit_rating#27, cd_dep_count#28, cd_dep_employed_count#29, cd_dep_college_count#30]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/customer_demographics]
PushedFilters: [IsNotNull(cd_demo_sk)]
ReadSchema: struct<cd_demo_sk:int,cd_gender:string,cd_marital_status:string,cd_education_status:string,cd_purchase_estimate:int,cd_credit_rating:string,cd_dep_count:int,cd_dep_employed_count:int,cd_dep_college_count:int>
(40) ColumnarToRow [codegen id : 8]
Input [9]: [cd_demo_sk#22, cd_gender#23, cd_marital_status#24, cd_education_status#25, cd_purchase_estimate#26, cd_credit_rating#27, cd_dep_count#28, cd_dep_employed_count#29, cd_dep_college_count#30]
(41) Filter [codegen id : 8]
Input [9]: [cd_demo_sk#22, cd_gender#23, cd_marital_status#24, cd_education_status#25, cd_purchase_estimate#26, cd_credit_rating#27, cd_dep_count#28, cd_dep_employed_count#29, cd_dep_college_count#30]
Condition : isnotnull(cd_demo_sk#22)
(42) BroadcastExchange
Input [9]: [cd_demo_sk#22, cd_gender#23, cd_marital_status#24, cd_education_status#25, cd_purchase_estimate#26, cd_credit_rating#27, cd_dep_count#28, cd_dep_employed_count#29, cd_dep_college_count#30]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#31]
(43) BroadcastHashJoin [codegen id : 9]
Left keys [1]: [c_current_cdemo_sk#2]
Right keys [1]: [cd_demo_sk#22]
Join condition: None
(44) Project [codegen id : 9]
Output [8]: [cd_gender#23, cd_marital_status#24, cd_education_status#25, cd_purchase_estimate#26, cd_credit_rating#27, cd_dep_count#28, cd_dep_employed_count#29, cd_dep_college_count#30]
Input [10]: [c_current_cdemo_sk#2, cd_demo_sk#22, cd_gender#23, cd_marital_status#24, cd_education_status#25, cd_purchase_estimate#26, cd_credit_rating#27, cd_dep_count#28, cd_dep_employed_count#29, cd_dep_college_count#30]
(45) HashAggregate [codegen id : 9]
Input [8]: [cd_gender#23, cd_marital_status#24, cd_education_status#25, cd_purchase_estimate#26, cd_credit_rating#27, cd_dep_count#28, cd_dep_employed_count#29, cd_dep_college_count#30]
Keys [8]: [cd_gender#23, cd_marital_status#24, cd_education_status#25, cd_purchase_estimate#26, cd_credit_rating#27, cd_dep_count#28, cd_dep_employed_count#29, cd_dep_college_count#30]
Functions [1]: [partial_count(1)]
Aggregate Attributes [1]: [count#32]
Results [9]: [cd_gender#23, cd_marital_status#24, cd_education_status#25, cd_purchase_estimate#26, cd_credit_rating#27, cd_dep_count#28, cd_dep_employed_count#29, cd_dep_college_count#30, count#33]
(46) Exchange
Input [9]: [cd_gender#23, cd_marital_status#24, cd_education_status#25, cd_purchase_estimate#26, cd_credit_rating#27, cd_dep_count#28, cd_dep_employed_count#29, cd_dep_college_count#30, count#33]
Arguments: hashpartitioning(cd_gender#23, cd_marital_status#24, cd_education_status#25, cd_purchase_estimate#26, cd_credit_rating#27, cd_dep_count#28, cd_dep_employed_count#29, cd_dep_college_count#30, 5), true, [id=#34]
(47) HashAggregate [codegen id : 10]
Input [9]: [cd_gender#23, cd_marital_status#24, cd_education_status#25, cd_purchase_estimate#26, cd_credit_rating#27, cd_dep_count#28, cd_dep_employed_count#29, cd_dep_college_count#30, count#33]
Keys [8]: [cd_gender#23, cd_marital_status#24, cd_education_status#25, cd_purchase_estimate#26, cd_credit_rating#27, cd_dep_count#28, cd_dep_employed_count#29, cd_dep_college_count#30]
Functions [1]: [count(1)]
Aggregate Attributes [1]: [count(1)#35]
Results [14]: [cd_gender#23, cd_marital_status#24, cd_education_status#25, count(1)#35 AS cnt1#36, cd_purchase_estimate#26, count(1)#35 AS cnt2#37, cd_credit_rating#27, count(1)#35 AS cnt3#38, cd_dep_count#28, count(1)#35 AS cnt4#39, cd_dep_employed_count#29, count(1)#35 AS cnt5#40, cd_dep_college_count#30, count(1)#35 AS cnt6#41]
(48) TakeOrderedAndProject
Input [14]: [cd_gender#23, cd_marital_status#24, cd_education_status#25, cnt1#36, cd_purchase_estimate#26, cnt2#37, cd_credit_rating#27, cnt3#38, cd_dep_count#28, cnt4#39, cd_dep_employed_count#29, cnt5#40, cd_dep_college_count#30, cnt6#41]
Arguments: 100, [cd_gender#23 ASC NULLS FIRST, cd_marital_status#24 ASC NULLS FIRST, cd_education_status#25 ASC NULLS FIRST, cd_purchase_estimate#26 ASC NULLS FIRST, cd_credit_rating#27 ASC NULLS FIRST, cd_dep_count#28 ASC NULLS FIRST, cd_dep_employed_count#29 ASC NULLS FIRST, cd_dep_college_count#30 ASC NULLS FIRST], [cd_gender#23, cd_marital_status#24, cd_education_status#25, cnt1#36, cd_purchase_estimate#26, cnt2#37, cd_credit_rating#27, cnt3#38, cd_dep_count#28, cnt4#39, cd_dep_employed_count#29, cnt5#40, cd_dep_college_count#30, cnt6#41]

View file

@ -0,0 +1,71 @@
TakeOrderedAndProject [cd_credit_rating,cd_dep_college_count,cd_dep_count,cd_dep_employed_count,cd_education_status,cd_gender,cd_marital_status,cd_purchase_estimate,cnt1,cnt2,cnt3,cnt4,cnt5,cnt6]
WholeStageCodegen (10)
HashAggregate [cd_credit_rating,cd_dep_college_count,cd_dep_count,cd_dep_employed_count,cd_education_status,cd_gender,cd_marital_status,cd_purchase_estimate,count] [cnt1,cnt2,cnt3,cnt4,cnt5,cnt6,count,count(1)]
InputAdapter
Exchange [cd_credit_rating,cd_dep_college_count,cd_dep_count,cd_dep_employed_count,cd_education_status,cd_gender,cd_marital_status,cd_purchase_estimate] #1
WholeStageCodegen (9)
HashAggregate [cd_credit_rating,cd_dep_college_count,cd_dep_count,cd_dep_employed_count,cd_education_status,cd_gender,cd_marital_status,cd_purchase_estimate] [count,count]
Project [cd_credit_rating,cd_dep_college_count,cd_dep_count,cd_dep_employed_count,cd_education_status,cd_gender,cd_marital_status,cd_purchase_estimate]
BroadcastHashJoin [c_current_cdemo_sk,cd_demo_sk]
Project [c_current_cdemo_sk]
BroadcastHashJoin [c_current_addr_sk,ca_address_sk]
Project [c_current_addr_sk,c_current_cdemo_sk]
BroadcastHashJoin [c_customer_sk,customer_sk]
BroadcastHashJoin [c_customer_sk,customer_sk]
Filter [c_current_addr_sk,c_current_cdemo_sk,c_customer_sk]
ColumnarToRow
InputAdapter
Scan parquet default.customer [c_current_addr_sk,c_current_cdemo_sk,c_customer_sk]
InputAdapter
BroadcastExchange #2
Union
WholeStageCodegen (2)
Project [ws_bill_customer_sk]
BroadcastHashJoin [d_date_sk,ws_sold_date_sk]
Filter [ws_bill_customer_sk,ws_sold_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.web_sales [ws_bill_customer_sk,ws_sold_date_sk]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (1)
Project [d_date_sk]
Filter [d_date_sk,d_moy,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_moy,d_year]
WholeStageCodegen (4)
Project [cs_ship_customer_sk]
BroadcastHashJoin [cs_sold_date_sk,d_date_sk]
Filter [cs_ship_customer_sk,cs_sold_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.catalog_sales [cs_ship_customer_sk,cs_sold_date_sk]
InputAdapter
ReusedExchange [d_date_sk] #3
InputAdapter
BroadcastExchange #4
WholeStageCodegen (6)
Project [ss_customer_sk]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Filter [ss_customer_sk,ss_sold_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_customer_sk,ss_sold_date_sk]
InputAdapter
ReusedExchange [d_date_sk] #3
InputAdapter
BroadcastExchange #5
WholeStageCodegen (7)
Project [ca_address_sk]
Filter [ca_address_sk,ca_county]
ColumnarToRow
InputAdapter
Scan parquet default.customer_address [ca_address_sk,ca_county]
InputAdapter
BroadcastExchange #6
WholeStageCodegen (8)
Filter [cd_demo_sk]
ColumnarToRow
InputAdapter
Scan parquet default.customer_demographics [cd_credit_rating,cd_demo_sk,cd_dep_college_count,cd_dep_count,cd_dep_employed_count,cd_education_status,cd_gender,cd_marital_status,cd_purchase_estimate]

View file

@ -0,0 +1,221 @@
== Physical Plan ==
TakeOrderedAndProject (39)
+- * HashAggregate (38)
+- Exchange (37)
+- * HashAggregate (36)
+- * Project (35)
+- * BroadcastHashJoin Inner BuildRight (34)
:- * Project (28)
: +- * BroadcastHashJoin Inner BuildLeft (27)
: :- BroadcastExchange (23)
: : +- * Project (22)
: : +- * BroadcastHashJoin Inner BuildRight (21)
: : :- * Project (16)
: : : +- * BroadcastHashJoin Inner BuildLeft (15)
: : : :- BroadcastExchange (11)
: : : : +- * Project (10)
: : : : +- * BroadcastHashJoin Inner BuildLeft (9)
: : : : :- BroadcastExchange (5)
: : : : : +- * Project (4)
: : : : : +- * Filter (3)
: : : : : +- * ColumnarToRow (2)
: : : : : +- Scan parquet default.date_dim (1)
: : : : +- * Filter (8)
: : : : +- * ColumnarToRow (7)
: : : : +- Scan parquet default.store_sales (6)
: : : +- * Filter (14)
: : : +- * ColumnarToRow (13)
: : : +- Scan parquet default.customer (12)
: : +- BroadcastExchange (20)
: : +- * Filter (19)
: : +- * ColumnarToRow (18)
: : +- Scan parquet default.store (17)
: +- * Filter (26)
: +- * ColumnarToRow (25)
: +- Scan parquet default.customer_address (24)
+- BroadcastExchange (33)
+- * Project (32)
+- * Filter (31)
+- * ColumnarToRow (30)
+- Scan parquet default.item (29)
(1) Scan parquet default.date_dim
Output [3]: [d_date_sk#1, d_year#2, d_moy#3]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/date_dim]
PushedFilters: [IsNotNull(d_moy), IsNotNull(d_year), EqualTo(d_moy,11), EqualTo(d_year,1999), GreaterThanOrEqual(d_date_sk,2451484), LessThanOrEqual(d_date_sk,2451513), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int,d_moy:int>
(2) ColumnarToRow [codegen id : 1]
Input [3]: [d_date_sk#1, d_year#2, d_moy#3]
(3) Filter [codegen id : 1]
Input [3]: [d_date_sk#1, d_year#2, d_moy#3]
Condition : ((((((isnotnull(d_moy#3) AND isnotnull(d_year#2)) AND (d_moy#3 = 11)) AND (d_year#2 = 1999)) AND (d_date_sk#1 >= 2451484)) AND (d_date_sk#1 <= 2451513)) AND isnotnull(d_date_sk#1))
(4) Project [codegen id : 1]
Output [1]: [d_date_sk#1]
Input [3]: [d_date_sk#1, d_year#2, d_moy#3]
(5) BroadcastExchange
Input [1]: [d_date_sk#1]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#4]
(6) Scan parquet default.store_sales
Output [5]: [ss_sold_date_sk#5, ss_item_sk#6, ss_customer_sk#7, ss_store_sk#8, ss_ext_sales_price#9]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2451484), LessThanOrEqual(ss_sold_date_sk,2451513), IsNotNull(ss_item_sk), IsNotNull(ss_customer_sk), IsNotNull(ss_store_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_item_sk:int,ss_customer_sk:int,ss_store_sk:int,ss_ext_sales_price:decimal(7,2)>
(7) ColumnarToRow
Input [5]: [ss_sold_date_sk#5, ss_item_sk#6, ss_customer_sk#7, ss_store_sk#8, ss_ext_sales_price#9]
(8) Filter
Input [5]: [ss_sold_date_sk#5, ss_item_sk#6, ss_customer_sk#7, ss_store_sk#8, ss_ext_sales_price#9]
Condition : (((((isnotnull(ss_sold_date_sk#5) AND (ss_sold_date_sk#5 >= 2451484)) AND (ss_sold_date_sk#5 <= 2451513)) AND isnotnull(ss_item_sk#6)) AND isnotnull(ss_customer_sk#7)) AND isnotnull(ss_store_sk#8))
(9) BroadcastHashJoin [codegen id : 2]
Left keys [1]: [d_date_sk#1]
Right keys [1]: [ss_sold_date_sk#5]
Join condition: None
(10) Project [codegen id : 2]
Output [4]: [ss_item_sk#6, ss_customer_sk#7, ss_store_sk#8, ss_ext_sales_price#9]
Input [6]: [d_date_sk#1, ss_sold_date_sk#5, ss_item_sk#6, ss_customer_sk#7, ss_store_sk#8, ss_ext_sales_price#9]
(11) BroadcastExchange
Input [4]: [ss_item_sk#6, ss_customer_sk#7, ss_store_sk#8, ss_ext_sales_price#9]
Arguments: HashedRelationBroadcastMode(List(cast(input[1, int, true] as bigint)),false), [id=#10]
(12) Scan parquet default.customer
Output [2]: [c_customer_sk#11, c_current_addr_sk#12]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/customer]
PushedFilters: [IsNotNull(c_customer_sk), IsNotNull(c_current_addr_sk)]
ReadSchema: struct<c_customer_sk:int,c_current_addr_sk:int>
(13) ColumnarToRow
Input [2]: [c_customer_sk#11, c_current_addr_sk#12]
(14) Filter
Input [2]: [c_customer_sk#11, c_current_addr_sk#12]
Condition : (isnotnull(c_customer_sk#11) AND isnotnull(c_current_addr_sk#12))
(15) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_customer_sk#7]
Right keys [1]: [c_customer_sk#11]
Join condition: None
(16) Project [codegen id : 4]
Output [4]: [ss_item_sk#6, ss_store_sk#8, ss_ext_sales_price#9, c_current_addr_sk#12]
Input [6]: [ss_item_sk#6, ss_customer_sk#7, ss_store_sk#8, ss_ext_sales_price#9, c_customer_sk#11, c_current_addr_sk#12]
(17) Scan parquet default.store
Output [2]: [s_store_sk#13, s_zip#14]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store]
PushedFilters: [IsNotNull(s_zip), IsNotNull(s_store_sk)]
ReadSchema: struct<s_store_sk:int,s_zip:string>
(18) ColumnarToRow [codegen id : 3]
Input [2]: [s_store_sk#13, s_zip#14]
(19) Filter [codegen id : 3]
Input [2]: [s_store_sk#13, s_zip#14]
Condition : (isnotnull(s_zip#14) AND isnotnull(s_store_sk#13))
(20) BroadcastExchange
Input [2]: [s_store_sk#13, s_zip#14]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#15]
(21) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_store_sk#8]
Right keys [1]: [s_store_sk#13]
Join condition: None
(22) Project [codegen id : 4]
Output [4]: [ss_item_sk#6, ss_ext_sales_price#9, c_current_addr_sk#12, s_zip#14]
Input [6]: [ss_item_sk#6, ss_store_sk#8, ss_ext_sales_price#9, c_current_addr_sk#12, s_store_sk#13, s_zip#14]
(23) BroadcastExchange
Input [4]: [ss_item_sk#6, ss_ext_sales_price#9, c_current_addr_sk#12, s_zip#14]
Arguments: HashedRelationBroadcastMode(List(cast(input[2, int, true] as bigint)),false), [id=#16]
(24) Scan parquet default.customer_address
Output [2]: [ca_address_sk#17, ca_zip#18]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/customer_address]
PushedFilters: [IsNotNull(ca_address_sk), IsNotNull(ca_zip)]
ReadSchema: struct<ca_address_sk:int,ca_zip:string>
(25) ColumnarToRow
Input [2]: [ca_address_sk#17, ca_zip#18]
(26) Filter
Input [2]: [ca_address_sk#17, ca_zip#18]
Condition : (isnotnull(ca_address_sk#17) AND isnotnull(ca_zip#18))
(27) BroadcastHashJoin [codegen id : 6]
Left keys [1]: [c_current_addr_sk#12]
Right keys [1]: [ca_address_sk#17]
Join condition: NOT (substr(ca_zip#18, 1, 5) = substr(s_zip#14, 1, 5))
(28) Project [codegen id : 6]
Output [2]: [ss_item_sk#6, ss_ext_sales_price#9]
Input [6]: [ss_item_sk#6, ss_ext_sales_price#9, c_current_addr_sk#12, s_zip#14, ca_address_sk#17, ca_zip#18]
(29) Scan parquet default.item
Output [6]: [i_item_sk#19, i_brand_id#20, i_brand#21, i_manufact_id#22, i_manufact#23, i_manager_id#24]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/item]
PushedFilters: [IsNotNull(i_manager_id), EqualTo(i_manager_id,7), IsNotNull(i_item_sk)]
ReadSchema: struct<i_item_sk:int,i_brand_id:int,i_brand:string,i_manufact_id:int,i_manufact:string,i_manager_id:int>
(30) ColumnarToRow [codegen id : 5]
Input [6]: [i_item_sk#19, i_brand_id#20, i_brand#21, i_manufact_id#22, i_manufact#23, i_manager_id#24]
(31) Filter [codegen id : 5]
Input [6]: [i_item_sk#19, i_brand_id#20, i_brand#21, i_manufact_id#22, i_manufact#23, i_manager_id#24]
Condition : ((isnotnull(i_manager_id#24) AND (i_manager_id#24 = 7)) AND isnotnull(i_item_sk#19))
(32) Project [codegen id : 5]
Output [5]: [i_item_sk#19, i_brand_id#20, i_brand#21, i_manufact_id#22, i_manufact#23]
Input [6]: [i_item_sk#19, i_brand_id#20, i_brand#21, i_manufact_id#22, i_manufact#23, i_manager_id#24]
(33) BroadcastExchange
Input [5]: [i_item_sk#19, i_brand_id#20, i_brand#21, i_manufact_id#22, i_manufact#23]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#25]
(34) BroadcastHashJoin [codegen id : 6]
Left keys [1]: [ss_item_sk#6]
Right keys [1]: [i_item_sk#19]
Join condition: None
(35) Project [codegen id : 6]
Output [5]: [ss_ext_sales_price#9, i_brand_id#20, i_brand#21, i_manufact_id#22, i_manufact#23]
Input [7]: [ss_item_sk#6, ss_ext_sales_price#9, i_item_sk#19, i_brand_id#20, i_brand#21, i_manufact_id#22, i_manufact#23]
(36) HashAggregate [codegen id : 6]
Input [5]: [ss_ext_sales_price#9, i_brand_id#20, i_brand#21, i_manufact_id#22, i_manufact#23]
Keys [4]: [i_brand#21, i_brand_id#20, i_manufact_id#22, i_manufact#23]
Functions [1]: [partial_sum(UnscaledValue(ss_ext_sales_price#9))]
Aggregate Attributes [1]: [sum#26]
Results [5]: [i_brand#21, i_brand_id#20, i_manufact_id#22, i_manufact#23, sum#27]
(37) Exchange
Input [5]: [i_brand#21, i_brand_id#20, i_manufact_id#22, i_manufact#23, sum#27]
Arguments: hashpartitioning(i_brand#21, i_brand_id#20, i_manufact_id#22, i_manufact#23, 5), true, [id=#28]
(38) HashAggregate [codegen id : 7]
Input [5]: [i_brand#21, i_brand_id#20, i_manufact_id#22, i_manufact#23, sum#27]
Keys [4]: [i_brand#21, i_brand_id#20, i_manufact_id#22, i_manufact#23]
Functions [1]: [sum(UnscaledValue(ss_ext_sales_price#9))]
Aggregate Attributes [1]: [sum(UnscaledValue(ss_ext_sales_price#9))#29]
Results [5]: [i_brand_id#20 AS brand_id#30, i_brand#21 AS brand#31, i_manufact_id#22, i_manufact#23, MakeDecimal(sum(UnscaledValue(ss_ext_sales_price#9))#29,17,2) AS ext_price#32]
(39) TakeOrderedAndProject
Input [5]: [brand_id#30, brand#31, i_manufact_id#22, i_manufact#23, ext_price#32]
Arguments: 100, [ext_price#32 DESC NULLS LAST, brand#31 ASC NULLS FIRST, brand_id#30 ASC NULLS FIRST, i_manufact_id#22 ASC NULLS FIRST, i_manufact#23 ASC NULLS FIRST], [brand_id#30, brand#31, i_manufact_id#22, i_manufact#23, ext_price#32]

View file

@ -0,0 +1,58 @@
TakeOrderedAndProject [brand,brand_id,ext_price,i_manufact,i_manufact_id]
WholeStageCodegen (7)
HashAggregate [i_brand,i_brand_id,i_manufact,i_manufact_id,sum] [brand,brand_id,ext_price,sum,sum(UnscaledValue(ss_ext_sales_price))]
InputAdapter
Exchange [i_brand,i_brand_id,i_manufact,i_manufact_id] #1
WholeStageCodegen (6)
HashAggregate [i_brand,i_brand_id,i_manufact,i_manufact_id,ss_ext_sales_price] [sum,sum]
Project [i_brand,i_brand_id,i_manufact,i_manufact_id,ss_ext_sales_price]
BroadcastHashJoin [i_item_sk,ss_item_sk]
Project [ss_ext_sales_price,ss_item_sk]
BroadcastHashJoin [c_current_addr_sk,ca_address_sk,ca_zip,s_zip]
InputAdapter
BroadcastExchange #2
WholeStageCodegen (4)
Project [c_current_addr_sk,s_zip,ss_ext_sales_price,ss_item_sk]
BroadcastHashJoin [s_store_sk,ss_store_sk]
Project [c_current_addr_sk,ss_ext_sales_price,ss_item_sk,ss_store_sk]
BroadcastHashJoin [c_customer_sk,ss_customer_sk]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (2)
Project [ss_customer_sk,ss_ext_sales_price,ss_item_sk,ss_store_sk]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (1)
Project [d_date_sk]
Filter [d_date_sk,d_moy,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_moy,d_year]
Filter [ss_customer_sk,ss_item_sk,ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_customer_sk,ss_ext_sales_price,ss_item_sk,ss_sold_date_sk,ss_store_sk]
Filter [c_current_addr_sk,c_customer_sk]
ColumnarToRow
InputAdapter
Scan parquet default.customer [c_current_addr_sk,c_customer_sk]
InputAdapter
BroadcastExchange #5
WholeStageCodegen (3)
Filter [s_store_sk,s_zip]
ColumnarToRow
InputAdapter
Scan parquet default.store [s_store_sk,s_zip]
Filter [ca_address_sk,ca_zip]
ColumnarToRow
InputAdapter
Scan parquet default.customer_address [ca_address_sk,ca_zip]
InputAdapter
BroadcastExchange #6
WholeStageCodegen (5)
Project [i_brand,i_brand_id,i_item_sk,i_manufact,i_manufact_id]
Filter [i_item_sk,i_manager_id]
ColumnarToRow
InputAdapter
Scan parquet default.item [i_brand,i_brand_id,i_item_sk,i_manager_id,i_manufact,i_manufact_id]

View file

@ -0,0 +1,221 @@
== Physical Plan ==
TakeOrderedAndProject (39)
+- * HashAggregate (38)
+- Exchange (37)
+- * HashAggregate (36)
+- * Project (35)
+- * BroadcastHashJoin Inner BuildRight (34)
:- * Project (29)
: +- * BroadcastHashJoin Inner BuildRight (28)
: :- * Project (23)
: : +- * BroadcastHashJoin Inner BuildRight (22)
: : :- * Project (17)
: : : +- * BroadcastHashJoin Inner BuildRight (16)
: : : :- * Project (10)
: : : : +- * BroadcastHashJoin Inner BuildRight (9)
: : : : :- * Project (4)
: : : : : +- * Filter (3)
: : : : : +- * ColumnarToRow (2)
: : : : : +- Scan parquet default.date_dim (1)
: : : : +- BroadcastExchange (8)
: : : : +- * Filter (7)
: : : : +- * ColumnarToRow (6)
: : : : +- Scan parquet default.store_sales (5)
: : : +- BroadcastExchange (15)
: : : +- * Project (14)
: : : +- * Filter (13)
: : : +- * ColumnarToRow (12)
: : : +- Scan parquet default.item (11)
: : +- BroadcastExchange (21)
: : +- * Filter (20)
: : +- * ColumnarToRow (19)
: : +- Scan parquet default.customer (18)
: +- BroadcastExchange (27)
: +- * Filter (26)
: +- * ColumnarToRow (25)
: +- Scan parquet default.customer_address (24)
+- BroadcastExchange (33)
+- * Filter (32)
+- * ColumnarToRow (31)
+- Scan parquet default.store (30)
(1) Scan parquet default.date_dim
Output [3]: [d_date_sk#1, d_year#2, d_moy#3]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/date_dim]
PushedFilters: [IsNotNull(d_moy), IsNotNull(d_year), EqualTo(d_moy,11), EqualTo(d_year,1999), GreaterThanOrEqual(d_date_sk,2451484), LessThanOrEqual(d_date_sk,2451513), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int,d_moy:int>
(2) ColumnarToRow [codegen id : 6]
Input [3]: [d_date_sk#1, d_year#2, d_moy#3]
(3) Filter [codegen id : 6]
Input [3]: [d_date_sk#1, d_year#2, d_moy#3]
Condition : ((((((isnotnull(d_moy#3) AND isnotnull(d_year#2)) AND (d_moy#3 = 11)) AND (d_year#2 = 1999)) AND (d_date_sk#1 >= 2451484)) AND (d_date_sk#1 <= 2451513)) AND isnotnull(d_date_sk#1))
(4) Project [codegen id : 6]
Output [1]: [d_date_sk#1]
Input [3]: [d_date_sk#1, d_year#2, d_moy#3]
(5) Scan parquet default.store_sales
Output [5]: [ss_sold_date_sk#4, ss_item_sk#5, ss_customer_sk#6, ss_store_sk#7, ss_ext_sales_price#8]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2451484), LessThanOrEqual(ss_sold_date_sk,2451513), IsNotNull(ss_item_sk), IsNotNull(ss_customer_sk), IsNotNull(ss_store_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_item_sk:int,ss_customer_sk:int,ss_store_sk:int,ss_ext_sales_price:decimal(7,2)>
(6) ColumnarToRow [codegen id : 1]
Input [5]: [ss_sold_date_sk#4, ss_item_sk#5, ss_customer_sk#6, ss_store_sk#7, ss_ext_sales_price#8]
(7) Filter [codegen id : 1]
Input [5]: [ss_sold_date_sk#4, ss_item_sk#5, ss_customer_sk#6, ss_store_sk#7, ss_ext_sales_price#8]
Condition : (((((isnotnull(ss_sold_date_sk#4) AND (ss_sold_date_sk#4 >= 2451484)) AND (ss_sold_date_sk#4 <= 2451513)) AND isnotnull(ss_item_sk#5)) AND isnotnull(ss_customer_sk#6)) AND isnotnull(ss_store_sk#7))
(8) BroadcastExchange
Input [5]: [ss_sold_date_sk#4, ss_item_sk#5, ss_customer_sk#6, ss_store_sk#7, ss_ext_sales_price#8]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#9]
(9) BroadcastHashJoin [codegen id : 6]
Left keys [1]: [d_date_sk#1]
Right keys [1]: [ss_sold_date_sk#4]
Join condition: None
(10) Project [codegen id : 6]
Output [4]: [ss_item_sk#5, ss_customer_sk#6, ss_store_sk#7, ss_ext_sales_price#8]
Input [6]: [d_date_sk#1, ss_sold_date_sk#4, ss_item_sk#5, ss_customer_sk#6, ss_store_sk#7, ss_ext_sales_price#8]
(11) Scan parquet default.item
Output [6]: [i_item_sk#10, i_brand_id#11, i_brand#12, i_manufact_id#13, i_manufact#14, i_manager_id#15]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/item]
PushedFilters: [IsNotNull(i_manager_id), EqualTo(i_manager_id,7), IsNotNull(i_item_sk)]
ReadSchema: struct<i_item_sk:int,i_brand_id:int,i_brand:string,i_manufact_id:int,i_manufact:string,i_manager_id:int>
(12) ColumnarToRow [codegen id : 2]
Input [6]: [i_item_sk#10, i_brand_id#11, i_brand#12, i_manufact_id#13, i_manufact#14, i_manager_id#15]
(13) Filter [codegen id : 2]
Input [6]: [i_item_sk#10, i_brand_id#11, i_brand#12, i_manufact_id#13, i_manufact#14, i_manager_id#15]
Condition : ((isnotnull(i_manager_id#15) AND (i_manager_id#15 = 7)) AND isnotnull(i_item_sk#10))
(14) Project [codegen id : 2]
Output [5]: [i_item_sk#10, i_brand_id#11, i_brand#12, i_manufact_id#13, i_manufact#14]
Input [6]: [i_item_sk#10, i_brand_id#11, i_brand#12, i_manufact_id#13, i_manufact#14, i_manager_id#15]
(15) BroadcastExchange
Input [5]: [i_item_sk#10, i_brand_id#11, i_brand#12, i_manufact_id#13, i_manufact#14]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#16]
(16) BroadcastHashJoin [codegen id : 6]
Left keys [1]: [ss_item_sk#5]
Right keys [1]: [i_item_sk#10]
Join condition: None
(17) Project [codegen id : 6]
Output [7]: [ss_customer_sk#6, ss_store_sk#7, ss_ext_sales_price#8, i_brand_id#11, i_brand#12, i_manufact_id#13, i_manufact#14]
Input [9]: [ss_item_sk#5, ss_customer_sk#6, ss_store_sk#7, ss_ext_sales_price#8, i_item_sk#10, i_brand_id#11, i_brand#12, i_manufact_id#13, i_manufact#14]
(18) Scan parquet default.customer
Output [2]: [c_customer_sk#17, c_current_addr_sk#18]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/customer]
PushedFilters: [IsNotNull(c_customer_sk), IsNotNull(c_current_addr_sk)]
ReadSchema: struct<c_customer_sk:int,c_current_addr_sk:int>
(19) ColumnarToRow [codegen id : 3]
Input [2]: [c_customer_sk#17, c_current_addr_sk#18]
(20) Filter [codegen id : 3]
Input [2]: [c_customer_sk#17, c_current_addr_sk#18]
Condition : (isnotnull(c_customer_sk#17) AND isnotnull(c_current_addr_sk#18))
(21) BroadcastExchange
Input [2]: [c_customer_sk#17, c_current_addr_sk#18]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#19]
(22) BroadcastHashJoin [codegen id : 6]
Left keys [1]: [ss_customer_sk#6]
Right keys [1]: [c_customer_sk#17]
Join condition: None
(23) Project [codegen id : 6]
Output [7]: [ss_store_sk#7, ss_ext_sales_price#8, i_brand_id#11, i_brand#12, i_manufact_id#13, i_manufact#14, c_current_addr_sk#18]
Input [9]: [ss_customer_sk#6, ss_store_sk#7, ss_ext_sales_price#8, i_brand_id#11, i_brand#12, i_manufact_id#13, i_manufact#14, c_customer_sk#17, c_current_addr_sk#18]
(24) Scan parquet default.customer_address
Output [2]: [ca_address_sk#20, ca_zip#21]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/customer_address]
PushedFilters: [IsNotNull(ca_address_sk), IsNotNull(ca_zip)]
ReadSchema: struct<ca_address_sk:int,ca_zip:string>
(25) ColumnarToRow [codegen id : 4]
Input [2]: [ca_address_sk#20, ca_zip#21]
(26) Filter [codegen id : 4]
Input [2]: [ca_address_sk#20, ca_zip#21]
Condition : (isnotnull(ca_address_sk#20) AND isnotnull(ca_zip#21))
(27) BroadcastExchange
Input [2]: [ca_address_sk#20, ca_zip#21]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#22]
(28) BroadcastHashJoin [codegen id : 6]
Left keys [1]: [c_current_addr_sk#18]
Right keys [1]: [ca_address_sk#20]
Join condition: None
(29) Project [codegen id : 6]
Output [7]: [ss_store_sk#7, ss_ext_sales_price#8, i_brand_id#11, i_brand#12, i_manufact_id#13, i_manufact#14, ca_zip#21]
Input [9]: [ss_store_sk#7, ss_ext_sales_price#8, i_brand_id#11, i_brand#12, i_manufact_id#13, i_manufact#14, c_current_addr_sk#18, ca_address_sk#20, ca_zip#21]
(30) Scan parquet default.store
Output [2]: [s_store_sk#23, s_zip#24]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store]
PushedFilters: [IsNotNull(s_zip), IsNotNull(s_store_sk)]
ReadSchema: struct<s_store_sk:int,s_zip:string>
(31) ColumnarToRow [codegen id : 5]
Input [2]: [s_store_sk#23, s_zip#24]
(32) Filter [codegen id : 5]
Input [2]: [s_store_sk#23, s_zip#24]
Condition : (isnotnull(s_zip#24) AND isnotnull(s_store_sk#23))
(33) BroadcastExchange
Input [2]: [s_store_sk#23, s_zip#24]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#25]
(34) BroadcastHashJoin [codegen id : 6]
Left keys [1]: [ss_store_sk#7]
Right keys [1]: [s_store_sk#23]
Join condition: NOT (substr(ca_zip#21, 1, 5) = substr(s_zip#24, 1, 5))
(35) Project [codegen id : 6]
Output [5]: [ss_ext_sales_price#8, i_brand_id#11, i_brand#12, i_manufact_id#13, i_manufact#14]
Input [9]: [ss_store_sk#7, ss_ext_sales_price#8, i_brand_id#11, i_brand#12, i_manufact_id#13, i_manufact#14, ca_zip#21, s_store_sk#23, s_zip#24]
(36) HashAggregate [codegen id : 6]
Input [5]: [ss_ext_sales_price#8, i_brand_id#11, i_brand#12, i_manufact_id#13, i_manufact#14]
Keys [4]: [i_brand#12, i_brand_id#11, i_manufact_id#13, i_manufact#14]
Functions [1]: [partial_sum(UnscaledValue(ss_ext_sales_price#8))]
Aggregate Attributes [1]: [sum#26]
Results [5]: [i_brand#12, i_brand_id#11, i_manufact_id#13, i_manufact#14, sum#27]
(37) Exchange
Input [5]: [i_brand#12, i_brand_id#11, i_manufact_id#13, i_manufact#14, sum#27]
Arguments: hashpartitioning(i_brand#12, i_brand_id#11, i_manufact_id#13, i_manufact#14, 5), true, [id=#28]
(38) HashAggregate [codegen id : 7]
Input [5]: [i_brand#12, i_brand_id#11, i_manufact_id#13, i_manufact#14, sum#27]
Keys [4]: [i_brand#12, i_brand_id#11, i_manufact_id#13, i_manufact#14]
Functions [1]: [sum(UnscaledValue(ss_ext_sales_price#8))]
Aggregate Attributes [1]: [sum(UnscaledValue(ss_ext_sales_price#8))#29]
Results [5]: [i_brand_id#11 AS brand_id#30, i_brand#12 AS brand#31, i_manufact_id#13, i_manufact#14, MakeDecimal(sum(UnscaledValue(ss_ext_sales_price#8))#29,17,2) AS ext_price#32]
(39) TakeOrderedAndProject
Input [5]: [brand_id#30, brand#31, i_manufact_id#13, i_manufact#14, ext_price#32]
Arguments: 100, [ext_price#32 DESC NULLS LAST, brand#31 ASC NULLS FIRST, brand_id#30 ASC NULLS FIRST, i_manufact_id#13 ASC NULLS FIRST, i_manufact#14 ASC NULLS FIRST], [brand_id#30, brand#31, i_manufact_id#13, i_manufact#14, ext_price#32]

View file

@ -0,0 +1,58 @@
TakeOrderedAndProject [brand,brand_id,ext_price,i_manufact,i_manufact_id]
WholeStageCodegen (7)
HashAggregate [i_brand,i_brand_id,i_manufact,i_manufact_id,sum] [brand,brand_id,ext_price,sum,sum(UnscaledValue(ss_ext_sales_price))]
InputAdapter
Exchange [i_brand,i_brand_id,i_manufact,i_manufact_id] #1
WholeStageCodegen (6)
HashAggregate [i_brand,i_brand_id,i_manufact,i_manufact_id,ss_ext_sales_price] [sum,sum]
Project [i_brand,i_brand_id,i_manufact,i_manufact_id,ss_ext_sales_price]
BroadcastHashJoin [ca_zip,s_store_sk,s_zip,ss_store_sk]
Project [ca_zip,i_brand,i_brand_id,i_manufact,i_manufact_id,ss_ext_sales_price,ss_store_sk]
BroadcastHashJoin [c_current_addr_sk,ca_address_sk]
Project [c_current_addr_sk,i_brand,i_brand_id,i_manufact,i_manufact_id,ss_ext_sales_price,ss_store_sk]
BroadcastHashJoin [c_customer_sk,ss_customer_sk]
Project [i_brand,i_brand_id,i_manufact,i_manufact_id,ss_customer_sk,ss_ext_sales_price,ss_store_sk]
BroadcastHashJoin [i_item_sk,ss_item_sk]
Project [ss_customer_sk,ss_ext_sales_price,ss_item_sk,ss_store_sk]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Project [d_date_sk]
Filter [d_date_sk,d_moy,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_moy,d_year]
InputAdapter
BroadcastExchange #2
WholeStageCodegen (1)
Filter [ss_customer_sk,ss_item_sk,ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_customer_sk,ss_ext_sales_price,ss_item_sk,ss_sold_date_sk,ss_store_sk]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (2)
Project [i_brand,i_brand_id,i_item_sk,i_manufact,i_manufact_id]
Filter [i_item_sk,i_manager_id]
ColumnarToRow
InputAdapter
Scan parquet default.item [i_brand,i_brand_id,i_item_sk,i_manager_id,i_manufact,i_manufact_id]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (3)
Filter [c_current_addr_sk,c_customer_sk]
ColumnarToRow
InputAdapter
Scan parquet default.customer [c_current_addr_sk,c_customer_sk]
InputAdapter
BroadcastExchange #5
WholeStageCodegen (4)
Filter [ca_address_sk,ca_zip]
ColumnarToRow
InputAdapter
Scan parquet default.customer_address [ca_address_sk,ca_zip]
InputAdapter
BroadcastExchange #6
WholeStageCodegen (5)
Filter [s_store_sk,s_zip]
ColumnarToRow
InputAdapter
Scan parquet default.store [s_store_sk,s_zip]

View file

@ -0,0 +1,428 @@
== Physical Plan ==
TakeOrderedAndProject (77)
+- Union (76)
:- * HashAggregate (32)
: +- Exchange (31)
: +- * HashAggregate (30)
: +- * Project (29)
: +- * BroadcastHashJoin Inner BuildRight (28)
: :- * Project (23)
: : +- * BroadcastHashJoin Inner BuildRight (22)
: : :- * Project (17)
: : : +- * BroadcastHashJoin Inner BuildRight (16)
: : : :- * Project (10)
: : : : +- * BroadcastHashJoin Inner BuildLeft (9)
: : : : :- BroadcastExchange (5)
: : : : : +- * Project (4)
: : : : : +- * Filter (3)
: : : : : +- * ColumnarToRow (2)
: : : : : +- Scan parquet default.date_dim (1)
: : : : +- * Filter (8)
: : : : +- * ColumnarToRow (7)
: : : : +- Scan parquet default.store_sales (6)
: : : +- BroadcastExchange (15)
: : : +- * Project (14)
: : : +- * Filter (13)
: : : +- * ColumnarToRow (12)
: : : +- Scan parquet default.customer_demographics (11)
: : +- BroadcastExchange (21)
: : +- * Filter (20)
: : +- * ColumnarToRow (19)
: : +- Scan parquet default.store (18)
: +- BroadcastExchange (27)
: +- * Filter (26)
: +- * ColumnarToRow (25)
: +- Scan parquet default.item (24)
:- * HashAggregate (54)
: +- Exchange (53)
: +- * HashAggregate (52)
: +- * Project (51)
: +- * BroadcastHashJoin Inner BuildRight (50)
: :- * Project (48)
: : +- * BroadcastHashJoin Inner BuildRight (47)
: : :- * Project (45)
: : : +- * BroadcastHashJoin Inner BuildRight (44)
: : : :- * Project (38)
: : : : +- * BroadcastHashJoin Inner BuildLeft (37)
: : : : :- ReusedExchange (33)
: : : : +- * Filter (36)
: : : : +- * ColumnarToRow (35)
: : : : +- Scan parquet default.store_sales (34)
: : : +- BroadcastExchange (43)
: : : +- * Project (42)
: : : +- * Filter (41)
: : : +- * ColumnarToRow (40)
: : : +- Scan parquet default.store (39)
: : +- ReusedExchange (46)
: +- ReusedExchange (49)
+- * HashAggregate (75)
+- Exchange (74)
+- * HashAggregate (73)
+- * Project (72)
+- * BroadcastHashJoin Inner BuildRight (71)
:- * Project (66)
: +- * BroadcastHashJoin Inner BuildRight (65)
: :- * Project (63)
: : +- * BroadcastHashJoin Inner BuildRight (62)
: : :- * Project (60)
: : : +- * BroadcastHashJoin Inner BuildLeft (59)
: : : :- ReusedExchange (55)
: : : +- * Filter (58)
: : : +- * ColumnarToRow (57)
: : : +- Scan parquet default.store_sales (56)
: : +- ReusedExchange (61)
: +- ReusedExchange (64)
+- BroadcastExchange (70)
+- * Filter (69)
+- * ColumnarToRow (68)
+- Scan parquet default.item (67)
(1) Scan parquet default.date_dim
Output [2]: [d_date_sk#1, d_year#2]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/date_dim]
PushedFilters: [IsNotNull(d_year), EqualTo(d_year,2000), LessThanOrEqual(d_date_sk,2451910), GreaterThanOrEqual(d_date_sk,2451545), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int>
(2) ColumnarToRow [codegen id : 1]
Input [2]: [d_date_sk#1, d_year#2]
(3) Filter [codegen id : 1]
Input [2]: [d_date_sk#1, d_year#2]
Condition : ((((isnotnull(d_year#2) AND (d_year#2 = 2000)) AND (d_date_sk#1 <= 2451910)) AND (d_date_sk#1 >= 2451545)) AND isnotnull(d_date_sk#1))
(4) Project [codegen id : 1]
Output [1]: [d_date_sk#1]
Input [2]: [d_date_sk#1, d_year#2]
(5) BroadcastExchange
Input [1]: [d_date_sk#1]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#3]
(6) Scan parquet default.store_sales
Output [8]: [ss_sold_date_sk#4, ss_item_sk#5, ss_cdemo_sk#6, ss_store_sk#7, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2451545), LessThanOrEqual(ss_sold_date_sk,2451910), IsNotNull(ss_cdemo_sk), IsNotNull(ss_store_sk), IsNotNull(ss_item_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_item_sk:int,ss_cdemo_sk:int,ss_store_sk:int,ss_quantity:int,ss_list_price:decimal(7,2),ss_sales_price:decimal(7,2),ss_coupon_amt:decimal(7,2)>
(7) ColumnarToRow
Input [8]: [ss_sold_date_sk#4, ss_item_sk#5, ss_cdemo_sk#6, ss_store_sk#7, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11]
(8) Filter
Input [8]: [ss_sold_date_sk#4, ss_item_sk#5, ss_cdemo_sk#6, ss_store_sk#7, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11]
Condition : (((((isnotnull(ss_sold_date_sk#4) AND (ss_sold_date_sk#4 >= 2451545)) AND (ss_sold_date_sk#4 <= 2451910)) AND isnotnull(ss_cdemo_sk#6)) AND isnotnull(ss_store_sk#7)) AND isnotnull(ss_item_sk#5))
(9) BroadcastHashJoin [codegen id : 5]
Left keys [1]: [d_date_sk#1]
Right keys [1]: [ss_sold_date_sk#4]
Join condition: None
(10) Project [codegen id : 5]
Output [7]: [ss_item_sk#5, ss_cdemo_sk#6, ss_store_sk#7, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11]
Input [9]: [d_date_sk#1, ss_sold_date_sk#4, ss_item_sk#5, ss_cdemo_sk#6, ss_store_sk#7, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11]
(11) Scan parquet default.customer_demographics
Output [4]: [cd_demo_sk#12, cd_gender#13, cd_marital_status#14, cd_education_status#15]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/customer_demographics]
PushedFilters: [IsNotNull(cd_education_status), IsNotNull(cd_gender), IsNotNull(cd_marital_status), EqualTo(cd_gender,F), EqualTo(cd_marital_status,D), EqualTo(cd_education_status,Primary), IsNotNull(cd_demo_sk)]
ReadSchema: struct<cd_demo_sk:int,cd_gender:string,cd_marital_status:string,cd_education_status:string>
(12) ColumnarToRow [codegen id : 2]
Input [4]: [cd_demo_sk#12, cd_gender#13, cd_marital_status#14, cd_education_status#15]
(13) Filter [codegen id : 2]
Input [4]: [cd_demo_sk#12, cd_gender#13, cd_marital_status#14, cd_education_status#15]
Condition : ((((((isnotnull(cd_education_status#15) AND isnotnull(cd_gender#13)) AND isnotnull(cd_marital_status#14)) AND (cd_gender#13 = F)) AND (cd_marital_status#14 = D)) AND (cd_education_status#15 = Primary)) AND isnotnull(cd_demo_sk#12))
(14) Project [codegen id : 2]
Output [1]: [cd_demo_sk#12]
Input [4]: [cd_demo_sk#12, cd_gender#13, cd_marital_status#14, cd_education_status#15]
(15) BroadcastExchange
Input [1]: [cd_demo_sk#12]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#16]
(16) BroadcastHashJoin [codegen id : 5]
Left keys [1]: [ss_cdemo_sk#6]
Right keys [1]: [cd_demo_sk#12]
Join condition: None
(17) Project [codegen id : 5]
Output [6]: [ss_item_sk#5, ss_store_sk#7, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11]
Input [8]: [ss_item_sk#5, ss_cdemo_sk#6, ss_store_sk#7, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11, cd_demo_sk#12]
(18) Scan parquet default.store
Output [2]: [s_store_sk#17, s_state#18]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store]
PushedFilters: [In(s_state, [TN,AL,SD]), IsNotNull(s_store_sk)]
ReadSchema: struct<s_store_sk:int,s_state:string>
(19) ColumnarToRow [codegen id : 3]
Input [2]: [s_store_sk#17, s_state#18]
(20) Filter [codegen id : 3]
Input [2]: [s_store_sk#17, s_state#18]
Condition : (s_state#18 IN (TN,AL,SD) AND isnotnull(s_store_sk#17))
(21) BroadcastExchange
Input [2]: [s_store_sk#17, s_state#18]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#19]
(22) BroadcastHashJoin [codegen id : 5]
Left keys [1]: [ss_store_sk#7]
Right keys [1]: [s_store_sk#17]
Join condition: None
(23) Project [codegen id : 5]
Output [6]: [ss_item_sk#5, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11, s_state#18]
Input [8]: [ss_item_sk#5, ss_store_sk#7, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11, s_store_sk#17, s_state#18]
(24) Scan parquet default.item
Output [2]: [i_item_sk#20, i_item_id#21]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/item]
PushedFilters: [IsNotNull(i_item_sk)]
ReadSchema: struct<i_item_sk:int,i_item_id:string>
(25) ColumnarToRow [codegen id : 4]
Input [2]: [i_item_sk#20, i_item_id#21]
(26) Filter [codegen id : 4]
Input [2]: [i_item_sk#20, i_item_id#21]
Condition : isnotnull(i_item_sk#20)
(27) BroadcastExchange
Input [2]: [i_item_sk#20, i_item_id#21]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#22]
(28) BroadcastHashJoin [codegen id : 5]
Left keys [1]: [ss_item_sk#5]
Right keys [1]: [i_item_sk#20]
Join condition: None
(29) Project [codegen id : 5]
Output [6]: [i_item_id#21, s_state#18, ss_quantity#8 AS agg1#23, ss_list_price#9 AS agg2#24, ss_coupon_amt#11 AS agg3#25, ss_sales_price#10 AS agg4#26]
Input [8]: [ss_item_sk#5, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11, s_state#18, i_item_sk#20, i_item_id#21]
(30) HashAggregate [codegen id : 5]
Input [6]: [i_item_id#21, s_state#18, agg1#23, agg2#24, agg3#25, agg4#26]
Keys [2]: [i_item_id#21, s_state#18]
Functions [4]: [partial_avg(cast(agg1#23 as bigint)), partial_avg(UnscaledValue(agg2#24)), partial_avg(UnscaledValue(agg3#25)), partial_avg(UnscaledValue(agg4#26))]
Aggregate Attributes [8]: [sum#27, count#28, sum#29, count#30, sum#31, count#32, sum#33, count#34]
Results [10]: [i_item_id#21, s_state#18, sum#35, count#36, sum#37, count#38, sum#39, count#40, sum#41, count#42]
(31) Exchange
Input [10]: [i_item_id#21, s_state#18, sum#35, count#36, sum#37, count#38, sum#39, count#40, sum#41, count#42]
Arguments: hashpartitioning(i_item_id#21, s_state#18, 5), true, [id=#43]
(32) HashAggregate [codegen id : 6]
Input [10]: [i_item_id#21, s_state#18, sum#35, count#36, sum#37, count#38, sum#39, count#40, sum#41, count#42]
Keys [2]: [i_item_id#21, s_state#18]
Functions [4]: [avg(cast(agg1#23 as bigint)), avg(UnscaledValue(agg2#24)), avg(UnscaledValue(agg3#25)), avg(UnscaledValue(agg4#26))]
Aggregate Attributes [4]: [avg(cast(agg1#23 as bigint))#44, avg(UnscaledValue(agg2#24))#45, avg(UnscaledValue(agg3#25))#46, avg(UnscaledValue(agg4#26))#47]
Results [7]: [i_item_id#21, s_state#18, 0 AS g_state#48, avg(cast(agg1#23 as bigint))#44 AS agg1#49, cast((avg(UnscaledValue(agg2#24))#45 / 100.0) as decimal(11,6)) AS agg2#50, cast((avg(UnscaledValue(agg3#25))#46 / 100.0) as decimal(11,6)) AS agg3#51, cast((avg(UnscaledValue(agg4#26))#47 / 100.0) as decimal(11,6)) AS agg4#52]
(33) ReusedExchange [Reuses operator id: 5]
Output [1]: [d_date_sk#1]
(34) Scan parquet default.store_sales
Output [8]: [ss_sold_date_sk#4, ss_item_sk#5, ss_cdemo_sk#6, ss_store_sk#7, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2451545), LessThanOrEqual(ss_sold_date_sk,2451910), IsNotNull(ss_cdemo_sk), IsNotNull(ss_store_sk), IsNotNull(ss_item_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_item_sk:int,ss_cdemo_sk:int,ss_store_sk:int,ss_quantity:int,ss_list_price:decimal(7,2),ss_sales_price:decimal(7,2),ss_coupon_amt:decimal(7,2)>
(35) ColumnarToRow
Input [8]: [ss_sold_date_sk#4, ss_item_sk#5, ss_cdemo_sk#6, ss_store_sk#7, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11]
(36) Filter
Input [8]: [ss_sold_date_sk#4, ss_item_sk#5, ss_cdemo_sk#6, ss_store_sk#7, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11]
Condition : (((((isnotnull(ss_sold_date_sk#4) AND (ss_sold_date_sk#4 >= 2451545)) AND (ss_sold_date_sk#4 <= 2451910)) AND isnotnull(ss_cdemo_sk#6)) AND isnotnull(ss_store_sk#7)) AND isnotnull(ss_item_sk#5))
(37) BroadcastHashJoin [codegen id : 11]
Left keys [1]: [d_date_sk#1]
Right keys [1]: [ss_sold_date_sk#4]
Join condition: None
(38) Project [codegen id : 11]
Output [7]: [ss_item_sk#5, ss_cdemo_sk#6, ss_store_sk#7, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11]
Input [9]: [d_date_sk#1, ss_sold_date_sk#4, ss_item_sk#5, ss_cdemo_sk#6, ss_store_sk#7, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11]
(39) Scan parquet default.store
Output [2]: [s_store_sk#17, s_state#18]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store]
PushedFilters: [In(s_state, [TN,AL,SD]), IsNotNull(s_store_sk)]
ReadSchema: struct<s_store_sk:int,s_state:string>
(40) ColumnarToRow [codegen id : 8]
Input [2]: [s_store_sk#17, s_state#18]
(41) Filter [codegen id : 8]
Input [2]: [s_store_sk#17, s_state#18]
Condition : (s_state#18 IN (TN,AL,SD) AND isnotnull(s_store_sk#17))
(42) Project [codegen id : 8]
Output [1]: [s_store_sk#17]
Input [2]: [s_store_sk#17, s_state#18]
(43) BroadcastExchange
Input [1]: [s_store_sk#17]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#53]
(44) BroadcastHashJoin [codegen id : 11]
Left keys [1]: [ss_store_sk#7]
Right keys [1]: [s_store_sk#17]
Join condition: None
(45) Project [codegen id : 11]
Output [6]: [ss_item_sk#5, ss_cdemo_sk#6, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11]
Input [8]: [ss_item_sk#5, ss_cdemo_sk#6, ss_store_sk#7, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11, s_store_sk#17]
(46) ReusedExchange [Reuses operator id: 15]
Output [1]: [cd_demo_sk#12]
(47) BroadcastHashJoin [codegen id : 11]
Left keys [1]: [ss_cdemo_sk#6]
Right keys [1]: [cd_demo_sk#12]
Join condition: None
(48) Project [codegen id : 11]
Output [5]: [ss_item_sk#5, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11]
Input [7]: [ss_item_sk#5, ss_cdemo_sk#6, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11, cd_demo_sk#12]
(49) ReusedExchange [Reuses operator id: 27]
Output [2]: [i_item_sk#20, i_item_id#21]
(50) BroadcastHashJoin [codegen id : 11]
Left keys [1]: [ss_item_sk#5]
Right keys [1]: [i_item_sk#20]
Join condition: None
(51) Project [codegen id : 11]
Output [5]: [i_item_id#21, ss_quantity#8 AS agg1#23, ss_list_price#9 AS agg2#24, ss_coupon_amt#11 AS agg3#25, ss_sales_price#10 AS agg4#26]
Input [7]: [ss_item_sk#5, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11, i_item_sk#20, i_item_id#21]
(52) HashAggregate [codegen id : 11]
Input [5]: [i_item_id#21, agg1#23, agg2#24, agg3#25, agg4#26]
Keys [1]: [i_item_id#21]
Functions [4]: [partial_avg(cast(agg1#23 as bigint)), partial_avg(UnscaledValue(agg2#24)), partial_avg(UnscaledValue(agg3#25)), partial_avg(UnscaledValue(agg4#26))]
Aggregate Attributes [8]: [sum#54, count#55, sum#56, count#57, sum#58, count#59, sum#60, count#61]
Results [9]: [i_item_id#21, sum#62, count#63, sum#64, count#65, sum#66, count#67, sum#68, count#69]
(53) Exchange
Input [9]: [i_item_id#21, sum#62, count#63, sum#64, count#65, sum#66, count#67, sum#68, count#69]
Arguments: hashpartitioning(i_item_id#21, 5), true, [id=#70]
(54) HashAggregate [codegen id : 12]
Input [9]: [i_item_id#21, sum#62, count#63, sum#64, count#65, sum#66, count#67, sum#68, count#69]
Keys [1]: [i_item_id#21]
Functions [4]: [avg(cast(agg1#23 as bigint)), avg(UnscaledValue(agg2#24)), avg(UnscaledValue(agg3#25)), avg(UnscaledValue(agg4#26))]
Aggregate Attributes [4]: [avg(cast(agg1#23 as bigint))#71, avg(UnscaledValue(agg2#24))#72, avg(UnscaledValue(agg3#25))#73, avg(UnscaledValue(agg4#26))#74]
Results [7]: [i_item_id#21, null AS s_state#75, 1 AS g_state#76, avg(cast(agg1#23 as bigint))#71 AS agg1#77, cast((avg(UnscaledValue(agg2#24))#72 / 100.0) as decimal(11,6)) AS agg2#78, cast((avg(UnscaledValue(agg3#25))#73 / 100.0) as decimal(11,6)) AS agg3#79, cast((avg(UnscaledValue(agg4#26))#74 / 100.0) as decimal(11,6)) AS agg4#80]
(55) ReusedExchange [Reuses operator id: 5]
Output [1]: [d_date_sk#1]
(56) Scan parquet default.store_sales
Output [8]: [ss_sold_date_sk#4, ss_item_sk#5, ss_cdemo_sk#6, ss_store_sk#7, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2451545), LessThanOrEqual(ss_sold_date_sk,2451910), IsNotNull(ss_cdemo_sk), IsNotNull(ss_store_sk), IsNotNull(ss_item_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_item_sk:int,ss_cdemo_sk:int,ss_store_sk:int,ss_quantity:int,ss_list_price:decimal(7,2),ss_sales_price:decimal(7,2),ss_coupon_amt:decimal(7,2)>
(57) ColumnarToRow
Input [8]: [ss_sold_date_sk#4, ss_item_sk#5, ss_cdemo_sk#6, ss_store_sk#7, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11]
(58) Filter
Input [8]: [ss_sold_date_sk#4, ss_item_sk#5, ss_cdemo_sk#6, ss_store_sk#7, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11]
Condition : (((((isnotnull(ss_sold_date_sk#4) AND (ss_sold_date_sk#4 >= 2451545)) AND (ss_sold_date_sk#4 <= 2451910)) AND isnotnull(ss_cdemo_sk#6)) AND isnotnull(ss_store_sk#7)) AND isnotnull(ss_item_sk#5))
(59) BroadcastHashJoin [codegen id : 17]
Left keys [1]: [d_date_sk#1]
Right keys [1]: [ss_sold_date_sk#4]
Join condition: None
(60) Project [codegen id : 17]
Output [7]: [ss_item_sk#5, ss_cdemo_sk#6, ss_store_sk#7, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11]
Input [9]: [d_date_sk#1, ss_sold_date_sk#4, ss_item_sk#5, ss_cdemo_sk#6, ss_store_sk#7, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11]
(61) ReusedExchange [Reuses operator id: 43]
Output [1]: [s_store_sk#17]
(62) BroadcastHashJoin [codegen id : 17]
Left keys [1]: [ss_store_sk#7]
Right keys [1]: [s_store_sk#17]
Join condition: None
(63) Project [codegen id : 17]
Output [6]: [ss_item_sk#5, ss_cdemo_sk#6, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11]
Input [8]: [ss_item_sk#5, ss_cdemo_sk#6, ss_store_sk#7, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11, s_store_sk#17]
(64) ReusedExchange [Reuses operator id: 15]
Output [1]: [cd_demo_sk#12]
(65) BroadcastHashJoin [codegen id : 17]
Left keys [1]: [ss_cdemo_sk#6]
Right keys [1]: [cd_demo_sk#12]
Join condition: None
(66) Project [codegen id : 17]
Output [5]: [ss_item_sk#5, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11]
Input [7]: [ss_item_sk#5, ss_cdemo_sk#6, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11, cd_demo_sk#12]
(67) Scan parquet default.item
Output [1]: [i_item_sk#20]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/item]
PushedFilters: [IsNotNull(i_item_sk)]
ReadSchema: struct<i_item_sk:int>
(68) ColumnarToRow [codegen id : 16]
Input [1]: [i_item_sk#20]
(69) Filter [codegen id : 16]
Input [1]: [i_item_sk#20]
Condition : isnotnull(i_item_sk#20)
(70) BroadcastExchange
Input [1]: [i_item_sk#20]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#81]
(71) BroadcastHashJoin [codegen id : 17]
Left keys [1]: [ss_item_sk#5]
Right keys [1]: [i_item_sk#20]
Join condition: None
(72) Project [codegen id : 17]
Output [4]: [ss_quantity#8 AS agg1#23, ss_list_price#9 AS agg2#24, ss_coupon_amt#11 AS agg3#25, ss_sales_price#10 AS agg4#26]
Input [6]: [ss_item_sk#5, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11, i_item_sk#20]
(73) HashAggregate [codegen id : 17]
Input [4]: [agg1#23, agg2#24, agg3#25, agg4#26]
Keys: []
Functions [4]: [partial_avg(cast(agg1#23 as bigint)), partial_avg(UnscaledValue(agg2#24)), partial_avg(UnscaledValue(agg3#25)), partial_avg(UnscaledValue(agg4#26))]
Aggregate Attributes [8]: [sum#82, count#83, sum#84, count#85, sum#86, count#87, sum#88, count#89]
Results [8]: [sum#90, count#91, sum#92, count#93, sum#94, count#95, sum#96, count#97]
(74) Exchange
Input [8]: [sum#90, count#91, sum#92, count#93, sum#94, count#95, sum#96, count#97]
Arguments: SinglePartition, true, [id=#98]
(75) HashAggregate [codegen id : 18]
Input [8]: [sum#90, count#91, sum#92, count#93, sum#94, count#95, sum#96, count#97]
Keys: []
Functions [4]: [avg(cast(agg1#23 as bigint)), avg(UnscaledValue(agg2#24)), avg(UnscaledValue(agg3#25)), avg(UnscaledValue(agg4#26))]
Aggregate Attributes [4]: [avg(cast(agg1#23 as bigint))#99, avg(UnscaledValue(agg2#24))#100, avg(UnscaledValue(agg3#25))#101, avg(UnscaledValue(agg4#26))#102]
Results [7]: [null AS i_item_id#103, null AS s_state#104, 1 AS g_state#105, avg(cast(agg1#23 as bigint))#99 AS agg1#106, cast((avg(UnscaledValue(agg2#24))#100 / 100.0) as decimal(11,6)) AS agg2#107, cast((avg(UnscaledValue(agg3#25))#101 / 100.0) as decimal(11,6)) AS agg3#108, cast((avg(UnscaledValue(agg4#26))#102 / 100.0) as decimal(11,6)) AS agg4#109]
(76) Union
(77) TakeOrderedAndProject
Input [7]: [i_item_id#21, s_state#18, g_state#48, agg1#49, agg2#50, agg3#51, agg4#52]
Arguments: 100, [i_item_id#21 ASC NULLS FIRST, s_state#18 ASC NULLS FIRST], [i_item_id#21, s_state#18, g_state#48, agg1#49, agg2#50, agg3#51, agg4#52]

View file

@ -0,0 +1,113 @@
TakeOrderedAndProject [agg1,agg2,agg3,agg4,g_state,i_item_id,s_state]
Union
WholeStageCodegen (6)
HashAggregate [count,count,count,count,i_item_id,s_state,sum,sum,sum,sum] [agg1,agg2,agg3,agg4,avg(UnscaledValue(agg2)),avg(UnscaledValue(agg3)),avg(UnscaledValue(agg4)),avg(cast(agg1 as bigint)),count,count,count,count,g_state,sum,sum,sum,sum]
InputAdapter
Exchange [i_item_id,s_state] #1
WholeStageCodegen (5)
HashAggregate [agg1,agg2,agg3,agg4,i_item_id,s_state] [count,count,count,count,count,count,count,count,sum,sum,sum,sum,sum,sum,sum,sum]
Project [i_item_id,s_state,ss_coupon_amt,ss_list_price,ss_quantity,ss_sales_price]
BroadcastHashJoin [i_item_sk,ss_item_sk]
Project [s_state,ss_coupon_amt,ss_item_sk,ss_list_price,ss_quantity,ss_sales_price]
BroadcastHashJoin [s_store_sk,ss_store_sk]
Project [ss_coupon_amt,ss_item_sk,ss_list_price,ss_quantity,ss_sales_price,ss_store_sk]
BroadcastHashJoin [cd_demo_sk,ss_cdemo_sk]
Project [ss_cdemo_sk,ss_coupon_amt,ss_item_sk,ss_list_price,ss_quantity,ss_sales_price,ss_store_sk]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
InputAdapter
BroadcastExchange #2
WholeStageCodegen (1)
Project [d_date_sk]
Filter [d_date_sk,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_year]
Filter [ss_cdemo_sk,ss_item_sk,ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_cdemo_sk,ss_coupon_amt,ss_item_sk,ss_list_price,ss_quantity,ss_sales_price,ss_sold_date_sk,ss_store_sk]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (2)
Project [cd_demo_sk]
Filter [cd_demo_sk,cd_education_status,cd_gender,cd_marital_status]
ColumnarToRow
InputAdapter
Scan parquet default.customer_demographics [cd_demo_sk,cd_education_status,cd_gender,cd_marital_status]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (3)
Filter [s_state,s_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store [s_state,s_store_sk]
InputAdapter
BroadcastExchange #5
WholeStageCodegen (4)
Filter [i_item_sk]
ColumnarToRow
InputAdapter
Scan parquet default.item [i_item_id,i_item_sk]
WholeStageCodegen (12)
HashAggregate [count,count,count,count,i_item_id,sum,sum,sum,sum] [agg1,agg2,agg3,agg4,avg(UnscaledValue(agg2)),avg(UnscaledValue(agg3)),avg(UnscaledValue(agg4)),avg(cast(agg1 as bigint)),count,count,count,count,g_state,s_state,sum,sum,sum,sum]
InputAdapter
Exchange [i_item_id] #6
WholeStageCodegen (11)
HashAggregate [agg1,agg2,agg3,agg4,i_item_id] [count,count,count,count,count,count,count,count,sum,sum,sum,sum,sum,sum,sum,sum]
Project [i_item_id,ss_coupon_amt,ss_list_price,ss_quantity,ss_sales_price]
BroadcastHashJoin [i_item_sk,ss_item_sk]
Project [ss_coupon_amt,ss_item_sk,ss_list_price,ss_quantity,ss_sales_price]
BroadcastHashJoin [cd_demo_sk,ss_cdemo_sk]
Project [ss_cdemo_sk,ss_coupon_amt,ss_item_sk,ss_list_price,ss_quantity,ss_sales_price]
BroadcastHashJoin [s_store_sk,ss_store_sk]
Project [ss_cdemo_sk,ss_coupon_amt,ss_item_sk,ss_list_price,ss_quantity,ss_sales_price,ss_store_sk]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
InputAdapter
ReusedExchange [d_date_sk] #2
Filter [ss_cdemo_sk,ss_item_sk,ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_cdemo_sk,ss_coupon_amt,ss_item_sk,ss_list_price,ss_quantity,ss_sales_price,ss_sold_date_sk,ss_store_sk]
InputAdapter
BroadcastExchange #7
WholeStageCodegen (8)
Project [s_store_sk]
Filter [s_state,s_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store [s_state,s_store_sk]
InputAdapter
ReusedExchange [cd_demo_sk] #3
InputAdapter
ReusedExchange [i_item_id,i_item_sk] #5
WholeStageCodegen (18)
HashAggregate [count,count,count,count,sum,sum,sum,sum] [agg1,agg2,agg3,agg4,avg(UnscaledValue(agg2)),avg(UnscaledValue(agg3)),avg(UnscaledValue(agg4)),avg(cast(agg1 as bigint)),count,count,count,count,g_state,i_item_id,s_state,sum,sum,sum,sum]
InputAdapter
Exchange #8
WholeStageCodegen (17)
HashAggregate [agg1,agg2,agg3,agg4] [count,count,count,count,count,count,count,count,sum,sum,sum,sum,sum,sum,sum,sum]
Project [ss_coupon_amt,ss_list_price,ss_quantity,ss_sales_price]
BroadcastHashJoin [i_item_sk,ss_item_sk]
Project [ss_coupon_amt,ss_item_sk,ss_list_price,ss_quantity,ss_sales_price]
BroadcastHashJoin [cd_demo_sk,ss_cdemo_sk]
Project [ss_cdemo_sk,ss_coupon_amt,ss_item_sk,ss_list_price,ss_quantity,ss_sales_price]
BroadcastHashJoin [s_store_sk,ss_store_sk]
Project [ss_cdemo_sk,ss_coupon_amt,ss_item_sk,ss_list_price,ss_quantity,ss_sales_price,ss_store_sk]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
InputAdapter
ReusedExchange [d_date_sk] #2
Filter [ss_cdemo_sk,ss_item_sk,ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_cdemo_sk,ss_coupon_amt,ss_item_sk,ss_list_price,ss_quantity,ss_sales_price,ss_sold_date_sk,ss_store_sk]
InputAdapter
ReusedExchange [s_store_sk] #7
InputAdapter
ReusedExchange [cd_demo_sk] #3
InputAdapter
BroadcastExchange #9
WholeStageCodegen (16)
Filter [i_item_sk]
ColumnarToRow
InputAdapter
Scan parquet default.item [i_item_sk]

View file

@ -0,0 +1,428 @@
== Physical Plan ==
TakeOrderedAndProject (77)
+- Union (76)
:- * HashAggregate (32)
: +- Exchange (31)
: +- * HashAggregate (30)
: +- * Project (29)
: +- * BroadcastHashJoin Inner BuildRight (28)
: :- * Project (23)
: : +- * BroadcastHashJoin Inner BuildRight (22)
: : :- * Project (17)
: : : +- * BroadcastHashJoin Inner BuildRight (16)
: : : :- * Project (10)
: : : : +- * BroadcastHashJoin Inner BuildRight (9)
: : : : :- * Filter (3)
: : : : : +- * ColumnarToRow (2)
: : : : : +- Scan parquet default.store_sales (1)
: : : : +- BroadcastExchange (8)
: : : : +- * Project (7)
: : : : +- * Filter (6)
: : : : +- * ColumnarToRow (5)
: : : : +- Scan parquet default.customer_demographics (4)
: : : +- BroadcastExchange (15)
: : : +- * Project (14)
: : : +- * Filter (13)
: : : +- * ColumnarToRow (12)
: : : +- Scan parquet default.date_dim (11)
: : +- BroadcastExchange (21)
: : +- * Filter (20)
: : +- * ColumnarToRow (19)
: : +- Scan parquet default.store (18)
: +- BroadcastExchange (27)
: +- * Filter (26)
: +- * ColumnarToRow (25)
: +- Scan parquet default.item (24)
:- * HashAggregate (54)
: +- Exchange (53)
: +- * HashAggregate (52)
: +- * Project (51)
: +- * BroadcastHashJoin Inner BuildRight (50)
: :- * Project (48)
: : +- * BroadcastHashJoin Inner BuildRight (47)
: : :- * Project (41)
: : : +- * BroadcastHashJoin Inner BuildRight (40)
: : : :- * Project (38)
: : : : +- * BroadcastHashJoin Inner BuildRight (37)
: : : : :- * Filter (35)
: : : : : +- * ColumnarToRow (34)
: : : : : +- Scan parquet default.store_sales (33)
: : : : +- ReusedExchange (36)
: : : +- ReusedExchange (39)
: : +- BroadcastExchange (46)
: : +- * Project (45)
: : +- * Filter (44)
: : +- * ColumnarToRow (43)
: : +- Scan parquet default.store (42)
: +- ReusedExchange (49)
+- * HashAggregate (75)
+- Exchange (74)
+- * HashAggregate (73)
+- * Project (72)
+- * BroadcastHashJoin Inner BuildRight (71)
:- * Project (66)
: +- * BroadcastHashJoin Inner BuildRight (65)
: :- * Project (63)
: : +- * BroadcastHashJoin Inner BuildRight (62)
: : :- * Project (60)
: : : +- * BroadcastHashJoin Inner BuildRight (59)
: : : :- * Filter (57)
: : : : +- * ColumnarToRow (56)
: : : : +- Scan parquet default.store_sales (55)
: : : +- ReusedExchange (58)
: : +- ReusedExchange (61)
: +- ReusedExchange (64)
+- BroadcastExchange (70)
+- * Filter (69)
+- * ColumnarToRow (68)
+- Scan parquet default.item (67)
(1) Scan parquet default.store_sales
Output [8]: [ss_sold_date_sk#1, ss_item_sk#2, ss_cdemo_sk#3, ss_store_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2451545), LessThanOrEqual(ss_sold_date_sk,2451910), IsNotNull(ss_cdemo_sk), IsNotNull(ss_store_sk), IsNotNull(ss_item_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_item_sk:int,ss_cdemo_sk:int,ss_store_sk:int,ss_quantity:int,ss_list_price:decimal(7,2),ss_sales_price:decimal(7,2),ss_coupon_amt:decimal(7,2)>
(2) ColumnarToRow [codegen id : 5]
Input [8]: [ss_sold_date_sk#1, ss_item_sk#2, ss_cdemo_sk#3, ss_store_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8]
(3) Filter [codegen id : 5]
Input [8]: [ss_sold_date_sk#1, ss_item_sk#2, ss_cdemo_sk#3, ss_store_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8]
Condition : (((((isnotnull(ss_sold_date_sk#1) AND (ss_sold_date_sk#1 >= 2451545)) AND (ss_sold_date_sk#1 <= 2451910)) AND isnotnull(ss_cdemo_sk#3)) AND isnotnull(ss_store_sk#4)) AND isnotnull(ss_item_sk#2))
(4) Scan parquet default.customer_demographics
Output [4]: [cd_demo_sk#9, cd_gender#10, cd_marital_status#11, cd_education_status#12]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/customer_demographics]
PushedFilters: [IsNotNull(cd_marital_status), IsNotNull(cd_education_status), IsNotNull(cd_gender), EqualTo(cd_gender,F), EqualTo(cd_marital_status,D), EqualTo(cd_education_status,Primary), IsNotNull(cd_demo_sk)]
ReadSchema: struct<cd_demo_sk:int,cd_gender:string,cd_marital_status:string,cd_education_status:string>
(5) ColumnarToRow [codegen id : 1]
Input [4]: [cd_demo_sk#9, cd_gender#10, cd_marital_status#11, cd_education_status#12]
(6) Filter [codegen id : 1]
Input [4]: [cd_demo_sk#9, cd_gender#10, cd_marital_status#11, cd_education_status#12]
Condition : ((((((isnotnull(cd_marital_status#11) AND isnotnull(cd_education_status#12)) AND isnotnull(cd_gender#10)) AND (cd_gender#10 = F)) AND (cd_marital_status#11 = D)) AND (cd_education_status#12 = Primary)) AND isnotnull(cd_demo_sk#9))
(7) Project [codegen id : 1]
Output [1]: [cd_demo_sk#9]
Input [4]: [cd_demo_sk#9, cd_gender#10, cd_marital_status#11, cd_education_status#12]
(8) BroadcastExchange
Input [1]: [cd_demo_sk#9]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#13]
(9) BroadcastHashJoin [codegen id : 5]
Left keys [1]: [ss_cdemo_sk#3]
Right keys [1]: [cd_demo_sk#9]
Join condition: None
(10) Project [codegen id : 5]
Output [7]: [ss_sold_date_sk#1, ss_item_sk#2, ss_store_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8]
Input [9]: [ss_sold_date_sk#1, ss_item_sk#2, ss_cdemo_sk#3, ss_store_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8, cd_demo_sk#9]
(11) Scan parquet default.date_dim
Output [2]: [d_date_sk#14, d_year#15]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/date_dim]
PushedFilters: [IsNotNull(d_year), EqualTo(d_year,2000), LessThanOrEqual(d_date_sk,2451910), GreaterThanOrEqual(d_date_sk,2451545), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int>
(12) ColumnarToRow [codegen id : 2]
Input [2]: [d_date_sk#14, d_year#15]
(13) Filter [codegen id : 2]
Input [2]: [d_date_sk#14, d_year#15]
Condition : ((((isnotnull(d_year#15) AND (d_year#15 = 2000)) AND (d_date_sk#14 <= 2451910)) AND (d_date_sk#14 >= 2451545)) AND isnotnull(d_date_sk#14))
(14) Project [codegen id : 2]
Output [1]: [d_date_sk#14]
Input [2]: [d_date_sk#14, d_year#15]
(15) BroadcastExchange
Input [1]: [d_date_sk#14]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#16]
(16) BroadcastHashJoin [codegen id : 5]
Left keys [1]: [ss_sold_date_sk#1]
Right keys [1]: [d_date_sk#14]
Join condition: None
(17) Project [codegen id : 5]
Output [6]: [ss_item_sk#2, ss_store_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8]
Input [8]: [ss_sold_date_sk#1, ss_item_sk#2, ss_store_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8, d_date_sk#14]
(18) Scan parquet default.store
Output [2]: [s_store_sk#17, s_state#18]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store]
PushedFilters: [In(s_state, [TN,AL,SD]), IsNotNull(s_store_sk)]
ReadSchema: struct<s_store_sk:int,s_state:string>
(19) ColumnarToRow [codegen id : 3]
Input [2]: [s_store_sk#17, s_state#18]
(20) Filter [codegen id : 3]
Input [2]: [s_store_sk#17, s_state#18]
Condition : (s_state#18 IN (TN,AL,SD) AND isnotnull(s_store_sk#17))
(21) BroadcastExchange
Input [2]: [s_store_sk#17, s_state#18]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#19]
(22) BroadcastHashJoin [codegen id : 5]
Left keys [1]: [ss_store_sk#4]
Right keys [1]: [s_store_sk#17]
Join condition: None
(23) Project [codegen id : 5]
Output [6]: [ss_item_sk#2, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8, s_state#18]
Input [8]: [ss_item_sk#2, ss_store_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8, s_store_sk#17, s_state#18]
(24) Scan parquet default.item
Output [2]: [i_item_sk#20, i_item_id#21]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/item]
PushedFilters: [IsNotNull(i_item_sk)]
ReadSchema: struct<i_item_sk:int,i_item_id:string>
(25) ColumnarToRow [codegen id : 4]
Input [2]: [i_item_sk#20, i_item_id#21]
(26) Filter [codegen id : 4]
Input [2]: [i_item_sk#20, i_item_id#21]
Condition : isnotnull(i_item_sk#20)
(27) BroadcastExchange
Input [2]: [i_item_sk#20, i_item_id#21]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#22]
(28) BroadcastHashJoin [codegen id : 5]
Left keys [1]: [ss_item_sk#2]
Right keys [1]: [i_item_sk#20]
Join condition: None
(29) Project [codegen id : 5]
Output [6]: [i_item_id#21, s_state#18, ss_quantity#5 AS agg1#23, ss_list_price#6 AS agg2#24, ss_coupon_amt#8 AS agg3#25, ss_sales_price#7 AS agg4#26]
Input [8]: [ss_item_sk#2, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8, s_state#18, i_item_sk#20, i_item_id#21]
(30) HashAggregate [codegen id : 5]
Input [6]: [i_item_id#21, s_state#18, agg1#23, agg2#24, agg3#25, agg4#26]
Keys [2]: [i_item_id#21, s_state#18]
Functions [4]: [partial_avg(cast(agg1#23 as bigint)), partial_avg(UnscaledValue(agg2#24)), partial_avg(UnscaledValue(agg3#25)), partial_avg(UnscaledValue(agg4#26))]
Aggregate Attributes [8]: [sum#27, count#28, sum#29, count#30, sum#31, count#32, sum#33, count#34]
Results [10]: [i_item_id#21, s_state#18, sum#35, count#36, sum#37, count#38, sum#39, count#40, sum#41, count#42]
(31) Exchange
Input [10]: [i_item_id#21, s_state#18, sum#35, count#36, sum#37, count#38, sum#39, count#40, sum#41, count#42]
Arguments: hashpartitioning(i_item_id#21, s_state#18, 5), true, [id=#43]
(32) HashAggregate [codegen id : 6]
Input [10]: [i_item_id#21, s_state#18, sum#35, count#36, sum#37, count#38, sum#39, count#40, sum#41, count#42]
Keys [2]: [i_item_id#21, s_state#18]
Functions [4]: [avg(cast(agg1#23 as bigint)), avg(UnscaledValue(agg2#24)), avg(UnscaledValue(agg3#25)), avg(UnscaledValue(agg4#26))]
Aggregate Attributes [4]: [avg(cast(agg1#23 as bigint))#44, avg(UnscaledValue(agg2#24))#45, avg(UnscaledValue(agg3#25))#46, avg(UnscaledValue(agg4#26))#47]
Results [7]: [i_item_id#21, s_state#18, 0 AS g_state#48, avg(cast(agg1#23 as bigint))#44 AS agg1#49, cast((avg(UnscaledValue(agg2#24))#45 / 100.0) as decimal(11,6)) AS agg2#50, cast((avg(UnscaledValue(agg3#25))#46 / 100.0) as decimal(11,6)) AS agg3#51, cast((avg(UnscaledValue(agg4#26))#47 / 100.0) as decimal(11,6)) AS agg4#52]
(33) Scan parquet default.store_sales
Output [8]: [ss_sold_date_sk#1, ss_item_sk#2, ss_cdemo_sk#3, ss_store_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2451545), LessThanOrEqual(ss_sold_date_sk,2451910), IsNotNull(ss_cdemo_sk), IsNotNull(ss_store_sk), IsNotNull(ss_item_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_item_sk:int,ss_cdemo_sk:int,ss_store_sk:int,ss_quantity:int,ss_list_price:decimal(7,2),ss_sales_price:decimal(7,2),ss_coupon_amt:decimal(7,2)>
(34) ColumnarToRow [codegen id : 11]
Input [8]: [ss_sold_date_sk#1, ss_item_sk#2, ss_cdemo_sk#3, ss_store_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8]
(35) Filter [codegen id : 11]
Input [8]: [ss_sold_date_sk#1, ss_item_sk#2, ss_cdemo_sk#3, ss_store_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8]
Condition : (((((isnotnull(ss_sold_date_sk#1) AND (ss_sold_date_sk#1 >= 2451545)) AND (ss_sold_date_sk#1 <= 2451910)) AND isnotnull(ss_cdemo_sk#3)) AND isnotnull(ss_store_sk#4)) AND isnotnull(ss_item_sk#2))
(36) ReusedExchange [Reuses operator id: 8]
Output [1]: [cd_demo_sk#9]
(37) BroadcastHashJoin [codegen id : 11]
Left keys [1]: [ss_cdemo_sk#3]
Right keys [1]: [cd_demo_sk#9]
Join condition: None
(38) Project [codegen id : 11]
Output [7]: [ss_sold_date_sk#1, ss_item_sk#2, ss_store_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8]
Input [9]: [ss_sold_date_sk#1, ss_item_sk#2, ss_cdemo_sk#3, ss_store_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8, cd_demo_sk#9]
(39) ReusedExchange [Reuses operator id: 15]
Output [1]: [d_date_sk#14]
(40) BroadcastHashJoin [codegen id : 11]
Left keys [1]: [ss_sold_date_sk#1]
Right keys [1]: [d_date_sk#14]
Join condition: None
(41) Project [codegen id : 11]
Output [6]: [ss_item_sk#2, ss_store_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8]
Input [8]: [ss_sold_date_sk#1, ss_item_sk#2, ss_store_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8, d_date_sk#14]
(42) Scan parquet default.store
Output [2]: [s_store_sk#17, s_state#18]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store]
PushedFilters: [In(s_state, [TN,AL,SD]), IsNotNull(s_store_sk)]
ReadSchema: struct<s_store_sk:int,s_state:string>
(43) ColumnarToRow [codegen id : 9]
Input [2]: [s_store_sk#17, s_state#18]
(44) Filter [codegen id : 9]
Input [2]: [s_store_sk#17, s_state#18]
Condition : (s_state#18 IN (TN,AL,SD) AND isnotnull(s_store_sk#17))
(45) Project [codegen id : 9]
Output [1]: [s_store_sk#17]
Input [2]: [s_store_sk#17, s_state#18]
(46) BroadcastExchange
Input [1]: [s_store_sk#17]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#53]
(47) BroadcastHashJoin [codegen id : 11]
Left keys [1]: [ss_store_sk#4]
Right keys [1]: [s_store_sk#17]
Join condition: None
(48) Project [codegen id : 11]
Output [5]: [ss_item_sk#2, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8]
Input [7]: [ss_item_sk#2, ss_store_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8, s_store_sk#17]
(49) ReusedExchange [Reuses operator id: 27]
Output [2]: [i_item_sk#20, i_item_id#21]
(50) BroadcastHashJoin [codegen id : 11]
Left keys [1]: [ss_item_sk#2]
Right keys [1]: [i_item_sk#20]
Join condition: None
(51) Project [codegen id : 11]
Output [5]: [i_item_id#21, ss_quantity#5 AS agg1#23, ss_list_price#6 AS agg2#24, ss_coupon_amt#8 AS agg3#25, ss_sales_price#7 AS agg4#26]
Input [7]: [ss_item_sk#2, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8, i_item_sk#20, i_item_id#21]
(52) HashAggregate [codegen id : 11]
Input [5]: [i_item_id#21, agg1#23, agg2#24, agg3#25, agg4#26]
Keys [1]: [i_item_id#21]
Functions [4]: [partial_avg(cast(agg1#23 as bigint)), partial_avg(UnscaledValue(agg2#24)), partial_avg(UnscaledValue(agg3#25)), partial_avg(UnscaledValue(agg4#26))]
Aggregate Attributes [8]: [sum#54, count#55, sum#56, count#57, sum#58, count#59, sum#60, count#61]
Results [9]: [i_item_id#21, sum#62, count#63, sum#64, count#65, sum#66, count#67, sum#68, count#69]
(53) Exchange
Input [9]: [i_item_id#21, sum#62, count#63, sum#64, count#65, sum#66, count#67, sum#68, count#69]
Arguments: hashpartitioning(i_item_id#21, 5), true, [id=#70]
(54) HashAggregate [codegen id : 12]
Input [9]: [i_item_id#21, sum#62, count#63, sum#64, count#65, sum#66, count#67, sum#68, count#69]
Keys [1]: [i_item_id#21]
Functions [4]: [avg(cast(agg1#23 as bigint)), avg(UnscaledValue(agg2#24)), avg(UnscaledValue(agg3#25)), avg(UnscaledValue(agg4#26))]
Aggregate Attributes [4]: [avg(cast(agg1#23 as bigint))#71, avg(UnscaledValue(agg2#24))#72, avg(UnscaledValue(agg3#25))#73, avg(UnscaledValue(agg4#26))#74]
Results [7]: [i_item_id#21, null AS s_state#75, 1 AS g_state#76, avg(cast(agg1#23 as bigint))#71 AS agg1#77, cast((avg(UnscaledValue(agg2#24))#72 / 100.0) as decimal(11,6)) AS agg2#78, cast((avg(UnscaledValue(agg3#25))#73 / 100.0) as decimal(11,6)) AS agg3#79, cast((avg(UnscaledValue(agg4#26))#74 / 100.0) as decimal(11,6)) AS agg4#80]
(55) Scan parquet default.store_sales
Output [8]: [ss_sold_date_sk#1, ss_item_sk#2, ss_cdemo_sk#3, ss_store_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2451545), LessThanOrEqual(ss_sold_date_sk,2451910), IsNotNull(ss_cdemo_sk), IsNotNull(ss_store_sk), IsNotNull(ss_item_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_item_sk:int,ss_cdemo_sk:int,ss_store_sk:int,ss_quantity:int,ss_list_price:decimal(7,2),ss_sales_price:decimal(7,2),ss_coupon_amt:decimal(7,2)>
(56) ColumnarToRow [codegen id : 17]
Input [8]: [ss_sold_date_sk#1, ss_item_sk#2, ss_cdemo_sk#3, ss_store_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8]
(57) Filter [codegen id : 17]
Input [8]: [ss_sold_date_sk#1, ss_item_sk#2, ss_cdemo_sk#3, ss_store_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8]
Condition : (((((isnotnull(ss_sold_date_sk#1) AND (ss_sold_date_sk#1 >= 2451545)) AND (ss_sold_date_sk#1 <= 2451910)) AND isnotnull(ss_cdemo_sk#3)) AND isnotnull(ss_store_sk#4)) AND isnotnull(ss_item_sk#2))
(58) ReusedExchange [Reuses operator id: 8]
Output [1]: [cd_demo_sk#9]
(59) BroadcastHashJoin [codegen id : 17]
Left keys [1]: [ss_cdemo_sk#3]
Right keys [1]: [cd_demo_sk#9]
Join condition: None
(60) Project [codegen id : 17]
Output [7]: [ss_sold_date_sk#1, ss_item_sk#2, ss_store_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8]
Input [9]: [ss_sold_date_sk#1, ss_item_sk#2, ss_cdemo_sk#3, ss_store_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8, cd_demo_sk#9]
(61) ReusedExchange [Reuses operator id: 15]
Output [1]: [d_date_sk#14]
(62) BroadcastHashJoin [codegen id : 17]
Left keys [1]: [ss_sold_date_sk#1]
Right keys [1]: [d_date_sk#14]
Join condition: None
(63) Project [codegen id : 17]
Output [6]: [ss_item_sk#2, ss_store_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8]
Input [8]: [ss_sold_date_sk#1, ss_item_sk#2, ss_store_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8, d_date_sk#14]
(64) ReusedExchange [Reuses operator id: 46]
Output [1]: [s_store_sk#17]
(65) BroadcastHashJoin [codegen id : 17]
Left keys [1]: [ss_store_sk#4]
Right keys [1]: [s_store_sk#17]
Join condition: None
(66) Project [codegen id : 17]
Output [5]: [ss_item_sk#2, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8]
Input [7]: [ss_item_sk#2, ss_store_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8, s_store_sk#17]
(67) Scan parquet default.item
Output [1]: [i_item_sk#20]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/item]
PushedFilters: [IsNotNull(i_item_sk)]
ReadSchema: struct<i_item_sk:int>
(68) ColumnarToRow [codegen id : 16]
Input [1]: [i_item_sk#20]
(69) Filter [codegen id : 16]
Input [1]: [i_item_sk#20]
Condition : isnotnull(i_item_sk#20)
(70) BroadcastExchange
Input [1]: [i_item_sk#20]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#81]
(71) BroadcastHashJoin [codegen id : 17]
Left keys [1]: [ss_item_sk#2]
Right keys [1]: [i_item_sk#20]
Join condition: None
(72) Project [codegen id : 17]
Output [4]: [ss_quantity#5 AS agg1#23, ss_list_price#6 AS agg2#24, ss_coupon_amt#8 AS agg3#25, ss_sales_price#7 AS agg4#26]
Input [6]: [ss_item_sk#2, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8, i_item_sk#20]
(73) HashAggregate [codegen id : 17]
Input [4]: [agg1#23, agg2#24, agg3#25, agg4#26]
Keys: []
Functions [4]: [partial_avg(cast(agg1#23 as bigint)), partial_avg(UnscaledValue(agg2#24)), partial_avg(UnscaledValue(agg3#25)), partial_avg(UnscaledValue(agg4#26))]
Aggregate Attributes [8]: [sum#82, count#83, sum#84, count#85, sum#86, count#87, sum#88, count#89]
Results [8]: [sum#90, count#91, sum#92, count#93, sum#94, count#95, sum#96, count#97]
(74) Exchange
Input [8]: [sum#90, count#91, sum#92, count#93, sum#94, count#95, sum#96, count#97]
Arguments: SinglePartition, true, [id=#98]
(75) HashAggregate [codegen id : 18]
Input [8]: [sum#90, count#91, sum#92, count#93, sum#94, count#95, sum#96, count#97]
Keys: []
Functions [4]: [avg(cast(agg1#23 as bigint)), avg(UnscaledValue(agg2#24)), avg(UnscaledValue(agg3#25)), avg(UnscaledValue(agg4#26))]
Aggregate Attributes [4]: [avg(cast(agg1#23 as bigint))#99, avg(UnscaledValue(agg2#24))#100, avg(UnscaledValue(agg3#25))#101, avg(UnscaledValue(agg4#26))#102]
Results [7]: [null AS i_item_id#103, null AS s_state#104, 1 AS g_state#105, avg(cast(agg1#23 as bigint))#99 AS agg1#106, cast((avg(UnscaledValue(agg2#24))#100 / 100.0) as decimal(11,6)) AS agg2#107, cast((avg(UnscaledValue(agg3#25))#101 / 100.0) as decimal(11,6)) AS agg3#108, cast((avg(UnscaledValue(agg4#26))#102 / 100.0) as decimal(11,6)) AS agg4#109]
(76) Union
(77) TakeOrderedAndProject
Input [7]: [i_item_id#21, s_state#18, g_state#48, agg1#49, agg2#50, agg3#51, agg4#52]
Arguments: 100, [i_item_id#21 ASC NULLS FIRST, s_state#18 ASC NULLS FIRST], [i_item_id#21, s_state#18, g_state#48, agg1#49, agg2#50, agg3#51, agg4#52]

View file

@ -0,0 +1,113 @@
TakeOrderedAndProject [agg1,agg2,agg3,agg4,g_state,i_item_id,s_state]
Union
WholeStageCodegen (6)
HashAggregate [count,count,count,count,i_item_id,s_state,sum,sum,sum,sum] [agg1,agg2,agg3,agg4,avg(UnscaledValue(agg2)),avg(UnscaledValue(agg3)),avg(UnscaledValue(agg4)),avg(cast(agg1 as bigint)),count,count,count,count,g_state,sum,sum,sum,sum]
InputAdapter
Exchange [i_item_id,s_state] #1
WholeStageCodegen (5)
HashAggregate [agg1,agg2,agg3,agg4,i_item_id,s_state] [count,count,count,count,count,count,count,count,sum,sum,sum,sum,sum,sum,sum,sum]
Project [i_item_id,s_state,ss_coupon_amt,ss_list_price,ss_quantity,ss_sales_price]
BroadcastHashJoin [i_item_sk,ss_item_sk]
Project [s_state,ss_coupon_amt,ss_item_sk,ss_list_price,ss_quantity,ss_sales_price]
BroadcastHashJoin [s_store_sk,ss_store_sk]
Project [ss_coupon_amt,ss_item_sk,ss_list_price,ss_quantity,ss_sales_price,ss_store_sk]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Project [ss_coupon_amt,ss_item_sk,ss_list_price,ss_quantity,ss_sales_price,ss_sold_date_sk,ss_store_sk]
BroadcastHashJoin [cd_demo_sk,ss_cdemo_sk]
Filter [ss_cdemo_sk,ss_item_sk,ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_cdemo_sk,ss_coupon_amt,ss_item_sk,ss_list_price,ss_quantity,ss_sales_price,ss_sold_date_sk,ss_store_sk]
InputAdapter
BroadcastExchange #2
WholeStageCodegen (1)
Project [cd_demo_sk]
Filter [cd_demo_sk,cd_education_status,cd_gender,cd_marital_status]
ColumnarToRow
InputAdapter
Scan parquet default.customer_demographics [cd_demo_sk,cd_education_status,cd_gender,cd_marital_status]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (2)
Project [d_date_sk]
Filter [d_date_sk,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_year]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (3)
Filter [s_state,s_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store [s_state,s_store_sk]
InputAdapter
BroadcastExchange #5
WholeStageCodegen (4)
Filter [i_item_sk]
ColumnarToRow
InputAdapter
Scan parquet default.item [i_item_id,i_item_sk]
WholeStageCodegen (12)
HashAggregate [count,count,count,count,i_item_id,sum,sum,sum,sum] [agg1,agg2,agg3,agg4,avg(UnscaledValue(agg2)),avg(UnscaledValue(agg3)),avg(UnscaledValue(agg4)),avg(cast(agg1 as bigint)),count,count,count,count,g_state,s_state,sum,sum,sum,sum]
InputAdapter
Exchange [i_item_id] #6
WholeStageCodegen (11)
HashAggregate [agg1,agg2,agg3,agg4,i_item_id] [count,count,count,count,count,count,count,count,sum,sum,sum,sum,sum,sum,sum,sum]
Project [i_item_id,ss_coupon_amt,ss_list_price,ss_quantity,ss_sales_price]
BroadcastHashJoin [i_item_sk,ss_item_sk]
Project [ss_coupon_amt,ss_item_sk,ss_list_price,ss_quantity,ss_sales_price]
BroadcastHashJoin [s_store_sk,ss_store_sk]
Project [ss_coupon_amt,ss_item_sk,ss_list_price,ss_quantity,ss_sales_price,ss_store_sk]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Project [ss_coupon_amt,ss_item_sk,ss_list_price,ss_quantity,ss_sales_price,ss_sold_date_sk,ss_store_sk]
BroadcastHashJoin [cd_demo_sk,ss_cdemo_sk]
Filter [ss_cdemo_sk,ss_item_sk,ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_cdemo_sk,ss_coupon_amt,ss_item_sk,ss_list_price,ss_quantity,ss_sales_price,ss_sold_date_sk,ss_store_sk]
InputAdapter
ReusedExchange [cd_demo_sk] #2
InputAdapter
ReusedExchange [d_date_sk] #3
InputAdapter
BroadcastExchange #7
WholeStageCodegen (9)
Project [s_store_sk]
Filter [s_state,s_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store [s_state,s_store_sk]
InputAdapter
ReusedExchange [i_item_id,i_item_sk] #5
WholeStageCodegen (18)
HashAggregate [count,count,count,count,sum,sum,sum,sum] [agg1,agg2,agg3,agg4,avg(UnscaledValue(agg2)),avg(UnscaledValue(agg3)),avg(UnscaledValue(agg4)),avg(cast(agg1 as bigint)),count,count,count,count,g_state,i_item_id,s_state,sum,sum,sum,sum]
InputAdapter
Exchange #8
WholeStageCodegen (17)
HashAggregate [agg1,agg2,agg3,agg4] [count,count,count,count,count,count,count,count,sum,sum,sum,sum,sum,sum,sum,sum]
Project [ss_coupon_amt,ss_list_price,ss_quantity,ss_sales_price]
BroadcastHashJoin [i_item_sk,ss_item_sk]
Project [ss_coupon_amt,ss_item_sk,ss_list_price,ss_quantity,ss_sales_price]
BroadcastHashJoin [s_store_sk,ss_store_sk]
Project [ss_coupon_amt,ss_item_sk,ss_list_price,ss_quantity,ss_sales_price,ss_store_sk]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Project [ss_coupon_amt,ss_item_sk,ss_list_price,ss_quantity,ss_sales_price,ss_sold_date_sk,ss_store_sk]
BroadcastHashJoin [cd_demo_sk,ss_cdemo_sk]
Filter [ss_cdemo_sk,ss_item_sk,ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_cdemo_sk,ss_coupon_amt,ss_item_sk,ss_list_price,ss_quantity,ss_sales_price,ss_sold_date_sk,ss_store_sk]
InputAdapter
ReusedExchange [cd_demo_sk] #2
InputAdapter
ReusedExchange [d_date_sk] #3
InputAdapter
ReusedExchange [s_store_sk] #7
InputAdapter
BroadcastExchange #9
WholeStageCodegen (16)
Filter [i_item_sk]
ColumnarToRow
InputAdapter
Scan parquet default.item [i_item_sk]

File diff suppressed because one or more lines are too long

View file

@ -0,0 +1,31 @@
TakeOrderedAndProject [brand,brand_id,d_year,sum_agg]
WholeStageCodegen (4)
HashAggregate [d_year,i_brand,i_brand_id,sum] [brand,brand_id,sum,sum(UnscaledValue(ss_net_profit)),sum_agg]
InputAdapter
Exchange [d_year,i_brand,i_brand_id] #1
WholeStageCodegen (3)
HashAggregate [d_year,i_brand,i_brand_id,ss_net_profit] [sum,sum]
Project [d_year,i_brand,i_brand_id,ss_net_profit]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Project [i_brand,i_brand_id,ss_net_profit,ss_sold_date_sk]
BroadcastHashJoin [i_item_sk,ss_item_sk]
Filter [ss_item_sk,ss_sold_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_item_sk,ss_net_profit,ss_sold_date_sk]
InputAdapter
BroadcastExchange #2
WholeStageCodegen (1)
Project [i_brand,i_brand_id,i_item_sk]
Filter [i_item_sk,i_manufact_id]
ColumnarToRow
InputAdapter
Scan parquet default.item [i_brand,i_brand_id,i_item_sk,i_manufact_id]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (2)
Project [d_date_sk,d_year]
Filter [d_date_sk,d_moy]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_moy,d_year]

File diff suppressed because one or more lines are too long

View file

@ -0,0 +1,31 @@
TakeOrderedAndProject [brand,brand_id,d_year,sum_agg]
WholeStageCodegen (4)
HashAggregate [d_year,i_brand,i_brand_id,sum] [brand,brand_id,sum,sum(UnscaledValue(ss_net_profit)),sum_agg]
InputAdapter
Exchange [d_year,i_brand,i_brand_id] #1
WholeStageCodegen (3)
HashAggregate [d_year,i_brand,i_brand_id,ss_net_profit] [sum,sum]
Project [d_year,i_brand,i_brand_id,ss_net_profit]
BroadcastHashJoin [i_item_sk,ss_item_sk]
Project [d_year,ss_item_sk,ss_net_profit]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Project [d_date_sk,d_year]
Filter [d_date_sk,d_moy]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_moy,d_year]
InputAdapter
BroadcastExchange #2
WholeStageCodegen (1)
Filter [ss_item_sk,ss_sold_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_item_sk,ss_net_profit,ss_sold_date_sk]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (2)
Project [i_brand,i_brand_id,i_item_sk]
Filter [i_item_sk,i_manufact_id]
ColumnarToRow
InputAdapter
Scan parquet default.item [i_brand,i_brand_id,i_item_sk,i_manufact_id]

View file

@ -0,0 +1,218 @@
== Physical Plan ==
* Sort (39)
+- Exchange (38)
+- * Project (37)
+- * SortMergeJoin Inner (36)
:- * Sort (30)
: +- Exchange (29)
: +- * Filter (28)
: +- * HashAggregate (27)
: +- Exchange (26)
: +- * HashAggregate (25)
: +- * Project (24)
: +- * BroadcastHashJoin Inner BuildRight (23)
: :- * Project (17)
: : +- * BroadcastHashJoin Inner BuildRight (16)
: : :- * Project (10)
: : : +- * BroadcastHashJoin Inner BuildRight (9)
: : : :- * Filter (3)
: : : : +- * ColumnarToRow (2)
: : : : +- Scan parquet default.store_sales (1)
: : : +- BroadcastExchange (8)
: : : +- * Project (7)
: : : +- * Filter (6)
: : : +- * ColumnarToRow (5)
: : : +- Scan parquet default.date_dim (4)
: : +- BroadcastExchange (15)
: : +- * Project (14)
: : +- * Filter (13)
: : +- * ColumnarToRow (12)
: : +- Scan parquet default.store (11)
: +- BroadcastExchange (22)
: +- * Project (21)
: +- * Filter (20)
: +- * ColumnarToRow (19)
: +- Scan parquet default.household_demographics (18)
+- * Sort (35)
+- Exchange (34)
+- * Filter (33)
+- * ColumnarToRow (32)
+- Scan parquet default.customer (31)
(1) Scan parquet default.store_sales
Output [5]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_store_sk#4, ss_ticket_number#5]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2450816), LessThanOrEqual(ss_sold_date_sk,2451910), IsNotNull(ss_store_sk), IsNotNull(ss_hdemo_sk), IsNotNull(ss_customer_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_customer_sk:int,ss_hdemo_sk:int,ss_store_sk:int,ss_ticket_number:int>
(2) ColumnarToRow [codegen id : 4]
Input [5]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_store_sk#4, ss_ticket_number#5]
(3) Filter [codegen id : 4]
Input [5]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_store_sk#4, ss_ticket_number#5]
Condition : (((((isnotnull(ss_sold_date_sk#1) AND (ss_sold_date_sk#1 >= 2450816)) AND (ss_sold_date_sk#1 <= 2451910)) AND isnotnull(ss_store_sk#4)) AND isnotnull(ss_hdemo_sk#3)) AND isnotnull(ss_customer_sk#2))
(4) Scan parquet default.date_dim
Output [3]: [d_date_sk#6, d_year#7, d_dom#8]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/date_dim]
PushedFilters: [Or(And(GreaterThanOrEqual(d_dom,1),LessThanOrEqual(d_dom,3)),And(GreaterThanOrEqual(d_dom,25),LessThanOrEqual(d_dom,28))), In(d_year, [1998,1999,2000]), GreaterThanOrEqual(d_date_sk,2450816), LessThanOrEqual(d_date_sk,2451910), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int,d_dom:int>
(5) ColumnarToRow [codegen id : 1]
Input [3]: [d_date_sk#6, d_year#7, d_dom#8]
(6) Filter [codegen id : 1]
Input [3]: [d_date_sk#6, d_year#7, d_dom#8]
Condition : (((((((d_dom#8 >= 1) AND (d_dom#8 <= 3)) OR ((d_dom#8 >= 25) AND (d_dom#8 <= 28))) AND d_year#7 IN (1998,1999,2000)) AND (d_date_sk#6 >= 2450816)) AND (d_date_sk#6 <= 2451910)) AND isnotnull(d_date_sk#6))
(7) Project [codegen id : 1]
Output [1]: [d_date_sk#6]
Input [3]: [d_date_sk#6, d_year#7, d_dom#8]
(8) BroadcastExchange
Input [1]: [d_date_sk#6]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#9]
(9) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_sold_date_sk#1]
Right keys [1]: [d_date_sk#6]
Join condition: None
(10) Project [codegen id : 4]
Output [4]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_store_sk#4, ss_ticket_number#5]
Input [6]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_store_sk#4, ss_ticket_number#5, d_date_sk#6]
(11) Scan parquet default.store
Output [2]: [s_store_sk#10, s_county#11]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store]
PushedFilters: [In(s_county, [Saginaw County,Sumner County,Appanoose County,Daviess County,Fairfield County,Raleigh County,Ziebach County,Williamson County]), IsNotNull(s_store_sk)]
ReadSchema: struct<s_store_sk:int,s_county:string>
(12) ColumnarToRow [codegen id : 2]
Input [2]: [s_store_sk#10, s_county#11]
(13) Filter [codegen id : 2]
Input [2]: [s_store_sk#10, s_county#11]
Condition : (s_county#11 IN (Saginaw County,Sumner County,Appanoose County,Daviess County,Fairfield County,Raleigh County,Ziebach County,Williamson County) AND isnotnull(s_store_sk#10))
(14) Project [codegen id : 2]
Output [1]: [s_store_sk#10]
Input [2]: [s_store_sk#10, s_county#11]
(15) BroadcastExchange
Input [1]: [s_store_sk#10]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#12]
(16) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_store_sk#4]
Right keys [1]: [s_store_sk#10]
Join condition: None
(17) Project [codegen id : 4]
Output [3]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_ticket_number#5]
Input [5]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_store_sk#4, ss_ticket_number#5, s_store_sk#10]
(18) Scan parquet default.household_demographics
Output [4]: [hd_demo_sk#13, hd_buy_potential#14, hd_dep_count#15, hd_vehicle_count#16]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/household_demographics]
PushedFilters: [IsNotNull(hd_vehicle_count), Or(EqualTo(hd_buy_potential,>10000),EqualTo(hd_buy_potential,Unknown)), GreaterThan(hd_vehicle_count,0), IsNotNull(hd_demo_sk)]
ReadSchema: struct<hd_demo_sk:int,hd_buy_potential:string,hd_dep_count:int,hd_vehicle_count:int>
(19) ColumnarToRow [codegen id : 3]
Input [4]: [hd_demo_sk#13, hd_buy_potential#14, hd_dep_count#15, hd_vehicle_count#16]
(20) Filter [codegen id : 3]
Input [4]: [hd_demo_sk#13, hd_buy_potential#14, hd_dep_count#15, hd_vehicle_count#16]
Condition : ((((isnotnull(hd_vehicle_count#16) AND ((hd_buy_potential#14 = >10000) OR (hd_buy_potential#14 = Unknown))) AND (hd_vehicle_count#16 > 0)) AND (CASE WHEN (hd_vehicle_count#16 > 0) THEN (cast(hd_dep_count#15 as double) / cast(hd_vehicle_count#16 as double)) ELSE null END > 1.2)) AND isnotnull(hd_demo_sk#13))
(21) Project [codegen id : 3]
Output [1]: [hd_demo_sk#13]
Input [4]: [hd_demo_sk#13, hd_buy_potential#14, hd_dep_count#15, hd_vehicle_count#16]
(22) BroadcastExchange
Input [1]: [hd_demo_sk#13]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#17]
(23) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_hdemo_sk#3]
Right keys [1]: [hd_demo_sk#13]
Join condition: None
(24) Project [codegen id : 4]
Output [2]: [ss_customer_sk#2, ss_ticket_number#5]
Input [4]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_ticket_number#5, hd_demo_sk#13]
(25) HashAggregate [codegen id : 4]
Input [2]: [ss_customer_sk#2, ss_ticket_number#5]
Keys [2]: [ss_ticket_number#5, ss_customer_sk#2]
Functions [1]: [partial_count(1)]
Aggregate Attributes [1]: [count#18]
Results [3]: [ss_ticket_number#5, ss_customer_sk#2, count#19]
(26) Exchange
Input [3]: [ss_ticket_number#5, ss_customer_sk#2, count#19]
Arguments: hashpartitioning(ss_ticket_number#5, ss_customer_sk#2, 5), true, [id=#20]
(27) HashAggregate [codegen id : 5]
Input [3]: [ss_ticket_number#5, ss_customer_sk#2, count#19]
Keys [2]: [ss_ticket_number#5, ss_customer_sk#2]
Functions [1]: [count(1)]
Aggregate Attributes [1]: [count(1)#21]
Results [3]: [ss_ticket_number#5, ss_customer_sk#2, count(1)#21 AS cnt#22]
(28) Filter [codegen id : 5]
Input [3]: [ss_ticket_number#5, ss_customer_sk#2, cnt#22]
Condition : ((cnt#22 >= 15) AND (cnt#22 <= 20))
(29) Exchange
Input [3]: [ss_ticket_number#5, ss_customer_sk#2, cnt#22]
Arguments: hashpartitioning(ss_customer_sk#2, 5), true, [id=#23]
(30) Sort [codegen id : 6]
Input [3]: [ss_ticket_number#5, ss_customer_sk#2, cnt#22]
Arguments: [ss_customer_sk#2 ASC NULLS FIRST], false, 0
(31) Scan parquet default.customer
Output [5]: [c_customer_sk#24, c_salutation#25, c_first_name#26, c_last_name#27, c_preferred_cust_flag#28]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/customer]
PushedFilters: [IsNotNull(c_customer_sk)]
ReadSchema: struct<c_customer_sk:int,c_salutation:string,c_first_name:string,c_last_name:string,c_preferred_cust_flag:string>
(32) ColumnarToRow [codegen id : 7]
Input [5]: [c_customer_sk#24, c_salutation#25, c_first_name#26, c_last_name#27, c_preferred_cust_flag#28]
(33) Filter [codegen id : 7]
Input [5]: [c_customer_sk#24, c_salutation#25, c_first_name#26, c_last_name#27, c_preferred_cust_flag#28]
Condition : isnotnull(c_customer_sk#24)
(34) Exchange
Input [5]: [c_customer_sk#24, c_salutation#25, c_first_name#26, c_last_name#27, c_preferred_cust_flag#28]
Arguments: hashpartitioning(c_customer_sk#24, 5), true, [id=#29]
(35) Sort [codegen id : 8]
Input [5]: [c_customer_sk#24, c_salutation#25, c_first_name#26, c_last_name#27, c_preferred_cust_flag#28]
Arguments: [c_customer_sk#24 ASC NULLS FIRST], false, 0
(36) SortMergeJoin [codegen id : 9]
Left keys [1]: [ss_customer_sk#2]
Right keys [1]: [c_customer_sk#24]
Join condition: None
(37) Project [codegen id : 9]
Output [6]: [c_last_name#27, c_first_name#26, c_salutation#25, c_preferred_cust_flag#28, ss_ticket_number#5, cnt#22]
Input [8]: [ss_ticket_number#5, ss_customer_sk#2, cnt#22, c_customer_sk#24, c_salutation#25, c_first_name#26, c_last_name#27, c_preferred_cust_flag#28]
(38) Exchange
Input [6]: [c_last_name#27, c_first_name#26, c_salutation#25, c_preferred_cust_flag#28, ss_ticket_number#5, cnt#22]
Arguments: rangepartitioning(c_last_name#27 ASC NULLS FIRST, c_first_name#26 ASC NULLS FIRST, c_salutation#25 ASC NULLS FIRST, c_preferred_cust_flag#28 DESC NULLS LAST, 5), true, [id=#30]
(39) Sort [codegen id : 10]
Input [6]: [c_last_name#27, c_first_name#26, c_salutation#25, c_preferred_cust_flag#28, ss_ticket_number#5, cnt#22]
Arguments: [c_last_name#27 ASC NULLS FIRST, c_first_name#26 ASC NULLS FIRST, c_salutation#25 ASC NULLS FIRST, c_preferred_cust_flag#28 DESC NULLS LAST], true, 0

View file

@ -0,0 +1,63 @@
WholeStageCodegen (10)
Sort [c_first_name,c_last_name,c_preferred_cust_flag,c_salutation]
InputAdapter
Exchange [c_first_name,c_last_name,c_preferred_cust_flag,c_salutation] #1
WholeStageCodegen (9)
Project [c_first_name,c_last_name,c_preferred_cust_flag,c_salutation,cnt,ss_ticket_number]
SortMergeJoin [c_customer_sk,ss_customer_sk]
InputAdapter
WholeStageCodegen (6)
Sort [ss_customer_sk]
InputAdapter
Exchange [ss_customer_sk] #2
WholeStageCodegen (5)
Filter [cnt]
HashAggregate [count,ss_customer_sk,ss_ticket_number] [cnt,count,count(1)]
InputAdapter
Exchange [ss_customer_sk,ss_ticket_number] #3
WholeStageCodegen (4)
HashAggregate [ss_customer_sk,ss_ticket_number] [count,count]
Project [ss_customer_sk,ss_ticket_number]
BroadcastHashJoin [hd_demo_sk,ss_hdemo_sk]
Project [ss_customer_sk,ss_hdemo_sk,ss_ticket_number]
BroadcastHashJoin [s_store_sk,ss_store_sk]
Project [ss_customer_sk,ss_hdemo_sk,ss_store_sk,ss_ticket_number]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Filter [ss_customer_sk,ss_hdemo_sk,ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_customer_sk,ss_hdemo_sk,ss_sold_date_sk,ss_store_sk,ss_ticket_number]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (1)
Project [d_date_sk]
Filter [d_date_sk,d_dom,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_dom,d_year]
InputAdapter
BroadcastExchange #5
WholeStageCodegen (2)
Project [s_store_sk]
Filter [s_county,s_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store [s_county,s_store_sk]
InputAdapter
BroadcastExchange #6
WholeStageCodegen (3)
Project [hd_demo_sk]
Filter [hd_buy_potential,hd_demo_sk,hd_dep_count,hd_vehicle_count]
ColumnarToRow
InputAdapter
Scan parquet default.household_demographics [hd_buy_potential,hd_demo_sk,hd_dep_count,hd_vehicle_count]
InputAdapter
WholeStageCodegen (8)
Sort [c_customer_sk]
InputAdapter
Exchange [c_customer_sk] #7
WholeStageCodegen (7)
Filter [c_customer_sk]
ColumnarToRow
InputAdapter
Scan parquet default.customer [c_customer_sk,c_first_name,c_last_name,c_preferred_cust_flag,c_salutation]

View file

@ -0,0 +1,203 @@
== Physical Plan ==
* Sort (36)
+- Exchange (35)
+- * Project (34)
+- * BroadcastHashJoin Inner BuildRight (33)
:- * Filter (28)
: +- * HashAggregate (27)
: +- Exchange (26)
: +- * HashAggregate (25)
: +- * Project (24)
: +- * BroadcastHashJoin Inner BuildRight (23)
: :- * Project (17)
: : +- * BroadcastHashJoin Inner BuildRight (16)
: : :- * Project (10)
: : : +- * BroadcastHashJoin Inner BuildRight (9)
: : : :- * Filter (3)
: : : : +- * ColumnarToRow (2)
: : : : +- Scan parquet default.store_sales (1)
: : : +- BroadcastExchange (8)
: : : +- * Project (7)
: : : +- * Filter (6)
: : : +- * ColumnarToRow (5)
: : : +- Scan parquet default.date_dim (4)
: : +- BroadcastExchange (15)
: : +- * Project (14)
: : +- * Filter (13)
: : +- * ColumnarToRow (12)
: : +- Scan parquet default.store (11)
: +- BroadcastExchange (22)
: +- * Project (21)
: +- * Filter (20)
: +- * ColumnarToRow (19)
: +- Scan parquet default.household_demographics (18)
+- BroadcastExchange (32)
+- * Filter (31)
+- * ColumnarToRow (30)
+- Scan parquet default.customer (29)
(1) Scan parquet default.store_sales
Output [5]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_store_sk#4, ss_ticket_number#5]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2450816), LessThanOrEqual(ss_sold_date_sk,2451910), IsNotNull(ss_store_sk), IsNotNull(ss_hdemo_sk), IsNotNull(ss_customer_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_customer_sk:int,ss_hdemo_sk:int,ss_store_sk:int,ss_ticket_number:int>
(2) ColumnarToRow [codegen id : 4]
Input [5]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_store_sk#4, ss_ticket_number#5]
(3) Filter [codegen id : 4]
Input [5]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_store_sk#4, ss_ticket_number#5]
Condition : (((((isnotnull(ss_sold_date_sk#1) AND (ss_sold_date_sk#1 >= 2450816)) AND (ss_sold_date_sk#1 <= 2451910)) AND isnotnull(ss_store_sk#4)) AND isnotnull(ss_hdemo_sk#3)) AND isnotnull(ss_customer_sk#2))
(4) Scan parquet default.date_dim
Output [3]: [d_date_sk#6, d_year#7, d_dom#8]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/date_dim]
PushedFilters: [Or(And(GreaterThanOrEqual(d_dom,1),LessThanOrEqual(d_dom,3)),And(GreaterThanOrEqual(d_dom,25),LessThanOrEqual(d_dom,28))), In(d_year, [1998,1999,2000]), GreaterThanOrEqual(d_date_sk,2450816), LessThanOrEqual(d_date_sk,2451910), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int,d_dom:int>
(5) ColumnarToRow [codegen id : 1]
Input [3]: [d_date_sk#6, d_year#7, d_dom#8]
(6) Filter [codegen id : 1]
Input [3]: [d_date_sk#6, d_year#7, d_dom#8]
Condition : (((((((d_dom#8 >= 1) AND (d_dom#8 <= 3)) OR ((d_dom#8 >= 25) AND (d_dom#8 <= 28))) AND d_year#7 IN (1998,1999,2000)) AND (d_date_sk#6 >= 2450816)) AND (d_date_sk#6 <= 2451910)) AND isnotnull(d_date_sk#6))
(7) Project [codegen id : 1]
Output [1]: [d_date_sk#6]
Input [3]: [d_date_sk#6, d_year#7, d_dom#8]
(8) BroadcastExchange
Input [1]: [d_date_sk#6]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#9]
(9) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_sold_date_sk#1]
Right keys [1]: [d_date_sk#6]
Join condition: None
(10) Project [codegen id : 4]
Output [4]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_store_sk#4, ss_ticket_number#5]
Input [6]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_store_sk#4, ss_ticket_number#5, d_date_sk#6]
(11) Scan parquet default.store
Output [2]: [s_store_sk#10, s_county#11]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store]
PushedFilters: [In(s_county, [Saginaw County,Sumner County,Appanoose County,Daviess County,Fairfield County,Raleigh County,Ziebach County,Williamson County]), IsNotNull(s_store_sk)]
ReadSchema: struct<s_store_sk:int,s_county:string>
(12) ColumnarToRow [codegen id : 2]
Input [2]: [s_store_sk#10, s_county#11]
(13) Filter [codegen id : 2]
Input [2]: [s_store_sk#10, s_county#11]
Condition : (s_county#11 IN (Saginaw County,Sumner County,Appanoose County,Daviess County,Fairfield County,Raleigh County,Ziebach County,Williamson County) AND isnotnull(s_store_sk#10))
(14) Project [codegen id : 2]
Output [1]: [s_store_sk#10]
Input [2]: [s_store_sk#10, s_county#11]
(15) BroadcastExchange
Input [1]: [s_store_sk#10]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#12]
(16) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_store_sk#4]
Right keys [1]: [s_store_sk#10]
Join condition: None
(17) Project [codegen id : 4]
Output [3]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_ticket_number#5]
Input [5]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_store_sk#4, ss_ticket_number#5, s_store_sk#10]
(18) Scan parquet default.household_demographics
Output [4]: [hd_demo_sk#13, hd_buy_potential#14, hd_dep_count#15, hd_vehicle_count#16]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/household_demographics]
PushedFilters: [IsNotNull(hd_vehicle_count), Or(EqualTo(hd_buy_potential,>10000),EqualTo(hd_buy_potential,Unknown)), GreaterThan(hd_vehicle_count,0), IsNotNull(hd_demo_sk)]
ReadSchema: struct<hd_demo_sk:int,hd_buy_potential:string,hd_dep_count:int,hd_vehicle_count:int>
(19) ColumnarToRow [codegen id : 3]
Input [4]: [hd_demo_sk#13, hd_buy_potential#14, hd_dep_count#15, hd_vehicle_count#16]
(20) Filter [codegen id : 3]
Input [4]: [hd_demo_sk#13, hd_buy_potential#14, hd_dep_count#15, hd_vehicle_count#16]
Condition : ((((isnotnull(hd_vehicle_count#16) AND ((hd_buy_potential#14 = >10000) OR (hd_buy_potential#14 = Unknown))) AND (hd_vehicle_count#16 > 0)) AND (CASE WHEN (hd_vehicle_count#16 > 0) THEN (cast(hd_dep_count#15 as double) / cast(hd_vehicle_count#16 as double)) ELSE null END > 1.2)) AND isnotnull(hd_demo_sk#13))
(21) Project [codegen id : 3]
Output [1]: [hd_demo_sk#13]
Input [4]: [hd_demo_sk#13, hd_buy_potential#14, hd_dep_count#15, hd_vehicle_count#16]
(22) BroadcastExchange
Input [1]: [hd_demo_sk#13]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#17]
(23) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_hdemo_sk#3]
Right keys [1]: [hd_demo_sk#13]
Join condition: None
(24) Project [codegen id : 4]
Output [2]: [ss_customer_sk#2, ss_ticket_number#5]
Input [4]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_ticket_number#5, hd_demo_sk#13]
(25) HashAggregate [codegen id : 4]
Input [2]: [ss_customer_sk#2, ss_ticket_number#5]
Keys [2]: [ss_ticket_number#5, ss_customer_sk#2]
Functions [1]: [partial_count(1)]
Aggregate Attributes [1]: [count#18]
Results [3]: [ss_ticket_number#5, ss_customer_sk#2, count#19]
(26) Exchange
Input [3]: [ss_ticket_number#5, ss_customer_sk#2, count#19]
Arguments: hashpartitioning(ss_ticket_number#5, ss_customer_sk#2, 5), true, [id=#20]
(27) HashAggregate [codegen id : 6]
Input [3]: [ss_ticket_number#5, ss_customer_sk#2, count#19]
Keys [2]: [ss_ticket_number#5, ss_customer_sk#2]
Functions [1]: [count(1)]
Aggregate Attributes [1]: [count(1)#21]
Results [3]: [ss_ticket_number#5, ss_customer_sk#2, count(1)#21 AS cnt#22]
(28) Filter [codegen id : 6]
Input [3]: [ss_ticket_number#5, ss_customer_sk#2, cnt#22]
Condition : ((cnt#22 >= 15) AND (cnt#22 <= 20))
(29) Scan parquet default.customer
Output [5]: [c_customer_sk#23, c_salutation#24, c_first_name#25, c_last_name#26, c_preferred_cust_flag#27]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/customer]
PushedFilters: [IsNotNull(c_customer_sk)]
ReadSchema: struct<c_customer_sk:int,c_salutation:string,c_first_name:string,c_last_name:string,c_preferred_cust_flag:string>
(30) ColumnarToRow [codegen id : 5]
Input [5]: [c_customer_sk#23, c_salutation#24, c_first_name#25, c_last_name#26, c_preferred_cust_flag#27]
(31) Filter [codegen id : 5]
Input [5]: [c_customer_sk#23, c_salutation#24, c_first_name#25, c_last_name#26, c_preferred_cust_flag#27]
Condition : isnotnull(c_customer_sk#23)
(32) BroadcastExchange
Input [5]: [c_customer_sk#23, c_salutation#24, c_first_name#25, c_last_name#26, c_preferred_cust_flag#27]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#28]
(33) BroadcastHashJoin [codegen id : 6]
Left keys [1]: [ss_customer_sk#2]
Right keys [1]: [c_customer_sk#23]
Join condition: None
(34) Project [codegen id : 6]
Output [6]: [c_last_name#26, c_first_name#25, c_salutation#24, c_preferred_cust_flag#27, ss_ticket_number#5, cnt#22]
Input [8]: [ss_ticket_number#5, ss_customer_sk#2, cnt#22, c_customer_sk#23, c_salutation#24, c_first_name#25, c_last_name#26, c_preferred_cust_flag#27]
(35) Exchange
Input [6]: [c_last_name#26, c_first_name#25, c_salutation#24, c_preferred_cust_flag#27, ss_ticket_number#5, cnt#22]
Arguments: rangepartitioning(c_last_name#26 ASC NULLS FIRST, c_first_name#25 ASC NULLS FIRST, c_salutation#24 ASC NULLS FIRST, c_preferred_cust_flag#27 DESC NULLS LAST, 5), true, [id=#29]
(36) Sort [codegen id : 7]
Input [6]: [c_last_name#26, c_first_name#25, c_salutation#24, c_preferred_cust_flag#27, ss_ticket_number#5, cnt#22]
Arguments: [c_last_name#26 ASC NULLS FIRST, c_first_name#25 ASC NULLS FIRST, c_salutation#24 ASC NULLS FIRST, c_preferred_cust_flag#27 DESC NULLS LAST], true, 0

View file

@ -0,0 +1,54 @@
WholeStageCodegen (7)
Sort [c_first_name,c_last_name,c_preferred_cust_flag,c_salutation]
InputAdapter
Exchange [c_first_name,c_last_name,c_preferred_cust_flag,c_salutation] #1
WholeStageCodegen (6)
Project [c_first_name,c_last_name,c_preferred_cust_flag,c_salutation,cnt,ss_ticket_number]
BroadcastHashJoin [c_customer_sk,ss_customer_sk]
Filter [cnt]
HashAggregate [count,ss_customer_sk,ss_ticket_number] [cnt,count,count(1)]
InputAdapter
Exchange [ss_customer_sk,ss_ticket_number] #2
WholeStageCodegen (4)
HashAggregate [ss_customer_sk,ss_ticket_number] [count,count]
Project [ss_customer_sk,ss_ticket_number]
BroadcastHashJoin [hd_demo_sk,ss_hdemo_sk]
Project [ss_customer_sk,ss_hdemo_sk,ss_ticket_number]
BroadcastHashJoin [s_store_sk,ss_store_sk]
Project [ss_customer_sk,ss_hdemo_sk,ss_store_sk,ss_ticket_number]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Filter [ss_customer_sk,ss_hdemo_sk,ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_customer_sk,ss_hdemo_sk,ss_sold_date_sk,ss_store_sk,ss_ticket_number]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (1)
Project [d_date_sk]
Filter [d_date_sk,d_dom,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_dom,d_year]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (2)
Project [s_store_sk]
Filter [s_county,s_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store [s_county,s_store_sk]
InputAdapter
BroadcastExchange #5
WholeStageCodegen (3)
Project [hd_demo_sk]
Filter [hd_buy_potential,hd_demo_sk,hd_dep_count,hd_vehicle_count]
ColumnarToRow
InputAdapter
Scan parquet default.household_demographics [hd_buy_potential,hd_demo_sk,hd_dep_count,hd_vehicle_count]
InputAdapter
BroadcastExchange #6
WholeStageCodegen (5)
Filter [c_customer_sk]
ColumnarToRow
InputAdapter
Scan parquet default.customer [c_customer_sk,c_first_name,c_last_name,c_preferred_cust_flag,c_salutation]

View file

@ -0,0 +1,122 @@
== Physical Plan ==
TakeOrderedAndProject (21)
+- * HashAggregate (20)
+- Exchange (19)
+- * HashAggregate (18)
+- * Project (17)
+- * BroadcastHashJoin Inner BuildRight (16)
:- * Project (10)
: +- * BroadcastHashJoin Inner BuildLeft (9)
: :- BroadcastExchange (5)
: : +- * Project (4)
: : +- * Filter (3)
: : +- * ColumnarToRow (2)
: : +- Scan parquet default.date_dim (1)
: +- * Filter (8)
: +- * ColumnarToRow (7)
: +- Scan parquet default.store_sales (6)
+- BroadcastExchange (15)
+- * Project (14)
+- * Filter (13)
+- * ColumnarToRow (12)
+- Scan parquet default.item (11)
(1) Scan parquet default.date_dim
Output [3]: [d_date_sk#1, d_year#2, d_moy#3]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/date_dim]
PushedFilters: [IsNotNull(d_moy), IsNotNull(d_year), EqualTo(d_moy,12), EqualTo(d_year,1998), GreaterThanOrEqual(d_date_sk,2451149), IsNotNull(d_date_sk), LessThanOrEqual(d_date_sk,2451179)]
ReadSchema: struct<d_date_sk:int,d_year:int,d_moy:int>
(2) ColumnarToRow [codegen id : 1]
Input [3]: [d_date_sk#1, d_year#2, d_moy#3]
(3) Filter [codegen id : 1]
Input [3]: [d_date_sk#1, d_year#2, d_moy#3]
Condition : ((((((isnotnull(d_moy#3) AND isnotnull(d_year#2)) AND (d_moy#3 = 12)) AND (d_year#2 = 1998)) AND (d_date_sk#1 >= 2451149)) AND isnotnull(d_date_sk#1)) AND (d_date_sk#1 <= 2451179))
(4) Project [codegen id : 1]
Output [2]: [d_date_sk#1, d_year#2]
Input [3]: [d_date_sk#1, d_year#2, d_moy#3]
(5) BroadcastExchange
Input [2]: [d_date_sk#1, d_year#2]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#4]
(6) Scan parquet default.store_sales
Output [3]: [ss_sold_date_sk#5, ss_item_sk#6, ss_ext_sales_price#7]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2451149), LessThanOrEqual(ss_sold_date_sk,2451179), IsNotNull(ss_item_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_item_sk:int,ss_ext_sales_price:decimal(7,2)>
(7) ColumnarToRow
Input [3]: [ss_sold_date_sk#5, ss_item_sk#6, ss_ext_sales_price#7]
(8) Filter
Input [3]: [ss_sold_date_sk#5, ss_item_sk#6, ss_ext_sales_price#7]
Condition : (((isnotnull(ss_sold_date_sk#5) AND (ss_sold_date_sk#5 >= 2451149)) AND (ss_sold_date_sk#5 <= 2451179)) AND isnotnull(ss_item_sk#6))
(9) BroadcastHashJoin [codegen id : 3]
Left keys [1]: [d_date_sk#1]
Right keys [1]: [ss_sold_date_sk#5]
Join condition: None
(10) Project [codegen id : 3]
Output [3]: [d_year#2, ss_item_sk#6, ss_ext_sales_price#7]
Input [5]: [d_date_sk#1, d_year#2, ss_sold_date_sk#5, ss_item_sk#6, ss_ext_sales_price#7]
(11) Scan parquet default.item
Output [4]: [i_item_sk#8, i_category_id#9, i_category#10, i_manager_id#11]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/item]
PushedFilters: [IsNotNull(i_manager_id), EqualTo(i_manager_id,1), IsNotNull(i_item_sk)]
ReadSchema: struct<i_item_sk:int,i_category_id:int,i_category:string,i_manager_id:int>
(12) ColumnarToRow [codegen id : 2]
Input [4]: [i_item_sk#8, i_category_id#9, i_category#10, i_manager_id#11]
(13) Filter [codegen id : 2]
Input [4]: [i_item_sk#8, i_category_id#9, i_category#10, i_manager_id#11]
Condition : ((isnotnull(i_manager_id#11) AND (i_manager_id#11 = 1)) AND isnotnull(i_item_sk#8))
(14) Project [codegen id : 2]
Output [3]: [i_item_sk#8, i_category_id#9, i_category#10]
Input [4]: [i_item_sk#8, i_category_id#9, i_category#10, i_manager_id#11]
(15) BroadcastExchange
Input [3]: [i_item_sk#8, i_category_id#9, i_category#10]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#12]
(16) BroadcastHashJoin [codegen id : 3]
Left keys [1]: [ss_item_sk#6]
Right keys [1]: [i_item_sk#8]
Join condition: None
(17) Project [codegen id : 3]
Output [4]: [d_year#2, ss_ext_sales_price#7, i_category_id#9, i_category#10]
Input [6]: [d_year#2, ss_item_sk#6, ss_ext_sales_price#7, i_item_sk#8, i_category_id#9, i_category#10]
(18) HashAggregate [codegen id : 3]
Input [4]: [d_year#2, ss_ext_sales_price#7, i_category_id#9, i_category#10]
Keys [3]: [d_year#2, i_category_id#9, i_category#10]
Functions [1]: [partial_sum(UnscaledValue(ss_ext_sales_price#7))]
Aggregate Attributes [1]: [sum#13]
Results [4]: [d_year#2, i_category_id#9, i_category#10, sum#14]
(19) Exchange
Input [4]: [d_year#2, i_category_id#9, i_category#10, sum#14]
Arguments: hashpartitioning(d_year#2, i_category_id#9, i_category#10, 5), true, [id=#15]
(20) HashAggregate [codegen id : 4]
Input [4]: [d_year#2, i_category_id#9, i_category#10, sum#14]
Keys [3]: [d_year#2, i_category_id#9, i_category#10]
Functions [1]: [sum(UnscaledValue(ss_ext_sales_price#7))]
Aggregate Attributes [1]: [sum(UnscaledValue(ss_ext_sales_price#7))#16]
Results [4]: [d_year#2, i_category_id#9, i_category#10, MakeDecimal(sum(UnscaledValue(ss_ext_sales_price#7))#16,17,2) AS sum(ss_ext_sales_price)#17]
(21) TakeOrderedAndProject
Input [4]: [d_year#2, i_category_id#9, i_category#10, sum(ss_ext_sales_price)#17]
Arguments: 100, [sum(ss_ext_sales_price)#17 DESC NULLS LAST, d_year#2 ASC NULLS FIRST, i_category_id#9 ASC NULLS FIRST, i_category#10 ASC NULLS FIRST], [d_year#2, i_category_id#9, i_category#10, sum(ss_ext_sales_price)#17]

View file

@ -0,0 +1,31 @@
TakeOrderedAndProject [d_year,i_category,i_category_id,sum(ss_ext_sales_price)]
WholeStageCodegen (4)
HashAggregate [d_year,i_category,i_category_id,sum] [sum,sum(UnscaledValue(ss_ext_sales_price)),sum(ss_ext_sales_price)]
InputAdapter
Exchange [d_year,i_category,i_category_id] #1
WholeStageCodegen (3)
HashAggregate [d_year,i_category,i_category_id,ss_ext_sales_price] [sum,sum]
Project [d_year,i_category,i_category_id,ss_ext_sales_price]
BroadcastHashJoin [i_item_sk,ss_item_sk]
Project [d_year,ss_ext_sales_price,ss_item_sk]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
InputAdapter
BroadcastExchange #2
WholeStageCodegen (1)
Project [d_date_sk,d_year]
Filter [d_date_sk,d_moy,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_moy,d_year]
Filter [ss_item_sk,ss_sold_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_ext_sales_price,ss_item_sk,ss_sold_date_sk]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (2)
Project [i_category,i_category_id,i_item_sk]
Filter [i_item_sk,i_manager_id]
ColumnarToRow
InputAdapter
Scan parquet default.item [i_category,i_category_id,i_item_sk,i_manager_id]

View file

@ -0,0 +1,122 @@
== Physical Plan ==
TakeOrderedAndProject (21)
+- * HashAggregate (20)
+- Exchange (19)
+- * HashAggregate (18)
+- * Project (17)
+- * BroadcastHashJoin Inner BuildRight (16)
:- * Project (10)
: +- * BroadcastHashJoin Inner BuildRight (9)
: :- * Project (4)
: : +- * Filter (3)
: : +- * ColumnarToRow (2)
: : +- Scan parquet default.date_dim (1)
: +- BroadcastExchange (8)
: +- * Filter (7)
: +- * ColumnarToRow (6)
: +- Scan parquet default.store_sales (5)
+- BroadcastExchange (15)
+- * Project (14)
+- * Filter (13)
+- * ColumnarToRow (12)
+- Scan parquet default.item (11)
(1) Scan parquet default.date_dim
Output [3]: [d_date_sk#1, d_year#2, d_moy#3]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/date_dim]
PushedFilters: [IsNotNull(d_moy), IsNotNull(d_year), EqualTo(d_moy,12), EqualTo(d_year,1998), LessThanOrEqual(d_date_sk,2451179), GreaterThanOrEqual(d_date_sk,2451149), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int,d_moy:int>
(2) ColumnarToRow [codegen id : 3]
Input [3]: [d_date_sk#1, d_year#2, d_moy#3]
(3) Filter [codegen id : 3]
Input [3]: [d_date_sk#1, d_year#2, d_moy#3]
Condition : ((((((isnotnull(d_moy#3) AND isnotnull(d_year#2)) AND (d_moy#3 = 12)) AND (d_year#2 = 1998)) AND (d_date_sk#1 <= 2451179)) AND (d_date_sk#1 >= 2451149)) AND isnotnull(d_date_sk#1))
(4) Project [codegen id : 3]
Output [2]: [d_date_sk#1, d_year#2]
Input [3]: [d_date_sk#1, d_year#2, d_moy#3]
(5) Scan parquet default.store_sales
Output [3]: [ss_sold_date_sk#4, ss_item_sk#5, ss_ext_sales_price#6]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2451149), LessThanOrEqual(ss_sold_date_sk,2451179), IsNotNull(ss_item_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_item_sk:int,ss_ext_sales_price:decimal(7,2)>
(6) ColumnarToRow [codegen id : 1]
Input [3]: [ss_sold_date_sk#4, ss_item_sk#5, ss_ext_sales_price#6]
(7) Filter [codegen id : 1]
Input [3]: [ss_sold_date_sk#4, ss_item_sk#5, ss_ext_sales_price#6]
Condition : (((isnotnull(ss_sold_date_sk#4) AND (ss_sold_date_sk#4 >= 2451149)) AND (ss_sold_date_sk#4 <= 2451179)) AND isnotnull(ss_item_sk#5))
(8) BroadcastExchange
Input [3]: [ss_sold_date_sk#4, ss_item_sk#5, ss_ext_sales_price#6]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#7]
(9) BroadcastHashJoin [codegen id : 3]
Left keys [1]: [d_date_sk#1]
Right keys [1]: [ss_sold_date_sk#4]
Join condition: None
(10) Project [codegen id : 3]
Output [3]: [d_year#2, ss_item_sk#5, ss_ext_sales_price#6]
Input [5]: [d_date_sk#1, d_year#2, ss_sold_date_sk#4, ss_item_sk#5, ss_ext_sales_price#6]
(11) Scan parquet default.item
Output [4]: [i_item_sk#8, i_category_id#9, i_category#10, i_manager_id#11]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/item]
PushedFilters: [IsNotNull(i_manager_id), EqualTo(i_manager_id,1), IsNotNull(i_item_sk)]
ReadSchema: struct<i_item_sk:int,i_category_id:int,i_category:string,i_manager_id:int>
(12) ColumnarToRow [codegen id : 2]
Input [4]: [i_item_sk#8, i_category_id#9, i_category#10, i_manager_id#11]
(13) Filter [codegen id : 2]
Input [4]: [i_item_sk#8, i_category_id#9, i_category#10, i_manager_id#11]
Condition : ((isnotnull(i_manager_id#11) AND (i_manager_id#11 = 1)) AND isnotnull(i_item_sk#8))
(14) Project [codegen id : 2]
Output [3]: [i_item_sk#8, i_category_id#9, i_category#10]
Input [4]: [i_item_sk#8, i_category_id#9, i_category#10, i_manager_id#11]
(15) BroadcastExchange
Input [3]: [i_item_sk#8, i_category_id#9, i_category#10]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#12]
(16) BroadcastHashJoin [codegen id : 3]
Left keys [1]: [ss_item_sk#5]
Right keys [1]: [i_item_sk#8]
Join condition: None
(17) Project [codegen id : 3]
Output [4]: [d_year#2, ss_ext_sales_price#6, i_category_id#9, i_category#10]
Input [6]: [d_year#2, ss_item_sk#5, ss_ext_sales_price#6, i_item_sk#8, i_category_id#9, i_category#10]
(18) HashAggregate [codegen id : 3]
Input [4]: [d_year#2, ss_ext_sales_price#6, i_category_id#9, i_category#10]
Keys [3]: [d_year#2, i_category_id#9, i_category#10]
Functions [1]: [partial_sum(UnscaledValue(ss_ext_sales_price#6))]
Aggregate Attributes [1]: [sum#13]
Results [4]: [d_year#2, i_category_id#9, i_category#10, sum#14]
(19) Exchange
Input [4]: [d_year#2, i_category_id#9, i_category#10, sum#14]
Arguments: hashpartitioning(d_year#2, i_category_id#9, i_category#10, 5), true, [id=#15]
(20) HashAggregate [codegen id : 4]
Input [4]: [d_year#2, i_category_id#9, i_category#10, sum#14]
Keys [3]: [d_year#2, i_category_id#9, i_category#10]
Functions [1]: [sum(UnscaledValue(ss_ext_sales_price#6))]
Aggregate Attributes [1]: [sum(UnscaledValue(ss_ext_sales_price#6))#16]
Results [4]: [d_year#2, i_category_id#9, i_category#10, MakeDecimal(sum(UnscaledValue(ss_ext_sales_price#6))#16,17,2) AS sum(ss_ext_sales_price)#17]
(21) TakeOrderedAndProject
Input [4]: [d_year#2, i_category_id#9, i_category#10, sum(ss_ext_sales_price)#17]
Arguments: 100, [sum(ss_ext_sales_price)#17 DESC NULLS LAST, d_year#2 ASC NULLS FIRST, i_category_id#9 ASC NULLS FIRST, i_category#10 ASC NULLS FIRST], [d_year#2, i_category_id#9, i_category#10, sum(ss_ext_sales_price)#17]

View file

@ -0,0 +1,31 @@
TakeOrderedAndProject [d_year,i_category,i_category_id,sum(ss_ext_sales_price)]
WholeStageCodegen (4)
HashAggregate [d_year,i_category,i_category_id,sum] [sum,sum(UnscaledValue(ss_ext_sales_price)),sum(ss_ext_sales_price)]
InputAdapter
Exchange [d_year,i_category,i_category_id] #1
WholeStageCodegen (3)
HashAggregate [d_year,i_category,i_category_id,ss_ext_sales_price] [sum,sum]
Project [d_year,i_category,i_category_id,ss_ext_sales_price]
BroadcastHashJoin [i_item_sk,ss_item_sk]
Project [d_year,ss_ext_sales_price,ss_item_sk]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Project [d_date_sk,d_year]
Filter [d_date_sk,d_moy,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_moy,d_year]
InputAdapter
BroadcastExchange #2
WholeStageCodegen (1)
Filter [ss_item_sk,ss_sold_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_ext_sales_price,ss_item_sk,ss_sold_date_sk]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (2)
Project [i_category,i_category_id,i_item_sk]
Filter [i_item_sk,i_manager_id]
ColumnarToRow
InputAdapter
Scan parquet default.item [i_category,i_category_id,i_item_sk,i_manager_id]

View file

@ -0,0 +1,122 @@
== Physical Plan ==
TakeOrderedAndProject (21)
+- * HashAggregate (20)
+- Exchange (19)
+- * HashAggregate (18)
+- * Project (17)
+- * BroadcastHashJoin Inner BuildRight (16)
:- * Project (10)
: +- * BroadcastHashJoin Inner BuildLeft (9)
: :- BroadcastExchange (5)
: : +- * Project (4)
: : +- * Filter (3)
: : +- * ColumnarToRow (2)
: : +- Scan parquet default.date_dim (1)
: +- * Filter (8)
: +- * ColumnarToRow (7)
: +- Scan parquet default.store_sales (6)
+- BroadcastExchange (15)
+- * Project (14)
+- * Filter (13)
+- * ColumnarToRow (12)
+- Scan parquet default.store (11)
(1) Scan parquet default.date_dim
Output [3]: [d_date_sk#1, d_year#2, d_day_name#3]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/date_dim]
PushedFilters: [IsNotNull(d_year), EqualTo(d_year,1998), GreaterThanOrEqual(d_date_sk,2450816), LessThanOrEqual(d_date_sk,2451179), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int,d_day_name:string>
(2) ColumnarToRow [codegen id : 1]
Input [3]: [d_date_sk#1, d_year#2, d_day_name#3]
(3) Filter [codegen id : 1]
Input [3]: [d_date_sk#1, d_year#2, d_day_name#3]
Condition : ((((isnotnull(d_year#2) AND (d_year#2 = 1998)) AND (d_date_sk#1 >= 2450816)) AND (d_date_sk#1 <= 2451179)) AND isnotnull(d_date_sk#1))
(4) Project [codegen id : 1]
Output [2]: [d_date_sk#1, d_day_name#3]
Input [3]: [d_date_sk#1, d_year#2, d_day_name#3]
(5) BroadcastExchange
Input [2]: [d_date_sk#1, d_day_name#3]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#4]
(6) Scan parquet default.store_sales
Output [3]: [ss_sold_date_sk#5, ss_store_sk#6, ss_sales_price#7]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2450816), LessThanOrEqual(ss_sold_date_sk,2451179), IsNotNull(ss_store_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_store_sk:int,ss_sales_price:decimal(7,2)>
(7) ColumnarToRow
Input [3]: [ss_sold_date_sk#5, ss_store_sk#6, ss_sales_price#7]
(8) Filter
Input [3]: [ss_sold_date_sk#5, ss_store_sk#6, ss_sales_price#7]
Condition : (((isnotnull(ss_sold_date_sk#5) AND (ss_sold_date_sk#5 >= 2450816)) AND (ss_sold_date_sk#5 <= 2451179)) AND isnotnull(ss_store_sk#6))
(9) BroadcastHashJoin [codegen id : 3]
Left keys [1]: [d_date_sk#1]
Right keys [1]: [ss_sold_date_sk#5]
Join condition: None
(10) Project [codegen id : 3]
Output [3]: [d_day_name#3, ss_store_sk#6, ss_sales_price#7]
Input [5]: [d_date_sk#1, d_day_name#3, ss_sold_date_sk#5, ss_store_sk#6, ss_sales_price#7]
(11) Scan parquet default.store
Output [4]: [s_store_sk#8, s_store_id#9, s_store_name#10, s_gmt_offset#11]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store]
PushedFilters: [IsNotNull(s_gmt_offset), EqualTo(s_gmt_offset,-5.00), IsNotNull(s_store_sk)]
ReadSchema: struct<s_store_sk:int,s_store_id:string,s_store_name:string,s_gmt_offset:decimal(5,2)>
(12) ColumnarToRow [codegen id : 2]
Input [4]: [s_store_sk#8, s_store_id#9, s_store_name#10, s_gmt_offset#11]
(13) Filter [codegen id : 2]
Input [4]: [s_store_sk#8, s_store_id#9, s_store_name#10, s_gmt_offset#11]
Condition : ((isnotnull(s_gmt_offset#11) AND (s_gmt_offset#11 = -5.00)) AND isnotnull(s_store_sk#8))
(14) Project [codegen id : 2]
Output [3]: [s_store_sk#8, s_store_id#9, s_store_name#10]
Input [4]: [s_store_sk#8, s_store_id#9, s_store_name#10, s_gmt_offset#11]
(15) BroadcastExchange
Input [3]: [s_store_sk#8, s_store_id#9, s_store_name#10]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#12]
(16) BroadcastHashJoin [codegen id : 3]
Left keys [1]: [ss_store_sk#6]
Right keys [1]: [s_store_sk#8]
Join condition: None
(17) Project [codegen id : 3]
Output [4]: [d_day_name#3, ss_sales_price#7, s_store_id#9, s_store_name#10]
Input [6]: [d_day_name#3, ss_store_sk#6, ss_sales_price#7, s_store_sk#8, s_store_id#9, s_store_name#10]
(18) HashAggregate [codegen id : 3]
Input [4]: [d_day_name#3, ss_sales_price#7, s_store_id#9, s_store_name#10]
Keys [2]: [s_store_name#10, s_store_id#9]
Functions [7]: [partial_sum(UnscaledValue(CASE WHEN (d_day_name#3 = Sunday) THEN ss_sales_price#7 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#3 = Monday) THEN ss_sales_price#7 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#3 = Tuesday) THEN ss_sales_price#7 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#3 = Wednesday) THEN ss_sales_price#7 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#3 = Thursday) THEN ss_sales_price#7 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#3 = Friday) THEN ss_sales_price#7 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#3 = Saturday) THEN ss_sales_price#7 ELSE null END))]
Aggregate Attributes [7]: [sum#13, sum#14, sum#15, sum#16, sum#17, sum#18, sum#19]
Results [9]: [s_store_name#10, s_store_id#9, sum#20, sum#21, sum#22, sum#23, sum#24, sum#25, sum#26]
(19) Exchange
Input [9]: [s_store_name#10, s_store_id#9, sum#20, sum#21, sum#22, sum#23, sum#24, sum#25, sum#26]
Arguments: hashpartitioning(s_store_name#10, s_store_id#9, 5), true, [id=#27]
(20) HashAggregate [codegen id : 4]
Input [9]: [s_store_name#10, s_store_id#9, sum#20, sum#21, sum#22, sum#23, sum#24, sum#25, sum#26]
Keys [2]: [s_store_name#10, s_store_id#9]
Functions [7]: [sum(UnscaledValue(CASE WHEN (d_day_name#3 = Sunday) THEN ss_sales_price#7 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#3 = Monday) THEN ss_sales_price#7 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#3 = Tuesday) THEN ss_sales_price#7 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#3 = Wednesday) THEN ss_sales_price#7 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#3 = Thursday) THEN ss_sales_price#7 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#3 = Friday) THEN ss_sales_price#7 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#3 = Saturday) THEN ss_sales_price#7 ELSE null END))]
Aggregate Attributes [7]: [sum(UnscaledValue(CASE WHEN (d_day_name#3 = Sunday) THEN ss_sales_price#7 ELSE null END))#28, sum(UnscaledValue(CASE WHEN (d_day_name#3 = Monday) THEN ss_sales_price#7 ELSE null END))#29, sum(UnscaledValue(CASE WHEN (d_day_name#3 = Tuesday) THEN ss_sales_price#7 ELSE null END))#30, sum(UnscaledValue(CASE WHEN (d_day_name#3 = Wednesday) THEN ss_sales_price#7 ELSE null END))#31, sum(UnscaledValue(CASE WHEN (d_day_name#3 = Thursday) THEN ss_sales_price#7 ELSE null END))#32, sum(UnscaledValue(CASE WHEN (d_day_name#3 = Friday) THEN ss_sales_price#7 ELSE null END))#33, sum(UnscaledValue(CASE WHEN (d_day_name#3 = Saturday) THEN ss_sales_price#7 ELSE null END))#34]
Results [9]: [s_store_name#10, s_store_id#9, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#3 = Sunday) THEN ss_sales_price#7 ELSE null END))#28,17,2) AS sun_sales#35, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#3 = Monday) THEN ss_sales_price#7 ELSE null END))#29,17,2) AS mon_sales#36, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#3 = Tuesday) THEN ss_sales_price#7 ELSE null END))#30,17,2) AS tue_sales#37, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#3 = Wednesday) THEN ss_sales_price#7 ELSE null END))#31,17,2) AS wed_sales#38, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#3 = Thursday) THEN ss_sales_price#7 ELSE null END))#32,17,2) AS thu_sales#39, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#3 = Friday) THEN ss_sales_price#7 ELSE null END))#33,17,2) AS fri_sales#40, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#3 = Saturday) THEN ss_sales_price#7 ELSE null END))#34,17,2) AS sat_sales#41]
(21) TakeOrderedAndProject
Input [9]: [s_store_name#10, s_store_id#9, sun_sales#35, mon_sales#36, tue_sales#37, wed_sales#38, thu_sales#39, fri_sales#40, sat_sales#41]
Arguments: 100, [s_store_name#10 ASC NULLS FIRST, s_store_id#9 ASC NULLS FIRST, sun_sales#35 ASC NULLS FIRST, mon_sales#36 ASC NULLS FIRST, tue_sales#37 ASC NULLS FIRST, wed_sales#38 ASC NULLS FIRST, thu_sales#39 ASC NULLS FIRST, fri_sales#40 ASC NULLS FIRST, sat_sales#41 ASC NULLS FIRST], [s_store_name#10, s_store_id#9, sun_sales#35, mon_sales#36, tue_sales#37, wed_sales#38, thu_sales#39, fri_sales#40, sat_sales#41]

View file

@ -0,0 +1,31 @@
TakeOrderedAndProject [fri_sales,mon_sales,s_store_id,s_store_name,sat_sales,sun_sales,thu_sales,tue_sales,wed_sales]
WholeStageCodegen (4)
HashAggregate [s_store_id,s_store_name,sum,sum,sum,sum,sum,sum,sum] [fri_sales,mon_sales,sat_sales,sum,sum,sum,sum,sum,sum,sum,sum(UnscaledValue(CASE WHEN (d_day_name = Friday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Monday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Saturday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Sunday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Thursday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Tuesday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Wednesday) THEN ss_sales_price ELSE null END)),sun_sales,thu_sales,tue_sales,wed_sales]
InputAdapter
Exchange [s_store_id,s_store_name] #1
WholeStageCodegen (3)
HashAggregate [d_day_name,s_store_id,s_store_name,ss_sales_price] [sum,sum,sum,sum,sum,sum,sum,sum,sum,sum,sum,sum,sum,sum]
Project [d_day_name,s_store_id,s_store_name,ss_sales_price]
BroadcastHashJoin [s_store_sk,ss_store_sk]
Project [d_day_name,ss_sales_price,ss_store_sk]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
InputAdapter
BroadcastExchange #2
WholeStageCodegen (1)
Project [d_date_sk,d_day_name]
Filter [d_date_sk,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_day_name,d_year]
Filter [ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_sales_price,ss_sold_date_sk,ss_store_sk]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (2)
Project [s_store_id,s_store_name,s_store_sk]
Filter [s_gmt_offset,s_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store [s_gmt_offset,s_store_id,s_store_name,s_store_sk]

View file

@ -0,0 +1,122 @@
== Physical Plan ==
TakeOrderedAndProject (21)
+- * HashAggregate (20)
+- Exchange (19)
+- * HashAggregate (18)
+- * Project (17)
+- * BroadcastHashJoin Inner BuildRight (16)
:- * Project (10)
: +- * BroadcastHashJoin Inner BuildRight (9)
: :- * Project (4)
: : +- * Filter (3)
: : +- * ColumnarToRow (2)
: : +- Scan parquet default.date_dim (1)
: +- BroadcastExchange (8)
: +- * Filter (7)
: +- * ColumnarToRow (6)
: +- Scan parquet default.store_sales (5)
+- BroadcastExchange (15)
+- * Project (14)
+- * Filter (13)
+- * ColumnarToRow (12)
+- Scan parquet default.store (11)
(1) Scan parquet default.date_dim
Output [3]: [d_date_sk#1, d_year#2, d_day_name#3]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/date_dim]
PushedFilters: [IsNotNull(d_year), EqualTo(d_year,1998), GreaterThanOrEqual(d_date_sk,2450816), LessThanOrEqual(d_date_sk,2451179), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int,d_day_name:string>
(2) ColumnarToRow [codegen id : 3]
Input [3]: [d_date_sk#1, d_year#2, d_day_name#3]
(3) Filter [codegen id : 3]
Input [3]: [d_date_sk#1, d_year#2, d_day_name#3]
Condition : ((((isnotnull(d_year#2) AND (d_year#2 = 1998)) AND (d_date_sk#1 >= 2450816)) AND (d_date_sk#1 <= 2451179)) AND isnotnull(d_date_sk#1))
(4) Project [codegen id : 3]
Output [2]: [d_date_sk#1, d_day_name#3]
Input [3]: [d_date_sk#1, d_year#2, d_day_name#3]
(5) Scan parquet default.store_sales
Output [3]: [ss_sold_date_sk#4, ss_store_sk#5, ss_sales_price#6]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2450816), LessThanOrEqual(ss_sold_date_sk,2451179), IsNotNull(ss_store_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_store_sk:int,ss_sales_price:decimal(7,2)>
(6) ColumnarToRow [codegen id : 1]
Input [3]: [ss_sold_date_sk#4, ss_store_sk#5, ss_sales_price#6]
(7) Filter [codegen id : 1]
Input [3]: [ss_sold_date_sk#4, ss_store_sk#5, ss_sales_price#6]
Condition : (((isnotnull(ss_sold_date_sk#4) AND (ss_sold_date_sk#4 >= 2450816)) AND (ss_sold_date_sk#4 <= 2451179)) AND isnotnull(ss_store_sk#5))
(8) BroadcastExchange
Input [3]: [ss_sold_date_sk#4, ss_store_sk#5, ss_sales_price#6]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#7]
(9) BroadcastHashJoin [codegen id : 3]
Left keys [1]: [d_date_sk#1]
Right keys [1]: [ss_sold_date_sk#4]
Join condition: None
(10) Project [codegen id : 3]
Output [3]: [d_day_name#3, ss_store_sk#5, ss_sales_price#6]
Input [5]: [d_date_sk#1, d_day_name#3, ss_sold_date_sk#4, ss_store_sk#5, ss_sales_price#6]
(11) Scan parquet default.store
Output [4]: [s_store_sk#8, s_store_id#9, s_store_name#10, s_gmt_offset#11]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store]
PushedFilters: [IsNotNull(s_gmt_offset), EqualTo(s_gmt_offset,-5.00), IsNotNull(s_store_sk)]
ReadSchema: struct<s_store_sk:int,s_store_id:string,s_store_name:string,s_gmt_offset:decimal(5,2)>
(12) ColumnarToRow [codegen id : 2]
Input [4]: [s_store_sk#8, s_store_id#9, s_store_name#10, s_gmt_offset#11]
(13) Filter [codegen id : 2]
Input [4]: [s_store_sk#8, s_store_id#9, s_store_name#10, s_gmt_offset#11]
Condition : ((isnotnull(s_gmt_offset#11) AND (s_gmt_offset#11 = -5.00)) AND isnotnull(s_store_sk#8))
(14) Project [codegen id : 2]
Output [3]: [s_store_sk#8, s_store_id#9, s_store_name#10]
Input [4]: [s_store_sk#8, s_store_id#9, s_store_name#10, s_gmt_offset#11]
(15) BroadcastExchange
Input [3]: [s_store_sk#8, s_store_id#9, s_store_name#10]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#12]
(16) BroadcastHashJoin [codegen id : 3]
Left keys [1]: [ss_store_sk#5]
Right keys [1]: [s_store_sk#8]
Join condition: None
(17) Project [codegen id : 3]
Output [4]: [d_day_name#3, ss_sales_price#6, s_store_id#9, s_store_name#10]
Input [6]: [d_day_name#3, ss_store_sk#5, ss_sales_price#6, s_store_sk#8, s_store_id#9, s_store_name#10]
(18) HashAggregate [codegen id : 3]
Input [4]: [d_day_name#3, ss_sales_price#6, s_store_id#9, s_store_name#10]
Keys [2]: [s_store_name#10, s_store_id#9]
Functions [7]: [partial_sum(UnscaledValue(CASE WHEN (d_day_name#3 = Sunday) THEN ss_sales_price#6 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#3 = Monday) THEN ss_sales_price#6 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#3 = Tuesday) THEN ss_sales_price#6 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#3 = Wednesday) THEN ss_sales_price#6 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#3 = Thursday) THEN ss_sales_price#6 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#3 = Friday) THEN ss_sales_price#6 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#3 = Saturday) THEN ss_sales_price#6 ELSE null END))]
Aggregate Attributes [7]: [sum#13, sum#14, sum#15, sum#16, sum#17, sum#18, sum#19]
Results [9]: [s_store_name#10, s_store_id#9, sum#20, sum#21, sum#22, sum#23, sum#24, sum#25, sum#26]
(19) Exchange
Input [9]: [s_store_name#10, s_store_id#9, sum#20, sum#21, sum#22, sum#23, sum#24, sum#25, sum#26]
Arguments: hashpartitioning(s_store_name#10, s_store_id#9, 5), true, [id=#27]
(20) HashAggregate [codegen id : 4]
Input [9]: [s_store_name#10, s_store_id#9, sum#20, sum#21, sum#22, sum#23, sum#24, sum#25, sum#26]
Keys [2]: [s_store_name#10, s_store_id#9]
Functions [7]: [sum(UnscaledValue(CASE WHEN (d_day_name#3 = Sunday) THEN ss_sales_price#6 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#3 = Monday) THEN ss_sales_price#6 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#3 = Tuesday) THEN ss_sales_price#6 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#3 = Wednesday) THEN ss_sales_price#6 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#3 = Thursday) THEN ss_sales_price#6 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#3 = Friday) THEN ss_sales_price#6 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#3 = Saturday) THEN ss_sales_price#6 ELSE null END))]
Aggregate Attributes [7]: [sum(UnscaledValue(CASE WHEN (d_day_name#3 = Sunday) THEN ss_sales_price#6 ELSE null END))#28, sum(UnscaledValue(CASE WHEN (d_day_name#3 = Monday) THEN ss_sales_price#6 ELSE null END))#29, sum(UnscaledValue(CASE WHEN (d_day_name#3 = Tuesday) THEN ss_sales_price#6 ELSE null END))#30, sum(UnscaledValue(CASE WHEN (d_day_name#3 = Wednesday) THEN ss_sales_price#6 ELSE null END))#31, sum(UnscaledValue(CASE WHEN (d_day_name#3 = Thursday) THEN ss_sales_price#6 ELSE null END))#32, sum(UnscaledValue(CASE WHEN (d_day_name#3 = Friday) THEN ss_sales_price#6 ELSE null END))#33, sum(UnscaledValue(CASE WHEN (d_day_name#3 = Saturday) THEN ss_sales_price#6 ELSE null END))#34]
Results [9]: [s_store_name#10, s_store_id#9, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#3 = Sunday) THEN ss_sales_price#6 ELSE null END))#28,17,2) AS sun_sales#35, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#3 = Monday) THEN ss_sales_price#6 ELSE null END))#29,17,2) AS mon_sales#36, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#3 = Tuesday) THEN ss_sales_price#6 ELSE null END))#30,17,2) AS tue_sales#37, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#3 = Wednesday) THEN ss_sales_price#6 ELSE null END))#31,17,2) AS wed_sales#38, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#3 = Thursday) THEN ss_sales_price#6 ELSE null END))#32,17,2) AS thu_sales#39, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#3 = Friday) THEN ss_sales_price#6 ELSE null END))#33,17,2) AS fri_sales#40, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#3 = Saturday) THEN ss_sales_price#6 ELSE null END))#34,17,2) AS sat_sales#41]
(21) TakeOrderedAndProject
Input [9]: [s_store_name#10, s_store_id#9, sun_sales#35, mon_sales#36, tue_sales#37, wed_sales#38, thu_sales#39, fri_sales#40, sat_sales#41]
Arguments: 100, [s_store_name#10 ASC NULLS FIRST, s_store_id#9 ASC NULLS FIRST, sun_sales#35 ASC NULLS FIRST, mon_sales#36 ASC NULLS FIRST, tue_sales#37 ASC NULLS FIRST, wed_sales#38 ASC NULLS FIRST, thu_sales#39 ASC NULLS FIRST, fri_sales#40 ASC NULLS FIRST, sat_sales#41 ASC NULLS FIRST], [s_store_name#10, s_store_id#9, sun_sales#35, mon_sales#36, tue_sales#37, wed_sales#38, thu_sales#39, fri_sales#40, sat_sales#41]

View file

@ -0,0 +1,31 @@
TakeOrderedAndProject [fri_sales,mon_sales,s_store_id,s_store_name,sat_sales,sun_sales,thu_sales,tue_sales,wed_sales]
WholeStageCodegen (4)
HashAggregate [s_store_id,s_store_name,sum,sum,sum,sum,sum,sum,sum] [fri_sales,mon_sales,sat_sales,sum,sum,sum,sum,sum,sum,sum,sum(UnscaledValue(CASE WHEN (d_day_name = Friday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Monday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Saturday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Sunday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Thursday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Tuesday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Wednesday) THEN ss_sales_price ELSE null END)),sun_sales,thu_sales,tue_sales,wed_sales]
InputAdapter
Exchange [s_store_id,s_store_name] #1
WholeStageCodegen (3)
HashAggregate [d_day_name,s_store_id,s_store_name,ss_sales_price] [sum,sum,sum,sum,sum,sum,sum,sum,sum,sum,sum,sum,sum,sum]
Project [d_day_name,s_store_id,s_store_name,ss_sales_price]
BroadcastHashJoin [s_store_sk,ss_store_sk]
Project [d_day_name,ss_sales_price,ss_store_sk]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Project [d_date_sk,d_day_name]
Filter [d_date_sk,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_day_name,d_year]
InputAdapter
BroadcastExchange #2
WholeStageCodegen (1)
Filter [ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_sales_price,ss_sold_date_sk,ss_store_sk]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (2)
Project [s_store_id,s_store_name,s_store_sk]
Filter [s_gmt_offset,s_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store [s_gmt_offset,s_store_id,s_store_name,s_store_sk]

View file

@ -0,0 +1,281 @@
== Physical Plan ==
TakeOrderedAndProject (51)
+- * Project (50)
+- * SortMergeJoin Inner (49)
:- * Sort (46)
: +- Exchange (45)
: +- * Project (44)
: +- * SortMergeJoin Inner (43)
: :- * Sort (37)
: : +- Exchange (36)
: : +- * HashAggregate (35)
: : +- * HashAggregate (34)
: : +- * Project (33)
: : +- * SortMergeJoin Inner (32)
: : :- * Sort (26)
: : : +- Exchange (25)
: : : +- * Project (24)
: : : +- * BroadcastHashJoin Inner BuildRight (23)
: : : :- * Project (17)
: : : : +- * BroadcastHashJoin Inner BuildRight (16)
: : : : :- * Project (10)
: : : : : +- * BroadcastHashJoin Inner BuildRight (9)
: : : : : :- * Filter (3)
: : : : : : +- * ColumnarToRow (2)
: : : : : : +- Scan parquet default.store_sales (1)
: : : : : +- BroadcastExchange (8)
: : : : : +- * Project (7)
: : : : : +- * Filter (6)
: : : : : +- * ColumnarToRow (5)
: : : : : +- Scan parquet default.date_dim (4)
: : : : +- BroadcastExchange (15)
: : : : +- * Project (14)
: : : : +- * Filter (13)
: : : : +- * ColumnarToRow (12)
: : : : +- Scan parquet default.store (11)
: : : +- BroadcastExchange (22)
: : : +- * Project (21)
: : : +- * Filter (20)
: : : +- * ColumnarToRow (19)
: : : +- Scan parquet default.household_demographics (18)
: : +- * Sort (31)
: : +- Exchange (30)
: : +- * Filter (29)
: : +- * ColumnarToRow (28)
: : +- Scan parquet default.customer_address (27)
: +- * Sort (42)
: +- Exchange (41)
: +- * Filter (40)
: +- * ColumnarToRow (39)
: +- Scan parquet default.customer (38)
+- * Sort (48)
+- ReusedExchange (47)
(1) Scan parquet default.store_sales
Output [8]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store_sales]
PushedFilters: [In(ss_sold_date_sk, [2451790,2451609,2451294,2451658,2452099,2451482,2451700,2452035,2452274,2451258,2451847,2451714,2451937,2451860,2451601,2451573,2451686,2452008,2451454,2451882,2451832,2452259,2451671,2451903,2451497,2452162,2451322,2451517,2451434,2451273,2451405,2452105,2451924,2452050,2452126,2452203,2451818,2451559,2451853,2451238,2451209,2451357,2451959,2452239,2451608,2452141,2452252,2451623,2451867,2451504,2451910,2452232,2451874,2451581,2451329,2451223,2451783,2452267,2452042,2451895,2451986,2452091,2451693,2451265,2451678,2451825,2451244,2451490,2451287,2451419,2451546,2451245,2451713,2452070,2451189,2451804,2451468,2451525,2451902,2452077,2452161,2451378,2451567,2451931,2451699,2451251,2451840,2452253,2451938,2451510,2452231,2452036,2451616,2451230,2452112,2451846,2451966,2451538,2451819,2452140,2452183,2451496,2451791,2451595,2451574,2451363,2451994,2451917,2451602,2452273,2451237,2451350,2451685,2451259,2451286,2451972,2452224,2451370,2452245,2451643,2451993,2451315,2451301,2451560,2451433,2452225,2451532,2451755,2451854,2451545,2451210,2451587,2451987,2451447,2452197,2451552,2451896,2451679,2452147,2451735,2452022,2451707,2451868,2451398,2451777,2451181,2451503,2451839,2452175,2451441,2452154,2452029,2452196,2451952,2451805,2451965,2451539,2452001,2451833,2451392,2451524,2451461,2452133,2451448,2451307,2451615,2451769,2451412,2451349,2451651,2451763,2451203,2452064,2451980,2451748,2451637,2452182,2451279,2451231,2451734,2451692,2452071,2451336,2451300,2451727,2451630,2452189,2451875,2451973,2451328,2452084,2451399,2451944,2452204,2451385,2451776,2451384,2451272,2451812,2451749,2451566,2451182,2451945,2451420,2451930,2452057,2451756,2451644,2451314,2451364,2452007,2451798,2451475,2452015,2451440,2452000,2451588,2452148,2451195,2452217,2451371,2452176,2451531,2452134,2452211,2451462,2451188,2451741,2452119,2451342,2451580,2451672,2451889,2451280,2451406,2451293,2451217,2452049,2452106,2451321,2451335,2451483,2452260,2451657,2451979,2451518,2451629,2451728,2451923,2451861,2451951,2452246,2451455,2451356,2451224,2452210,2452021,2451427,2451202,2452098,2452168,2451553,2451391,2451706,2452155,2451196,2451770,2452127,2451762,2452078,2451958,2451721,2451665,2452120,2451252,2452085,2452092,2451476,2452218,2452169,2451797,2451650,2451881,2451511,2451469,2451888,2452043,2452266,2451664,2452014,2451343,2452056,2452190,2452063,2451636,2451742,2451811,2451720,2451308,2451489,2451413,2451216,2451594,2452238,2451784,2451426,2451622,2451916,2452113,2451909,2451266,2451826,2451377,2452028]), IsNotNull(ss_sold_date_sk), IsNotNull(ss_store_sk), IsNotNull(ss_hdemo_sk), IsNotNull(ss_addr_sk), IsNotNull(ss_customer_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_customer_sk:int,ss_hdemo_sk:int,ss_addr_sk:int,ss_store_sk:int,ss_ticket_number:int,ss_coupon_amt:decimal(7,2),ss_net_profit:decimal(7,2)>
(2) ColumnarToRow [codegen id : 4]
Input [8]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8]
(3) Filter [codegen id : 4]
Input [8]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8]
Condition : (((((ss_sold_date_sk#1 INSET (2451790,2451609,2451294,2451658,2452099,2451482,2451700,2452035,2452274,2451258,2451847,2451714,2451937,2451860,2451601,2451573,2451686,2452008,2451454,2451882,2451832,2452259,2451671,2451903,2451497,2452162,2451322,2451517,2451434,2451273,2451405,2452105,2451924,2452050,2452126,2452203,2451818,2451559,2451853,2451238,2451209,2451357,2451959,2452239,2451608,2452141,2452252,2451623,2451867,2451504,2451910,2452232,2451874,2451581,2451329,2451223,2451783,2452267,2452042,2451895,2451986,2452091,2451693,2451265,2451678,2451825,2451244,2451490,2451287,2451419,2451546,2451245,2451713,2452070,2451189,2451804,2451468,2451525,2451902,2452077,2452161,2451378,2451567,2451931,2451699,2451251,2451840,2452253,2451938,2451510,2452231,2452036,2451616,2451230,2452112,2451846,2451966,2451538,2451819,2452140,2452183,2451496,2451791,2451595,2451574,2451363,2451994,2451917,2451602,2452273,2451237,2451350,2451685,2451259,2451286,2451972,2452224,2451370,2452245,2451643,2451993,2451315,2451301,2451560,2451433,2452225,2451532,2451755,2451854,2451545,2451210,2451587,2451987,2451447,2452197,2451552,2451896,2451679,2452147,2451735,2452022,2451707,2451868,2451398,2451777,2451181,2451503,2451839,2452175,2451441,2452154,2452029,2452196,2451952,2451805,2451965,2451539,2452001,2451833,2451392,2451524,2451461,2452133,2451448,2451307,2451615,2451769,2451412,2451349,2451651,2451763,2451203,2452064,2451980,2451748,2451637,2452182,2451279,2451231,2451734,2451692,2452071,2451336,2451300,2451727,2451630,2452189,2451875,2451973,2451328,2452084,2451399,2451944,2452204,2451385,2451776,2451384,2451272,2451812,2451749,2451566,2451182,2451945,2451420,2451930,2452057,2451756,2451644,2451314,2451364,2452007,2451798,2451475,2452015,2451440,2452000,2451588,2452148,2451195,2452217,2451371,2452176,2451531,2452134,2452211,2451462,2451188,2451741,2452119,2451342,2451580,2451672,2451889,2451280,2451406,2451293,2451217,2452049,2452106,2451321,2451335,2451483,2452260,2451657,2451979,2451518,2451629,2451728,2451923,2451861,2451951,2452246,2451455,2451356,2451224,2452210,2452021,2451427,2451202,2452098,2452168,2451553,2451391,2451706,2452155,2451196,2451770,2452127,2451762,2452078,2451958,2451721,2451665,2452120,2451252,2452085,2452092,2451476,2452218,2452169,2451797,2451650,2451881,2451511,2451469,2451888,2452043,2452266,2451664,2452014,2451343,2452056,2452190,2452063,2451636,2451742,2451811,2451720,2451308,2451489,2451413,2451216,2451594,2452238,2451784,2451426,2451622,2451916,2452113,2451909,2451266,2451826,2451377,2452028) AND isnotnull(ss_sold_date_sk#1)) AND isnotnull(ss_store_sk#5)) AND isnotnull(ss_hdemo_sk#3)) AND isnotnull(ss_addr_sk#4)) AND isnotnull(ss_customer_sk#2))
(4) Scan parquet default.date_dim
Output [3]: [d_date_sk#9, d_year#10, d_dow#11]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/date_dim]
PushedFilters: [In(d_dow, [6,0]), In(d_year, [1999,2000,2001]), In(d_date_sk, [2451790,2451609,2451294,2451658,2452099,2451482,2451700,2452035,2452274,2451258,2451847,2451714,2451937,2451860,2451601,2451573,2451686,2452008,2451454,2451882,2451832,2452259,2451671,2451903,2451497,2452162,2451322,2451517,2451434,2451273,2451405,2452105,2451924,2452050,2452126,2452203,2451818,2451559,2451853,2451238,2451209,2451357,2451959,2452239,2451608,2452141,2452252,2451623,2451867,2451504,2451910,2452232,2451874,2451581,2451329,2451223,2451783,2452267,2452042,2451895,2451986,2452091,2451693,2451265,2451678,2451825,2451244,2451490,2451287,2451419,2451546,2451245,2451713,2452070,2451189,2451804,2451468,2451525,2451902,2452077,2452161,2451378,2451567,2451931,2451699,2451251,2451840,2452253,2451938,2451510,2452231,2452036,2451616,2451230,2452112,2451846,2451966,2451538,2451819,2452140,2452183,2451496,2451791,2451595,2451574,2451363,2451994,2451917,2451602,2452273,2451237,2451350,2451685,2451259,2451286,2451972,2452224,2451370,2452245,2451643,2451993,2451315,2451301,2451560,2451433,2452225,2451532,2451755,2451854,2451545,2451210,2451587,2451987,2451447,2452197,2451552,2451896,2451679,2452147,2451735,2452022,2451707,2451868,2451398,2451777,2451181,2451503,2451839,2452175,2451441,2452154,2452029,2452196,2451952,2451805,2451965,2451539,2452001,2451833,2451392,2451524,2451461,2452133,2451448,2451307,2451615,2451769,2451412,2451349,2451651,2451763,2451203,2452064,2451980,2451748,2451637,2452182,2451279,2451231,2451734,2451692,2452071,2451336,2451300,2451727,2451630,2452189,2451875,2451973,2451328,2452084,2451399,2451944,2452204,2451385,2451776,2451384,2451272,2451812,2451749,2451566,2451182,2451945,2451420,2451930,2452057,2451756,2451644,2451314,2451364,2452007,2451798,2451475,2452015,2451440,2452000,2451588,2452148,2451195,2452217,2451371,2452176,2451531,2452134,2452211,2451462,2451188,2451741,2452119,2451342,2451580,2451672,2451889,2451280,2451406,2451293,2451217,2452049,2452106,2451321,2451335,2451483,2452260,2451657,2451979,2451518,2451629,2451728,2451923,2451861,2451951,2452246,2451455,2451356,2451224,2452210,2452021,2451427,2451202,2452098,2452168,2451553,2451391,2451706,2452155,2451196,2451770,2452127,2451762,2452078,2451958,2451721,2451665,2452120,2451252,2452085,2452092,2451476,2452218,2452169,2451797,2451650,2451881,2451511,2451469,2451888,2452043,2452266,2451664,2452014,2451343,2452056,2452190,2452063,2451636,2451742,2451811,2451720,2451308,2451489,2451413,2451216,2451594,2452238,2451784,2451426,2451622,2451916,2452113,2451909,2451266,2451826,2451377,2452028]), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int,d_dow:int>
(5) ColumnarToRow [codegen id : 1]
Input [3]: [d_date_sk#9, d_year#10, d_dow#11]
(6) Filter [codegen id : 1]
Input [3]: [d_date_sk#9, d_year#10, d_dow#11]
Condition : (((d_dow#11 IN (6,0) AND d_year#10 IN (1999,2000,2001)) AND d_date_sk#9 INSET (2451790,2451609,2451294,2451658,2452099,2451482,2451700,2452035,2452274,2451258,2451847,2451714,2451937,2451860,2451601,2451573,2451686,2452008,2451454,2451882,2451832,2452259,2451671,2451903,2451497,2452162,2451322,2451517,2451434,2451273,2451405,2452105,2451924,2452050,2452126,2452203,2451818,2451559,2451853,2451238,2451209,2451357,2451959,2452239,2451608,2452141,2452252,2451623,2451867,2451504,2451910,2452232,2451874,2451581,2451329,2451223,2451783,2452267,2452042,2451895,2451986,2452091,2451693,2451265,2451678,2451825,2451244,2451490,2451287,2451419,2451546,2451245,2451713,2452070,2451189,2451804,2451468,2451525,2451902,2452077,2452161,2451378,2451567,2451931,2451699,2451251,2451840,2452253,2451938,2451510,2452231,2452036,2451616,2451230,2452112,2451846,2451966,2451538,2451819,2452140,2452183,2451496,2451791,2451595,2451574,2451363,2451994,2451917,2451602,2452273,2451237,2451350,2451685,2451259,2451286,2451972,2452224,2451370,2452245,2451643,2451993,2451315,2451301,2451560,2451433,2452225,2451532,2451755,2451854,2451545,2451210,2451587,2451987,2451447,2452197,2451552,2451896,2451679,2452147,2451735,2452022,2451707,2451868,2451398,2451777,2451181,2451503,2451839,2452175,2451441,2452154,2452029,2452196,2451952,2451805,2451965,2451539,2452001,2451833,2451392,2451524,2451461,2452133,2451448,2451307,2451615,2451769,2451412,2451349,2451651,2451763,2451203,2452064,2451980,2451748,2451637,2452182,2451279,2451231,2451734,2451692,2452071,2451336,2451300,2451727,2451630,2452189,2451875,2451973,2451328,2452084,2451399,2451944,2452204,2451385,2451776,2451384,2451272,2451812,2451749,2451566,2451182,2451945,2451420,2451930,2452057,2451756,2451644,2451314,2451364,2452007,2451798,2451475,2452015,2451440,2452000,2451588,2452148,2451195,2452217,2451371,2452176,2451531,2452134,2452211,2451462,2451188,2451741,2452119,2451342,2451580,2451672,2451889,2451280,2451406,2451293,2451217,2452049,2452106,2451321,2451335,2451483,2452260,2451657,2451979,2451518,2451629,2451728,2451923,2451861,2451951,2452246,2451455,2451356,2451224,2452210,2452021,2451427,2451202,2452098,2452168,2451553,2451391,2451706,2452155,2451196,2451770,2452127,2451762,2452078,2451958,2451721,2451665,2452120,2451252,2452085,2452092,2451476,2452218,2452169,2451797,2451650,2451881,2451511,2451469,2451888,2452043,2452266,2451664,2452014,2451343,2452056,2452190,2452063,2451636,2451742,2451811,2451720,2451308,2451489,2451413,2451216,2451594,2452238,2451784,2451426,2451622,2451916,2452113,2451909,2451266,2451826,2451377,2452028)) AND isnotnull(d_date_sk#9))
(7) Project [codegen id : 1]
Output [1]: [d_date_sk#9]
Input [3]: [d_date_sk#9, d_year#10, d_dow#11]
(8) BroadcastExchange
Input [1]: [d_date_sk#9]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#12]
(9) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_sold_date_sk#1]
Right keys [1]: [d_date_sk#9]
Join condition: None
(10) Project [codegen id : 4]
Output [7]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8]
Input [9]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8, d_date_sk#9]
(11) Scan parquet default.store
Output [2]: [s_store_sk#13, s_city#14]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store]
PushedFilters: [In(s_city, [Midway,Concord,Spring Hill,Brownsville,Greenville]), IsNotNull(s_store_sk)]
ReadSchema: struct<s_store_sk:int,s_city:string>
(12) ColumnarToRow [codegen id : 2]
Input [2]: [s_store_sk#13, s_city#14]
(13) Filter [codegen id : 2]
Input [2]: [s_store_sk#13, s_city#14]
Condition : (s_city#14 IN (Midway,Concord,Spring Hill,Brownsville,Greenville) AND isnotnull(s_store_sk#13))
(14) Project [codegen id : 2]
Output [1]: [s_store_sk#13]
Input [2]: [s_store_sk#13, s_city#14]
(15) BroadcastExchange
Input [1]: [s_store_sk#13]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#15]
(16) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_store_sk#5]
Right keys [1]: [s_store_sk#13]
Join condition: None
(17) Project [codegen id : 4]
Output [6]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8]
Input [8]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8, s_store_sk#13]
(18) Scan parquet default.household_demographics
Output [3]: [hd_demo_sk#16, hd_dep_count#17, hd_vehicle_count#18]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/household_demographics]
PushedFilters: [Or(EqualTo(hd_dep_count,5),EqualTo(hd_vehicle_count,3)), IsNotNull(hd_demo_sk)]
ReadSchema: struct<hd_demo_sk:int,hd_dep_count:int,hd_vehicle_count:int>
(19) ColumnarToRow [codegen id : 3]
Input [3]: [hd_demo_sk#16, hd_dep_count#17, hd_vehicle_count#18]
(20) Filter [codegen id : 3]
Input [3]: [hd_demo_sk#16, hd_dep_count#17, hd_vehicle_count#18]
Condition : (((hd_dep_count#17 = 5) OR (hd_vehicle_count#18 = 3)) AND isnotnull(hd_demo_sk#16))
(21) Project [codegen id : 3]
Output [1]: [hd_demo_sk#16]
Input [3]: [hd_demo_sk#16, hd_dep_count#17, hd_vehicle_count#18]
(22) BroadcastExchange
Input [1]: [hd_demo_sk#16]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#19]
(23) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_hdemo_sk#3]
Right keys [1]: [hd_demo_sk#16]
Join condition: None
(24) Project [codegen id : 4]
Output [5]: [ss_customer_sk#2, ss_addr_sk#4, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8]
Input [7]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8, hd_demo_sk#16]
(25) Exchange
Input [5]: [ss_customer_sk#2, ss_addr_sk#4, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8]
Arguments: hashpartitioning(ss_addr_sk#4, 5), true, [id=#20]
(26) Sort [codegen id : 5]
Input [5]: [ss_customer_sk#2, ss_addr_sk#4, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8]
Arguments: [ss_addr_sk#4 ASC NULLS FIRST], false, 0
(27) Scan parquet default.customer_address
Output [2]: [ca_address_sk#21, ca_city#22]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/customer_address]
PushedFilters: [IsNotNull(ca_address_sk), IsNotNull(ca_city)]
ReadSchema: struct<ca_address_sk:int,ca_city:string>
(28) ColumnarToRow [codegen id : 6]
Input [2]: [ca_address_sk#21, ca_city#22]
(29) Filter [codegen id : 6]
Input [2]: [ca_address_sk#21, ca_city#22]
Condition : (isnotnull(ca_address_sk#21) AND isnotnull(ca_city#22))
(30) Exchange
Input [2]: [ca_address_sk#21, ca_city#22]
Arguments: hashpartitioning(ca_address_sk#21, 5), true, [id=#23]
(31) Sort [codegen id : 7]
Input [2]: [ca_address_sk#21, ca_city#22]
Arguments: [ca_address_sk#21 ASC NULLS FIRST], false, 0
(32) SortMergeJoin [codegen id : 8]
Left keys [1]: [ss_addr_sk#4]
Right keys [1]: [ca_address_sk#21]
Join condition: None
(33) Project [codegen id : 8]
Output [6]: [ss_customer_sk#2, ss_addr_sk#4, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8, ca_city#22]
Input [7]: [ss_customer_sk#2, ss_addr_sk#4, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8, ca_address_sk#21, ca_city#22]
(34) HashAggregate [codegen id : 8]
Input [6]: [ss_customer_sk#2, ss_addr_sk#4, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8, ca_city#22]
Keys [4]: [ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, ca_city#22]
Functions [2]: [partial_sum(UnscaledValue(ss_coupon_amt#7)), partial_sum(UnscaledValue(ss_net_profit#8))]
Aggregate Attributes [2]: [sum#24, sum#25]
Results [6]: [ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, ca_city#22, sum#26, sum#27]
(35) HashAggregate [codegen id : 8]
Input [6]: [ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, ca_city#22, sum#26, sum#27]
Keys [4]: [ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, ca_city#22]
Functions [2]: [sum(UnscaledValue(ss_coupon_amt#7)), sum(UnscaledValue(ss_net_profit#8))]
Aggregate Attributes [2]: [sum(UnscaledValue(ss_coupon_amt#7))#28, sum(UnscaledValue(ss_net_profit#8))#29]
Results [5]: [ss_ticket_number#6, ss_customer_sk#2, ca_city#22 AS bought_city#30, MakeDecimal(sum(UnscaledValue(ss_coupon_amt#7))#28,17,2) AS amt#31, MakeDecimal(sum(UnscaledValue(ss_net_profit#8))#29,17,2) AS profit#32]
(36) Exchange
Input [5]: [ss_ticket_number#6, ss_customer_sk#2, bought_city#30, amt#31, profit#32]
Arguments: hashpartitioning(ss_customer_sk#2, 5), true, [id=#33]
(37) Sort [codegen id : 9]
Input [5]: [ss_ticket_number#6, ss_customer_sk#2, bought_city#30, amt#31, profit#32]
Arguments: [ss_customer_sk#2 ASC NULLS FIRST], false, 0
(38) Scan parquet default.customer
Output [4]: [c_customer_sk#34, c_current_addr_sk#35, c_first_name#36, c_last_name#37]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/customer]
PushedFilters: [IsNotNull(c_customer_sk), IsNotNull(c_current_addr_sk)]
ReadSchema: struct<c_customer_sk:int,c_current_addr_sk:int,c_first_name:string,c_last_name:string>
(39) ColumnarToRow [codegen id : 10]
Input [4]: [c_customer_sk#34, c_current_addr_sk#35, c_first_name#36, c_last_name#37]
(40) Filter [codegen id : 10]
Input [4]: [c_customer_sk#34, c_current_addr_sk#35, c_first_name#36, c_last_name#37]
Condition : (isnotnull(c_customer_sk#34) AND isnotnull(c_current_addr_sk#35))
(41) Exchange
Input [4]: [c_customer_sk#34, c_current_addr_sk#35, c_first_name#36, c_last_name#37]
Arguments: hashpartitioning(c_customer_sk#34, 5), true, [id=#38]
(42) Sort [codegen id : 11]
Input [4]: [c_customer_sk#34, c_current_addr_sk#35, c_first_name#36, c_last_name#37]
Arguments: [c_customer_sk#34 ASC NULLS FIRST], false, 0
(43) SortMergeJoin [codegen id : 12]
Left keys [1]: [ss_customer_sk#2]
Right keys [1]: [c_customer_sk#34]
Join condition: None
(44) Project [codegen id : 12]
Output [7]: [ss_ticket_number#6, bought_city#30, amt#31, profit#32, c_current_addr_sk#35, c_first_name#36, c_last_name#37]
Input [9]: [ss_ticket_number#6, ss_customer_sk#2, bought_city#30, amt#31, profit#32, c_customer_sk#34, c_current_addr_sk#35, c_first_name#36, c_last_name#37]
(45) Exchange
Input [7]: [ss_ticket_number#6, bought_city#30, amt#31, profit#32, c_current_addr_sk#35, c_first_name#36, c_last_name#37]
Arguments: hashpartitioning(c_current_addr_sk#35, 5), true, [id=#39]
(46) Sort [codegen id : 13]
Input [7]: [ss_ticket_number#6, bought_city#30, amt#31, profit#32, c_current_addr_sk#35, c_first_name#36, c_last_name#37]
Arguments: [c_current_addr_sk#35 ASC NULLS FIRST], false, 0
(47) ReusedExchange [Reuses operator id: 30]
Output [2]: [ca_address_sk#21, ca_city#22]
(48) Sort [codegen id : 15]
Input [2]: [ca_address_sk#21, ca_city#22]
Arguments: [ca_address_sk#21 ASC NULLS FIRST], false, 0
(49) SortMergeJoin [codegen id : 16]
Left keys [1]: [c_current_addr_sk#35]
Right keys [1]: [ca_address_sk#21]
Join condition: NOT (ca_city#22 = bought_city#30)
(50) Project [codegen id : 16]
Output [7]: [c_last_name#37, c_first_name#36, ca_city#22, bought_city#30, ss_ticket_number#6, amt#31, profit#32]
Input [9]: [ss_ticket_number#6, bought_city#30, amt#31, profit#32, c_current_addr_sk#35, c_first_name#36, c_last_name#37, ca_address_sk#21, ca_city#22]
(51) TakeOrderedAndProject
Input [7]: [c_last_name#37, c_first_name#36, ca_city#22, bought_city#30, ss_ticket_number#6, amt#31, profit#32]
Arguments: 100, [c_last_name#37 ASC NULLS FIRST, c_first_name#36 ASC NULLS FIRST, ca_city#22 ASC NULLS FIRST, bought_city#30 ASC NULLS FIRST, ss_ticket_number#6 ASC NULLS FIRST], [c_last_name#37, c_first_name#36, ca_city#22, bought_city#30, ss_ticket_number#6, amt#31, profit#32]

View file

@ -0,0 +1,87 @@
TakeOrderedAndProject [amt,bought_city,c_first_name,c_last_name,ca_city,profit,ss_ticket_number]
WholeStageCodegen (16)
Project [amt,bought_city,c_first_name,c_last_name,ca_city,profit,ss_ticket_number]
SortMergeJoin [bought_city,c_current_addr_sk,ca_address_sk,ca_city]
InputAdapter
WholeStageCodegen (13)
Sort [c_current_addr_sk]
InputAdapter
Exchange [c_current_addr_sk] #1
WholeStageCodegen (12)
Project [amt,bought_city,c_current_addr_sk,c_first_name,c_last_name,profit,ss_ticket_number]
SortMergeJoin [c_customer_sk,ss_customer_sk]
InputAdapter
WholeStageCodegen (9)
Sort [ss_customer_sk]
InputAdapter
Exchange [ss_customer_sk] #2
WholeStageCodegen (8)
HashAggregate [ca_city,ss_addr_sk,ss_customer_sk,ss_ticket_number,sum,sum] [amt,bought_city,profit,sum,sum,sum(UnscaledValue(ss_coupon_amt)),sum(UnscaledValue(ss_net_profit))]
HashAggregate [ca_city,ss_addr_sk,ss_coupon_amt,ss_customer_sk,ss_net_profit,ss_ticket_number] [sum,sum,sum,sum]
Project [ca_city,ss_addr_sk,ss_coupon_amt,ss_customer_sk,ss_net_profit,ss_ticket_number]
SortMergeJoin [ca_address_sk,ss_addr_sk]
InputAdapter
WholeStageCodegen (5)
Sort [ss_addr_sk]
InputAdapter
Exchange [ss_addr_sk] #3
WholeStageCodegen (4)
Project [ss_addr_sk,ss_coupon_amt,ss_customer_sk,ss_net_profit,ss_ticket_number]
BroadcastHashJoin [hd_demo_sk,ss_hdemo_sk]
Project [ss_addr_sk,ss_coupon_amt,ss_customer_sk,ss_hdemo_sk,ss_net_profit,ss_ticket_number]
BroadcastHashJoin [s_store_sk,ss_store_sk]
Project [ss_addr_sk,ss_coupon_amt,ss_customer_sk,ss_hdemo_sk,ss_net_profit,ss_store_sk,ss_ticket_number]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Filter [ss_addr_sk,ss_customer_sk,ss_hdemo_sk,ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_addr_sk,ss_coupon_amt,ss_customer_sk,ss_hdemo_sk,ss_net_profit,ss_sold_date_sk,ss_store_sk,ss_ticket_number]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (1)
Project [d_date_sk]
Filter [d_date_sk,d_dow,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_dow,d_year]
InputAdapter
BroadcastExchange #5
WholeStageCodegen (2)
Project [s_store_sk]
Filter [s_city,s_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store [s_city,s_store_sk]
InputAdapter
BroadcastExchange #6
WholeStageCodegen (3)
Project [hd_demo_sk]
Filter [hd_demo_sk,hd_dep_count,hd_vehicle_count]
ColumnarToRow
InputAdapter
Scan parquet default.household_demographics [hd_demo_sk,hd_dep_count,hd_vehicle_count]
InputAdapter
WholeStageCodegen (7)
Sort [ca_address_sk]
InputAdapter
Exchange [ca_address_sk] #7
WholeStageCodegen (6)
Filter [ca_address_sk,ca_city]
ColumnarToRow
InputAdapter
Scan parquet default.customer_address [ca_address_sk,ca_city]
InputAdapter
WholeStageCodegen (11)
Sort [c_customer_sk]
InputAdapter
Exchange [c_customer_sk] #8
WholeStageCodegen (10)
Filter [c_current_addr_sk,c_customer_sk]
ColumnarToRow
InputAdapter
Scan parquet default.customer [c_current_addr_sk,c_customer_sk,c_first_name,c_last_name]
InputAdapter
WholeStageCodegen (15)
Sort [ca_address_sk]
InputAdapter
ReusedExchange [ca_address_sk,ca_city] #7

View file

@ -0,0 +1,241 @@
== Physical Plan ==
TakeOrderedAndProject (43)
+- * Project (42)
+- * BroadcastHashJoin Inner BuildRight (41)
:- * Project (39)
: +- * BroadcastHashJoin Inner BuildRight (38)
: :- * HashAggregate (33)
: : +- Exchange (32)
: : +- * HashAggregate (31)
: : +- * Project (30)
: : +- * BroadcastHashJoin Inner BuildRight (29)
: : :- * Project (24)
: : : +- * BroadcastHashJoin Inner BuildRight (23)
: : : :- * Project (17)
: : : : +- * BroadcastHashJoin Inner BuildRight (16)
: : : : :- * Project (10)
: : : : : +- * BroadcastHashJoin Inner BuildRight (9)
: : : : : :- * Filter (3)
: : : : : : +- * ColumnarToRow (2)
: : : : : : +- Scan parquet default.store_sales (1)
: : : : : +- BroadcastExchange (8)
: : : : : +- * Project (7)
: : : : : +- * Filter (6)
: : : : : +- * ColumnarToRow (5)
: : : : : +- Scan parquet default.date_dim (4)
: : : : +- BroadcastExchange (15)
: : : : +- * Project (14)
: : : : +- * Filter (13)
: : : : +- * ColumnarToRow (12)
: : : : +- Scan parquet default.store (11)
: : : +- BroadcastExchange (22)
: : : +- * Project (21)
: : : +- * Filter (20)
: : : +- * ColumnarToRow (19)
: : : +- Scan parquet default.household_demographics (18)
: : +- BroadcastExchange (28)
: : +- * Filter (27)
: : +- * ColumnarToRow (26)
: : +- Scan parquet default.customer_address (25)
: +- BroadcastExchange (37)
: +- * Filter (36)
: +- * ColumnarToRow (35)
: +- Scan parquet default.customer (34)
+- ReusedExchange (40)
(1) Scan parquet default.store_sales
Output [8]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store_sales]
PushedFilters: [In(ss_sold_date_sk, [2451790,2451609,2451294,2451658,2452099,2451482,2451700,2452035,2452274,2451258,2451847,2451714,2451937,2451860,2451601,2451573,2451686,2452008,2451454,2451882,2451832,2452259,2451671,2451903,2451497,2452162,2451322,2451517,2451434,2451273,2451405,2452105,2451924,2452050,2452126,2452203,2451818,2451559,2451853,2451238,2451209,2451357,2451959,2452239,2451608,2452141,2452252,2451623,2451867,2451504,2451910,2452232,2451874,2451581,2451329,2451223,2451783,2452267,2452042,2451895,2451986,2452091,2451693,2451265,2451678,2451825,2451244,2451490,2451287,2451419,2451546,2451245,2451713,2452070,2451189,2451804,2451468,2451525,2451902,2452077,2452161,2451378,2451567,2451931,2451699,2451251,2451840,2452253,2451938,2451510,2452231,2452036,2451616,2451230,2452112,2451846,2451966,2451538,2451819,2452140,2452183,2451496,2451791,2451595,2451574,2451363,2451994,2451917,2451602,2452273,2451237,2451350,2451685,2451259,2451286,2451972,2452224,2451370,2452245,2451643,2451993,2451315,2451301,2451560,2451433,2452225,2451532,2451755,2451854,2451545,2451210,2451587,2451987,2451447,2452197,2451552,2451896,2451679,2452147,2451735,2452022,2451707,2451868,2451398,2451777,2451181,2451503,2451839,2452175,2451441,2452154,2452029,2452196,2451952,2451805,2451965,2451539,2452001,2451833,2451392,2451524,2451461,2452133,2451448,2451307,2451615,2451769,2451412,2451349,2451651,2451763,2451203,2452064,2451980,2451748,2451637,2452182,2451279,2451231,2451734,2451692,2452071,2451336,2451300,2451727,2451630,2452189,2451875,2451973,2451328,2452084,2451399,2451944,2452204,2451385,2451776,2451384,2451272,2451812,2451749,2451566,2451182,2451945,2451420,2451930,2452057,2451756,2451644,2451314,2451364,2452007,2451798,2451475,2452015,2451440,2452000,2451588,2452148,2451195,2452217,2451371,2452176,2451531,2452134,2452211,2451462,2451188,2451741,2452119,2451342,2451580,2451672,2451889,2451280,2451406,2451293,2451217,2452049,2452106,2451321,2451335,2451483,2452260,2451657,2451979,2451518,2451629,2451728,2451923,2451861,2451951,2452246,2451455,2451356,2451224,2452210,2452021,2451427,2451202,2452098,2452168,2451553,2451391,2451706,2452155,2451196,2451770,2452127,2451762,2452078,2451958,2451721,2451665,2452120,2451252,2452085,2452092,2451476,2452218,2452169,2451797,2451650,2451881,2451511,2451469,2451888,2452043,2452266,2451664,2452014,2451343,2452056,2452190,2452063,2451636,2451742,2451811,2451720,2451308,2451489,2451413,2451216,2451594,2452238,2451784,2451426,2451622,2451916,2452113,2451909,2451266,2451826,2451377,2452028]), IsNotNull(ss_sold_date_sk), IsNotNull(ss_store_sk), IsNotNull(ss_hdemo_sk), IsNotNull(ss_addr_sk), IsNotNull(ss_customer_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_customer_sk:int,ss_hdemo_sk:int,ss_addr_sk:int,ss_store_sk:int,ss_ticket_number:int,ss_coupon_amt:decimal(7,2),ss_net_profit:decimal(7,2)>
(2) ColumnarToRow [codegen id : 5]
Input [8]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8]
(3) Filter [codegen id : 5]
Input [8]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8]
Condition : (((((ss_sold_date_sk#1 INSET (2451790,2451609,2451294,2451658,2452099,2451482,2451700,2452035,2452274,2451258,2451847,2451714,2451937,2451860,2451601,2451573,2451686,2452008,2451454,2451882,2451832,2452259,2451671,2451903,2451497,2452162,2451322,2451517,2451434,2451273,2451405,2452105,2451924,2452050,2452126,2452203,2451818,2451559,2451853,2451238,2451209,2451357,2451959,2452239,2451608,2452141,2452252,2451623,2451867,2451504,2451910,2452232,2451874,2451581,2451329,2451223,2451783,2452267,2452042,2451895,2451986,2452091,2451693,2451265,2451678,2451825,2451244,2451490,2451287,2451419,2451546,2451245,2451713,2452070,2451189,2451804,2451468,2451525,2451902,2452077,2452161,2451378,2451567,2451931,2451699,2451251,2451840,2452253,2451938,2451510,2452231,2452036,2451616,2451230,2452112,2451846,2451966,2451538,2451819,2452140,2452183,2451496,2451791,2451595,2451574,2451363,2451994,2451917,2451602,2452273,2451237,2451350,2451685,2451259,2451286,2451972,2452224,2451370,2452245,2451643,2451993,2451315,2451301,2451560,2451433,2452225,2451532,2451755,2451854,2451545,2451210,2451587,2451987,2451447,2452197,2451552,2451896,2451679,2452147,2451735,2452022,2451707,2451868,2451398,2451777,2451181,2451503,2451839,2452175,2451441,2452154,2452029,2452196,2451952,2451805,2451965,2451539,2452001,2451833,2451392,2451524,2451461,2452133,2451448,2451307,2451615,2451769,2451412,2451349,2451651,2451763,2451203,2452064,2451980,2451748,2451637,2452182,2451279,2451231,2451734,2451692,2452071,2451336,2451300,2451727,2451630,2452189,2451875,2451973,2451328,2452084,2451399,2451944,2452204,2451385,2451776,2451384,2451272,2451812,2451749,2451566,2451182,2451945,2451420,2451930,2452057,2451756,2451644,2451314,2451364,2452007,2451798,2451475,2452015,2451440,2452000,2451588,2452148,2451195,2452217,2451371,2452176,2451531,2452134,2452211,2451462,2451188,2451741,2452119,2451342,2451580,2451672,2451889,2451280,2451406,2451293,2451217,2452049,2452106,2451321,2451335,2451483,2452260,2451657,2451979,2451518,2451629,2451728,2451923,2451861,2451951,2452246,2451455,2451356,2451224,2452210,2452021,2451427,2451202,2452098,2452168,2451553,2451391,2451706,2452155,2451196,2451770,2452127,2451762,2452078,2451958,2451721,2451665,2452120,2451252,2452085,2452092,2451476,2452218,2452169,2451797,2451650,2451881,2451511,2451469,2451888,2452043,2452266,2451664,2452014,2451343,2452056,2452190,2452063,2451636,2451742,2451811,2451720,2451308,2451489,2451413,2451216,2451594,2452238,2451784,2451426,2451622,2451916,2452113,2451909,2451266,2451826,2451377,2452028) AND isnotnull(ss_sold_date_sk#1)) AND isnotnull(ss_store_sk#5)) AND isnotnull(ss_hdemo_sk#3)) AND isnotnull(ss_addr_sk#4)) AND isnotnull(ss_customer_sk#2))
(4) Scan parquet default.date_dim
Output [3]: [d_date_sk#9, d_year#10, d_dow#11]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/date_dim]
PushedFilters: [In(d_dow, [6,0]), In(d_year, [1999,2000,2001]), In(d_date_sk, [2451790,2451609,2451294,2451658,2452099,2451482,2451700,2452035,2452274,2451258,2451847,2451714,2451937,2451860,2451601,2451573,2451686,2452008,2451454,2451882,2451832,2452259,2451671,2451903,2451497,2452162,2451322,2451517,2451434,2451273,2451405,2452105,2451924,2452050,2452126,2452203,2451818,2451559,2451853,2451238,2451209,2451357,2451959,2452239,2451608,2452141,2452252,2451623,2451867,2451504,2451910,2452232,2451874,2451581,2451329,2451223,2451783,2452267,2452042,2451895,2451986,2452091,2451693,2451265,2451678,2451825,2451244,2451490,2451287,2451419,2451546,2451245,2451713,2452070,2451189,2451804,2451468,2451525,2451902,2452077,2452161,2451378,2451567,2451931,2451699,2451251,2451840,2452253,2451938,2451510,2452231,2452036,2451616,2451230,2452112,2451846,2451966,2451538,2451819,2452140,2452183,2451496,2451791,2451595,2451574,2451363,2451994,2451917,2451602,2452273,2451237,2451350,2451685,2451259,2451286,2451972,2452224,2451370,2452245,2451643,2451993,2451315,2451301,2451560,2451433,2452225,2451532,2451755,2451854,2451545,2451210,2451587,2451987,2451447,2452197,2451552,2451896,2451679,2452147,2451735,2452022,2451707,2451868,2451398,2451777,2451181,2451503,2451839,2452175,2451441,2452154,2452029,2452196,2451952,2451805,2451965,2451539,2452001,2451833,2451392,2451524,2451461,2452133,2451448,2451307,2451615,2451769,2451412,2451349,2451651,2451763,2451203,2452064,2451980,2451748,2451637,2452182,2451279,2451231,2451734,2451692,2452071,2451336,2451300,2451727,2451630,2452189,2451875,2451973,2451328,2452084,2451399,2451944,2452204,2451385,2451776,2451384,2451272,2451812,2451749,2451566,2451182,2451945,2451420,2451930,2452057,2451756,2451644,2451314,2451364,2452007,2451798,2451475,2452015,2451440,2452000,2451588,2452148,2451195,2452217,2451371,2452176,2451531,2452134,2452211,2451462,2451188,2451741,2452119,2451342,2451580,2451672,2451889,2451280,2451406,2451293,2451217,2452049,2452106,2451321,2451335,2451483,2452260,2451657,2451979,2451518,2451629,2451728,2451923,2451861,2451951,2452246,2451455,2451356,2451224,2452210,2452021,2451427,2451202,2452098,2452168,2451553,2451391,2451706,2452155,2451196,2451770,2452127,2451762,2452078,2451958,2451721,2451665,2452120,2451252,2452085,2452092,2451476,2452218,2452169,2451797,2451650,2451881,2451511,2451469,2451888,2452043,2452266,2451664,2452014,2451343,2452056,2452190,2452063,2451636,2451742,2451811,2451720,2451308,2451489,2451413,2451216,2451594,2452238,2451784,2451426,2451622,2451916,2452113,2451909,2451266,2451826,2451377,2452028]), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int,d_dow:int>
(5) ColumnarToRow [codegen id : 1]
Input [3]: [d_date_sk#9, d_year#10, d_dow#11]
(6) Filter [codegen id : 1]
Input [3]: [d_date_sk#9, d_year#10, d_dow#11]
Condition : (((d_dow#11 IN (6,0) AND d_year#10 IN (1999,2000,2001)) AND d_date_sk#9 INSET (2451790,2451609,2451294,2451658,2452099,2451482,2451700,2452035,2452274,2451258,2451847,2451714,2451937,2451860,2451601,2451573,2451686,2452008,2451454,2451882,2451832,2452259,2451671,2451903,2451497,2452162,2451322,2451517,2451434,2451273,2451405,2452105,2451924,2452050,2452126,2452203,2451818,2451559,2451853,2451238,2451209,2451357,2451959,2452239,2451608,2452141,2452252,2451623,2451867,2451504,2451910,2452232,2451874,2451581,2451329,2451223,2451783,2452267,2452042,2451895,2451986,2452091,2451693,2451265,2451678,2451825,2451244,2451490,2451287,2451419,2451546,2451245,2451713,2452070,2451189,2451804,2451468,2451525,2451902,2452077,2452161,2451378,2451567,2451931,2451699,2451251,2451840,2452253,2451938,2451510,2452231,2452036,2451616,2451230,2452112,2451846,2451966,2451538,2451819,2452140,2452183,2451496,2451791,2451595,2451574,2451363,2451994,2451917,2451602,2452273,2451237,2451350,2451685,2451259,2451286,2451972,2452224,2451370,2452245,2451643,2451993,2451315,2451301,2451560,2451433,2452225,2451532,2451755,2451854,2451545,2451210,2451587,2451987,2451447,2452197,2451552,2451896,2451679,2452147,2451735,2452022,2451707,2451868,2451398,2451777,2451181,2451503,2451839,2452175,2451441,2452154,2452029,2452196,2451952,2451805,2451965,2451539,2452001,2451833,2451392,2451524,2451461,2452133,2451448,2451307,2451615,2451769,2451412,2451349,2451651,2451763,2451203,2452064,2451980,2451748,2451637,2452182,2451279,2451231,2451734,2451692,2452071,2451336,2451300,2451727,2451630,2452189,2451875,2451973,2451328,2452084,2451399,2451944,2452204,2451385,2451776,2451384,2451272,2451812,2451749,2451566,2451182,2451945,2451420,2451930,2452057,2451756,2451644,2451314,2451364,2452007,2451798,2451475,2452015,2451440,2452000,2451588,2452148,2451195,2452217,2451371,2452176,2451531,2452134,2452211,2451462,2451188,2451741,2452119,2451342,2451580,2451672,2451889,2451280,2451406,2451293,2451217,2452049,2452106,2451321,2451335,2451483,2452260,2451657,2451979,2451518,2451629,2451728,2451923,2451861,2451951,2452246,2451455,2451356,2451224,2452210,2452021,2451427,2451202,2452098,2452168,2451553,2451391,2451706,2452155,2451196,2451770,2452127,2451762,2452078,2451958,2451721,2451665,2452120,2451252,2452085,2452092,2451476,2452218,2452169,2451797,2451650,2451881,2451511,2451469,2451888,2452043,2452266,2451664,2452014,2451343,2452056,2452190,2452063,2451636,2451742,2451811,2451720,2451308,2451489,2451413,2451216,2451594,2452238,2451784,2451426,2451622,2451916,2452113,2451909,2451266,2451826,2451377,2452028)) AND isnotnull(d_date_sk#9))
(7) Project [codegen id : 1]
Output [1]: [d_date_sk#9]
Input [3]: [d_date_sk#9, d_year#10, d_dow#11]
(8) BroadcastExchange
Input [1]: [d_date_sk#9]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#12]
(9) BroadcastHashJoin [codegen id : 5]
Left keys [1]: [ss_sold_date_sk#1]
Right keys [1]: [d_date_sk#9]
Join condition: None
(10) Project [codegen id : 5]
Output [7]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8]
Input [9]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8, d_date_sk#9]
(11) Scan parquet default.store
Output [2]: [s_store_sk#13, s_city#14]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store]
PushedFilters: [In(s_city, [Midway,Concord,Spring Hill,Brownsville,Greenville]), IsNotNull(s_store_sk)]
ReadSchema: struct<s_store_sk:int,s_city:string>
(12) ColumnarToRow [codegen id : 2]
Input [2]: [s_store_sk#13, s_city#14]
(13) Filter [codegen id : 2]
Input [2]: [s_store_sk#13, s_city#14]
Condition : (s_city#14 IN (Midway,Concord,Spring Hill,Brownsville,Greenville) AND isnotnull(s_store_sk#13))
(14) Project [codegen id : 2]
Output [1]: [s_store_sk#13]
Input [2]: [s_store_sk#13, s_city#14]
(15) BroadcastExchange
Input [1]: [s_store_sk#13]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#15]
(16) BroadcastHashJoin [codegen id : 5]
Left keys [1]: [ss_store_sk#5]
Right keys [1]: [s_store_sk#13]
Join condition: None
(17) Project [codegen id : 5]
Output [6]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8]
Input [8]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8, s_store_sk#13]
(18) Scan parquet default.household_demographics
Output [3]: [hd_demo_sk#16, hd_dep_count#17, hd_vehicle_count#18]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/household_demographics]
PushedFilters: [Or(EqualTo(hd_dep_count,5),EqualTo(hd_vehicle_count,3)), IsNotNull(hd_demo_sk)]
ReadSchema: struct<hd_demo_sk:int,hd_dep_count:int,hd_vehicle_count:int>
(19) ColumnarToRow [codegen id : 3]
Input [3]: [hd_demo_sk#16, hd_dep_count#17, hd_vehicle_count#18]
(20) Filter [codegen id : 3]
Input [3]: [hd_demo_sk#16, hd_dep_count#17, hd_vehicle_count#18]
Condition : (((hd_dep_count#17 = 5) OR (hd_vehicle_count#18 = 3)) AND isnotnull(hd_demo_sk#16))
(21) Project [codegen id : 3]
Output [1]: [hd_demo_sk#16]
Input [3]: [hd_demo_sk#16, hd_dep_count#17, hd_vehicle_count#18]
(22) BroadcastExchange
Input [1]: [hd_demo_sk#16]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#19]
(23) BroadcastHashJoin [codegen id : 5]
Left keys [1]: [ss_hdemo_sk#3]
Right keys [1]: [hd_demo_sk#16]
Join condition: None
(24) Project [codegen id : 5]
Output [5]: [ss_customer_sk#2, ss_addr_sk#4, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8]
Input [7]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8, hd_demo_sk#16]
(25) Scan parquet default.customer_address
Output [2]: [ca_address_sk#20, ca_city#21]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/customer_address]
PushedFilters: [IsNotNull(ca_address_sk), IsNotNull(ca_city)]
ReadSchema: struct<ca_address_sk:int,ca_city:string>
(26) ColumnarToRow [codegen id : 4]
Input [2]: [ca_address_sk#20, ca_city#21]
(27) Filter [codegen id : 4]
Input [2]: [ca_address_sk#20, ca_city#21]
Condition : (isnotnull(ca_address_sk#20) AND isnotnull(ca_city#21))
(28) BroadcastExchange
Input [2]: [ca_address_sk#20, ca_city#21]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#22]
(29) BroadcastHashJoin [codegen id : 5]
Left keys [1]: [ss_addr_sk#4]
Right keys [1]: [ca_address_sk#20]
Join condition: None
(30) Project [codegen id : 5]
Output [6]: [ss_customer_sk#2, ss_addr_sk#4, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8, ca_city#21]
Input [7]: [ss_customer_sk#2, ss_addr_sk#4, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8, ca_address_sk#20, ca_city#21]
(31) HashAggregate [codegen id : 5]
Input [6]: [ss_customer_sk#2, ss_addr_sk#4, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8, ca_city#21]
Keys [4]: [ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, ca_city#21]
Functions [2]: [partial_sum(UnscaledValue(ss_coupon_amt#7)), partial_sum(UnscaledValue(ss_net_profit#8))]
Aggregate Attributes [2]: [sum#23, sum#24]
Results [6]: [ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, ca_city#21, sum#25, sum#26]
(32) Exchange
Input [6]: [ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, ca_city#21, sum#25, sum#26]
Arguments: hashpartitioning(ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, ca_city#21, 5), true, [id=#27]
(33) HashAggregate [codegen id : 8]
Input [6]: [ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, ca_city#21, sum#25, sum#26]
Keys [4]: [ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, ca_city#21]
Functions [2]: [sum(UnscaledValue(ss_coupon_amt#7)), sum(UnscaledValue(ss_net_profit#8))]
Aggregate Attributes [2]: [sum(UnscaledValue(ss_coupon_amt#7))#28, sum(UnscaledValue(ss_net_profit#8))#29]
Results [5]: [ss_ticket_number#6, ss_customer_sk#2, ca_city#21 AS bought_city#30, MakeDecimal(sum(UnscaledValue(ss_coupon_amt#7))#28,17,2) AS amt#31, MakeDecimal(sum(UnscaledValue(ss_net_profit#8))#29,17,2) AS profit#32]
(34) Scan parquet default.customer
Output [4]: [c_customer_sk#33, c_current_addr_sk#34, c_first_name#35, c_last_name#36]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/customer]
PushedFilters: [IsNotNull(c_customer_sk), IsNotNull(c_current_addr_sk)]
ReadSchema: struct<c_customer_sk:int,c_current_addr_sk:int,c_first_name:string,c_last_name:string>
(35) ColumnarToRow [codegen id : 6]
Input [4]: [c_customer_sk#33, c_current_addr_sk#34, c_first_name#35, c_last_name#36]
(36) Filter [codegen id : 6]
Input [4]: [c_customer_sk#33, c_current_addr_sk#34, c_first_name#35, c_last_name#36]
Condition : (isnotnull(c_customer_sk#33) AND isnotnull(c_current_addr_sk#34))
(37) BroadcastExchange
Input [4]: [c_customer_sk#33, c_current_addr_sk#34, c_first_name#35, c_last_name#36]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#37]
(38) BroadcastHashJoin [codegen id : 8]
Left keys [1]: [ss_customer_sk#2]
Right keys [1]: [c_customer_sk#33]
Join condition: None
(39) Project [codegen id : 8]
Output [7]: [ss_ticket_number#6, bought_city#30, amt#31, profit#32, c_current_addr_sk#34, c_first_name#35, c_last_name#36]
Input [9]: [ss_ticket_number#6, ss_customer_sk#2, bought_city#30, amt#31, profit#32, c_customer_sk#33, c_current_addr_sk#34, c_first_name#35, c_last_name#36]
(40) ReusedExchange [Reuses operator id: 28]
Output [2]: [ca_address_sk#20, ca_city#21]
(41) BroadcastHashJoin [codegen id : 8]
Left keys [1]: [c_current_addr_sk#34]
Right keys [1]: [ca_address_sk#20]
Join condition: NOT (ca_city#21 = bought_city#30)
(42) Project [codegen id : 8]
Output [7]: [c_last_name#36, c_first_name#35, ca_city#21, bought_city#30, ss_ticket_number#6, amt#31, profit#32]
Input [9]: [ss_ticket_number#6, bought_city#30, amt#31, profit#32, c_current_addr_sk#34, c_first_name#35, c_last_name#36, ca_address_sk#20, ca_city#21]
(43) TakeOrderedAndProject
Input [7]: [c_last_name#36, c_first_name#35, ca_city#21, bought_city#30, ss_ticket_number#6, amt#31, profit#32]
Arguments: 100, [c_last_name#36 ASC NULLS FIRST, c_first_name#35 ASC NULLS FIRST, ca_city#21 ASC NULLS FIRST, bought_city#30 ASC NULLS FIRST, ss_ticket_number#6 ASC NULLS FIRST], [c_last_name#36, c_first_name#35, ca_city#21, bought_city#30, ss_ticket_number#6, amt#31, profit#32]

View file

@ -0,0 +1,63 @@
TakeOrderedAndProject [amt,bought_city,c_first_name,c_last_name,ca_city,profit,ss_ticket_number]
WholeStageCodegen (8)
Project [amt,bought_city,c_first_name,c_last_name,ca_city,profit,ss_ticket_number]
BroadcastHashJoin [bought_city,c_current_addr_sk,ca_address_sk,ca_city]
Project [amt,bought_city,c_current_addr_sk,c_first_name,c_last_name,profit,ss_ticket_number]
BroadcastHashJoin [c_customer_sk,ss_customer_sk]
HashAggregate [ca_city,ss_addr_sk,ss_customer_sk,ss_ticket_number,sum,sum] [amt,bought_city,profit,sum,sum,sum(UnscaledValue(ss_coupon_amt)),sum(UnscaledValue(ss_net_profit))]
InputAdapter
Exchange [ca_city,ss_addr_sk,ss_customer_sk,ss_ticket_number] #1
WholeStageCodegen (5)
HashAggregate [ca_city,ss_addr_sk,ss_coupon_amt,ss_customer_sk,ss_net_profit,ss_ticket_number] [sum,sum,sum,sum]
Project [ca_city,ss_addr_sk,ss_coupon_amt,ss_customer_sk,ss_net_profit,ss_ticket_number]
BroadcastHashJoin [ca_address_sk,ss_addr_sk]
Project [ss_addr_sk,ss_coupon_amt,ss_customer_sk,ss_net_profit,ss_ticket_number]
BroadcastHashJoin [hd_demo_sk,ss_hdemo_sk]
Project [ss_addr_sk,ss_coupon_amt,ss_customer_sk,ss_hdemo_sk,ss_net_profit,ss_ticket_number]
BroadcastHashJoin [s_store_sk,ss_store_sk]
Project [ss_addr_sk,ss_coupon_amt,ss_customer_sk,ss_hdemo_sk,ss_net_profit,ss_store_sk,ss_ticket_number]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Filter [ss_addr_sk,ss_customer_sk,ss_hdemo_sk,ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_addr_sk,ss_coupon_amt,ss_customer_sk,ss_hdemo_sk,ss_net_profit,ss_sold_date_sk,ss_store_sk,ss_ticket_number]
InputAdapter
BroadcastExchange #2
WholeStageCodegen (1)
Project [d_date_sk]
Filter [d_date_sk,d_dow,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_dow,d_year]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (2)
Project [s_store_sk]
Filter [s_city,s_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store [s_city,s_store_sk]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (3)
Project [hd_demo_sk]
Filter [hd_demo_sk,hd_dep_count,hd_vehicle_count]
ColumnarToRow
InputAdapter
Scan parquet default.household_demographics [hd_demo_sk,hd_dep_count,hd_vehicle_count]
InputAdapter
BroadcastExchange #5
WholeStageCodegen (4)
Filter [ca_address_sk,ca_city]
ColumnarToRow
InputAdapter
Scan parquet default.customer_address [ca_address_sk,ca_city]
InputAdapter
BroadcastExchange #6
WholeStageCodegen (6)
Filter [c_current_addr_sk,c_customer_sk]
ColumnarToRow
InputAdapter
Scan parquet default.customer [c_current_addr_sk,c_customer_sk,c_first_name,c_last_name]
InputAdapter
ReusedExchange [ca_address_sk,ca_city] #5

View file

@ -0,0 +1,122 @@
== Physical Plan ==
TakeOrderedAndProject (21)
+- * HashAggregate (20)
+- Exchange (19)
+- * HashAggregate (18)
+- * Project (17)
+- * BroadcastHashJoin Inner BuildRight (16)
:- * Project (10)
: +- * BroadcastHashJoin Inner BuildLeft (9)
: :- BroadcastExchange (5)
: : +- * Project (4)
: : +- * Filter (3)
: : +- * ColumnarToRow (2)
: : +- Scan parquet default.date_dim (1)
: +- * Filter (8)
: +- * ColumnarToRow (7)
: +- Scan parquet default.store_sales (6)
+- BroadcastExchange (15)
+- * Project (14)
+- * Filter (13)
+- * ColumnarToRow (12)
+- Scan parquet default.item (11)
(1) Scan parquet default.date_dim
Output [3]: [d_date_sk#1, d_year#2, d_moy#3]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/date_dim]
PushedFilters: [IsNotNull(d_moy), IsNotNull(d_year), EqualTo(d_moy,12), EqualTo(d_year,1998), GreaterThanOrEqual(d_date_sk,2451149), IsNotNull(d_date_sk), LessThanOrEqual(d_date_sk,2451179)]
ReadSchema: struct<d_date_sk:int,d_year:int,d_moy:int>
(2) ColumnarToRow [codegen id : 1]
Input [3]: [d_date_sk#1, d_year#2, d_moy#3]
(3) Filter [codegen id : 1]
Input [3]: [d_date_sk#1, d_year#2, d_moy#3]
Condition : ((((((isnotnull(d_moy#3) AND isnotnull(d_year#2)) AND (d_moy#3 = 12)) AND (d_year#2 = 1998)) AND (d_date_sk#1 >= 2451149)) AND isnotnull(d_date_sk#1)) AND (d_date_sk#1 <= 2451179))
(4) Project [codegen id : 1]
Output [2]: [d_date_sk#1, d_year#2]
Input [3]: [d_date_sk#1, d_year#2, d_moy#3]
(5) BroadcastExchange
Input [2]: [d_date_sk#1, d_year#2]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#4]
(6) Scan parquet default.store_sales
Output [3]: [ss_sold_date_sk#5, ss_item_sk#6, ss_ext_sales_price#7]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2451149), LessThanOrEqual(ss_sold_date_sk,2451179), IsNotNull(ss_item_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_item_sk:int,ss_ext_sales_price:decimal(7,2)>
(7) ColumnarToRow
Input [3]: [ss_sold_date_sk#5, ss_item_sk#6, ss_ext_sales_price#7]
(8) Filter
Input [3]: [ss_sold_date_sk#5, ss_item_sk#6, ss_ext_sales_price#7]
Condition : (((isnotnull(ss_sold_date_sk#5) AND (ss_sold_date_sk#5 >= 2451149)) AND (ss_sold_date_sk#5 <= 2451179)) AND isnotnull(ss_item_sk#6))
(9) BroadcastHashJoin [codegen id : 3]
Left keys [1]: [d_date_sk#1]
Right keys [1]: [ss_sold_date_sk#5]
Join condition: None
(10) Project [codegen id : 3]
Output [3]: [d_year#2, ss_item_sk#6, ss_ext_sales_price#7]
Input [5]: [d_date_sk#1, d_year#2, ss_sold_date_sk#5, ss_item_sk#6, ss_ext_sales_price#7]
(11) Scan parquet default.item
Output [4]: [i_item_sk#8, i_brand_id#9, i_brand#10, i_manager_id#11]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/item]
PushedFilters: [IsNotNull(i_manager_id), EqualTo(i_manager_id,1), IsNotNull(i_item_sk)]
ReadSchema: struct<i_item_sk:int,i_brand_id:int,i_brand:string,i_manager_id:int>
(12) ColumnarToRow [codegen id : 2]
Input [4]: [i_item_sk#8, i_brand_id#9, i_brand#10, i_manager_id#11]
(13) Filter [codegen id : 2]
Input [4]: [i_item_sk#8, i_brand_id#9, i_brand#10, i_manager_id#11]
Condition : ((isnotnull(i_manager_id#11) AND (i_manager_id#11 = 1)) AND isnotnull(i_item_sk#8))
(14) Project [codegen id : 2]
Output [3]: [i_item_sk#8, i_brand_id#9, i_brand#10]
Input [4]: [i_item_sk#8, i_brand_id#9, i_brand#10, i_manager_id#11]
(15) BroadcastExchange
Input [3]: [i_item_sk#8, i_brand_id#9, i_brand#10]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#12]
(16) BroadcastHashJoin [codegen id : 3]
Left keys [1]: [ss_item_sk#6]
Right keys [1]: [i_item_sk#8]
Join condition: None
(17) Project [codegen id : 3]
Output [4]: [d_year#2, ss_ext_sales_price#7, i_brand_id#9, i_brand#10]
Input [6]: [d_year#2, ss_item_sk#6, ss_ext_sales_price#7, i_item_sk#8, i_brand_id#9, i_brand#10]
(18) HashAggregate [codegen id : 3]
Input [4]: [d_year#2, ss_ext_sales_price#7, i_brand_id#9, i_brand#10]
Keys [3]: [d_year#2, i_brand#10, i_brand_id#9]
Functions [1]: [partial_sum(UnscaledValue(ss_ext_sales_price#7))]
Aggregate Attributes [1]: [sum#13]
Results [4]: [d_year#2, i_brand#10, i_brand_id#9, sum#14]
(19) Exchange
Input [4]: [d_year#2, i_brand#10, i_brand_id#9, sum#14]
Arguments: hashpartitioning(d_year#2, i_brand#10, i_brand_id#9, 5), true, [id=#15]
(20) HashAggregate [codegen id : 4]
Input [4]: [d_year#2, i_brand#10, i_brand_id#9, sum#14]
Keys [3]: [d_year#2, i_brand#10, i_brand_id#9]
Functions [1]: [sum(UnscaledValue(ss_ext_sales_price#7))]
Aggregate Attributes [1]: [sum(UnscaledValue(ss_ext_sales_price#7))#16]
Results [4]: [d_year#2, i_brand_id#9 AS brand_id#17, i_brand#10 AS brand#18, MakeDecimal(sum(UnscaledValue(ss_ext_sales_price#7))#16,17,2) AS ext_price#19]
(21) TakeOrderedAndProject
Input [4]: [d_year#2, brand_id#17, brand#18, ext_price#19]
Arguments: 100, [d_year#2 ASC NULLS FIRST, ext_price#19 DESC NULLS LAST, brand_id#17 ASC NULLS FIRST], [d_year#2, brand_id#17, brand#18, ext_price#19]

View file

@ -0,0 +1,31 @@
TakeOrderedAndProject [brand,brand_id,d_year,ext_price]
WholeStageCodegen (4)
HashAggregate [d_year,i_brand,i_brand_id,sum] [brand,brand_id,ext_price,sum,sum(UnscaledValue(ss_ext_sales_price))]
InputAdapter
Exchange [d_year,i_brand,i_brand_id] #1
WholeStageCodegen (3)
HashAggregate [d_year,i_brand,i_brand_id,ss_ext_sales_price] [sum,sum]
Project [d_year,i_brand,i_brand_id,ss_ext_sales_price]
BroadcastHashJoin [i_item_sk,ss_item_sk]
Project [d_year,ss_ext_sales_price,ss_item_sk]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
InputAdapter
BroadcastExchange #2
WholeStageCodegen (1)
Project [d_date_sk,d_year]
Filter [d_date_sk,d_moy,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_moy,d_year]
Filter [ss_item_sk,ss_sold_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_ext_sales_price,ss_item_sk,ss_sold_date_sk]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (2)
Project [i_brand,i_brand_id,i_item_sk]
Filter [i_item_sk,i_manager_id]
ColumnarToRow
InputAdapter
Scan parquet default.item [i_brand,i_brand_id,i_item_sk,i_manager_id]

View file

@ -0,0 +1,122 @@
== Physical Plan ==
TakeOrderedAndProject (21)
+- * HashAggregate (20)
+- Exchange (19)
+- * HashAggregate (18)
+- * Project (17)
+- * BroadcastHashJoin Inner BuildRight (16)
:- * Project (10)
: +- * BroadcastHashJoin Inner BuildRight (9)
: :- * Project (4)
: : +- * Filter (3)
: : +- * ColumnarToRow (2)
: : +- Scan parquet default.date_dim (1)
: +- BroadcastExchange (8)
: +- * Filter (7)
: +- * ColumnarToRow (6)
: +- Scan parquet default.store_sales (5)
+- BroadcastExchange (15)
+- * Project (14)
+- * Filter (13)
+- * ColumnarToRow (12)
+- Scan parquet default.item (11)
(1) Scan parquet default.date_dim
Output [3]: [d_date_sk#1, d_year#2, d_moy#3]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/date_dim]
PushedFilters: [IsNotNull(d_moy), IsNotNull(d_year), EqualTo(d_moy,12), EqualTo(d_year,1998), LessThanOrEqual(d_date_sk,2451179), GreaterThanOrEqual(d_date_sk,2451149), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int,d_moy:int>
(2) ColumnarToRow [codegen id : 3]
Input [3]: [d_date_sk#1, d_year#2, d_moy#3]
(3) Filter [codegen id : 3]
Input [3]: [d_date_sk#1, d_year#2, d_moy#3]
Condition : ((((((isnotnull(d_moy#3) AND isnotnull(d_year#2)) AND (d_moy#3 = 12)) AND (d_year#2 = 1998)) AND (d_date_sk#1 <= 2451179)) AND (d_date_sk#1 >= 2451149)) AND isnotnull(d_date_sk#1))
(4) Project [codegen id : 3]
Output [2]: [d_date_sk#1, d_year#2]
Input [3]: [d_date_sk#1, d_year#2, d_moy#3]
(5) Scan parquet default.store_sales
Output [3]: [ss_sold_date_sk#4, ss_item_sk#5, ss_ext_sales_price#6]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2451149), LessThanOrEqual(ss_sold_date_sk,2451179), IsNotNull(ss_item_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_item_sk:int,ss_ext_sales_price:decimal(7,2)>
(6) ColumnarToRow [codegen id : 1]
Input [3]: [ss_sold_date_sk#4, ss_item_sk#5, ss_ext_sales_price#6]
(7) Filter [codegen id : 1]
Input [3]: [ss_sold_date_sk#4, ss_item_sk#5, ss_ext_sales_price#6]
Condition : (((isnotnull(ss_sold_date_sk#4) AND (ss_sold_date_sk#4 >= 2451149)) AND (ss_sold_date_sk#4 <= 2451179)) AND isnotnull(ss_item_sk#5))
(8) BroadcastExchange
Input [3]: [ss_sold_date_sk#4, ss_item_sk#5, ss_ext_sales_price#6]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#7]
(9) BroadcastHashJoin [codegen id : 3]
Left keys [1]: [d_date_sk#1]
Right keys [1]: [ss_sold_date_sk#4]
Join condition: None
(10) Project [codegen id : 3]
Output [3]: [d_year#2, ss_item_sk#5, ss_ext_sales_price#6]
Input [5]: [d_date_sk#1, d_year#2, ss_sold_date_sk#4, ss_item_sk#5, ss_ext_sales_price#6]
(11) Scan parquet default.item
Output [4]: [i_item_sk#8, i_brand_id#9, i_brand#10, i_manager_id#11]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/item]
PushedFilters: [IsNotNull(i_manager_id), EqualTo(i_manager_id,1), IsNotNull(i_item_sk)]
ReadSchema: struct<i_item_sk:int,i_brand_id:int,i_brand:string,i_manager_id:int>
(12) ColumnarToRow [codegen id : 2]
Input [4]: [i_item_sk#8, i_brand_id#9, i_brand#10, i_manager_id#11]
(13) Filter [codegen id : 2]
Input [4]: [i_item_sk#8, i_brand_id#9, i_brand#10, i_manager_id#11]
Condition : ((isnotnull(i_manager_id#11) AND (i_manager_id#11 = 1)) AND isnotnull(i_item_sk#8))
(14) Project [codegen id : 2]
Output [3]: [i_item_sk#8, i_brand_id#9, i_brand#10]
Input [4]: [i_item_sk#8, i_brand_id#9, i_brand#10, i_manager_id#11]
(15) BroadcastExchange
Input [3]: [i_item_sk#8, i_brand_id#9, i_brand#10]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#12]
(16) BroadcastHashJoin [codegen id : 3]
Left keys [1]: [ss_item_sk#5]
Right keys [1]: [i_item_sk#8]
Join condition: None
(17) Project [codegen id : 3]
Output [4]: [d_year#2, ss_ext_sales_price#6, i_brand_id#9, i_brand#10]
Input [6]: [d_year#2, ss_item_sk#5, ss_ext_sales_price#6, i_item_sk#8, i_brand_id#9, i_brand#10]
(18) HashAggregate [codegen id : 3]
Input [4]: [d_year#2, ss_ext_sales_price#6, i_brand_id#9, i_brand#10]
Keys [3]: [d_year#2, i_brand#10, i_brand_id#9]
Functions [1]: [partial_sum(UnscaledValue(ss_ext_sales_price#6))]
Aggregate Attributes [1]: [sum#13]
Results [4]: [d_year#2, i_brand#10, i_brand_id#9, sum#14]
(19) Exchange
Input [4]: [d_year#2, i_brand#10, i_brand_id#9, sum#14]
Arguments: hashpartitioning(d_year#2, i_brand#10, i_brand_id#9, 5), true, [id=#15]
(20) HashAggregate [codegen id : 4]
Input [4]: [d_year#2, i_brand#10, i_brand_id#9, sum#14]
Keys [3]: [d_year#2, i_brand#10, i_brand_id#9]
Functions [1]: [sum(UnscaledValue(ss_ext_sales_price#6))]
Aggregate Attributes [1]: [sum(UnscaledValue(ss_ext_sales_price#6))#16]
Results [4]: [d_year#2, i_brand_id#9 AS brand_id#17, i_brand#10 AS brand#18, MakeDecimal(sum(UnscaledValue(ss_ext_sales_price#6))#16,17,2) AS ext_price#19]
(21) TakeOrderedAndProject
Input [4]: [d_year#2, brand_id#17, brand#18, ext_price#19]
Arguments: 100, [d_year#2 ASC NULLS FIRST, ext_price#19 DESC NULLS LAST, brand_id#17 ASC NULLS FIRST], [d_year#2, brand_id#17, brand#18, ext_price#19]

View file

@ -0,0 +1,31 @@
TakeOrderedAndProject [brand,brand_id,d_year,ext_price]
WholeStageCodegen (4)
HashAggregate [d_year,i_brand,i_brand_id,sum] [brand,brand_id,ext_price,sum,sum(UnscaledValue(ss_ext_sales_price))]
InputAdapter
Exchange [d_year,i_brand,i_brand_id] #1
WholeStageCodegen (3)
HashAggregate [d_year,i_brand,i_brand_id,ss_ext_sales_price] [sum,sum]
Project [d_year,i_brand,i_brand_id,ss_ext_sales_price]
BroadcastHashJoin [i_item_sk,ss_item_sk]
Project [d_year,ss_ext_sales_price,ss_item_sk]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Project [d_date_sk,d_year]
Filter [d_date_sk,d_moy,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_moy,d_year]
InputAdapter
BroadcastExchange #2
WholeStageCodegen (1)
Filter [ss_item_sk,ss_sold_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_ext_sales_price,ss_item_sk,ss_sold_date_sk]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (2)
Project [i_brand,i_brand_id,i_item_sk]
Filter [i_item_sk,i_manager_id]
ColumnarToRow
InputAdapter
Scan parquet default.item [i_brand,i_brand_id,i_item_sk,i_manager_id]

View file

@ -0,0 +1,180 @@
== Physical Plan ==
TakeOrderedAndProject (32)
+- * Project (31)
+- * Filter (30)
+- Window (29)
+- * Sort (28)
+- Exchange (27)
+- * HashAggregate (26)
+- Exchange (25)
+- * HashAggregate (24)
+- * Project (23)
+- * BroadcastHashJoin Inner BuildRight (22)
:- * Project (16)
: +- * BroadcastHashJoin Inner BuildRight (15)
: :- * Project (10)
: : +- * BroadcastHashJoin Inner BuildLeft (9)
: : :- BroadcastExchange (5)
: : : +- * Project (4)
: : : +- * Filter (3)
: : : +- * ColumnarToRow (2)
: : : +- Scan parquet default.item (1)
: : +- * Filter (8)
: : +- * ColumnarToRow (7)
: : +- Scan parquet default.store_sales (6)
: +- BroadcastExchange (14)
: +- * Filter (13)
: +- * ColumnarToRow (12)
: +- Scan parquet default.store (11)
+- BroadcastExchange (21)
+- * Project (20)
+- * Filter (19)
+- * ColumnarToRow (18)
+- Scan parquet default.date_dim (17)
(1) Scan parquet default.item
Output [5]: [i_item_sk#1, i_brand#2, i_class#3, i_category#4, i_manufact_id#5]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/item]
PushedFilters: [Or(And(And(In(i_category, [Books,Children,Electronics]),In(i_class, [personal,portable,reference,self-help])),In(i_brand, [scholaramalgamalg #6,scholaramalgamalg #7,exportiunivamalg #8,scholaramalgamalg #8])),And(And(In(i_category, [Women,Music,Men]),In(i_class, [accessories,classical,fragrances,pants])),In(i_brand, [amalgimporto #9,edu packscholar #9,exportiimporto #9,importoamalg #9]))), IsNotNull(i_item_sk)]
ReadSchema: struct<i_item_sk:int,i_brand:string,i_class:string,i_category:string,i_manufact_id:int>
(2) ColumnarToRow [codegen id : 1]
Input [5]: [i_item_sk#1, i_brand#2, i_class#3, i_category#4, i_manufact_id#5]
(3) Filter [codegen id : 1]
Input [5]: [i_item_sk#1, i_brand#2, i_class#3, i_category#4, i_manufact_id#5]
Condition : ((((i_category#4 IN (Books,Children,Electronics) AND i_class#3 IN (personal,portable,reference,self-help)) AND i_brand#2 IN (scholaramalgamalg #6,scholaramalgamalg #7,exportiunivamalg #8,scholaramalgamalg #8)) OR ((i_category#4 IN (Women,Music,Men) AND i_class#3 IN (accessories,classical,fragrances,pants)) AND i_brand#2 IN (amalgimporto #9,edu packscholar #9,exportiimporto #9,importoamalg #9))) AND isnotnull(i_item_sk#1))
(4) Project [codegen id : 1]
Output [2]: [i_item_sk#1, i_manufact_id#5]
Input [5]: [i_item_sk#1, i_brand#2, i_class#3, i_category#4, i_manufact_id#5]
(5) BroadcastExchange
Input [2]: [i_item_sk#1, i_manufact_id#5]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#10]
(6) Scan parquet default.store_sales
Output [4]: [ss_sold_date_sk#11, ss_item_sk#12, ss_store_sk#13, ss_sales_price#14]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2451911), LessThanOrEqual(ss_sold_date_sk,2452275), IsNotNull(ss_item_sk), IsNotNull(ss_store_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_item_sk:int,ss_store_sk:int,ss_sales_price:decimal(7,2)>
(7) ColumnarToRow
Input [4]: [ss_sold_date_sk#11, ss_item_sk#12, ss_store_sk#13, ss_sales_price#14]
(8) Filter
Input [4]: [ss_sold_date_sk#11, ss_item_sk#12, ss_store_sk#13, ss_sales_price#14]
Condition : ((((isnotnull(ss_sold_date_sk#11) AND (ss_sold_date_sk#11 >= 2451911)) AND (ss_sold_date_sk#11 <= 2452275)) AND isnotnull(ss_item_sk#12)) AND isnotnull(ss_store_sk#13))
(9) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [i_item_sk#1]
Right keys [1]: [ss_item_sk#12]
Join condition: None
(10) Project [codegen id : 4]
Output [4]: [i_manufact_id#5, ss_sold_date_sk#11, ss_store_sk#13, ss_sales_price#14]
Input [6]: [i_item_sk#1, i_manufact_id#5, ss_sold_date_sk#11, ss_item_sk#12, ss_store_sk#13, ss_sales_price#14]
(11) Scan parquet default.store
Output [1]: [s_store_sk#15]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store]
PushedFilters: [IsNotNull(s_store_sk)]
ReadSchema: struct<s_store_sk:int>
(12) ColumnarToRow [codegen id : 2]
Input [1]: [s_store_sk#15]
(13) Filter [codegen id : 2]
Input [1]: [s_store_sk#15]
Condition : isnotnull(s_store_sk#15)
(14) BroadcastExchange
Input [1]: [s_store_sk#15]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#16]
(15) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_store_sk#13]
Right keys [1]: [s_store_sk#15]
Join condition: None
(16) Project [codegen id : 4]
Output [3]: [i_manufact_id#5, ss_sold_date_sk#11, ss_sales_price#14]
Input [5]: [i_manufact_id#5, ss_sold_date_sk#11, ss_store_sk#13, ss_sales_price#14, s_store_sk#15]
(17) Scan parquet default.date_dim
Output [3]: [d_date_sk#17, d_month_seq#18, d_qoy#19]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/date_dim]
PushedFilters: [In(d_month_seq, [1222,1215,1223,1217,1214,1219,1213,1218,1220,1221,1216,1212]), LessThanOrEqual(d_date_sk,2452275), GreaterThanOrEqual(d_date_sk,2451911), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_month_seq:int,d_qoy:int>
(18) ColumnarToRow [codegen id : 3]
Input [3]: [d_date_sk#17, d_month_seq#18, d_qoy#19]
(19) Filter [codegen id : 3]
Input [3]: [d_date_sk#17, d_month_seq#18, d_qoy#19]
Condition : (((d_month_seq#18 INSET (1222,1215,1223,1217,1214,1219,1213,1218,1220,1221,1216,1212) AND (d_date_sk#17 <= 2452275)) AND (d_date_sk#17 >= 2451911)) AND isnotnull(d_date_sk#17))
(20) Project [codegen id : 3]
Output [2]: [d_date_sk#17, d_qoy#19]
Input [3]: [d_date_sk#17, d_month_seq#18, d_qoy#19]
(21) BroadcastExchange
Input [2]: [d_date_sk#17, d_qoy#19]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#20]
(22) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_sold_date_sk#11]
Right keys [1]: [d_date_sk#17]
Join condition: None
(23) Project [codegen id : 4]
Output [3]: [i_manufact_id#5, ss_sales_price#14, d_qoy#19]
Input [5]: [i_manufact_id#5, ss_sold_date_sk#11, ss_sales_price#14, d_date_sk#17, d_qoy#19]
(24) HashAggregate [codegen id : 4]
Input [3]: [i_manufact_id#5, ss_sales_price#14, d_qoy#19]
Keys [2]: [i_manufact_id#5, d_qoy#19]
Functions [1]: [partial_sum(UnscaledValue(ss_sales_price#14))]
Aggregate Attributes [1]: [sum#21]
Results [3]: [i_manufact_id#5, d_qoy#19, sum#22]
(25) Exchange
Input [3]: [i_manufact_id#5, d_qoy#19, sum#22]
Arguments: hashpartitioning(i_manufact_id#5, d_qoy#19, 5), true, [id=#23]
(26) HashAggregate [codegen id : 5]
Input [3]: [i_manufact_id#5, d_qoy#19, sum#22]
Keys [2]: [i_manufact_id#5, d_qoy#19]
Functions [1]: [sum(UnscaledValue(ss_sales_price#14))]
Aggregate Attributes [1]: [sum(UnscaledValue(ss_sales_price#14))#24]
Results [3]: [i_manufact_id#5, MakeDecimal(sum(UnscaledValue(ss_sales_price#14))#24,17,2) AS sum_sales#25, MakeDecimal(sum(UnscaledValue(ss_sales_price#14))#24,17,2) AS _w0#26]
(27) Exchange
Input [3]: [i_manufact_id#5, sum_sales#25, _w0#26]
Arguments: hashpartitioning(i_manufact_id#5, 5), true, [id=#27]
(28) Sort [codegen id : 6]
Input [3]: [i_manufact_id#5, sum_sales#25, _w0#26]
Arguments: [i_manufact_id#5 ASC NULLS FIRST], false, 0
(29) Window
Input [3]: [i_manufact_id#5, sum_sales#25, _w0#26]
Arguments: [avg(_w0#26) windowspecdefinition(i_manufact_id#5, specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) AS avg_quarterly_sales#28], [i_manufact_id#5]
(30) Filter [codegen id : 7]
Input [4]: [i_manufact_id#5, sum_sales#25, _w0#26, avg_quarterly_sales#28]
Condition : (CASE WHEN (avg_quarterly_sales#28 > 0.000000) THEN CheckOverflow((promote_precision(abs(CheckOverflow((promote_precision(cast(sum_sales#25 as decimal(22,6))) - promote_precision(cast(avg_quarterly_sales#28 as decimal(22,6)))), DecimalType(22,6), true))) / promote_precision(cast(avg_quarterly_sales#28 as decimal(22,6)))), DecimalType(38,16), true) ELSE null END > 0.1000000000000000)
(31) Project [codegen id : 7]
Output [3]: [i_manufact_id#5, sum_sales#25, avg_quarterly_sales#28]
Input [4]: [i_manufact_id#5, sum_sales#25, _w0#26, avg_quarterly_sales#28]
(32) TakeOrderedAndProject
Input [3]: [i_manufact_id#5, sum_sales#25, avg_quarterly_sales#28]
Arguments: 100, [avg_quarterly_sales#28 ASC NULLS FIRST, sum_sales#25 ASC NULLS FIRST, i_manufact_id#5 ASC NULLS FIRST], [i_manufact_id#5, sum_sales#25, avg_quarterly_sales#28]

View file

@ -0,0 +1,49 @@
TakeOrderedAndProject [avg_quarterly_sales,i_manufact_id,sum_sales]
WholeStageCodegen (7)
Project [avg_quarterly_sales,i_manufact_id,sum_sales]
Filter [avg_quarterly_sales,sum_sales]
InputAdapter
Window [_w0,i_manufact_id]
WholeStageCodegen (6)
Sort [i_manufact_id]
InputAdapter
Exchange [i_manufact_id] #1
WholeStageCodegen (5)
HashAggregate [d_qoy,i_manufact_id,sum] [_w0,sum,sum(UnscaledValue(ss_sales_price)),sum_sales]
InputAdapter
Exchange [d_qoy,i_manufact_id] #2
WholeStageCodegen (4)
HashAggregate [d_qoy,i_manufact_id,ss_sales_price] [sum,sum]
Project [d_qoy,i_manufact_id,ss_sales_price]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Project [i_manufact_id,ss_sales_price,ss_sold_date_sk]
BroadcastHashJoin [s_store_sk,ss_store_sk]
Project [i_manufact_id,ss_sales_price,ss_sold_date_sk,ss_store_sk]
BroadcastHashJoin [i_item_sk,ss_item_sk]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (1)
Project [i_item_sk,i_manufact_id]
Filter [i_brand,i_category,i_class,i_item_sk]
ColumnarToRow
InputAdapter
Scan parquet default.item [i_brand,i_category,i_class,i_item_sk,i_manufact_id]
Filter [ss_item_sk,ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_item_sk,ss_sales_price,ss_sold_date_sk,ss_store_sk]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (2)
Filter [s_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store [s_store_sk]
InputAdapter
BroadcastExchange #5
WholeStageCodegen (3)
Project [d_date_sk,d_qoy]
Filter [d_date_sk,d_month_seq]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_month_seq,d_qoy]

View file

@ -0,0 +1,180 @@
== Physical Plan ==
TakeOrderedAndProject (32)
+- * Project (31)
+- * Filter (30)
+- Window (29)
+- * Sort (28)
+- Exchange (27)
+- * HashAggregate (26)
+- Exchange (25)
+- * HashAggregate (24)
+- * Project (23)
+- * BroadcastHashJoin Inner BuildRight (22)
:- * Project (17)
: +- * BroadcastHashJoin Inner BuildRight (16)
: :- * Project (10)
: : +- * BroadcastHashJoin Inner BuildRight (9)
: : :- * Project (4)
: : : +- * Filter (3)
: : : +- * ColumnarToRow (2)
: : : +- Scan parquet default.item (1)
: : +- BroadcastExchange (8)
: : +- * Filter (7)
: : +- * ColumnarToRow (6)
: : +- Scan parquet default.store_sales (5)
: +- BroadcastExchange (15)
: +- * Project (14)
: +- * Filter (13)
: +- * ColumnarToRow (12)
: +- Scan parquet default.date_dim (11)
+- BroadcastExchange (21)
+- * Filter (20)
+- * ColumnarToRow (19)
+- Scan parquet default.store (18)
(1) Scan parquet default.item
Output [5]: [i_item_sk#1, i_brand#2, i_class#3, i_category#4, i_manufact_id#5]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/item]
PushedFilters: [Or(And(And(In(i_category, [Books,Children,Electronics]),In(i_class, [personal,portable,reference,self-help])),In(i_brand, [scholaramalgamalg #6,scholaramalgamalg #7,exportiunivamalg #8,scholaramalgamalg #8])),And(And(In(i_category, [Women,Music,Men]),In(i_class, [accessories,classical,fragrances,pants])),In(i_brand, [amalgimporto #9,edu packscholar #9,exportiimporto #9,importoamalg #9]))), IsNotNull(i_item_sk)]
ReadSchema: struct<i_item_sk:int,i_brand:string,i_class:string,i_category:string,i_manufact_id:int>
(2) ColumnarToRow [codegen id : 4]
Input [5]: [i_item_sk#1, i_brand#2, i_class#3, i_category#4, i_manufact_id#5]
(3) Filter [codegen id : 4]
Input [5]: [i_item_sk#1, i_brand#2, i_class#3, i_category#4, i_manufact_id#5]
Condition : ((((i_category#4 IN (Books,Children,Electronics) AND i_class#3 IN (personal,portable,reference,self-help)) AND i_brand#2 IN (scholaramalgamalg #6,scholaramalgamalg #7,exportiunivamalg #8,scholaramalgamalg #8)) OR ((i_category#4 IN (Women,Music,Men) AND i_class#3 IN (accessories,classical,fragrances,pants)) AND i_brand#2 IN (amalgimporto #9,edu packscholar #9,exportiimporto #9,importoamalg #9))) AND isnotnull(i_item_sk#1))
(4) Project [codegen id : 4]
Output [2]: [i_item_sk#1, i_manufact_id#5]
Input [5]: [i_item_sk#1, i_brand#2, i_class#3, i_category#4, i_manufact_id#5]
(5) Scan parquet default.store_sales
Output [4]: [ss_sold_date_sk#10, ss_item_sk#11, ss_store_sk#12, ss_sales_price#13]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2451911), LessThanOrEqual(ss_sold_date_sk,2452275), IsNotNull(ss_item_sk), IsNotNull(ss_store_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_item_sk:int,ss_store_sk:int,ss_sales_price:decimal(7,2)>
(6) ColumnarToRow [codegen id : 1]
Input [4]: [ss_sold_date_sk#10, ss_item_sk#11, ss_store_sk#12, ss_sales_price#13]
(7) Filter [codegen id : 1]
Input [4]: [ss_sold_date_sk#10, ss_item_sk#11, ss_store_sk#12, ss_sales_price#13]
Condition : ((((isnotnull(ss_sold_date_sk#10) AND (ss_sold_date_sk#10 >= 2451911)) AND (ss_sold_date_sk#10 <= 2452275)) AND isnotnull(ss_item_sk#11)) AND isnotnull(ss_store_sk#12))
(8) BroadcastExchange
Input [4]: [ss_sold_date_sk#10, ss_item_sk#11, ss_store_sk#12, ss_sales_price#13]
Arguments: HashedRelationBroadcastMode(List(cast(input[1, int, false] as bigint)),false), [id=#14]
(9) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [i_item_sk#1]
Right keys [1]: [ss_item_sk#11]
Join condition: None
(10) Project [codegen id : 4]
Output [4]: [i_manufact_id#5, ss_sold_date_sk#10, ss_store_sk#12, ss_sales_price#13]
Input [6]: [i_item_sk#1, i_manufact_id#5, ss_sold_date_sk#10, ss_item_sk#11, ss_store_sk#12, ss_sales_price#13]
(11) Scan parquet default.date_dim
Output [3]: [d_date_sk#15, d_month_seq#16, d_qoy#17]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/date_dim]
PushedFilters: [In(d_month_seq, [1222,1215,1223,1217,1214,1219,1213,1218,1220,1221,1216,1212]), GreaterThanOrEqual(d_date_sk,2451911), LessThanOrEqual(d_date_sk,2452275), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_month_seq:int,d_qoy:int>
(12) ColumnarToRow [codegen id : 2]
Input [3]: [d_date_sk#15, d_month_seq#16, d_qoy#17]
(13) Filter [codegen id : 2]
Input [3]: [d_date_sk#15, d_month_seq#16, d_qoy#17]
Condition : (((d_month_seq#16 INSET (1222,1215,1223,1217,1214,1219,1213,1218,1220,1221,1216,1212) AND (d_date_sk#15 >= 2451911)) AND (d_date_sk#15 <= 2452275)) AND isnotnull(d_date_sk#15))
(14) Project [codegen id : 2]
Output [2]: [d_date_sk#15, d_qoy#17]
Input [3]: [d_date_sk#15, d_month_seq#16, d_qoy#17]
(15) BroadcastExchange
Input [2]: [d_date_sk#15, d_qoy#17]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#18]
(16) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_sold_date_sk#10]
Right keys [1]: [d_date_sk#15]
Join condition: None
(17) Project [codegen id : 4]
Output [4]: [i_manufact_id#5, ss_store_sk#12, ss_sales_price#13, d_qoy#17]
Input [6]: [i_manufact_id#5, ss_sold_date_sk#10, ss_store_sk#12, ss_sales_price#13, d_date_sk#15, d_qoy#17]
(18) Scan parquet default.store
Output [1]: [s_store_sk#19]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store]
PushedFilters: [IsNotNull(s_store_sk)]
ReadSchema: struct<s_store_sk:int>
(19) ColumnarToRow [codegen id : 3]
Input [1]: [s_store_sk#19]
(20) Filter [codegen id : 3]
Input [1]: [s_store_sk#19]
Condition : isnotnull(s_store_sk#19)
(21) BroadcastExchange
Input [1]: [s_store_sk#19]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#20]
(22) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_store_sk#12]
Right keys [1]: [s_store_sk#19]
Join condition: None
(23) Project [codegen id : 4]
Output [3]: [i_manufact_id#5, ss_sales_price#13, d_qoy#17]
Input [5]: [i_manufact_id#5, ss_store_sk#12, ss_sales_price#13, d_qoy#17, s_store_sk#19]
(24) HashAggregate [codegen id : 4]
Input [3]: [i_manufact_id#5, ss_sales_price#13, d_qoy#17]
Keys [2]: [i_manufact_id#5, d_qoy#17]
Functions [1]: [partial_sum(UnscaledValue(ss_sales_price#13))]
Aggregate Attributes [1]: [sum#21]
Results [3]: [i_manufact_id#5, d_qoy#17, sum#22]
(25) Exchange
Input [3]: [i_manufact_id#5, d_qoy#17, sum#22]
Arguments: hashpartitioning(i_manufact_id#5, d_qoy#17, 5), true, [id=#23]
(26) HashAggregate [codegen id : 5]
Input [3]: [i_manufact_id#5, d_qoy#17, sum#22]
Keys [2]: [i_manufact_id#5, d_qoy#17]
Functions [1]: [sum(UnscaledValue(ss_sales_price#13))]
Aggregate Attributes [1]: [sum(UnscaledValue(ss_sales_price#13))#24]
Results [3]: [i_manufact_id#5, MakeDecimal(sum(UnscaledValue(ss_sales_price#13))#24,17,2) AS sum_sales#25, MakeDecimal(sum(UnscaledValue(ss_sales_price#13))#24,17,2) AS _w0#26]
(27) Exchange
Input [3]: [i_manufact_id#5, sum_sales#25, _w0#26]
Arguments: hashpartitioning(i_manufact_id#5, 5), true, [id=#27]
(28) Sort [codegen id : 6]
Input [3]: [i_manufact_id#5, sum_sales#25, _w0#26]
Arguments: [i_manufact_id#5 ASC NULLS FIRST], false, 0
(29) Window
Input [3]: [i_manufact_id#5, sum_sales#25, _w0#26]
Arguments: [avg(_w0#26) windowspecdefinition(i_manufact_id#5, specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) AS avg_quarterly_sales#28], [i_manufact_id#5]
(30) Filter [codegen id : 7]
Input [4]: [i_manufact_id#5, sum_sales#25, _w0#26, avg_quarterly_sales#28]
Condition : (CASE WHEN (avg_quarterly_sales#28 > 0.000000) THEN CheckOverflow((promote_precision(abs(CheckOverflow((promote_precision(cast(sum_sales#25 as decimal(22,6))) - promote_precision(cast(avg_quarterly_sales#28 as decimal(22,6)))), DecimalType(22,6), true))) / promote_precision(cast(avg_quarterly_sales#28 as decimal(22,6)))), DecimalType(38,16), true) ELSE null END > 0.1000000000000000)
(31) Project [codegen id : 7]
Output [3]: [i_manufact_id#5, sum_sales#25, avg_quarterly_sales#28]
Input [4]: [i_manufact_id#5, sum_sales#25, _w0#26, avg_quarterly_sales#28]
(32) TakeOrderedAndProject
Input [3]: [i_manufact_id#5, sum_sales#25, avg_quarterly_sales#28]
Arguments: 100, [avg_quarterly_sales#28 ASC NULLS FIRST, sum_sales#25 ASC NULLS FIRST, i_manufact_id#5 ASC NULLS FIRST], [i_manufact_id#5, sum_sales#25, avg_quarterly_sales#28]

View file

@ -0,0 +1,49 @@
TakeOrderedAndProject [avg_quarterly_sales,i_manufact_id,sum_sales]
WholeStageCodegen (7)
Project [avg_quarterly_sales,i_manufact_id,sum_sales]
Filter [avg_quarterly_sales,sum_sales]
InputAdapter
Window [_w0,i_manufact_id]
WholeStageCodegen (6)
Sort [i_manufact_id]
InputAdapter
Exchange [i_manufact_id] #1
WholeStageCodegen (5)
HashAggregate [d_qoy,i_manufact_id,sum] [_w0,sum,sum(UnscaledValue(ss_sales_price)),sum_sales]
InputAdapter
Exchange [d_qoy,i_manufact_id] #2
WholeStageCodegen (4)
HashAggregate [d_qoy,i_manufact_id,ss_sales_price] [sum,sum]
Project [d_qoy,i_manufact_id,ss_sales_price]
BroadcastHashJoin [s_store_sk,ss_store_sk]
Project [d_qoy,i_manufact_id,ss_sales_price,ss_store_sk]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Project [i_manufact_id,ss_sales_price,ss_sold_date_sk,ss_store_sk]
BroadcastHashJoin [i_item_sk,ss_item_sk]
Project [i_item_sk,i_manufact_id]
Filter [i_brand,i_category,i_class,i_item_sk]
ColumnarToRow
InputAdapter
Scan parquet default.item [i_brand,i_category,i_class,i_item_sk,i_manufact_id]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (1)
Filter [ss_item_sk,ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_item_sk,ss_sales_price,ss_sold_date_sk,ss_store_sk]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (2)
Project [d_date_sk,d_qoy]
Filter [d_date_sk,d_month_seq]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_month_seq,d_qoy]
InputAdapter
BroadcastExchange #5
WholeStageCodegen (3)
Filter [s_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store [s_store_sk]

View file

@ -0,0 +1,122 @@
== Physical Plan ==
TakeOrderedAndProject (21)
+- * HashAggregate (20)
+- Exchange (19)
+- * HashAggregate (18)
+- * Project (17)
+- * BroadcastHashJoin Inner BuildRight (16)
:- * Project (10)
: +- * BroadcastHashJoin Inner BuildLeft (9)
: :- BroadcastExchange (5)
: : +- * Project (4)
: : +- * Filter (3)
: : +- * ColumnarToRow (2)
: : +- Scan parquet default.date_dim (1)
: +- * Filter (8)
: +- * ColumnarToRow (7)
: +- Scan parquet default.store_sales (6)
+- BroadcastExchange (15)
+- * Project (14)
+- * Filter (13)
+- * ColumnarToRow (12)
+- Scan parquet default.item (11)
(1) Scan parquet default.date_dim
Output [3]: [d_date_sk#1, d_year#2, d_moy#3]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/date_dim]
PushedFilters: [IsNotNull(d_moy), IsNotNull(d_year), EqualTo(d_moy,11), EqualTo(d_year,2001), GreaterThanOrEqual(d_date_sk,2452215), LessThanOrEqual(d_date_sk,2452244), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int,d_moy:int>
(2) ColumnarToRow [codegen id : 1]
Input [3]: [d_date_sk#1, d_year#2, d_moy#3]
(3) Filter [codegen id : 1]
Input [3]: [d_date_sk#1, d_year#2, d_moy#3]
Condition : ((((((isnotnull(d_moy#3) AND isnotnull(d_year#2)) AND (d_moy#3 = 11)) AND (d_year#2 = 2001)) AND (d_date_sk#1 >= 2452215)) AND (d_date_sk#1 <= 2452244)) AND isnotnull(d_date_sk#1))
(4) Project [codegen id : 1]
Output [1]: [d_date_sk#1]
Input [3]: [d_date_sk#1, d_year#2, d_moy#3]
(5) BroadcastExchange
Input [1]: [d_date_sk#1]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#4]
(6) Scan parquet default.store_sales
Output [3]: [ss_sold_date_sk#5, ss_item_sk#6, ss_ext_sales_price#7]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2452215), LessThanOrEqual(ss_sold_date_sk,2452244), IsNotNull(ss_item_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_item_sk:int,ss_ext_sales_price:decimal(7,2)>
(7) ColumnarToRow
Input [3]: [ss_sold_date_sk#5, ss_item_sk#6, ss_ext_sales_price#7]
(8) Filter
Input [3]: [ss_sold_date_sk#5, ss_item_sk#6, ss_ext_sales_price#7]
Condition : (((isnotnull(ss_sold_date_sk#5) AND (ss_sold_date_sk#5 >= 2452215)) AND (ss_sold_date_sk#5 <= 2452244)) AND isnotnull(ss_item_sk#6))
(9) BroadcastHashJoin [codegen id : 3]
Left keys [1]: [d_date_sk#1]
Right keys [1]: [ss_sold_date_sk#5]
Join condition: None
(10) Project [codegen id : 3]
Output [2]: [ss_item_sk#6, ss_ext_sales_price#7]
Input [4]: [d_date_sk#1, ss_sold_date_sk#5, ss_item_sk#6, ss_ext_sales_price#7]
(11) Scan parquet default.item
Output [4]: [i_item_sk#8, i_brand_id#9, i_brand#10, i_manager_id#11]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/item]
PushedFilters: [IsNotNull(i_manager_id), EqualTo(i_manager_id,48), IsNotNull(i_item_sk)]
ReadSchema: struct<i_item_sk:int,i_brand_id:int,i_brand:string,i_manager_id:int>
(12) ColumnarToRow [codegen id : 2]
Input [4]: [i_item_sk#8, i_brand_id#9, i_brand#10, i_manager_id#11]
(13) Filter [codegen id : 2]
Input [4]: [i_item_sk#8, i_brand_id#9, i_brand#10, i_manager_id#11]
Condition : ((isnotnull(i_manager_id#11) AND (i_manager_id#11 = 48)) AND isnotnull(i_item_sk#8))
(14) Project [codegen id : 2]
Output [3]: [i_item_sk#8, i_brand_id#9, i_brand#10]
Input [4]: [i_item_sk#8, i_brand_id#9, i_brand#10, i_manager_id#11]
(15) BroadcastExchange
Input [3]: [i_item_sk#8, i_brand_id#9, i_brand#10]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#12]
(16) BroadcastHashJoin [codegen id : 3]
Left keys [1]: [ss_item_sk#6]
Right keys [1]: [i_item_sk#8]
Join condition: None
(17) Project [codegen id : 3]
Output [3]: [ss_ext_sales_price#7, i_brand_id#9, i_brand#10]
Input [5]: [ss_item_sk#6, ss_ext_sales_price#7, i_item_sk#8, i_brand_id#9, i_brand#10]
(18) HashAggregate [codegen id : 3]
Input [3]: [ss_ext_sales_price#7, i_brand_id#9, i_brand#10]
Keys [2]: [i_brand#10, i_brand_id#9]
Functions [1]: [partial_sum(UnscaledValue(ss_ext_sales_price#7))]
Aggregate Attributes [1]: [sum#13]
Results [3]: [i_brand#10, i_brand_id#9, sum#14]
(19) Exchange
Input [3]: [i_brand#10, i_brand_id#9, sum#14]
Arguments: hashpartitioning(i_brand#10, i_brand_id#9, 5), true, [id=#15]
(20) HashAggregate [codegen id : 4]
Input [3]: [i_brand#10, i_brand_id#9, sum#14]
Keys [2]: [i_brand#10, i_brand_id#9]
Functions [1]: [sum(UnscaledValue(ss_ext_sales_price#7))]
Aggregate Attributes [1]: [sum(UnscaledValue(ss_ext_sales_price#7))#16]
Results [3]: [i_brand_id#9 AS brand_id#17, i_brand#10 AS brand#18, MakeDecimal(sum(UnscaledValue(ss_ext_sales_price#7))#16,17,2) AS ext_price#19]
(21) TakeOrderedAndProject
Input [3]: [brand_id#17, brand#18, ext_price#19]
Arguments: 100, [ext_price#19 DESC NULLS LAST, brand_id#17 ASC NULLS FIRST], [brand_id#17, brand#18, ext_price#19]

View file

@ -0,0 +1,31 @@
TakeOrderedAndProject [brand,brand_id,ext_price]
WholeStageCodegen (4)
HashAggregate [i_brand,i_brand_id,sum] [brand,brand_id,ext_price,sum,sum(UnscaledValue(ss_ext_sales_price))]
InputAdapter
Exchange [i_brand,i_brand_id] #1
WholeStageCodegen (3)
HashAggregate [i_brand,i_brand_id,ss_ext_sales_price] [sum,sum]
Project [i_brand,i_brand_id,ss_ext_sales_price]
BroadcastHashJoin [i_item_sk,ss_item_sk]
Project [ss_ext_sales_price,ss_item_sk]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
InputAdapter
BroadcastExchange #2
WholeStageCodegen (1)
Project [d_date_sk]
Filter [d_date_sk,d_moy,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_moy,d_year]
Filter [ss_item_sk,ss_sold_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_ext_sales_price,ss_item_sk,ss_sold_date_sk]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (2)
Project [i_brand,i_brand_id,i_item_sk]
Filter [i_item_sk,i_manager_id]
ColumnarToRow
InputAdapter
Scan parquet default.item [i_brand,i_brand_id,i_item_sk,i_manager_id]

View file

@ -0,0 +1,122 @@
== Physical Plan ==
TakeOrderedAndProject (21)
+- * HashAggregate (20)
+- Exchange (19)
+- * HashAggregate (18)
+- * Project (17)
+- * BroadcastHashJoin Inner BuildRight (16)
:- * Project (10)
: +- * BroadcastHashJoin Inner BuildRight (9)
: :- * Project (4)
: : +- * Filter (3)
: : +- * ColumnarToRow (2)
: : +- Scan parquet default.date_dim (1)
: +- BroadcastExchange (8)
: +- * Filter (7)
: +- * ColumnarToRow (6)
: +- Scan parquet default.store_sales (5)
+- BroadcastExchange (15)
+- * Project (14)
+- * Filter (13)
+- * ColumnarToRow (12)
+- Scan parquet default.item (11)
(1) Scan parquet default.date_dim
Output [3]: [d_date_sk#1, d_year#2, d_moy#3]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/date_dim]
PushedFilters: [IsNotNull(d_moy), IsNotNull(d_year), EqualTo(d_moy,11), EqualTo(d_year,2001), GreaterThanOrEqual(d_date_sk,2452215), LessThanOrEqual(d_date_sk,2452244), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int,d_moy:int>
(2) ColumnarToRow [codegen id : 3]
Input [3]: [d_date_sk#1, d_year#2, d_moy#3]
(3) Filter [codegen id : 3]
Input [3]: [d_date_sk#1, d_year#2, d_moy#3]
Condition : ((((((isnotnull(d_moy#3) AND isnotnull(d_year#2)) AND (d_moy#3 = 11)) AND (d_year#2 = 2001)) AND (d_date_sk#1 >= 2452215)) AND (d_date_sk#1 <= 2452244)) AND isnotnull(d_date_sk#1))
(4) Project [codegen id : 3]
Output [1]: [d_date_sk#1]
Input [3]: [d_date_sk#1, d_year#2, d_moy#3]
(5) Scan parquet default.store_sales
Output [3]: [ss_sold_date_sk#4, ss_item_sk#5, ss_ext_sales_price#6]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2452215), LessThanOrEqual(ss_sold_date_sk,2452244), IsNotNull(ss_item_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_item_sk:int,ss_ext_sales_price:decimal(7,2)>
(6) ColumnarToRow [codegen id : 1]
Input [3]: [ss_sold_date_sk#4, ss_item_sk#5, ss_ext_sales_price#6]
(7) Filter [codegen id : 1]
Input [3]: [ss_sold_date_sk#4, ss_item_sk#5, ss_ext_sales_price#6]
Condition : (((isnotnull(ss_sold_date_sk#4) AND (ss_sold_date_sk#4 >= 2452215)) AND (ss_sold_date_sk#4 <= 2452244)) AND isnotnull(ss_item_sk#5))
(8) BroadcastExchange
Input [3]: [ss_sold_date_sk#4, ss_item_sk#5, ss_ext_sales_price#6]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#7]
(9) BroadcastHashJoin [codegen id : 3]
Left keys [1]: [d_date_sk#1]
Right keys [1]: [ss_sold_date_sk#4]
Join condition: None
(10) Project [codegen id : 3]
Output [2]: [ss_item_sk#5, ss_ext_sales_price#6]
Input [4]: [d_date_sk#1, ss_sold_date_sk#4, ss_item_sk#5, ss_ext_sales_price#6]
(11) Scan parquet default.item
Output [4]: [i_item_sk#8, i_brand_id#9, i_brand#10, i_manager_id#11]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/item]
PushedFilters: [IsNotNull(i_manager_id), EqualTo(i_manager_id,48), IsNotNull(i_item_sk)]
ReadSchema: struct<i_item_sk:int,i_brand_id:int,i_brand:string,i_manager_id:int>
(12) ColumnarToRow [codegen id : 2]
Input [4]: [i_item_sk#8, i_brand_id#9, i_brand#10, i_manager_id#11]
(13) Filter [codegen id : 2]
Input [4]: [i_item_sk#8, i_brand_id#9, i_brand#10, i_manager_id#11]
Condition : ((isnotnull(i_manager_id#11) AND (i_manager_id#11 = 48)) AND isnotnull(i_item_sk#8))
(14) Project [codegen id : 2]
Output [3]: [i_item_sk#8, i_brand_id#9, i_brand#10]
Input [4]: [i_item_sk#8, i_brand_id#9, i_brand#10, i_manager_id#11]
(15) BroadcastExchange
Input [3]: [i_item_sk#8, i_brand_id#9, i_brand#10]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#12]
(16) BroadcastHashJoin [codegen id : 3]
Left keys [1]: [ss_item_sk#5]
Right keys [1]: [i_item_sk#8]
Join condition: None
(17) Project [codegen id : 3]
Output [3]: [ss_ext_sales_price#6, i_brand_id#9, i_brand#10]
Input [5]: [ss_item_sk#5, ss_ext_sales_price#6, i_item_sk#8, i_brand_id#9, i_brand#10]
(18) HashAggregate [codegen id : 3]
Input [3]: [ss_ext_sales_price#6, i_brand_id#9, i_brand#10]
Keys [2]: [i_brand#10, i_brand_id#9]
Functions [1]: [partial_sum(UnscaledValue(ss_ext_sales_price#6))]
Aggregate Attributes [1]: [sum#13]
Results [3]: [i_brand#10, i_brand_id#9, sum#14]
(19) Exchange
Input [3]: [i_brand#10, i_brand_id#9, sum#14]
Arguments: hashpartitioning(i_brand#10, i_brand_id#9, 5), true, [id=#15]
(20) HashAggregate [codegen id : 4]
Input [3]: [i_brand#10, i_brand_id#9, sum#14]
Keys [2]: [i_brand#10, i_brand_id#9]
Functions [1]: [sum(UnscaledValue(ss_ext_sales_price#6))]
Aggregate Attributes [1]: [sum(UnscaledValue(ss_ext_sales_price#6))#16]
Results [3]: [i_brand_id#9 AS brand_id#17, i_brand#10 AS brand#18, MakeDecimal(sum(UnscaledValue(ss_ext_sales_price#6))#16,17,2) AS ext_price#19]
(21) TakeOrderedAndProject
Input [3]: [brand_id#17, brand#18, ext_price#19]
Arguments: 100, [ext_price#19 DESC NULLS LAST, brand_id#17 ASC NULLS FIRST], [brand_id#17, brand#18, ext_price#19]

View file

@ -0,0 +1,31 @@
TakeOrderedAndProject [brand,brand_id,ext_price]
WholeStageCodegen (4)
HashAggregate [i_brand,i_brand_id,sum] [brand,brand_id,ext_price,sum,sum(UnscaledValue(ss_ext_sales_price))]
InputAdapter
Exchange [i_brand,i_brand_id] #1
WholeStageCodegen (3)
HashAggregate [i_brand,i_brand_id,ss_ext_sales_price] [sum,sum]
Project [i_brand,i_brand_id,ss_ext_sales_price]
BroadcastHashJoin [i_item_sk,ss_item_sk]
Project [ss_ext_sales_price,ss_item_sk]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Project [d_date_sk]
Filter [d_date_sk,d_moy,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_moy,d_year]
InputAdapter
BroadcastExchange #2
WholeStageCodegen (1)
Filter [ss_item_sk,ss_sold_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_ext_sales_price,ss_item_sk,ss_sold_date_sk]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (2)
Project [i_brand,i_brand_id,i_item_sk]
Filter [i_item_sk,i_manager_id]
ColumnarToRow
InputAdapter
Scan parquet default.item [i_brand,i_brand_id,i_item_sk,i_manager_id]

View file

@ -0,0 +1,290 @@
== Physical Plan ==
TakeOrderedAndProject (51)
+- * Project (50)
+- * BroadcastHashJoin Inner BuildRight (49)
:- * Project (25)
: +- * BroadcastHashJoin Inner BuildRight (24)
: :- * Project (18)
: : +- * BroadcastHashJoin Inner BuildRight (17)
: : :- * HashAggregate (12)
: : : +- Exchange (11)
: : : +- * HashAggregate (10)
: : : +- * Project (9)
: : : +- * BroadcastHashJoin Inner BuildRight (8)
: : : :- * Filter (3)
: : : : +- * ColumnarToRow (2)
: : : : +- Scan parquet default.store_sales (1)
: : : +- BroadcastExchange (7)
: : : +- * Filter (6)
: : : +- * ColumnarToRow (5)
: : : +- Scan parquet default.date_dim (4)
: : +- BroadcastExchange (16)
: : +- * Filter (15)
: : +- * ColumnarToRow (14)
: : +- Scan parquet default.store (13)
: +- BroadcastExchange (23)
: +- * Project (22)
: +- * Filter (21)
: +- * ColumnarToRow (20)
: +- Scan parquet default.date_dim (19)
+- BroadcastExchange (48)
+- * Project (47)
+- * BroadcastHashJoin Inner BuildRight (46)
:- * Project (40)
: +- * BroadcastHashJoin Inner BuildRight (39)
: :- * HashAggregate (34)
: : +- Exchange (33)
: : +- * HashAggregate (32)
: : +- * Project (31)
: : +- * BroadcastHashJoin Inner BuildRight (30)
: : :- * Filter (28)
: : : +- * ColumnarToRow (27)
: : : +- Scan parquet default.store_sales (26)
: : +- ReusedExchange (29)
: +- BroadcastExchange (38)
: +- * Filter (37)
: +- * ColumnarToRow (36)
: +- Scan parquet default.store (35)
+- BroadcastExchange (45)
+- * Project (44)
+- * Filter (43)
+- * ColumnarToRow (42)
+- Scan parquet default.date_dim (41)
(1) Scan parquet default.store_sales
Output [3]: [ss_sold_date_sk#1, ss_store_sk#2, ss_sales_price#3]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), IsNotNull(ss_store_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_store_sk:int,ss_sales_price:decimal(7,2)>
(2) ColumnarToRow [codegen id : 2]
Input [3]: [ss_sold_date_sk#1, ss_store_sk#2, ss_sales_price#3]
(3) Filter [codegen id : 2]
Input [3]: [ss_sold_date_sk#1, ss_store_sk#2, ss_sales_price#3]
Condition : (isnotnull(ss_sold_date_sk#1) AND isnotnull(ss_store_sk#2))
(4) Scan parquet default.date_dim
Output [3]: [d_date_sk#4, d_week_seq#5, d_day_name#6]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/date_dim]
PushedFilters: [IsNotNull(d_date_sk), IsNotNull(d_week_seq)]
ReadSchema: struct<d_date_sk:int,d_week_seq:int,d_day_name:string>
(5) ColumnarToRow [codegen id : 1]
Input [3]: [d_date_sk#4, d_week_seq#5, d_day_name#6]
(6) Filter [codegen id : 1]
Input [3]: [d_date_sk#4, d_week_seq#5, d_day_name#6]
Condition : (isnotnull(d_date_sk#4) AND isnotnull(d_week_seq#5))
(7) BroadcastExchange
Input [3]: [d_date_sk#4, d_week_seq#5, d_day_name#6]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#7]
(8) BroadcastHashJoin [codegen id : 2]
Left keys [1]: [ss_sold_date_sk#1]
Right keys [1]: [d_date_sk#4]
Join condition: None
(9) Project [codegen id : 2]
Output [4]: [ss_store_sk#2, ss_sales_price#3, d_week_seq#5, d_day_name#6]
Input [6]: [ss_sold_date_sk#1, ss_store_sk#2, ss_sales_price#3, d_date_sk#4, d_week_seq#5, d_day_name#6]
(10) HashAggregate [codegen id : 2]
Input [4]: [ss_store_sk#2, ss_sales_price#3, d_week_seq#5, d_day_name#6]
Keys [2]: [d_week_seq#5, ss_store_sk#2]
Functions [7]: [partial_sum(UnscaledValue(CASE WHEN (d_day_name#6 = Sunday) THEN ss_sales_price#3 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#6 = Monday) THEN ss_sales_price#3 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#6 = Tuesday) THEN ss_sales_price#3 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#6 = Wednesday) THEN ss_sales_price#3 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#6 = Thursday) THEN ss_sales_price#3 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#6 = Friday) THEN ss_sales_price#3 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#6 = Saturday) THEN ss_sales_price#3 ELSE null END))]
Aggregate Attributes [7]: [sum#8, sum#9, sum#10, sum#11, sum#12, sum#13, sum#14]
Results [9]: [d_week_seq#5, ss_store_sk#2, sum#15, sum#16, sum#17, sum#18, sum#19, sum#20, sum#21]
(11) Exchange
Input [9]: [d_week_seq#5, ss_store_sk#2, sum#15, sum#16, sum#17, sum#18, sum#19, sum#20, sum#21]
Arguments: hashpartitioning(d_week_seq#5, ss_store_sk#2, 5), true, [id=#22]
(12) HashAggregate [codegen id : 10]
Input [9]: [d_week_seq#5, ss_store_sk#2, sum#15, sum#16, sum#17, sum#18, sum#19, sum#20, sum#21]
Keys [2]: [d_week_seq#5, ss_store_sk#2]
Functions [7]: [sum(UnscaledValue(CASE WHEN (d_day_name#6 = Sunday) THEN ss_sales_price#3 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#6 = Monday) THEN ss_sales_price#3 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#6 = Tuesday) THEN ss_sales_price#3 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#6 = Wednesday) THEN ss_sales_price#3 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#6 = Thursday) THEN ss_sales_price#3 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#6 = Friday) THEN ss_sales_price#3 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#6 = Saturday) THEN ss_sales_price#3 ELSE null END))]
Aggregate Attributes [7]: [sum(UnscaledValue(CASE WHEN (d_day_name#6 = Sunday) THEN ss_sales_price#3 ELSE null END))#23, sum(UnscaledValue(CASE WHEN (d_day_name#6 = Monday) THEN ss_sales_price#3 ELSE null END))#24, sum(UnscaledValue(CASE WHEN (d_day_name#6 = Tuesday) THEN ss_sales_price#3 ELSE null END))#25, sum(UnscaledValue(CASE WHEN (d_day_name#6 = Wednesday) THEN ss_sales_price#3 ELSE null END))#26, sum(UnscaledValue(CASE WHEN (d_day_name#6 = Thursday) THEN ss_sales_price#3 ELSE null END))#27, sum(UnscaledValue(CASE WHEN (d_day_name#6 = Friday) THEN ss_sales_price#3 ELSE null END))#28, sum(UnscaledValue(CASE WHEN (d_day_name#6 = Saturday) THEN ss_sales_price#3 ELSE null END))#29]
Results [9]: [d_week_seq#5, ss_store_sk#2, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#6 = Sunday) THEN ss_sales_price#3 ELSE null END))#23,17,2) AS sun_sales#30, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#6 = Monday) THEN ss_sales_price#3 ELSE null END))#24,17,2) AS mon_sales#31, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#6 = Tuesday) THEN ss_sales_price#3 ELSE null END))#25,17,2) AS tue_sales#32, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#6 = Wednesday) THEN ss_sales_price#3 ELSE null END))#26,17,2) AS wed_sales#33, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#6 = Thursday) THEN ss_sales_price#3 ELSE null END))#27,17,2) AS thu_sales#34, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#6 = Friday) THEN ss_sales_price#3 ELSE null END))#28,17,2) AS fri_sales#35, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#6 = Saturday) THEN ss_sales_price#3 ELSE null END))#29,17,2) AS sat_sales#36]
(13) Scan parquet default.store
Output [3]: [s_store_sk#37, s_store_id#38, s_store_name#39]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store]
PushedFilters: [IsNotNull(s_store_sk), IsNotNull(s_store_id)]
ReadSchema: struct<s_store_sk:int,s_store_id:string,s_store_name:string>
(14) ColumnarToRow [codegen id : 3]
Input [3]: [s_store_sk#37, s_store_id#38, s_store_name#39]
(15) Filter [codegen id : 3]
Input [3]: [s_store_sk#37, s_store_id#38, s_store_name#39]
Condition : (isnotnull(s_store_sk#37) AND isnotnull(s_store_id#38))
(16) BroadcastExchange
Input [3]: [s_store_sk#37, s_store_id#38, s_store_name#39]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#40]
(17) BroadcastHashJoin [codegen id : 10]
Left keys [1]: [ss_store_sk#2]
Right keys [1]: [s_store_sk#37]
Join condition: None
(18) Project [codegen id : 10]
Output [10]: [d_week_seq#5, sun_sales#30, mon_sales#31, tue_sales#32, wed_sales#33, thu_sales#34, fri_sales#35, sat_sales#36, s_store_id#38, s_store_name#39]
Input [12]: [d_week_seq#5, ss_store_sk#2, sun_sales#30, mon_sales#31, tue_sales#32, wed_sales#33, thu_sales#34, fri_sales#35, sat_sales#36, s_store_sk#37, s_store_id#38, s_store_name#39]
(19) Scan parquet default.date_dim
Output [2]: [d_month_seq#41, d_week_seq#42]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/date_dim]
PushedFilters: [IsNotNull(d_month_seq), GreaterThanOrEqual(d_month_seq,1185), LessThanOrEqual(d_month_seq,1196), IsNotNull(d_week_seq)]
ReadSchema: struct<d_month_seq:int,d_week_seq:int>
(20) ColumnarToRow [codegen id : 4]
Input [2]: [d_month_seq#41, d_week_seq#42]
(21) Filter [codegen id : 4]
Input [2]: [d_month_seq#41, d_week_seq#42]
Condition : (((isnotnull(d_month_seq#41) AND (d_month_seq#41 >= 1185)) AND (d_month_seq#41 <= 1196)) AND isnotnull(d_week_seq#42))
(22) Project [codegen id : 4]
Output [1]: [d_week_seq#42]
Input [2]: [d_month_seq#41, d_week_seq#42]
(23) BroadcastExchange
Input [1]: [d_week_seq#42]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#43]
(24) BroadcastHashJoin [codegen id : 10]
Left keys [1]: [d_week_seq#5]
Right keys [1]: [d_week_seq#42]
Join condition: None
(25) Project [codegen id : 10]
Output [10]: [s_store_name#39 AS s_store_name1#44, d_week_seq#5 AS d_week_seq1#45, s_store_id#38 AS s_store_id1#46, sun_sales#30 AS sun_sales1#47, mon_sales#31 AS mon_sales1#48, tue_sales#32 AS tue_sales1#49, wed_sales#33 AS wed_sales1#50, thu_sales#34 AS thu_sales1#51, fri_sales#35 AS fri_sales1#52, sat_sales#36 AS sat_sales1#53]
Input [11]: [d_week_seq#5, sun_sales#30, mon_sales#31, tue_sales#32, wed_sales#33, thu_sales#34, fri_sales#35, sat_sales#36, s_store_id#38, s_store_name#39, d_week_seq#42]
(26) Scan parquet default.store_sales
Output [3]: [ss_sold_date_sk#1, ss_store_sk#2, ss_sales_price#3]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), IsNotNull(ss_store_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_store_sk:int,ss_sales_price:decimal(7,2)>
(27) ColumnarToRow [codegen id : 6]
Input [3]: [ss_sold_date_sk#1, ss_store_sk#2, ss_sales_price#3]
(28) Filter [codegen id : 6]
Input [3]: [ss_sold_date_sk#1, ss_store_sk#2, ss_sales_price#3]
Condition : (isnotnull(ss_sold_date_sk#1) AND isnotnull(ss_store_sk#2))
(29) ReusedExchange [Reuses operator id: 7]
Output [3]: [d_date_sk#4, d_week_seq#5, d_day_name#6]
(30) BroadcastHashJoin [codegen id : 6]
Left keys [1]: [ss_sold_date_sk#1]
Right keys [1]: [d_date_sk#4]
Join condition: None
(31) Project [codegen id : 6]
Output [4]: [ss_store_sk#2, ss_sales_price#3, d_week_seq#5, d_day_name#6]
Input [6]: [ss_sold_date_sk#1, ss_store_sk#2, ss_sales_price#3, d_date_sk#4, d_week_seq#5, d_day_name#6]
(32) HashAggregate [codegen id : 6]
Input [4]: [ss_store_sk#2, ss_sales_price#3, d_week_seq#5, d_day_name#6]
Keys [2]: [d_week_seq#5, ss_store_sk#2]
Functions [6]: [partial_sum(UnscaledValue(CASE WHEN (d_day_name#6 = Sunday) THEN ss_sales_price#3 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#6 = Monday) THEN ss_sales_price#3 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#6 = Wednesday) THEN ss_sales_price#3 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#6 = Thursday) THEN ss_sales_price#3 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#6 = Friday) THEN ss_sales_price#3 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#6 = Saturday) THEN ss_sales_price#3 ELSE null END))]
Aggregate Attributes [6]: [sum#54, sum#55, sum#56, sum#57, sum#58, sum#59]
Results [8]: [d_week_seq#5, ss_store_sk#2, sum#60, sum#61, sum#62, sum#63, sum#64, sum#65]
(33) Exchange
Input [8]: [d_week_seq#5, ss_store_sk#2, sum#60, sum#61, sum#62, sum#63, sum#64, sum#65]
Arguments: hashpartitioning(d_week_seq#5, ss_store_sk#2, 5), true, [id=#66]
(34) HashAggregate [codegen id : 9]
Input [8]: [d_week_seq#5, ss_store_sk#2, sum#60, sum#61, sum#62, sum#63, sum#64, sum#65]
Keys [2]: [d_week_seq#5, ss_store_sk#2]
Functions [6]: [sum(UnscaledValue(CASE WHEN (d_day_name#6 = Sunday) THEN ss_sales_price#3 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#6 = Monday) THEN ss_sales_price#3 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#6 = Wednesday) THEN ss_sales_price#3 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#6 = Thursday) THEN ss_sales_price#3 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#6 = Friday) THEN ss_sales_price#3 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#6 = Saturday) THEN ss_sales_price#3 ELSE null END))]
Aggregate Attributes [6]: [sum(UnscaledValue(CASE WHEN (d_day_name#6 = Sunday) THEN ss_sales_price#3 ELSE null END))#67, sum(UnscaledValue(CASE WHEN (d_day_name#6 = Monday) THEN ss_sales_price#3 ELSE null END))#68, sum(UnscaledValue(CASE WHEN (d_day_name#6 = Wednesday) THEN ss_sales_price#3 ELSE null END))#69, sum(UnscaledValue(CASE WHEN (d_day_name#6 = Thursday) THEN ss_sales_price#3 ELSE null END))#70, sum(UnscaledValue(CASE WHEN (d_day_name#6 = Friday) THEN ss_sales_price#3 ELSE null END))#71, sum(UnscaledValue(CASE WHEN (d_day_name#6 = Saturday) THEN ss_sales_price#3 ELSE null END))#72]
Results [8]: [d_week_seq#5, ss_store_sk#2, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#6 = Sunday) THEN ss_sales_price#3 ELSE null END))#67,17,2) AS sun_sales#30, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#6 = Monday) THEN ss_sales_price#3 ELSE null END))#68,17,2) AS mon_sales#31, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#6 = Wednesday) THEN ss_sales_price#3 ELSE null END))#69,17,2) AS wed_sales#33, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#6 = Thursday) THEN ss_sales_price#3 ELSE null END))#70,17,2) AS thu_sales#34, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#6 = Friday) THEN ss_sales_price#3 ELSE null END))#71,17,2) AS fri_sales#35, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#6 = Saturday) THEN ss_sales_price#3 ELSE null END))#72,17,2) AS sat_sales#36]
(35) Scan parquet default.store
Output [2]: [s_store_sk#37, s_store_id#38]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store]
PushedFilters: [IsNotNull(s_store_sk), IsNotNull(s_store_id)]
ReadSchema: struct<s_store_sk:int,s_store_id:string>
(36) ColumnarToRow [codegen id : 7]
Input [2]: [s_store_sk#37, s_store_id#38]
(37) Filter [codegen id : 7]
Input [2]: [s_store_sk#37, s_store_id#38]
Condition : (isnotnull(s_store_sk#37) AND isnotnull(s_store_id#38))
(38) BroadcastExchange
Input [2]: [s_store_sk#37, s_store_id#38]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#73]
(39) BroadcastHashJoin [codegen id : 9]
Left keys [1]: [ss_store_sk#2]
Right keys [1]: [s_store_sk#37]
Join condition: None
(40) Project [codegen id : 9]
Output [8]: [d_week_seq#5, sun_sales#30, mon_sales#31, wed_sales#33, thu_sales#34, fri_sales#35, sat_sales#36, s_store_id#38]
Input [10]: [d_week_seq#5, ss_store_sk#2, sun_sales#30, mon_sales#31, wed_sales#33, thu_sales#34, fri_sales#35, sat_sales#36, s_store_sk#37, s_store_id#38]
(41) Scan parquet default.date_dim
Output [2]: [d_month_seq#74, d_week_seq#75]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/date_dim]
PushedFilters: [IsNotNull(d_month_seq), GreaterThanOrEqual(d_month_seq,1197), LessThanOrEqual(d_month_seq,1208), IsNotNull(d_week_seq)]
ReadSchema: struct<d_month_seq:int,d_week_seq:int>
(42) ColumnarToRow [codegen id : 8]
Input [2]: [d_month_seq#74, d_week_seq#75]
(43) Filter [codegen id : 8]
Input [2]: [d_month_seq#74, d_week_seq#75]
Condition : (((isnotnull(d_month_seq#74) AND (d_month_seq#74 >= 1197)) AND (d_month_seq#74 <= 1208)) AND isnotnull(d_week_seq#75))
(44) Project [codegen id : 8]
Output [1]: [d_week_seq#75]
Input [2]: [d_month_seq#74, d_week_seq#75]
(45) BroadcastExchange
Input [1]: [d_week_seq#75]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#76]
(46) BroadcastHashJoin [codegen id : 9]
Left keys [1]: [d_week_seq#5]
Right keys [1]: [d_week_seq#75]
Join condition: None
(47) Project [codegen id : 9]
Output [8]: [d_week_seq#5 AS d_week_seq2#77, s_store_id#38 AS s_store_id2#78, sun_sales#30 AS sun_sales2#79, mon_sales#31 AS mon_sales2#80, wed_sales#33 AS wed_sales2#81, thu_sales#34 AS thu_sales2#82, fri_sales#35 AS fri_sales2#83, sat_sales#36 AS sat_sales2#84]
Input [9]: [d_week_seq#5, sun_sales#30, mon_sales#31, wed_sales#33, thu_sales#34, fri_sales#35, sat_sales#36, s_store_id#38, d_week_seq#75]
(48) BroadcastExchange
Input [8]: [d_week_seq2#77, s_store_id2#78, sun_sales2#79, mon_sales2#80, wed_sales2#81, thu_sales2#82, fri_sales2#83, sat_sales2#84]
Arguments: HashedRelationBroadcastMode(List(input[1, string, true], (input[0, int, true] - 52)),false), [id=#85]
(49) BroadcastHashJoin [codegen id : 10]
Left keys [2]: [s_store_id1#46, d_week_seq1#45]
Right keys [2]: [s_store_id2#78, (d_week_seq2#77 - 52)]
Join condition: None
(50) Project [codegen id : 10]
Output [10]: [s_store_name1#44, s_store_id1#46, d_week_seq1#45, CheckOverflow((promote_precision(sun_sales1#47) / promote_precision(sun_sales2#79)), DecimalType(37,20), true) AS (sun_sales1 / sun_sales2)#86, CheckOverflow((promote_precision(mon_sales1#48) / promote_precision(mon_sales2#80)), DecimalType(37,20), true) AS (mon_sales1 / mon_sales2)#87, CheckOverflow((promote_precision(tue_sales1#49) / promote_precision(tue_sales1#49)), DecimalType(37,20), true) AS (tue_sales1 / tue_sales1)#88, CheckOverflow((promote_precision(wed_sales1#50) / promote_precision(wed_sales2#81)), DecimalType(37,20), true) AS (wed_sales1 / wed_sales2)#89, CheckOverflow((promote_precision(thu_sales1#51) / promote_precision(thu_sales2#82)), DecimalType(37,20), true) AS (thu_sales1 / thu_sales2)#90, CheckOverflow((promote_precision(fri_sales1#52) / promote_precision(fri_sales2#83)), DecimalType(37,20), true) AS (fri_sales1 / fri_sales2)#91, CheckOverflow((promote_precision(sat_sales1#53) / promote_precision(sat_sales2#84)), DecimalType(37,20), true) AS (sat_sales1 / sat_sales2)#92]
Input [18]: [s_store_name1#44, d_week_seq1#45, s_store_id1#46, sun_sales1#47, mon_sales1#48, tue_sales1#49, wed_sales1#50, thu_sales1#51, fri_sales1#52, sat_sales1#53, d_week_seq2#77, s_store_id2#78, sun_sales2#79, mon_sales2#80, wed_sales2#81, thu_sales2#82, fri_sales2#83, sat_sales2#84]
(51) TakeOrderedAndProject
Input [10]: [s_store_name1#44, s_store_id1#46, d_week_seq1#45, (sun_sales1 / sun_sales2)#86, (mon_sales1 / mon_sales2)#87, (tue_sales1 / tue_sales1)#88, (wed_sales1 / wed_sales2)#89, (thu_sales1 / thu_sales2)#90, (fri_sales1 / fri_sales2)#91, (sat_sales1 / sat_sales2)#92]
Arguments: 100, [s_store_name1#44 ASC NULLS FIRST, s_store_id1#46 ASC NULLS FIRST, d_week_seq1#45 ASC NULLS FIRST], [s_store_name1#44, s_store_id1#46, d_week_seq1#45, (sun_sales1 / sun_sales2)#86, (mon_sales1 / mon_sales2)#87, (tue_sales1 / tue_sales1)#88, (wed_sales1 / wed_sales2)#89, (thu_sales1 / thu_sales2)#90, (fri_sales1 / fri_sales2)#91, (sat_sales1 / sat_sales2)#92]

View file

@ -0,0 +1,76 @@
TakeOrderedAndProject [(fri_sales1 / fri_sales2),(mon_sales1 / mon_sales2),(sat_sales1 / sat_sales2),(sun_sales1 / sun_sales2),(thu_sales1 / thu_sales2),(tue_sales1 / tue_sales1),(wed_sales1 / wed_sales2),d_week_seq1,s_store_id1,s_store_name1]
WholeStageCodegen (10)
Project [d_week_seq1,fri_sales1,fri_sales2,mon_sales1,mon_sales2,s_store_id1,s_store_name1,sat_sales1,sat_sales2,sun_sales1,sun_sales2,thu_sales1,thu_sales2,tue_sales1,wed_sales1,wed_sales2]
BroadcastHashJoin [d_week_seq1,d_week_seq2,s_store_id1,s_store_id2]
Project [d_week_seq,fri_sales,mon_sales,s_store_id,s_store_name,sat_sales,sun_sales,thu_sales,tue_sales,wed_sales]
BroadcastHashJoin [d_week_seq,d_week_seq]
Project [d_week_seq,fri_sales,mon_sales,s_store_id,s_store_name,sat_sales,sun_sales,thu_sales,tue_sales,wed_sales]
BroadcastHashJoin [s_store_sk,ss_store_sk]
HashAggregate [d_week_seq,ss_store_sk,sum,sum,sum,sum,sum,sum,sum] [fri_sales,mon_sales,sat_sales,sum,sum,sum,sum,sum,sum,sum,sum(UnscaledValue(CASE WHEN (d_day_name = Friday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Monday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Saturday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Sunday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Thursday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Tuesday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Wednesday) THEN ss_sales_price ELSE null END)),sun_sales,thu_sales,tue_sales,wed_sales]
InputAdapter
Exchange [d_week_seq,ss_store_sk] #1
WholeStageCodegen (2)
HashAggregate [d_day_name,d_week_seq,ss_sales_price,ss_store_sk] [sum,sum,sum,sum,sum,sum,sum,sum,sum,sum,sum,sum,sum,sum]
Project [d_day_name,d_week_seq,ss_sales_price,ss_store_sk]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Filter [ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_sales_price,ss_sold_date_sk,ss_store_sk]
InputAdapter
BroadcastExchange #2
WholeStageCodegen (1)
Filter [d_date_sk,d_week_seq]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_day_name,d_week_seq]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (3)
Filter [s_store_id,s_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store [s_store_id,s_store_name,s_store_sk]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (4)
Project [d_week_seq]
Filter [d_month_seq,d_week_seq]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_month_seq,d_week_seq]
InputAdapter
BroadcastExchange #5
WholeStageCodegen (9)
Project [d_week_seq,fri_sales,mon_sales,s_store_id,sat_sales,sun_sales,thu_sales,wed_sales]
BroadcastHashJoin [d_week_seq,d_week_seq]
Project [d_week_seq,fri_sales,mon_sales,s_store_id,sat_sales,sun_sales,thu_sales,wed_sales]
BroadcastHashJoin [s_store_sk,ss_store_sk]
HashAggregate [d_week_seq,ss_store_sk,sum,sum,sum,sum,sum,sum] [fri_sales,mon_sales,sat_sales,sum,sum,sum,sum,sum,sum,sum(UnscaledValue(CASE WHEN (d_day_name = Friday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Monday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Saturday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Sunday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Thursday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Wednesday) THEN ss_sales_price ELSE null END)),sun_sales,thu_sales,wed_sales]
InputAdapter
Exchange [d_week_seq,ss_store_sk] #6
WholeStageCodegen (6)
HashAggregate [d_day_name,d_week_seq,ss_sales_price,ss_store_sk] [sum,sum,sum,sum,sum,sum,sum,sum,sum,sum,sum,sum]
Project [d_day_name,d_week_seq,ss_sales_price,ss_store_sk]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Filter [ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_sales_price,ss_sold_date_sk,ss_store_sk]
InputAdapter
ReusedExchange [d_date_sk,d_day_name,d_week_seq] #2
InputAdapter
BroadcastExchange #7
WholeStageCodegen (7)
Filter [s_store_id,s_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store [s_store_id,s_store_sk]
InputAdapter
BroadcastExchange #8
WholeStageCodegen (8)
Project [d_week_seq]
Filter [d_month_seq,d_week_seq]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_month_seq,d_week_seq]

View file

@ -0,0 +1,290 @@
== Physical Plan ==
TakeOrderedAndProject (51)
+- * Project (50)
+- * BroadcastHashJoin Inner BuildRight (49)
:- * Project (25)
: +- * BroadcastHashJoin Inner BuildRight (24)
: :- * Project (18)
: : +- * BroadcastHashJoin Inner BuildRight (17)
: : :- * HashAggregate (12)
: : : +- Exchange (11)
: : : +- * HashAggregate (10)
: : : +- * Project (9)
: : : +- * BroadcastHashJoin Inner BuildRight (8)
: : : :- * Filter (3)
: : : : +- * ColumnarToRow (2)
: : : : +- Scan parquet default.store_sales (1)
: : : +- BroadcastExchange (7)
: : : +- * Filter (6)
: : : +- * ColumnarToRow (5)
: : : +- Scan parquet default.date_dim (4)
: : +- BroadcastExchange (16)
: : +- * Filter (15)
: : +- * ColumnarToRow (14)
: : +- Scan parquet default.store (13)
: +- BroadcastExchange (23)
: +- * Project (22)
: +- * Filter (21)
: +- * ColumnarToRow (20)
: +- Scan parquet default.date_dim (19)
+- BroadcastExchange (48)
+- * Project (47)
+- * BroadcastHashJoin Inner BuildRight (46)
:- * Project (40)
: +- * BroadcastHashJoin Inner BuildRight (39)
: :- * HashAggregate (34)
: : +- Exchange (33)
: : +- * HashAggregate (32)
: : +- * Project (31)
: : +- * BroadcastHashJoin Inner BuildRight (30)
: : :- * Filter (28)
: : : +- * ColumnarToRow (27)
: : : +- Scan parquet default.store_sales (26)
: : +- ReusedExchange (29)
: +- BroadcastExchange (38)
: +- * Filter (37)
: +- * ColumnarToRow (36)
: +- Scan parquet default.store (35)
+- BroadcastExchange (45)
+- * Project (44)
+- * Filter (43)
+- * ColumnarToRow (42)
+- Scan parquet default.date_dim (41)
(1) Scan parquet default.store_sales
Output [3]: [ss_sold_date_sk#1, ss_store_sk#2, ss_sales_price#3]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), IsNotNull(ss_store_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_store_sk:int,ss_sales_price:decimal(7,2)>
(2) ColumnarToRow [codegen id : 2]
Input [3]: [ss_sold_date_sk#1, ss_store_sk#2, ss_sales_price#3]
(3) Filter [codegen id : 2]
Input [3]: [ss_sold_date_sk#1, ss_store_sk#2, ss_sales_price#3]
Condition : (isnotnull(ss_sold_date_sk#1) AND isnotnull(ss_store_sk#2))
(4) Scan parquet default.date_dim
Output [3]: [d_date_sk#4, d_week_seq#5, d_day_name#6]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/date_dim]
PushedFilters: [IsNotNull(d_date_sk), IsNotNull(d_week_seq)]
ReadSchema: struct<d_date_sk:int,d_week_seq:int,d_day_name:string>
(5) ColumnarToRow [codegen id : 1]
Input [3]: [d_date_sk#4, d_week_seq#5, d_day_name#6]
(6) Filter [codegen id : 1]
Input [3]: [d_date_sk#4, d_week_seq#5, d_day_name#6]
Condition : (isnotnull(d_date_sk#4) AND isnotnull(d_week_seq#5))
(7) BroadcastExchange
Input [3]: [d_date_sk#4, d_week_seq#5, d_day_name#6]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#7]
(8) BroadcastHashJoin [codegen id : 2]
Left keys [1]: [ss_sold_date_sk#1]
Right keys [1]: [d_date_sk#4]
Join condition: None
(9) Project [codegen id : 2]
Output [4]: [ss_store_sk#2, ss_sales_price#3, d_week_seq#5, d_day_name#6]
Input [6]: [ss_sold_date_sk#1, ss_store_sk#2, ss_sales_price#3, d_date_sk#4, d_week_seq#5, d_day_name#6]
(10) HashAggregate [codegen id : 2]
Input [4]: [ss_store_sk#2, ss_sales_price#3, d_week_seq#5, d_day_name#6]
Keys [2]: [d_week_seq#5, ss_store_sk#2]
Functions [7]: [partial_sum(UnscaledValue(CASE WHEN (d_day_name#6 = Sunday) THEN ss_sales_price#3 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#6 = Monday) THEN ss_sales_price#3 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#6 = Tuesday) THEN ss_sales_price#3 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#6 = Wednesday) THEN ss_sales_price#3 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#6 = Thursday) THEN ss_sales_price#3 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#6 = Friday) THEN ss_sales_price#3 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#6 = Saturday) THEN ss_sales_price#3 ELSE null END))]
Aggregate Attributes [7]: [sum#8, sum#9, sum#10, sum#11, sum#12, sum#13, sum#14]
Results [9]: [d_week_seq#5, ss_store_sk#2, sum#15, sum#16, sum#17, sum#18, sum#19, sum#20, sum#21]
(11) Exchange
Input [9]: [d_week_seq#5, ss_store_sk#2, sum#15, sum#16, sum#17, sum#18, sum#19, sum#20, sum#21]
Arguments: hashpartitioning(d_week_seq#5, ss_store_sk#2, 5), true, [id=#22]
(12) HashAggregate [codegen id : 10]
Input [9]: [d_week_seq#5, ss_store_sk#2, sum#15, sum#16, sum#17, sum#18, sum#19, sum#20, sum#21]
Keys [2]: [d_week_seq#5, ss_store_sk#2]
Functions [7]: [sum(UnscaledValue(CASE WHEN (d_day_name#6 = Sunday) THEN ss_sales_price#3 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#6 = Monday) THEN ss_sales_price#3 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#6 = Tuesday) THEN ss_sales_price#3 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#6 = Wednesday) THEN ss_sales_price#3 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#6 = Thursday) THEN ss_sales_price#3 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#6 = Friday) THEN ss_sales_price#3 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#6 = Saturday) THEN ss_sales_price#3 ELSE null END))]
Aggregate Attributes [7]: [sum(UnscaledValue(CASE WHEN (d_day_name#6 = Sunday) THEN ss_sales_price#3 ELSE null END))#23, sum(UnscaledValue(CASE WHEN (d_day_name#6 = Monday) THEN ss_sales_price#3 ELSE null END))#24, sum(UnscaledValue(CASE WHEN (d_day_name#6 = Tuesday) THEN ss_sales_price#3 ELSE null END))#25, sum(UnscaledValue(CASE WHEN (d_day_name#6 = Wednesday) THEN ss_sales_price#3 ELSE null END))#26, sum(UnscaledValue(CASE WHEN (d_day_name#6 = Thursday) THEN ss_sales_price#3 ELSE null END))#27, sum(UnscaledValue(CASE WHEN (d_day_name#6 = Friday) THEN ss_sales_price#3 ELSE null END))#28, sum(UnscaledValue(CASE WHEN (d_day_name#6 = Saturday) THEN ss_sales_price#3 ELSE null END))#29]
Results [9]: [d_week_seq#5, ss_store_sk#2, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#6 = Sunday) THEN ss_sales_price#3 ELSE null END))#23,17,2) AS sun_sales#30, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#6 = Monday) THEN ss_sales_price#3 ELSE null END))#24,17,2) AS mon_sales#31, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#6 = Tuesday) THEN ss_sales_price#3 ELSE null END))#25,17,2) AS tue_sales#32, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#6 = Wednesday) THEN ss_sales_price#3 ELSE null END))#26,17,2) AS wed_sales#33, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#6 = Thursday) THEN ss_sales_price#3 ELSE null END))#27,17,2) AS thu_sales#34, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#6 = Friday) THEN ss_sales_price#3 ELSE null END))#28,17,2) AS fri_sales#35, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#6 = Saturday) THEN ss_sales_price#3 ELSE null END))#29,17,2) AS sat_sales#36]
(13) Scan parquet default.store
Output [3]: [s_store_sk#37, s_store_id#38, s_store_name#39]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store]
PushedFilters: [IsNotNull(s_store_sk), IsNotNull(s_store_id)]
ReadSchema: struct<s_store_sk:int,s_store_id:string,s_store_name:string>
(14) ColumnarToRow [codegen id : 3]
Input [3]: [s_store_sk#37, s_store_id#38, s_store_name#39]
(15) Filter [codegen id : 3]
Input [3]: [s_store_sk#37, s_store_id#38, s_store_name#39]
Condition : (isnotnull(s_store_sk#37) AND isnotnull(s_store_id#38))
(16) BroadcastExchange
Input [3]: [s_store_sk#37, s_store_id#38, s_store_name#39]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#40]
(17) BroadcastHashJoin [codegen id : 10]
Left keys [1]: [ss_store_sk#2]
Right keys [1]: [s_store_sk#37]
Join condition: None
(18) Project [codegen id : 10]
Output [10]: [d_week_seq#5, sun_sales#30, mon_sales#31, tue_sales#32, wed_sales#33, thu_sales#34, fri_sales#35, sat_sales#36, s_store_id#38, s_store_name#39]
Input [12]: [d_week_seq#5, ss_store_sk#2, sun_sales#30, mon_sales#31, tue_sales#32, wed_sales#33, thu_sales#34, fri_sales#35, sat_sales#36, s_store_sk#37, s_store_id#38, s_store_name#39]
(19) Scan parquet default.date_dim
Output [2]: [d_month_seq#41, d_week_seq#42]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/date_dim]
PushedFilters: [IsNotNull(d_month_seq), GreaterThanOrEqual(d_month_seq,1185), LessThanOrEqual(d_month_seq,1196), IsNotNull(d_week_seq)]
ReadSchema: struct<d_month_seq:int,d_week_seq:int>
(20) ColumnarToRow [codegen id : 4]
Input [2]: [d_month_seq#41, d_week_seq#42]
(21) Filter [codegen id : 4]
Input [2]: [d_month_seq#41, d_week_seq#42]
Condition : (((isnotnull(d_month_seq#41) AND (d_month_seq#41 >= 1185)) AND (d_month_seq#41 <= 1196)) AND isnotnull(d_week_seq#42))
(22) Project [codegen id : 4]
Output [1]: [d_week_seq#42]
Input [2]: [d_month_seq#41, d_week_seq#42]
(23) BroadcastExchange
Input [1]: [d_week_seq#42]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#43]
(24) BroadcastHashJoin [codegen id : 10]
Left keys [1]: [d_week_seq#5]
Right keys [1]: [d_week_seq#42]
Join condition: None
(25) Project [codegen id : 10]
Output [10]: [s_store_name#39 AS s_store_name1#44, d_week_seq#5 AS d_week_seq1#45, s_store_id#38 AS s_store_id1#46, sun_sales#30 AS sun_sales1#47, mon_sales#31 AS mon_sales1#48, tue_sales#32 AS tue_sales1#49, wed_sales#33 AS wed_sales1#50, thu_sales#34 AS thu_sales1#51, fri_sales#35 AS fri_sales1#52, sat_sales#36 AS sat_sales1#53]
Input [11]: [d_week_seq#5, sun_sales#30, mon_sales#31, tue_sales#32, wed_sales#33, thu_sales#34, fri_sales#35, sat_sales#36, s_store_id#38, s_store_name#39, d_week_seq#42]
(26) Scan parquet default.store_sales
Output [3]: [ss_sold_date_sk#1, ss_store_sk#2, ss_sales_price#3]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), IsNotNull(ss_store_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_store_sk:int,ss_sales_price:decimal(7,2)>
(27) ColumnarToRow [codegen id : 6]
Input [3]: [ss_sold_date_sk#1, ss_store_sk#2, ss_sales_price#3]
(28) Filter [codegen id : 6]
Input [3]: [ss_sold_date_sk#1, ss_store_sk#2, ss_sales_price#3]
Condition : (isnotnull(ss_sold_date_sk#1) AND isnotnull(ss_store_sk#2))
(29) ReusedExchange [Reuses operator id: 7]
Output [3]: [d_date_sk#4, d_week_seq#5, d_day_name#6]
(30) BroadcastHashJoin [codegen id : 6]
Left keys [1]: [ss_sold_date_sk#1]
Right keys [1]: [d_date_sk#4]
Join condition: None
(31) Project [codegen id : 6]
Output [4]: [ss_store_sk#2, ss_sales_price#3, d_week_seq#5, d_day_name#6]
Input [6]: [ss_sold_date_sk#1, ss_store_sk#2, ss_sales_price#3, d_date_sk#4, d_week_seq#5, d_day_name#6]
(32) HashAggregate [codegen id : 6]
Input [4]: [ss_store_sk#2, ss_sales_price#3, d_week_seq#5, d_day_name#6]
Keys [2]: [d_week_seq#5, ss_store_sk#2]
Functions [6]: [partial_sum(UnscaledValue(CASE WHEN (d_day_name#6 = Sunday) THEN ss_sales_price#3 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#6 = Monday) THEN ss_sales_price#3 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#6 = Wednesday) THEN ss_sales_price#3 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#6 = Thursday) THEN ss_sales_price#3 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#6 = Friday) THEN ss_sales_price#3 ELSE null END)), partial_sum(UnscaledValue(CASE WHEN (d_day_name#6 = Saturday) THEN ss_sales_price#3 ELSE null END))]
Aggregate Attributes [6]: [sum#54, sum#55, sum#56, sum#57, sum#58, sum#59]
Results [8]: [d_week_seq#5, ss_store_sk#2, sum#60, sum#61, sum#62, sum#63, sum#64, sum#65]
(33) Exchange
Input [8]: [d_week_seq#5, ss_store_sk#2, sum#60, sum#61, sum#62, sum#63, sum#64, sum#65]
Arguments: hashpartitioning(d_week_seq#5, ss_store_sk#2, 5), true, [id=#66]
(34) HashAggregate [codegen id : 9]
Input [8]: [d_week_seq#5, ss_store_sk#2, sum#60, sum#61, sum#62, sum#63, sum#64, sum#65]
Keys [2]: [d_week_seq#5, ss_store_sk#2]
Functions [6]: [sum(UnscaledValue(CASE WHEN (d_day_name#6 = Sunday) THEN ss_sales_price#3 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#6 = Monday) THEN ss_sales_price#3 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#6 = Wednesday) THEN ss_sales_price#3 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#6 = Thursday) THEN ss_sales_price#3 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#6 = Friday) THEN ss_sales_price#3 ELSE null END)), sum(UnscaledValue(CASE WHEN (d_day_name#6 = Saturday) THEN ss_sales_price#3 ELSE null END))]
Aggregate Attributes [6]: [sum(UnscaledValue(CASE WHEN (d_day_name#6 = Sunday) THEN ss_sales_price#3 ELSE null END))#67, sum(UnscaledValue(CASE WHEN (d_day_name#6 = Monday) THEN ss_sales_price#3 ELSE null END))#68, sum(UnscaledValue(CASE WHEN (d_day_name#6 = Wednesday) THEN ss_sales_price#3 ELSE null END))#69, sum(UnscaledValue(CASE WHEN (d_day_name#6 = Thursday) THEN ss_sales_price#3 ELSE null END))#70, sum(UnscaledValue(CASE WHEN (d_day_name#6 = Friday) THEN ss_sales_price#3 ELSE null END))#71, sum(UnscaledValue(CASE WHEN (d_day_name#6 = Saturday) THEN ss_sales_price#3 ELSE null END))#72]
Results [8]: [d_week_seq#5, ss_store_sk#2, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#6 = Sunday) THEN ss_sales_price#3 ELSE null END))#67,17,2) AS sun_sales#30, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#6 = Monday) THEN ss_sales_price#3 ELSE null END))#68,17,2) AS mon_sales#31, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#6 = Wednesday) THEN ss_sales_price#3 ELSE null END))#69,17,2) AS wed_sales#33, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#6 = Thursday) THEN ss_sales_price#3 ELSE null END))#70,17,2) AS thu_sales#34, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#6 = Friday) THEN ss_sales_price#3 ELSE null END))#71,17,2) AS fri_sales#35, MakeDecimal(sum(UnscaledValue(CASE WHEN (d_day_name#6 = Saturday) THEN ss_sales_price#3 ELSE null END))#72,17,2) AS sat_sales#36]
(35) Scan parquet default.store
Output [2]: [s_store_sk#37, s_store_id#38]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store]
PushedFilters: [IsNotNull(s_store_sk), IsNotNull(s_store_id)]
ReadSchema: struct<s_store_sk:int,s_store_id:string>
(36) ColumnarToRow [codegen id : 7]
Input [2]: [s_store_sk#37, s_store_id#38]
(37) Filter [codegen id : 7]
Input [2]: [s_store_sk#37, s_store_id#38]
Condition : (isnotnull(s_store_sk#37) AND isnotnull(s_store_id#38))
(38) BroadcastExchange
Input [2]: [s_store_sk#37, s_store_id#38]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#73]
(39) BroadcastHashJoin [codegen id : 9]
Left keys [1]: [ss_store_sk#2]
Right keys [1]: [s_store_sk#37]
Join condition: None
(40) Project [codegen id : 9]
Output [8]: [d_week_seq#5, sun_sales#30, mon_sales#31, wed_sales#33, thu_sales#34, fri_sales#35, sat_sales#36, s_store_id#38]
Input [10]: [d_week_seq#5, ss_store_sk#2, sun_sales#30, mon_sales#31, wed_sales#33, thu_sales#34, fri_sales#35, sat_sales#36, s_store_sk#37, s_store_id#38]
(41) Scan parquet default.date_dim
Output [2]: [d_month_seq#74, d_week_seq#75]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/date_dim]
PushedFilters: [IsNotNull(d_month_seq), GreaterThanOrEqual(d_month_seq,1197), LessThanOrEqual(d_month_seq,1208), IsNotNull(d_week_seq)]
ReadSchema: struct<d_month_seq:int,d_week_seq:int>
(42) ColumnarToRow [codegen id : 8]
Input [2]: [d_month_seq#74, d_week_seq#75]
(43) Filter [codegen id : 8]
Input [2]: [d_month_seq#74, d_week_seq#75]
Condition : (((isnotnull(d_month_seq#74) AND (d_month_seq#74 >= 1197)) AND (d_month_seq#74 <= 1208)) AND isnotnull(d_week_seq#75))
(44) Project [codegen id : 8]
Output [1]: [d_week_seq#75]
Input [2]: [d_month_seq#74, d_week_seq#75]
(45) BroadcastExchange
Input [1]: [d_week_seq#75]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#76]
(46) BroadcastHashJoin [codegen id : 9]
Left keys [1]: [d_week_seq#5]
Right keys [1]: [d_week_seq#75]
Join condition: None
(47) Project [codegen id : 9]
Output [8]: [d_week_seq#5 AS d_week_seq2#77, s_store_id#38 AS s_store_id2#78, sun_sales#30 AS sun_sales2#79, mon_sales#31 AS mon_sales2#80, wed_sales#33 AS wed_sales2#81, thu_sales#34 AS thu_sales2#82, fri_sales#35 AS fri_sales2#83, sat_sales#36 AS sat_sales2#84]
Input [9]: [d_week_seq#5, sun_sales#30, mon_sales#31, wed_sales#33, thu_sales#34, fri_sales#35, sat_sales#36, s_store_id#38, d_week_seq#75]
(48) BroadcastExchange
Input [8]: [d_week_seq2#77, s_store_id2#78, sun_sales2#79, mon_sales2#80, wed_sales2#81, thu_sales2#82, fri_sales2#83, sat_sales2#84]
Arguments: HashedRelationBroadcastMode(List(input[1, string, true], (input[0, int, true] - 52)),false), [id=#85]
(49) BroadcastHashJoin [codegen id : 10]
Left keys [2]: [s_store_id1#46, d_week_seq1#45]
Right keys [2]: [s_store_id2#78, (d_week_seq2#77 - 52)]
Join condition: None
(50) Project [codegen id : 10]
Output [10]: [s_store_name1#44, s_store_id1#46, d_week_seq1#45, CheckOverflow((promote_precision(sun_sales1#47) / promote_precision(sun_sales2#79)), DecimalType(37,20), true) AS (sun_sales1 / sun_sales2)#86, CheckOverflow((promote_precision(mon_sales1#48) / promote_precision(mon_sales2#80)), DecimalType(37,20), true) AS (mon_sales1 / mon_sales2)#87, CheckOverflow((promote_precision(tue_sales1#49) / promote_precision(tue_sales1#49)), DecimalType(37,20), true) AS (tue_sales1 / tue_sales1)#88, CheckOverflow((promote_precision(wed_sales1#50) / promote_precision(wed_sales2#81)), DecimalType(37,20), true) AS (wed_sales1 / wed_sales2)#89, CheckOverflow((promote_precision(thu_sales1#51) / promote_precision(thu_sales2#82)), DecimalType(37,20), true) AS (thu_sales1 / thu_sales2)#90, CheckOverflow((promote_precision(fri_sales1#52) / promote_precision(fri_sales2#83)), DecimalType(37,20), true) AS (fri_sales1 / fri_sales2)#91, CheckOverflow((promote_precision(sat_sales1#53) / promote_precision(sat_sales2#84)), DecimalType(37,20), true) AS (sat_sales1 / sat_sales2)#92]
Input [18]: [s_store_name1#44, d_week_seq1#45, s_store_id1#46, sun_sales1#47, mon_sales1#48, tue_sales1#49, wed_sales1#50, thu_sales1#51, fri_sales1#52, sat_sales1#53, d_week_seq2#77, s_store_id2#78, sun_sales2#79, mon_sales2#80, wed_sales2#81, thu_sales2#82, fri_sales2#83, sat_sales2#84]
(51) TakeOrderedAndProject
Input [10]: [s_store_name1#44, s_store_id1#46, d_week_seq1#45, (sun_sales1 / sun_sales2)#86, (mon_sales1 / mon_sales2)#87, (tue_sales1 / tue_sales1)#88, (wed_sales1 / wed_sales2)#89, (thu_sales1 / thu_sales2)#90, (fri_sales1 / fri_sales2)#91, (sat_sales1 / sat_sales2)#92]
Arguments: 100, [s_store_name1#44 ASC NULLS FIRST, s_store_id1#46 ASC NULLS FIRST, d_week_seq1#45 ASC NULLS FIRST], [s_store_name1#44, s_store_id1#46, d_week_seq1#45, (sun_sales1 / sun_sales2)#86, (mon_sales1 / mon_sales2)#87, (tue_sales1 / tue_sales1)#88, (wed_sales1 / wed_sales2)#89, (thu_sales1 / thu_sales2)#90, (fri_sales1 / fri_sales2)#91, (sat_sales1 / sat_sales2)#92]

View file

@ -0,0 +1,76 @@
TakeOrderedAndProject [(fri_sales1 / fri_sales2),(mon_sales1 / mon_sales2),(sat_sales1 / sat_sales2),(sun_sales1 / sun_sales2),(thu_sales1 / thu_sales2),(tue_sales1 / tue_sales1),(wed_sales1 / wed_sales2),d_week_seq1,s_store_id1,s_store_name1]
WholeStageCodegen (10)
Project [d_week_seq1,fri_sales1,fri_sales2,mon_sales1,mon_sales2,s_store_id1,s_store_name1,sat_sales1,sat_sales2,sun_sales1,sun_sales2,thu_sales1,thu_sales2,tue_sales1,wed_sales1,wed_sales2]
BroadcastHashJoin [d_week_seq1,d_week_seq2,s_store_id1,s_store_id2]
Project [d_week_seq,fri_sales,mon_sales,s_store_id,s_store_name,sat_sales,sun_sales,thu_sales,tue_sales,wed_sales]
BroadcastHashJoin [d_week_seq,d_week_seq]
Project [d_week_seq,fri_sales,mon_sales,s_store_id,s_store_name,sat_sales,sun_sales,thu_sales,tue_sales,wed_sales]
BroadcastHashJoin [s_store_sk,ss_store_sk]
HashAggregate [d_week_seq,ss_store_sk,sum,sum,sum,sum,sum,sum,sum] [fri_sales,mon_sales,sat_sales,sum,sum,sum,sum,sum,sum,sum,sum(UnscaledValue(CASE WHEN (d_day_name = Friday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Monday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Saturday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Sunday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Thursday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Tuesday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Wednesday) THEN ss_sales_price ELSE null END)),sun_sales,thu_sales,tue_sales,wed_sales]
InputAdapter
Exchange [d_week_seq,ss_store_sk] #1
WholeStageCodegen (2)
HashAggregate [d_day_name,d_week_seq,ss_sales_price,ss_store_sk] [sum,sum,sum,sum,sum,sum,sum,sum,sum,sum,sum,sum,sum,sum]
Project [d_day_name,d_week_seq,ss_sales_price,ss_store_sk]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Filter [ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_sales_price,ss_sold_date_sk,ss_store_sk]
InputAdapter
BroadcastExchange #2
WholeStageCodegen (1)
Filter [d_date_sk,d_week_seq]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_day_name,d_week_seq]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (3)
Filter [s_store_id,s_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store [s_store_id,s_store_name,s_store_sk]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (4)
Project [d_week_seq]
Filter [d_month_seq,d_week_seq]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_month_seq,d_week_seq]
InputAdapter
BroadcastExchange #5
WholeStageCodegen (9)
Project [d_week_seq,fri_sales,mon_sales,s_store_id,sat_sales,sun_sales,thu_sales,wed_sales]
BroadcastHashJoin [d_week_seq,d_week_seq]
Project [d_week_seq,fri_sales,mon_sales,s_store_id,sat_sales,sun_sales,thu_sales,wed_sales]
BroadcastHashJoin [s_store_sk,ss_store_sk]
HashAggregate [d_week_seq,ss_store_sk,sum,sum,sum,sum,sum,sum] [fri_sales,mon_sales,sat_sales,sum,sum,sum,sum,sum,sum,sum(UnscaledValue(CASE WHEN (d_day_name = Friday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Monday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Saturday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Sunday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Thursday) THEN ss_sales_price ELSE null END)),sum(UnscaledValue(CASE WHEN (d_day_name = Wednesday) THEN ss_sales_price ELSE null END)),sun_sales,thu_sales,wed_sales]
InputAdapter
Exchange [d_week_seq,ss_store_sk] #6
WholeStageCodegen (6)
HashAggregate [d_day_name,d_week_seq,ss_sales_price,ss_store_sk] [sum,sum,sum,sum,sum,sum,sum,sum,sum,sum,sum,sum]
Project [d_day_name,d_week_seq,ss_sales_price,ss_store_sk]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Filter [ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_sales_price,ss_sold_date_sk,ss_store_sk]
InputAdapter
ReusedExchange [d_date_sk,d_day_name,d_week_seq] #2
InputAdapter
BroadcastExchange #7
WholeStageCodegen (7)
Filter [s_store_id,s_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store [s_store_id,s_store_sk]
InputAdapter
BroadcastExchange #8
WholeStageCodegen (8)
Project [d_week_seq]
Filter [d_month_seq,d_week_seq]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_month_seq,d_week_seq]

View file

@ -0,0 +1,180 @@
== Physical Plan ==
TakeOrderedAndProject (32)
+- * Project (31)
+- * Filter (30)
+- Window (29)
+- * Sort (28)
+- Exchange (27)
+- * HashAggregate (26)
+- Exchange (25)
+- * HashAggregate (24)
+- * Project (23)
+- * BroadcastHashJoin Inner BuildRight (22)
:- * Project (16)
: +- * BroadcastHashJoin Inner BuildRight (15)
: :- * Project (10)
: : +- * BroadcastHashJoin Inner BuildLeft (9)
: : :- BroadcastExchange (5)
: : : +- * Project (4)
: : : +- * Filter (3)
: : : +- * ColumnarToRow (2)
: : : +- Scan parquet default.item (1)
: : +- * Filter (8)
: : +- * ColumnarToRow (7)
: : +- Scan parquet default.store_sales (6)
: +- BroadcastExchange (14)
: +- * Filter (13)
: +- * ColumnarToRow (12)
: +- Scan parquet default.store (11)
+- BroadcastExchange (21)
+- * Project (20)
+- * Filter (19)
+- * ColumnarToRow (18)
+- Scan parquet default.date_dim (17)
(1) Scan parquet default.item
Output [5]: [i_item_sk#1, i_brand#2, i_class#3, i_category#4, i_manager_id#5]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/item]
PushedFilters: [Or(And(And(In(i_category, [Books,Children,Electronics]),In(i_class, [personal,portable,reference,self-help])),In(i_brand, [scholaramalgamalg #6,scholaramalgamalg #7,exportiunivamalg #8,scholaramalgamalg #8])),And(And(In(i_category, [Women,Music,Men]),In(i_class, [accessories,classical,fragrances,pants])),In(i_brand, [amalgimporto #9,edu packscholar #9,exportiimporto #9,importoamalg #9]))), IsNotNull(i_item_sk)]
ReadSchema: struct<i_item_sk:int,i_brand:string,i_class:string,i_category:string,i_manager_id:int>
(2) ColumnarToRow [codegen id : 1]
Input [5]: [i_item_sk#1, i_brand#2, i_class#3, i_category#4, i_manager_id#5]
(3) Filter [codegen id : 1]
Input [5]: [i_item_sk#1, i_brand#2, i_class#3, i_category#4, i_manager_id#5]
Condition : ((((i_category#4 IN (Books,Children,Electronics) AND i_class#3 IN (personal,portable,reference,self-help)) AND i_brand#2 IN (scholaramalgamalg #6,scholaramalgamalg #7,exportiunivamalg #8,scholaramalgamalg #8)) OR ((i_category#4 IN (Women,Music,Men) AND i_class#3 IN (accessories,classical,fragrances,pants)) AND i_brand#2 IN (amalgimporto #9,edu packscholar #9,exportiimporto #9,importoamalg #9))) AND isnotnull(i_item_sk#1))
(4) Project [codegen id : 1]
Output [2]: [i_item_sk#1, i_manager_id#5]
Input [5]: [i_item_sk#1, i_brand#2, i_class#3, i_category#4, i_manager_id#5]
(5) BroadcastExchange
Input [2]: [i_item_sk#1, i_manager_id#5]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#10]
(6) Scan parquet default.store_sales
Output [4]: [ss_sold_date_sk#11, ss_item_sk#12, ss_store_sk#13, ss_sales_price#14]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2452123), LessThanOrEqual(ss_sold_date_sk,2452487), IsNotNull(ss_item_sk), IsNotNull(ss_store_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_item_sk:int,ss_store_sk:int,ss_sales_price:decimal(7,2)>
(7) ColumnarToRow
Input [4]: [ss_sold_date_sk#11, ss_item_sk#12, ss_store_sk#13, ss_sales_price#14]
(8) Filter
Input [4]: [ss_sold_date_sk#11, ss_item_sk#12, ss_store_sk#13, ss_sales_price#14]
Condition : ((((isnotnull(ss_sold_date_sk#11) AND (ss_sold_date_sk#11 >= 2452123)) AND (ss_sold_date_sk#11 <= 2452487)) AND isnotnull(ss_item_sk#12)) AND isnotnull(ss_store_sk#13))
(9) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [i_item_sk#1]
Right keys [1]: [ss_item_sk#12]
Join condition: None
(10) Project [codegen id : 4]
Output [4]: [i_manager_id#5, ss_sold_date_sk#11, ss_store_sk#13, ss_sales_price#14]
Input [6]: [i_item_sk#1, i_manager_id#5, ss_sold_date_sk#11, ss_item_sk#12, ss_store_sk#13, ss_sales_price#14]
(11) Scan parquet default.store
Output [1]: [s_store_sk#15]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store]
PushedFilters: [IsNotNull(s_store_sk)]
ReadSchema: struct<s_store_sk:int>
(12) ColumnarToRow [codegen id : 2]
Input [1]: [s_store_sk#15]
(13) Filter [codegen id : 2]
Input [1]: [s_store_sk#15]
Condition : isnotnull(s_store_sk#15)
(14) BroadcastExchange
Input [1]: [s_store_sk#15]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#16]
(15) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_store_sk#13]
Right keys [1]: [s_store_sk#15]
Join condition: None
(16) Project [codegen id : 4]
Output [3]: [i_manager_id#5, ss_sold_date_sk#11, ss_sales_price#14]
Input [5]: [i_manager_id#5, ss_sold_date_sk#11, ss_store_sk#13, ss_sales_price#14, s_store_sk#15]
(17) Scan parquet default.date_dim
Output [3]: [d_date_sk#17, d_month_seq#18, d_moy#19]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/date_dim]
PushedFilters: [In(d_month_seq, [1222,1228,1223,1227,1219,1226,1224,1225,1230,1220,1221,1229]), LessThanOrEqual(d_date_sk,2452487), GreaterThanOrEqual(d_date_sk,2452123), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_month_seq:int,d_moy:int>
(18) ColumnarToRow [codegen id : 3]
Input [3]: [d_date_sk#17, d_month_seq#18, d_moy#19]
(19) Filter [codegen id : 3]
Input [3]: [d_date_sk#17, d_month_seq#18, d_moy#19]
Condition : (((d_month_seq#18 INSET (1222,1228,1223,1227,1219,1226,1224,1225,1230,1220,1221,1229) AND (d_date_sk#17 <= 2452487)) AND (d_date_sk#17 >= 2452123)) AND isnotnull(d_date_sk#17))
(20) Project [codegen id : 3]
Output [2]: [d_date_sk#17, d_moy#19]
Input [3]: [d_date_sk#17, d_month_seq#18, d_moy#19]
(21) BroadcastExchange
Input [2]: [d_date_sk#17, d_moy#19]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#20]
(22) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_sold_date_sk#11]
Right keys [1]: [d_date_sk#17]
Join condition: None
(23) Project [codegen id : 4]
Output [3]: [i_manager_id#5, ss_sales_price#14, d_moy#19]
Input [5]: [i_manager_id#5, ss_sold_date_sk#11, ss_sales_price#14, d_date_sk#17, d_moy#19]
(24) HashAggregate [codegen id : 4]
Input [3]: [i_manager_id#5, ss_sales_price#14, d_moy#19]
Keys [2]: [i_manager_id#5, d_moy#19]
Functions [1]: [partial_sum(UnscaledValue(ss_sales_price#14))]
Aggregate Attributes [1]: [sum#21]
Results [3]: [i_manager_id#5, d_moy#19, sum#22]
(25) Exchange
Input [3]: [i_manager_id#5, d_moy#19, sum#22]
Arguments: hashpartitioning(i_manager_id#5, d_moy#19, 5), true, [id=#23]
(26) HashAggregate [codegen id : 5]
Input [3]: [i_manager_id#5, d_moy#19, sum#22]
Keys [2]: [i_manager_id#5, d_moy#19]
Functions [1]: [sum(UnscaledValue(ss_sales_price#14))]
Aggregate Attributes [1]: [sum(UnscaledValue(ss_sales_price#14))#24]
Results [3]: [i_manager_id#5, MakeDecimal(sum(UnscaledValue(ss_sales_price#14))#24,17,2) AS sum_sales#25, MakeDecimal(sum(UnscaledValue(ss_sales_price#14))#24,17,2) AS _w0#26]
(27) Exchange
Input [3]: [i_manager_id#5, sum_sales#25, _w0#26]
Arguments: hashpartitioning(i_manager_id#5, 5), true, [id=#27]
(28) Sort [codegen id : 6]
Input [3]: [i_manager_id#5, sum_sales#25, _w0#26]
Arguments: [i_manager_id#5 ASC NULLS FIRST], false, 0
(29) Window
Input [3]: [i_manager_id#5, sum_sales#25, _w0#26]
Arguments: [avg(_w0#26) windowspecdefinition(i_manager_id#5, specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) AS avg_monthly_sales#28], [i_manager_id#5]
(30) Filter [codegen id : 7]
Input [4]: [i_manager_id#5, sum_sales#25, _w0#26, avg_monthly_sales#28]
Condition : (CASE WHEN (avg_monthly_sales#28 > 0.000000) THEN CheckOverflow((promote_precision(abs(CheckOverflow((promote_precision(cast(sum_sales#25 as decimal(22,6))) - promote_precision(cast(avg_monthly_sales#28 as decimal(22,6)))), DecimalType(22,6), true))) / promote_precision(cast(avg_monthly_sales#28 as decimal(22,6)))), DecimalType(38,16), true) ELSE null END > 0.1000000000000000)
(31) Project [codegen id : 7]
Output [3]: [i_manager_id#5, sum_sales#25, avg_monthly_sales#28]
Input [4]: [i_manager_id#5, sum_sales#25, _w0#26, avg_monthly_sales#28]
(32) TakeOrderedAndProject
Input [3]: [i_manager_id#5, sum_sales#25, avg_monthly_sales#28]
Arguments: 100, [i_manager_id#5 ASC NULLS FIRST, avg_monthly_sales#28 ASC NULLS FIRST, sum_sales#25 ASC NULLS FIRST], [i_manager_id#5, sum_sales#25, avg_monthly_sales#28]

View file

@ -0,0 +1,49 @@
TakeOrderedAndProject [avg_monthly_sales,i_manager_id,sum_sales]
WholeStageCodegen (7)
Project [avg_monthly_sales,i_manager_id,sum_sales]
Filter [avg_monthly_sales,sum_sales]
InputAdapter
Window [_w0,i_manager_id]
WholeStageCodegen (6)
Sort [i_manager_id]
InputAdapter
Exchange [i_manager_id] #1
WholeStageCodegen (5)
HashAggregate [d_moy,i_manager_id,sum] [_w0,sum,sum(UnscaledValue(ss_sales_price)),sum_sales]
InputAdapter
Exchange [d_moy,i_manager_id] #2
WholeStageCodegen (4)
HashAggregate [d_moy,i_manager_id,ss_sales_price] [sum,sum]
Project [d_moy,i_manager_id,ss_sales_price]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Project [i_manager_id,ss_sales_price,ss_sold_date_sk]
BroadcastHashJoin [s_store_sk,ss_store_sk]
Project [i_manager_id,ss_sales_price,ss_sold_date_sk,ss_store_sk]
BroadcastHashJoin [i_item_sk,ss_item_sk]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (1)
Project [i_item_sk,i_manager_id]
Filter [i_brand,i_category,i_class,i_item_sk]
ColumnarToRow
InputAdapter
Scan parquet default.item [i_brand,i_category,i_class,i_item_sk,i_manager_id]
Filter [ss_item_sk,ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_item_sk,ss_sales_price,ss_sold_date_sk,ss_store_sk]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (2)
Filter [s_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store [s_store_sk]
InputAdapter
BroadcastExchange #5
WholeStageCodegen (3)
Project [d_date_sk,d_moy]
Filter [d_date_sk,d_month_seq]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_month_seq,d_moy]

View file

@ -0,0 +1,180 @@
== Physical Plan ==
TakeOrderedAndProject (32)
+- * Project (31)
+- * Filter (30)
+- Window (29)
+- * Sort (28)
+- Exchange (27)
+- * HashAggregate (26)
+- Exchange (25)
+- * HashAggregate (24)
+- * Project (23)
+- * BroadcastHashJoin Inner BuildRight (22)
:- * Project (17)
: +- * BroadcastHashJoin Inner BuildRight (16)
: :- * Project (10)
: : +- * BroadcastHashJoin Inner BuildRight (9)
: : :- * Project (4)
: : : +- * Filter (3)
: : : +- * ColumnarToRow (2)
: : : +- Scan parquet default.item (1)
: : +- BroadcastExchange (8)
: : +- * Filter (7)
: : +- * ColumnarToRow (6)
: : +- Scan parquet default.store_sales (5)
: +- BroadcastExchange (15)
: +- * Project (14)
: +- * Filter (13)
: +- * ColumnarToRow (12)
: +- Scan parquet default.date_dim (11)
+- BroadcastExchange (21)
+- * Filter (20)
+- * ColumnarToRow (19)
+- Scan parquet default.store (18)
(1) Scan parquet default.item
Output [5]: [i_item_sk#1, i_brand#2, i_class#3, i_category#4, i_manager_id#5]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/item]
PushedFilters: [Or(And(And(In(i_category, [Books,Children,Electronics]),In(i_class, [personal,portable,reference,self-help])),In(i_brand, [scholaramalgamalg #6,scholaramalgamalg #7,exportiunivamalg #8,scholaramalgamalg #8])),And(And(In(i_category, [Women,Music,Men]),In(i_class, [accessories,classical,fragrances,pants])),In(i_brand, [amalgimporto #9,edu packscholar #9,exportiimporto #9,importoamalg #9]))), IsNotNull(i_item_sk)]
ReadSchema: struct<i_item_sk:int,i_brand:string,i_class:string,i_category:string,i_manager_id:int>
(2) ColumnarToRow [codegen id : 4]
Input [5]: [i_item_sk#1, i_brand#2, i_class#3, i_category#4, i_manager_id#5]
(3) Filter [codegen id : 4]
Input [5]: [i_item_sk#1, i_brand#2, i_class#3, i_category#4, i_manager_id#5]
Condition : ((((i_category#4 IN (Books,Children,Electronics) AND i_class#3 IN (personal,portable,reference,self-help)) AND i_brand#2 IN (scholaramalgamalg #6,scholaramalgamalg #7,exportiunivamalg #8,scholaramalgamalg #8)) OR ((i_category#4 IN (Women,Music,Men) AND i_class#3 IN (accessories,classical,fragrances,pants)) AND i_brand#2 IN (amalgimporto #9,edu packscholar #9,exportiimporto #9,importoamalg #9))) AND isnotnull(i_item_sk#1))
(4) Project [codegen id : 4]
Output [2]: [i_item_sk#1, i_manager_id#5]
Input [5]: [i_item_sk#1, i_brand#2, i_class#3, i_category#4, i_manager_id#5]
(5) Scan parquet default.store_sales
Output [4]: [ss_sold_date_sk#10, ss_item_sk#11, ss_store_sk#12, ss_sales_price#13]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2452123), LessThanOrEqual(ss_sold_date_sk,2452487), IsNotNull(ss_item_sk), IsNotNull(ss_store_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_item_sk:int,ss_store_sk:int,ss_sales_price:decimal(7,2)>
(6) ColumnarToRow [codegen id : 1]
Input [4]: [ss_sold_date_sk#10, ss_item_sk#11, ss_store_sk#12, ss_sales_price#13]
(7) Filter [codegen id : 1]
Input [4]: [ss_sold_date_sk#10, ss_item_sk#11, ss_store_sk#12, ss_sales_price#13]
Condition : ((((isnotnull(ss_sold_date_sk#10) AND (ss_sold_date_sk#10 >= 2452123)) AND (ss_sold_date_sk#10 <= 2452487)) AND isnotnull(ss_item_sk#11)) AND isnotnull(ss_store_sk#12))
(8) BroadcastExchange
Input [4]: [ss_sold_date_sk#10, ss_item_sk#11, ss_store_sk#12, ss_sales_price#13]
Arguments: HashedRelationBroadcastMode(List(cast(input[1, int, false] as bigint)),false), [id=#14]
(9) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [i_item_sk#1]
Right keys [1]: [ss_item_sk#11]
Join condition: None
(10) Project [codegen id : 4]
Output [4]: [i_manager_id#5, ss_sold_date_sk#10, ss_store_sk#12, ss_sales_price#13]
Input [6]: [i_item_sk#1, i_manager_id#5, ss_sold_date_sk#10, ss_item_sk#11, ss_store_sk#12, ss_sales_price#13]
(11) Scan parquet default.date_dim
Output [3]: [d_date_sk#15, d_month_seq#16, d_moy#17]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/date_dim]
PushedFilters: [In(d_month_seq, [1222,1228,1223,1227,1219,1226,1224,1225,1230,1220,1221,1229]), LessThanOrEqual(d_date_sk,2452487), GreaterThanOrEqual(d_date_sk,2452123), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_month_seq:int,d_moy:int>
(12) ColumnarToRow [codegen id : 2]
Input [3]: [d_date_sk#15, d_month_seq#16, d_moy#17]
(13) Filter [codegen id : 2]
Input [3]: [d_date_sk#15, d_month_seq#16, d_moy#17]
Condition : (((d_month_seq#16 INSET (1222,1228,1223,1227,1219,1226,1224,1225,1230,1220,1221,1229) AND (d_date_sk#15 <= 2452487)) AND (d_date_sk#15 >= 2452123)) AND isnotnull(d_date_sk#15))
(14) Project [codegen id : 2]
Output [2]: [d_date_sk#15, d_moy#17]
Input [3]: [d_date_sk#15, d_month_seq#16, d_moy#17]
(15) BroadcastExchange
Input [2]: [d_date_sk#15, d_moy#17]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#18]
(16) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_sold_date_sk#10]
Right keys [1]: [d_date_sk#15]
Join condition: None
(17) Project [codegen id : 4]
Output [4]: [i_manager_id#5, ss_store_sk#12, ss_sales_price#13, d_moy#17]
Input [6]: [i_manager_id#5, ss_sold_date_sk#10, ss_store_sk#12, ss_sales_price#13, d_date_sk#15, d_moy#17]
(18) Scan parquet default.store
Output [1]: [s_store_sk#19]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store]
PushedFilters: [IsNotNull(s_store_sk)]
ReadSchema: struct<s_store_sk:int>
(19) ColumnarToRow [codegen id : 3]
Input [1]: [s_store_sk#19]
(20) Filter [codegen id : 3]
Input [1]: [s_store_sk#19]
Condition : isnotnull(s_store_sk#19)
(21) BroadcastExchange
Input [1]: [s_store_sk#19]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#20]
(22) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_store_sk#12]
Right keys [1]: [s_store_sk#19]
Join condition: None
(23) Project [codegen id : 4]
Output [3]: [i_manager_id#5, ss_sales_price#13, d_moy#17]
Input [5]: [i_manager_id#5, ss_store_sk#12, ss_sales_price#13, d_moy#17, s_store_sk#19]
(24) HashAggregate [codegen id : 4]
Input [3]: [i_manager_id#5, ss_sales_price#13, d_moy#17]
Keys [2]: [i_manager_id#5, d_moy#17]
Functions [1]: [partial_sum(UnscaledValue(ss_sales_price#13))]
Aggregate Attributes [1]: [sum#21]
Results [3]: [i_manager_id#5, d_moy#17, sum#22]
(25) Exchange
Input [3]: [i_manager_id#5, d_moy#17, sum#22]
Arguments: hashpartitioning(i_manager_id#5, d_moy#17, 5), true, [id=#23]
(26) HashAggregate [codegen id : 5]
Input [3]: [i_manager_id#5, d_moy#17, sum#22]
Keys [2]: [i_manager_id#5, d_moy#17]
Functions [1]: [sum(UnscaledValue(ss_sales_price#13))]
Aggregate Attributes [1]: [sum(UnscaledValue(ss_sales_price#13))#24]
Results [3]: [i_manager_id#5, MakeDecimal(sum(UnscaledValue(ss_sales_price#13))#24,17,2) AS sum_sales#25, MakeDecimal(sum(UnscaledValue(ss_sales_price#13))#24,17,2) AS _w0#26]
(27) Exchange
Input [3]: [i_manager_id#5, sum_sales#25, _w0#26]
Arguments: hashpartitioning(i_manager_id#5, 5), true, [id=#27]
(28) Sort [codegen id : 6]
Input [3]: [i_manager_id#5, sum_sales#25, _w0#26]
Arguments: [i_manager_id#5 ASC NULLS FIRST], false, 0
(29) Window
Input [3]: [i_manager_id#5, sum_sales#25, _w0#26]
Arguments: [avg(_w0#26) windowspecdefinition(i_manager_id#5, specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) AS avg_monthly_sales#28], [i_manager_id#5]
(30) Filter [codegen id : 7]
Input [4]: [i_manager_id#5, sum_sales#25, _w0#26, avg_monthly_sales#28]
Condition : (CASE WHEN (avg_monthly_sales#28 > 0.000000) THEN CheckOverflow((promote_precision(abs(CheckOverflow((promote_precision(cast(sum_sales#25 as decimal(22,6))) - promote_precision(cast(avg_monthly_sales#28 as decimal(22,6)))), DecimalType(22,6), true))) / promote_precision(cast(avg_monthly_sales#28 as decimal(22,6)))), DecimalType(38,16), true) ELSE null END > 0.1000000000000000)
(31) Project [codegen id : 7]
Output [3]: [i_manager_id#5, sum_sales#25, avg_monthly_sales#28]
Input [4]: [i_manager_id#5, sum_sales#25, _w0#26, avg_monthly_sales#28]
(32) TakeOrderedAndProject
Input [3]: [i_manager_id#5, sum_sales#25, avg_monthly_sales#28]
Arguments: 100, [i_manager_id#5 ASC NULLS FIRST, avg_monthly_sales#28 ASC NULLS FIRST, sum_sales#25 ASC NULLS FIRST], [i_manager_id#5, sum_sales#25, avg_monthly_sales#28]

View file

@ -0,0 +1,49 @@
TakeOrderedAndProject [avg_monthly_sales,i_manager_id,sum_sales]
WholeStageCodegen (7)
Project [avg_monthly_sales,i_manager_id,sum_sales]
Filter [avg_monthly_sales,sum_sales]
InputAdapter
Window [_w0,i_manager_id]
WholeStageCodegen (6)
Sort [i_manager_id]
InputAdapter
Exchange [i_manager_id] #1
WholeStageCodegen (5)
HashAggregate [d_moy,i_manager_id,sum] [_w0,sum,sum(UnscaledValue(ss_sales_price)),sum_sales]
InputAdapter
Exchange [d_moy,i_manager_id] #2
WholeStageCodegen (4)
HashAggregate [d_moy,i_manager_id,ss_sales_price] [sum,sum]
Project [d_moy,i_manager_id,ss_sales_price]
BroadcastHashJoin [s_store_sk,ss_store_sk]
Project [d_moy,i_manager_id,ss_sales_price,ss_store_sk]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Project [i_manager_id,ss_sales_price,ss_sold_date_sk,ss_store_sk]
BroadcastHashJoin [i_item_sk,ss_item_sk]
Project [i_item_sk,i_manager_id]
Filter [i_brand,i_category,i_class,i_item_sk]
ColumnarToRow
InputAdapter
Scan parquet default.item [i_brand,i_category,i_class,i_item_sk,i_manager_id]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (1)
Filter [ss_item_sk,ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_item_sk,ss_sales_price,ss_sold_date_sk,ss_store_sk]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (2)
Project [d_date_sk,d_moy]
Filter [d_date_sk,d_month_seq]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_month_seq,d_moy]
InputAdapter
BroadcastExchange #5
WholeStageCodegen (3)
Filter [s_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store [s_store_sk]

View file

@ -0,0 +1,245 @@
== Physical Plan ==
TakeOrderedAndProject (42)
+- * Project (41)
+- * BroadcastHashJoin Inner BuildLeft (40)
:- BroadcastExchange (36)
: +- * Project (35)
: +- * BroadcastHashJoin Inner BuildLeft (34)
: :- BroadcastExchange (30)
: : +- * Project (29)
: : +- * BroadcastHashJoin Inner BuildRight (28)
: : :- * Filter (14)
: : : +- * HashAggregate (13)
: : : +- Exchange (12)
: : : +- * HashAggregate (11)
: : : +- * Project (10)
: : : +- * BroadcastHashJoin Inner BuildRight (9)
: : : :- * Filter (3)
: : : : +- * ColumnarToRow (2)
: : : : +- Scan parquet default.store_sales (1)
: : : +- BroadcastExchange (8)
: : : +- * Project (7)
: : : +- * Filter (6)
: : : +- * ColumnarToRow (5)
: : : +- Scan parquet default.date_dim (4)
: : +- BroadcastExchange (27)
: : +- * HashAggregate (26)
: : +- Exchange (25)
: : +- * HashAggregate (24)
: : +- * HashAggregate (23)
: : +- Exchange (22)
: : +- * HashAggregate (21)
: : +- * Project (20)
: : +- * BroadcastHashJoin Inner BuildRight (19)
: : :- * Filter (17)
: : : +- * ColumnarToRow (16)
: : : +- Scan parquet default.store_sales (15)
: : +- ReusedExchange (18)
: +- * Filter (33)
: +- * ColumnarToRow (32)
: +- Scan parquet default.store (31)
+- * Filter (39)
+- * ColumnarToRow (38)
+- Scan parquet default.item (37)
(1) Scan parquet default.store_sales
Output [4]: [ss_sold_date_sk#1, ss_item_sk#2, ss_store_sk#3, ss_sales_price#4]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2451911), LessThanOrEqual(ss_sold_date_sk,2452275), IsNotNull(ss_store_sk), IsNotNull(ss_item_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_item_sk:int,ss_store_sk:int,ss_sales_price:decimal(7,2)>
(2) ColumnarToRow [codegen id : 2]
Input [4]: [ss_sold_date_sk#1, ss_item_sk#2, ss_store_sk#3, ss_sales_price#4]
(3) Filter [codegen id : 2]
Input [4]: [ss_sold_date_sk#1, ss_item_sk#2, ss_store_sk#3, ss_sales_price#4]
Condition : ((((isnotnull(ss_sold_date_sk#1) AND (ss_sold_date_sk#1 >= 2451911)) AND (ss_sold_date_sk#1 <= 2452275)) AND isnotnull(ss_store_sk#3)) AND isnotnull(ss_item_sk#2))
(4) Scan parquet default.date_dim
Output [2]: [d_date_sk#5, d_month_seq#6]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/date_dim]
PushedFilters: [IsNotNull(d_month_seq), GreaterThanOrEqual(d_month_seq,1212), LessThanOrEqual(d_month_seq,1223), GreaterThanOrEqual(d_date_sk,2451911), LessThanOrEqual(d_date_sk,2452275), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_month_seq:int>
(5) ColumnarToRow [codegen id : 1]
Input [2]: [d_date_sk#5, d_month_seq#6]
(6) Filter [codegen id : 1]
Input [2]: [d_date_sk#5, d_month_seq#6]
Condition : (((((isnotnull(d_month_seq#6) AND (d_month_seq#6 >= 1212)) AND (d_month_seq#6 <= 1223)) AND (d_date_sk#5 >= 2451911)) AND (d_date_sk#5 <= 2452275)) AND isnotnull(d_date_sk#5))
(7) Project [codegen id : 1]
Output [1]: [d_date_sk#5]
Input [2]: [d_date_sk#5, d_month_seq#6]
(8) BroadcastExchange
Input [1]: [d_date_sk#5]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#7]
(9) BroadcastHashJoin [codegen id : 2]
Left keys [1]: [ss_sold_date_sk#1]
Right keys [1]: [d_date_sk#5]
Join condition: None
(10) Project [codegen id : 2]
Output [3]: [ss_item_sk#2, ss_store_sk#3, ss_sales_price#4]
Input [5]: [ss_sold_date_sk#1, ss_item_sk#2, ss_store_sk#3, ss_sales_price#4, d_date_sk#5]
(11) HashAggregate [codegen id : 2]
Input [3]: [ss_item_sk#2, ss_store_sk#3, ss_sales_price#4]
Keys [2]: [ss_store_sk#3, ss_item_sk#2]
Functions [1]: [partial_sum(UnscaledValue(ss_sales_price#4))]
Aggregate Attributes [1]: [sum#8]
Results [3]: [ss_store_sk#3, ss_item_sk#2, sum#9]
(12) Exchange
Input [3]: [ss_store_sk#3, ss_item_sk#2, sum#9]
Arguments: hashpartitioning(ss_store_sk#3, ss_item_sk#2, 5), true, [id=#10]
(13) HashAggregate [codegen id : 7]
Input [3]: [ss_store_sk#3, ss_item_sk#2, sum#9]
Keys [2]: [ss_store_sk#3, ss_item_sk#2]
Functions [1]: [sum(UnscaledValue(ss_sales_price#4))]
Aggregate Attributes [1]: [sum(UnscaledValue(ss_sales_price#4))#11]
Results [3]: [ss_store_sk#3, ss_item_sk#2, MakeDecimal(sum(UnscaledValue(ss_sales_price#4))#11,17,2) AS revenue#12]
(14) Filter [codegen id : 7]
Input [3]: [ss_store_sk#3, ss_item_sk#2, revenue#12]
Condition : isnotnull(revenue#12)
(15) Scan parquet default.store_sales
Output [4]: [ss_sold_date_sk#13, ss_item_sk#14, ss_store_sk#15, ss_sales_price#16]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2451911), LessThanOrEqual(ss_sold_date_sk,2452275), IsNotNull(ss_store_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_item_sk:int,ss_store_sk:int,ss_sales_price:decimal(7,2)>
(16) ColumnarToRow [codegen id : 4]
Input [4]: [ss_sold_date_sk#13, ss_item_sk#14, ss_store_sk#15, ss_sales_price#16]
(17) Filter [codegen id : 4]
Input [4]: [ss_sold_date_sk#13, ss_item_sk#14, ss_store_sk#15, ss_sales_price#16]
Condition : (((isnotnull(ss_sold_date_sk#13) AND (ss_sold_date_sk#13 >= 2451911)) AND (ss_sold_date_sk#13 <= 2452275)) AND isnotnull(ss_store_sk#15))
(18) ReusedExchange [Reuses operator id: 8]
Output [1]: [d_date_sk#5]
(19) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_sold_date_sk#13]
Right keys [1]: [d_date_sk#5]
Join condition: None
(20) Project [codegen id : 4]
Output [3]: [ss_item_sk#14, ss_store_sk#15, ss_sales_price#16]
Input [5]: [ss_sold_date_sk#13, ss_item_sk#14, ss_store_sk#15, ss_sales_price#16, d_date_sk#5]
(21) HashAggregate [codegen id : 4]
Input [3]: [ss_item_sk#14, ss_store_sk#15, ss_sales_price#16]
Keys [2]: [ss_store_sk#15, ss_item_sk#14]
Functions [1]: [partial_sum(UnscaledValue(ss_sales_price#16))]
Aggregate Attributes [1]: [sum#17]
Results [3]: [ss_store_sk#15, ss_item_sk#14, sum#18]
(22) Exchange
Input [3]: [ss_store_sk#15, ss_item_sk#14, sum#18]
Arguments: hashpartitioning(ss_store_sk#15, ss_item_sk#14, 5), true, [id=#19]
(23) HashAggregate [codegen id : 5]
Input [3]: [ss_store_sk#15, ss_item_sk#14, sum#18]
Keys [2]: [ss_store_sk#15, ss_item_sk#14]
Functions [1]: [sum(UnscaledValue(ss_sales_price#16))]
Aggregate Attributes [1]: [sum(UnscaledValue(ss_sales_price#16))#20]
Results [2]: [ss_store_sk#15, MakeDecimal(sum(UnscaledValue(ss_sales_price#16))#20,17,2) AS revenue#21]
(24) HashAggregate [codegen id : 5]
Input [2]: [ss_store_sk#15, revenue#21]
Keys [1]: [ss_store_sk#15]
Functions [1]: [partial_avg(revenue#21)]
Aggregate Attributes [2]: [sum#22, count#23]
Results [3]: [ss_store_sk#15, sum#24, count#25]
(25) Exchange
Input [3]: [ss_store_sk#15, sum#24, count#25]
Arguments: hashpartitioning(ss_store_sk#15, 5), true, [id=#26]
(26) HashAggregate [codegen id : 6]
Input [3]: [ss_store_sk#15, sum#24, count#25]
Keys [1]: [ss_store_sk#15]
Functions [1]: [avg(revenue#21)]
Aggregate Attributes [1]: [avg(revenue#21)#27]
Results [2]: [ss_store_sk#15, avg(revenue#21)#27 AS ave#28]
(27) BroadcastExchange
Input [2]: [ss_store_sk#15, ave#28]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#29]
(28) BroadcastHashJoin [codegen id : 7]
Left keys [1]: [ss_store_sk#3]
Right keys [1]: [ss_store_sk#15]
Join condition: (cast(revenue#12 as decimal(23,7)) <= CheckOverflow((0.100000 * promote_precision(ave#28)), DecimalType(23,7), true))
(29) Project [codegen id : 7]
Output [3]: [ss_store_sk#3, ss_item_sk#2, revenue#12]
Input [5]: [ss_store_sk#3, ss_item_sk#2, revenue#12, ss_store_sk#15, ave#28]
(30) BroadcastExchange
Input [3]: [ss_store_sk#3, ss_item_sk#2, revenue#12]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#30]
(31) Scan parquet default.store
Output [2]: [s_store_sk#31, s_store_name#32]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store]
PushedFilters: [IsNotNull(s_store_sk)]
ReadSchema: struct<s_store_sk:int,s_store_name:string>
(32) ColumnarToRow
Input [2]: [s_store_sk#31, s_store_name#32]
(33) Filter
Input [2]: [s_store_sk#31, s_store_name#32]
Condition : isnotnull(s_store_sk#31)
(34) BroadcastHashJoin [codegen id : 8]
Left keys [1]: [ss_store_sk#3]
Right keys [1]: [s_store_sk#31]
Join condition: None
(35) Project [codegen id : 8]
Output [3]: [ss_item_sk#2, revenue#12, s_store_name#32]
Input [5]: [ss_store_sk#3, ss_item_sk#2, revenue#12, s_store_sk#31, s_store_name#32]
(36) BroadcastExchange
Input [3]: [ss_item_sk#2, revenue#12, s_store_name#32]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#33]
(37) Scan parquet default.item
Output [5]: [i_item_sk#34, i_item_desc#35, i_current_price#36, i_wholesale_cost#37, i_brand#38]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/item]
PushedFilters: [IsNotNull(i_item_sk)]
ReadSchema: struct<i_item_sk:int,i_item_desc:string,i_current_price:decimal(7,2),i_wholesale_cost:decimal(7,2),i_brand:string>
(38) ColumnarToRow
Input [5]: [i_item_sk#34, i_item_desc#35, i_current_price#36, i_wholesale_cost#37, i_brand#38]
(39) Filter
Input [5]: [i_item_sk#34, i_item_desc#35, i_current_price#36, i_wholesale_cost#37, i_brand#38]
Condition : isnotnull(i_item_sk#34)
(40) BroadcastHashJoin [codegen id : 9]
Left keys [1]: [ss_item_sk#2]
Right keys [1]: [i_item_sk#34]
Join condition: None
(41) Project [codegen id : 9]
Output [6]: [s_store_name#32, i_item_desc#35, revenue#12, i_current_price#36, i_wholesale_cost#37, i_brand#38]
Input [8]: [ss_item_sk#2, revenue#12, s_store_name#32, i_item_sk#34, i_item_desc#35, i_current_price#36, i_wholesale_cost#37, i_brand#38]
(42) TakeOrderedAndProject
Input [6]: [s_store_name#32, i_item_desc#35, revenue#12, i_current_price#36, i_wholesale_cost#37, i_brand#38]
Arguments: 100, [s_store_name#32 ASC NULLS FIRST, i_item_desc#35 ASC NULLS FIRST], [s_store_name#32, i_item_desc#35, revenue#12, i_current_price#36, i_wholesale_cost#37, i_brand#38]

View file

@ -0,0 +1,63 @@
TakeOrderedAndProject [i_brand,i_current_price,i_item_desc,i_wholesale_cost,revenue,s_store_name]
WholeStageCodegen (9)
Project [i_brand,i_current_price,i_item_desc,i_wholesale_cost,revenue,s_store_name]
BroadcastHashJoin [i_item_sk,ss_item_sk]
InputAdapter
BroadcastExchange #1
WholeStageCodegen (8)
Project [revenue,s_store_name,ss_item_sk]
BroadcastHashJoin [s_store_sk,ss_store_sk]
InputAdapter
BroadcastExchange #2
WholeStageCodegen (7)
Project [revenue,ss_item_sk,ss_store_sk]
BroadcastHashJoin [ave,revenue,ss_store_sk,ss_store_sk]
Filter [revenue]
HashAggregate [ss_item_sk,ss_store_sk,sum] [revenue,sum,sum(UnscaledValue(ss_sales_price))]
InputAdapter
Exchange [ss_item_sk,ss_store_sk] #3
WholeStageCodegen (2)
HashAggregate [ss_item_sk,ss_sales_price,ss_store_sk] [sum,sum]
Project [ss_item_sk,ss_sales_price,ss_store_sk]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Filter [ss_item_sk,ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_item_sk,ss_sales_price,ss_sold_date_sk,ss_store_sk]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (1)
Project [d_date_sk]
Filter [d_date_sk,d_month_seq]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_month_seq]
InputAdapter
BroadcastExchange #5
WholeStageCodegen (6)
HashAggregate [count,ss_store_sk,sum] [ave,avg(revenue),count,sum]
InputAdapter
Exchange [ss_store_sk] #6
WholeStageCodegen (5)
HashAggregate [revenue,ss_store_sk] [count,count,sum,sum]
HashAggregate [ss_item_sk,ss_store_sk,sum] [revenue,sum,sum(UnscaledValue(ss_sales_price))]
InputAdapter
Exchange [ss_item_sk,ss_store_sk] #7
WholeStageCodegen (4)
HashAggregate [ss_item_sk,ss_sales_price,ss_store_sk] [sum,sum]
Project [ss_item_sk,ss_sales_price,ss_store_sk]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Filter [ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_item_sk,ss_sales_price,ss_sold_date_sk,ss_store_sk]
InputAdapter
ReusedExchange [d_date_sk] #4
Filter [s_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store [s_store_name,s_store_sk]
Filter [i_item_sk]
ColumnarToRow
InputAdapter
Scan parquet default.item [i_brand,i_current_price,i_item_desc,i_item_sk,i_wholesale_cost]

View file

@ -0,0 +1,245 @@
== Physical Plan ==
TakeOrderedAndProject (42)
+- * Project (41)
+- * BroadcastHashJoin Inner BuildRight (40)
:- * Project (26)
: +- * BroadcastHashJoin Inner BuildRight (25)
: :- * Project (20)
: : +- * BroadcastHashJoin Inner BuildRight (19)
: : :- * Filter (3)
: : : +- * ColumnarToRow (2)
: : : +- Scan parquet default.store (1)
: : +- BroadcastExchange (18)
: : +- * Filter (17)
: : +- * HashAggregate (16)
: : +- Exchange (15)
: : +- * HashAggregate (14)
: : +- * Project (13)
: : +- * BroadcastHashJoin Inner BuildRight (12)
: : :- * Filter (6)
: : : +- * ColumnarToRow (5)
: : : +- Scan parquet default.store_sales (4)
: : +- BroadcastExchange (11)
: : +- * Project (10)
: : +- * Filter (9)
: : +- * ColumnarToRow (8)
: : +- Scan parquet default.date_dim (7)
: +- BroadcastExchange (24)
: +- * Filter (23)
: +- * ColumnarToRow (22)
: +- Scan parquet default.item (21)
+- BroadcastExchange (39)
+- * HashAggregate (38)
+- Exchange (37)
+- * HashAggregate (36)
+- * HashAggregate (35)
+- Exchange (34)
+- * HashAggregate (33)
+- * Project (32)
+- * BroadcastHashJoin Inner BuildRight (31)
:- * Filter (29)
: +- * ColumnarToRow (28)
: +- Scan parquet default.store_sales (27)
+- ReusedExchange (30)
(1) Scan parquet default.store
Output [2]: [s_store_sk#1, s_store_name#2]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store]
PushedFilters: [IsNotNull(s_store_sk)]
ReadSchema: struct<s_store_sk:int,s_store_name:string>
(2) ColumnarToRow [codegen id : 9]
Input [2]: [s_store_sk#1, s_store_name#2]
(3) Filter [codegen id : 9]
Input [2]: [s_store_sk#1, s_store_name#2]
Condition : isnotnull(s_store_sk#1)
(4) Scan parquet default.store_sales
Output [4]: [ss_sold_date_sk#3, ss_item_sk#4, ss_store_sk#5, ss_sales_price#6]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2451911), LessThanOrEqual(ss_sold_date_sk,2452275), IsNotNull(ss_store_sk), IsNotNull(ss_item_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_item_sk:int,ss_store_sk:int,ss_sales_price:decimal(7,2)>
(5) ColumnarToRow [codegen id : 2]
Input [4]: [ss_sold_date_sk#3, ss_item_sk#4, ss_store_sk#5, ss_sales_price#6]
(6) Filter [codegen id : 2]
Input [4]: [ss_sold_date_sk#3, ss_item_sk#4, ss_store_sk#5, ss_sales_price#6]
Condition : ((((isnotnull(ss_sold_date_sk#3) AND (ss_sold_date_sk#3 >= 2451911)) AND (ss_sold_date_sk#3 <= 2452275)) AND isnotnull(ss_store_sk#5)) AND isnotnull(ss_item_sk#4))
(7) Scan parquet default.date_dim
Output [2]: [d_date_sk#7, d_month_seq#8]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/date_dim]
PushedFilters: [IsNotNull(d_month_seq), GreaterThanOrEqual(d_month_seq,1212), LessThanOrEqual(d_month_seq,1223), GreaterThanOrEqual(d_date_sk,2451911), LessThanOrEqual(d_date_sk,2452275), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_month_seq:int>
(8) ColumnarToRow [codegen id : 1]
Input [2]: [d_date_sk#7, d_month_seq#8]
(9) Filter [codegen id : 1]
Input [2]: [d_date_sk#7, d_month_seq#8]
Condition : (((((isnotnull(d_month_seq#8) AND (d_month_seq#8 >= 1212)) AND (d_month_seq#8 <= 1223)) AND (d_date_sk#7 >= 2451911)) AND (d_date_sk#7 <= 2452275)) AND isnotnull(d_date_sk#7))
(10) Project [codegen id : 1]
Output [1]: [d_date_sk#7]
Input [2]: [d_date_sk#7, d_month_seq#8]
(11) BroadcastExchange
Input [1]: [d_date_sk#7]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#9]
(12) BroadcastHashJoin [codegen id : 2]
Left keys [1]: [ss_sold_date_sk#3]
Right keys [1]: [d_date_sk#7]
Join condition: None
(13) Project [codegen id : 2]
Output [3]: [ss_item_sk#4, ss_store_sk#5, ss_sales_price#6]
Input [5]: [ss_sold_date_sk#3, ss_item_sk#4, ss_store_sk#5, ss_sales_price#6, d_date_sk#7]
(14) HashAggregate [codegen id : 2]
Input [3]: [ss_item_sk#4, ss_store_sk#5, ss_sales_price#6]
Keys [2]: [ss_store_sk#5, ss_item_sk#4]
Functions [1]: [partial_sum(UnscaledValue(ss_sales_price#6))]
Aggregate Attributes [1]: [sum#10]
Results [3]: [ss_store_sk#5, ss_item_sk#4, sum#11]
(15) Exchange
Input [3]: [ss_store_sk#5, ss_item_sk#4, sum#11]
Arguments: hashpartitioning(ss_store_sk#5, ss_item_sk#4, 5), true, [id=#12]
(16) HashAggregate [codegen id : 3]
Input [3]: [ss_store_sk#5, ss_item_sk#4, sum#11]
Keys [2]: [ss_store_sk#5, ss_item_sk#4]
Functions [1]: [sum(UnscaledValue(ss_sales_price#6))]
Aggregate Attributes [1]: [sum(UnscaledValue(ss_sales_price#6))#13]
Results [3]: [ss_store_sk#5, ss_item_sk#4, MakeDecimal(sum(UnscaledValue(ss_sales_price#6))#13,17,2) AS revenue#14]
(17) Filter [codegen id : 3]
Input [3]: [ss_store_sk#5, ss_item_sk#4, revenue#14]
Condition : isnotnull(revenue#14)
(18) BroadcastExchange
Input [3]: [ss_store_sk#5, ss_item_sk#4, revenue#14]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#15]
(19) BroadcastHashJoin [codegen id : 9]
Left keys [1]: [s_store_sk#1]
Right keys [1]: [ss_store_sk#5]
Join condition: None
(20) Project [codegen id : 9]
Output [4]: [s_store_name#2, ss_store_sk#5, ss_item_sk#4, revenue#14]
Input [5]: [s_store_sk#1, s_store_name#2, ss_store_sk#5, ss_item_sk#4, revenue#14]
(21) Scan parquet default.item
Output [5]: [i_item_sk#16, i_item_desc#17, i_current_price#18, i_wholesale_cost#19, i_brand#20]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/item]
PushedFilters: [IsNotNull(i_item_sk)]
ReadSchema: struct<i_item_sk:int,i_item_desc:string,i_current_price:decimal(7,2),i_wholesale_cost:decimal(7,2),i_brand:string>
(22) ColumnarToRow [codegen id : 4]
Input [5]: [i_item_sk#16, i_item_desc#17, i_current_price#18, i_wholesale_cost#19, i_brand#20]
(23) Filter [codegen id : 4]
Input [5]: [i_item_sk#16, i_item_desc#17, i_current_price#18, i_wholesale_cost#19, i_brand#20]
Condition : isnotnull(i_item_sk#16)
(24) BroadcastExchange
Input [5]: [i_item_sk#16, i_item_desc#17, i_current_price#18, i_wholesale_cost#19, i_brand#20]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#21]
(25) BroadcastHashJoin [codegen id : 9]
Left keys [1]: [ss_item_sk#4]
Right keys [1]: [i_item_sk#16]
Join condition: None
(26) Project [codegen id : 9]
Output [7]: [s_store_name#2, ss_store_sk#5, revenue#14, i_item_desc#17, i_current_price#18, i_wholesale_cost#19, i_brand#20]
Input [9]: [s_store_name#2, ss_store_sk#5, ss_item_sk#4, revenue#14, i_item_sk#16, i_item_desc#17, i_current_price#18, i_wholesale_cost#19, i_brand#20]
(27) Scan parquet default.store_sales
Output [4]: [ss_sold_date_sk#22, ss_item_sk#23, ss_store_sk#24, ss_sales_price#25]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2451911), LessThanOrEqual(ss_sold_date_sk,2452275), IsNotNull(ss_store_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_item_sk:int,ss_store_sk:int,ss_sales_price:decimal(7,2)>
(28) ColumnarToRow [codegen id : 6]
Input [4]: [ss_sold_date_sk#22, ss_item_sk#23, ss_store_sk#24, ss_sales_price#25]
(29) Filter [codegen id : 6]
Input [4]: [ss_sold_date_sk#22, ss_item_sk#23, ss_store_sk#24, ss_sales_price#25]
Condition : (((isnotnull(ss_sold_date_sk#22) AND (ss_sold_date_sk#22 >= 2451911)) AND (ss_sold_date_sk#22 <= 2452275)) AND isnotnull(ss_store_sk#24))
(30) ReusedExchange [Reuses operator id: 11]
Output [1]: [d_date_sk#7]
(31) BroadcastHashJoin [codegen id : 6]
Left keys [1]: [ss_sold_date_sk#22]
Right keys [1]: [d_date_sk#7]
Join condition: None
(32) Project [codegen id : 6]
Output [3]: [ss_item_sk#23, ss_store_sk#24, ss_sales_price#25]
Input [5]: [ss_sold_date_sk#22, ss_item_sk#23, ss_store_sk#24, ss_sales_price#25, d_date_sk#7]
(33) HashAggregate [codegen id : 6]
Input [3]: [ss_item_sk#23, ss_store_sk#24, ss_sales_price#25]
Keys [2]: [ss_store_sk#24, ss_item_sk#23]
Functions [1]: [partial_sum(UnscaledValue(ss_sales_price#25))]
Aggregate Attributes [1]: [sum#26]
Results [3]: [ss_store_sk#24, ss_item_sk#23, sum#27]
(34) Exchange
Input [3]: [ss_store_sk#24, ss_item_sk#23, sum#27]
Arguments: hashpartitioning(ss_store_sk#24, ss_item_sk#23, 5), true, [id=#28]
(35) HashAggregate [codegen id : 7]
Input [3]: [ss_store_sk#24, ss_item_sk#23, sum#27]
Keys [2]: [ss_store_sk#24, ss_item_sk#23]
Functions [1]: [sum(UnscaledValue(ss_sales_price#25))]
Aggregate Attributes [1]: [sum(UnscaledValue(ss_sales_price#25))#29]
Results [2]: [ss_store_sk#24, MakeDecimal(sum(UnscaledValue(ss_sales_price#25))#29,17,2) AS revenue#30]
(36) HashAggregate [codegen id : 7]
Input [2]: [ss_store_sk#24, revenue#30]
Keys [1]: [ss_store_sk#24]
Functions [1]: [partial_avg(revenue#30)]
Aggregate Attributes [2]: [sum#31, count#32]
Results [3]: [ss_store_sk#24, sum#33, count#34]
(37) Exchange
Input [3]: [ss_store_sk#24, sum#33, count#34]
Arguments: hashpartitioning(ss_store_sk#24, 5), true, [id=#35]
(38) HashAggregate [codegen id : 8]
Input [3]: [ss_store_sk#24, sum#33, count#34]
Keys [1]: [ss_store_sk#24]
Functions [1]: [avg(revenue#30)]
Aggregate Attributes [1]: [avg(revenue#30)#36]
Results [2]: [ss_store_sk#24, avg(revenue#30)#36 AS ave#37]
(39) BroadcastExchange
Input [2]: [ss_store_sk#24, ave#37]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#38]
(40) BroadcastHashJoin [codegen id : 9]
Left keys [1]: [ss_store_sk#5]
Right keys [1]: [ss_store_sk#24]
Join condition: (cast(revenue#14 as decimal(23,7)) <= CheckOverflow((0.100000 * promote_precision(ave#37)), DecimalType(23,7), true))
(41) Project [codegen id : 9]
Output [6]: [s_store_name#2, i_item_desc#17, revenue#14, i_current_price#18, i_wholesale_cost#19, i_brand#20]
Input [9]: [s_store_name#2, ss_store_sk#5, revenue#14, i_item_desc#17, i_current_price#18, i_wholesale_cost#19, i_brand#20, ss_store_sk#24, ave#37]
(42) TakeOrderedAndProject
Input [6]: [s_store_name#2, i_item_desc#17, revenue#14, i_current_price#18, i_wholesale_cost#19, i_brand#20]
Arguments: 100, [s_store_name#2 ASC NULLS FIRST, i_item_desc#17 ASC NULLS FIRST], [s_store_name#2, i_item_desc#17, revenue#14, i_current_price#18, i_wholesale_cost#19, i_brand#20]

View file

@ -0,0 +1,63 @@
TakeOrderedAndProject [i_brand,i_current_price,i_item_desc,i_wholesale_cost,revenue,s_store_name]
WholeStageCodegen (9)
Project [i_brand,i_current_price,i_item_desc,i_wholesale_cost,revenue,s_store_name]
BroadcastHashJoin [ave,revenue,ss_store_sk,ss_store_sk]
Project [i_brand,i_current_price,i_item_desc,i_wholesale_cost,revenue,s_store_name,ss_store_sk]
BroadcastHashJoin [i_item_sk,ss_item_sk]
Project [revenue,s_store_name,ss_item_sk,ss_store_sk]
BroadcastHashJoin [s_store_sk,ss_store_sk]
Filter [s_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store [s_store_name,s_store_sk]
InputAdapter
BroadcastExchange #1
WholeStageCodegen (3)
Filter [revenue]
HashAggregate [ss_item_sk,ss_store_sk,sum] [revenue,sum,sum(UnscaledValue(ss_sales_price))]
InputAdapter
Exchange [ss_item_sk,ss_store_sk] #2
WholeStageCodegen (2)
HashAggregate [ss_item_sk,ss_sales_price,ss_store_sk] [sum,sum]
Project [ss_item_sk,ss_sales_price,ss_store_sk]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Filter [ss_item_sk,ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_item_sk,ss_sales_price,ss_sold_date_sk,ss_store_sk]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (1)
Project [d_date_sk]
Filter [d_date_sk,d_month_seq]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_month_seq]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (4)
Filter [i_item_sk]
ColumnarToRow
InputAdapter
Scan parquet default.item [i_brand,i_current_price,i_item_desc,i_item_sk,i_wholesale_cost]
InputAdapter
BroadcastExchange #5
WholeStageCodegen (8)
HashAggregate [count,ss_store_sk,sum] [ave,avg(revenue),count,sum]
InputAdapter
Exchange [ss_store_sk] #6
WholeStageCodegen (7)
HashAggregate [revenue,ss_store_sk] [count,count,sum,sum]
HashAggregate [ss_item_sk,ss_store_sk,sum] [revenue,sum,sum(UnscaledValue(ss_sales_price))]
InputAdapter
Exchange [ss_item_sk,ss_store_sk] #7
WholeStageCodegen (6)
HashAggregate [ss_item_sk,ss_sales_price,ss_store_sk] [sum,sum]
Project [ss_item_sk,ss_sales_price,ss_store_sk]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Filter [ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_item_sk,ss_sales_price,ss_sold_date_sk,ss_store_sk]
InputAdapter
ReusedExchange [d_date_sk] #3

View file

@ -0,0 +1,289 @@
== Physical Plan ==
TakeOrderedAndProject (52)
+- * Project (51)
+- * SortMergeJoin Inner (50)
:- * Sort (44)
: +- Exchange (43)
: +- * Project (42)
: +- * SortMergeJoin Inner (41)
: :- * Sort (35)
: : +- Exchange (34)
: : +- * HashAggregate (33)
: : +- Exchange (32)
: : +- * HashAggregate (31)
: : +- * Project (30)
: : +- * BroadcastHashJoin Inner BuildLeft (29)
: : :- BroadcastExchange (25)
: : : +- * Project (24)
: : : +- * BroadcastHashJoin Inner BuildRight (23)
: : : :- * Project (17)
: : : : +- * BroadcastHashJoin Inner BuildRight (16)
: : : : :- * Project (10)
: : : : : +- * BroadcastHashJoin Inner BuildRight (9)
: : : : : :- * Filter (3)
: : : : : : +- * ColumnarToRow (2)
: : : : : : +- Scan parquet default.store_sales (1)
: : : : : +- BroadcastExchange (8)
: : : : : +- * Project (7)
: : : : : +- * Filter (6)
: : : : : +- * ColumnarToRow (5)
: : : : : +- Scan parquet default.date_dim (4)
: : : : +- BroadcastExchange (15)
: : : : +- * Project (14)
: : : : +- * Filter (13)
: : : : +- * ColumnarToRow (12)
: : : : +- Scan parquet default.store (11)
: : : +- BroadcastExchange (22)
: : : +- * Project (21)
: : : +- * Filter (20)
: : : +- * ColumnarToRow (19)
: : : +- Scan parquet default.household_demographics (18)
: : +- * Filter (28)
: : +- * ColumnarToRow (27)
: : +- Scan parquet default.customer_address (26)
: +- * Sort (40)
: +- Exchange (39)
: +- * Filter (38)
: +- * ColumnarToRow (37)
: +- Scan parquet default.customer (36)
+- * Sort (49)
+- Exchange (48)
+- * Filter (47)
+- * ColumnarToRow (46)
+- Scan parquet default.customer_address (45)
(1) Scan parquet default.store_sales
Output [9]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_ext_sales_price#7, ss_ext_list_price#8, ss_ext_tax#9]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store_sales]
PushedFilters: [In(ss_sold_date_sk, [2451790,2451180,2452216,2451454,2452184,2451485,2451850,2451514,2452062,2451270,2452123,2451758,2451971,2451546,2451942,2451393,2451667,2451453,2452215,2451819,2451331,2451577,2451911,2452245,2451301,2451545,2451605,2451943,2451851,2451181,2452154,2451820,2452001,2451362,2451392,2451240,2452032,2451637,2451484,2452124,2451300,2451727,2452093,2451759,2451698,2451332,2451606,2451666,2451912,2452185,2451211,2451361,2452031,2451212,2451880,2451789,2451423,2451576,2451728,2452246,2452155,2452092,2451881,2451970,2451697,2452063,2451271,2451636,2451515,2451424,2451239,2452002]), IsNotNull(ss_sold_date_sk), IsNotNull(ss_store_sk), IsNotNull(ss_hdemo_sk), IsNotNull(ss_addr_sk), IsNotNull(ss_customer_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_customer_sk:int,ss_hdemo_sk:int,ss_addr_sk:int,ss_store_sk:int,ss_ticket_number:int,ss_ext_sales_price:decimal(7,2),ss_ext_list_price:decimal(7,2),ss_ext_tax:decimal(7,2)>
(2) ColumnarToRow [codegen id : 4]
Input [9]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_ext_sales_price#7, ss_ext_list_price#8, ss_ext_tax#9]
(3) Filter [codegen id : 4]
Input [9]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_ext_sales_price#7, ss_ext_list_price#8, ss_ext_tax#9]
Condition : (((((ss_sold_date_sk#1 INSET (2451790,2451180,2452216,2451454,2452184,2451485,2451850,2451514,2452062,2451270,2452123,2451758,2451971,2451546,2451942,2451393,2451667,2451453,2452215,2451819,2451331,2451577,2451911,2452245,2451301,2451545,2451605,2451943,2451851,2451181,2452154,2451820,2452001,2451362,2451392,2451240,2452032,2451637,2451484,2452124,2451300,2451727,2452093,2451759,2451698,2451332,2451606,2451666,2451912,2452185,2451211,2451361,2452031,2451212,2451880,2451789,2451423,2451576,2451728,2452246,2452155,2452092,2451881,2451970,2451697,2452063,2451271,2451636,2451515,2451424,2451239,2452002) AND isnotnull(ss_sold_date_sk#1)) AND isnotnull(ss_store_sk#5)) AND isnotnull(ss_hdemo_sk#3)) AND isnotnull(ss_addr_sk#4)) AND isnotnull(ss_customer_sk#2))
(4) Scan parquet default.date_dim
Output [3]: [d_date_sk#10, d_year#11, d_dom#12]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/date_dim]
PushedFilters: [IsNotNull(d_dom), GreaterThanOrEqual(d_dom,1), LessThanOrEqual(d_dom,2), In(d_year, [1999,2000,2001]), In(d_date_sk, [2451790,2451180,2452216,2451454,2452184,2451485,2451850,2451514,2452062,2451270,2452123,2451758,2451971,2451546,2451942,2451393,2451667,2451453,2452215,2451819,2451331,2451577,2451911,2452245,2451301,2451545,2451605,2451943,2451851,2451181,2452154,2451820,2452001,2451362,2451392,2451240,2452032,2451637,2451484,2452124,2451300,2451727,2452093,2451759,2451698,2451332,2451606,2451666,2451912,2452185,2451211,2451361,2452031,2451212,2451880,2451789,2451423,2451576,2451728,2452246,2452155,2452092,2451881,2451970,2451697,2452063,2451271,2451636,2451515,2451424,2451239,2452002]), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int,d_dom:int>
(5) ColumnarToRow [codegen id : 1]
Input [3]: [d_date_sk#10, d_year#11, d_dom#12]
(6) Filter [codegen id : 1]
Input [3]: [d_date_sk#10, d_year#11, d_dom#12]
Condition : (((((isnotnull(d_dom#12) AND (d_dom#12 >= 1)) AND (d_dom#12 <= 2)) AND d_year#11 IN (1999,2000,2001)) AND d_date_sk#10 INSET (2451790,2451180,2452216,2451454,2452184,2451485,2451850,2451514,2452062,2451270,2452123,2451758,2451971,2451546,2451942,2451393,2451667,2451453,2452215,2451819,2451331,2451577,2451911,2452245,2451301,2451545,2451605,2451943,2451851,2451181,2452154,2451820,2452001,2451362,2451392,2451240,2452032,2451637,2451484,2452124,2451300,2451727,2452093,2451759,2451698,2451332,2451606,2451666,2451912,2452185,2451211,2451361,2452031,2451212,2451880,2451789,2451423,2451576,2451728,2452246,2452155,2452092,2451881,2451970,2451697,2452063,2451271,2451636,2451515,2451424,2451239,2452002)) AND isnotnull(d_date_sk#10))
(7) Project [codegen id : 1]
Output [1]: [d_date_sk#10]
Input [3]: [d_date_sk#10, d_year#11, d_dom#12]
(8) BroadcastExchange
Input [1]: [d_date_sk#10]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#13]
(9) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_sold_date_sk#1]
Right keys [1]: [d_date_sk#10]
Join condition: None
(10) Project [codegen id : 4]
Output [8]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_ext_sales_price#7, ss_ext_list_price#8, ss_ext_tax#9]
Input [10]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_ext_sales_price#7, ss_ext_list_price#8, ss_ext_tax#9, d_date_sk#10]
(11) Scan parquet default.store
Output [2]: [s_store_sk#14, s_city#15]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store]
PushedFilters: [In(s_city, [Midway,Fairview]), IsNotNull(s_store_sk)]
ReadSchema: struct<s_store_sk:int,s_city:string>
(12) ColumnarToRow [codegen id : 2]
Input [2]: [s_store_sk#14, s_city#15]
(13) Filter [codegen id : 2]
Input [2]: [s_store_sk#14, s_city#15]
Condition : (s_city#15 IN (Midway,Fairview) AND isnotnull(s_store_sk#14))
(14) Project [codegen id : 2]
Output [1]: [s_store_sk#14]
Input [2]: [s_store_sk#14, s_city#15]
(15) BroadcastExchange
Input [1]: [s_store_sk#14]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#16]
(16) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_store_sk#5]
Right keys [1]: [s_store_sk#14]
Join condition: None
(17) Project [codegen id : 4]
Output [7]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_ticket_number#6, ss_ext_sales_price#7, ss_ext_list_price#8, ss_ext_tax#9]
Input [9]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_ext_sales_price#7, ss_ext_list_price#8, ss_ext_tax#9, s_store_sk#14]
(18) Scan parquet default.household_demographics
Output [3]: [hd_demo_sk#17, hd_dep_count#18, hd_vehicle_count#19]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/household_demographics]
PushedFilters: [Or(EqualTo(hd_dep_count,5),EqualTo(hd_vehicle_count,3)), IsNotNull(hd_demo_sk)]
ReadSchema: struct<hd_demo_sk:int,hd_dep_count:int,hd_vehicle_count:int>
(19) ColumnarToRow [codegen id : 3]
Input [3]: [hd_demo_sk#17, hd_dep_count#18, hd_vehicle_count#19]
(20) Filter [codegen id : 3]
Input [3]: [hd_demo_sk#17, hd_dep_count#18, hd_vehicle_count#19]
Condition : (((hd_dep_count#18 = 5) OR (hd_vehicle_count#19 = 3)) AND isnotnull(hd_demo_sk#17))
(21) Project [codegen id : 3]
Output [1]: [hd_demo_sk#17]
Input [3]: [hd_demo_sk#17, hd_dep_count#18, hd_vehicle_count#19]
(22) BroadcastExchange
Input [1]: [hd_demo_sk#17]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#20]
(23) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_hdemo_sk#3]
Right keys [1]: [hd_demo_sk#17]
Join condition: None
(24) Project [codegen id : 4]
Output [6]: [ss_customer_sk#2, ss_addr_sk#4, ss_ticket_number#6, ss_ext_sales_price#7, ss_ext_list_price#8, ss_ext_tax#9]
Input [8]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_ticket_number#6, ss_ext_sales_price#7, ss_ext_list_price#8, ss_ext_tax#9, hd_demo_sk#17]
(25) BroadcastExchange
Input [6]: [ss_customer_sk#2, ss_addr_sk#4, ss_ticket_number#6, ss_ext_sales_price#7, ss_ext_list_price#8, ss_ext_tax#9]
Arguments: HashedRelationBroadcastMode(List(cast(input[1, int, true] as bigint)),false), [id=#21]
(26) Scan parquet default.customer_address
Output [2]: [ca_address_sk#22, ca_city#23]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/customer_address]
PushedFilters: [IsNotNull(ca_address_sk), IsNotNull(ca_city)]
ReadSchema: struct<ca_address_sk:int,ca_city:string>
(27) ColumnarToRow
Input [2]: [ca_address_sk#22, ca_city#23]
(28) Filter
Input [2]: [ca_address_sk#22, ca_city#23]
Condition : (isnotnull(ca_address_sk#22) AND isnotnull(ca_city#23))
(29) BroadcastHashJoin [codegen id : 5]
Left keys [1]: [ss_addr_sk#4]
Right keys [1]: [ca_address_sk#22]
Join condition: None
(30) Project [codegen id : 5]
Output [7]: [ss_customer_sk#2, ss_addr_sk#4, ss_ticket_number#6, ss_ext_sales_price#7, ss_ext_list_price#8, ss_ext_tax#9, ca_city#23]
Input [8]: [ss_customer_sk#2, ss_addr_sk#4, ss_ticket_number#6, ss_ext_sales_price#7, ss_ext_list_price#8, ss_ext_tax#9, ca_address_sk#22, ca_city#23]
(31) HashAggregate [codegen id : 5]
Input [7]: [ss_customer_sk#2, ss_addr_sk#4, ss_ticket_number#6, ss_ext_sales_price#7, ss_ext_list_price#8, ss_ext_tax#9, ca_city#23]
Keys [4]: [ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, ca_city#23]
Functions [3]: [partial_sum(UnscaledValue(ss_ext_sales_price#7)), partial_sum(UnscaledValue(ss_ext_list_price#8)), partial_sum(UnscaledValue(ss_ext_tax#9))]
Aggregate Attributes [3]: [sum#24, sum#25, sum#26]
Results [7]: [ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, ca_city#23, sum#27, sum#28, sum#29]
(32) Exchange
Input [7]: [ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, ca_city#23, sum#27, sum#28, sum#29]
Arguments: hashpartitioning(ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, ca_city#23, 5), true, [id=#30]
(33) HashAggregate [codegen id : 6]
Input [7]: [ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, ca_city#23, sum#27, sum#28, sum#29]
Keys [4]: [ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, ca_city#23]
Functions [3]: [sum(UnscaledValue(ss_ext_sales_price#7)), sum(UnscaledValue(ss_ext_list_price#8)), sum(UnscaledValue(ss_ext_tax#9))]
Aggregate Attributes [3]: [sum(UnscaledValue(ss_ext_sales_price#7))#31, sum(UnscaledValue(ss_ext_list_price#8))#32, sum(UnscaledValue(ss_ext_tax#9))#33]
Results [6]: [ss_ticket_number#6, ss_customer_sk#2, ca_city#23 AS bought_city#34, MakeDecimal(sum(UnscaledValue(ss_ext_sales_price#7))#31,17,2) AS extended_price#35, MakeDecimal(sum(UnscaledValue(ss_ext_list_price#8))#32,17,2) AS list_price#36, MakeDecimal(sum(UnscaledValue(ss_ext_tax#9))#33,17,2) AS extended_tax#37]
(34) Exchange
Input [6]: [ss_ticket_number#6, ss_customer_sk#2, bought_city#34, extended_price#35, list_price#36, extended_tax#37]
Arguments: hashpartitioning(ss_customer_sk#2, 5), true, [id=#38]
(35) Sort [codegen id : 7]
Input [6]: [ss_ticket_number#6, ss_customer_sk#2, bought_city#34, extended_price#35, list_price#36, extended_tax#37]
Arguments: [ss_customer_sk#2 ASC NULLS FIRST], false, 0
(36) Scan parquet default.customer
Output [4]: [c_customer_sk#39, c_current_addr_sk#40, c_first_name#41, c_last_name#42]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/customer]
PushedFilters: [IsNotNull(c_customer_sk), IsNotNull(c_current_addr_sk)]
ReadSchema: struct<c_customer_sk:int,c_current_addr_sk:int,c_first_name:string,c_last_name:string>
(37) ColumnarToRow [codegen id : 8]
Input [4]: [c_customer_sk#39, c_current_addr_sk#40, c_first_name#41, c_last_name#42]
(38) Filter [codegen id : 8]
Input [4]: [c_customer_sk#39, c_current_addr_sk#40, c_first_name#41, c_last_name#42]
Condition : (isnotnull(c_customer_sk#39) AND isnotnull(c_current_addr_sk#40))
(39) Exchange
Input [4]: [c_customer_sk#39, c_current_addr_sk#40, c_first_name#41, c_last_name#42]
Arguments: hashpartitioning(c_customer_sk#39, 5), true, [id=#43]
(40) Sort [codegen id : 9]
Input [4]: [c_customer_sk#39, c_current_addr_sk#40, c_first_name#41, c_last_name#42]
Arguments: [c_customer_sk#39 ASC NULLS FIRST], false, 0
(41) SortMergeJoin [codegen id : 10]
Left keys [1]: [ss_customer_sk#2]
Right keys [1]: [c_customer_sk#39]
Join condition: None
(42) Project [codegen id : 10]
Output [8]: [ss_ticket_number#6, bought_city#34, extended_price#35, list_price#36, extended_tax#37, c_current_addr_sk#40, c_first_name#41, c_last_name#42]
Input [10]: [ss_ticket_number#6, ss_customer_sk#2, bought_city#34, extended_price#35, list_price#36, extended_tax#37, c_customer_sk#39, c_current_addr_sk#40, c_first_name#41, c_last_name#42]
(43) Exchange
Input [8]: [ss_ticket_number#6, bought_city#34, extended_price#35, list_price#36, extended_tax#37, c_current_addr_sk#40, c_first_name#41, c_last_name#42]
Arguments: hashpartitioning(c_current_addr_sk#40, 5), true, [id=#44]
(44) Sort [codegen id : 11]
Input [8]: [ss_ticket_number#6, bought_city#34, extended_price#35, list_price#36, extended_tax#37, c_current_addr_sk#40, c_first_name#41, c_last_name#42]
Arguments: [c_current_addr_sk#40 ASC NULLS FIRST], false, 0
(45) Scan parquet default.customer_address
Output [2]: [ca_address_sk#22, ca_city#23]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/customer_address]
PushedFilters: [IsNotNull(ca_address_sk), IsNotNull(ca_city)]
ReadSchema: struct<ca_address_sk:int,ca_city:string>
(46) ColumnarToRow [codegen id : 12]
Input [2]: [ca_address_sk#22, ca_city#23]
(47) Filter [codegen id : 12]
Input [2]: [ca_address_sk#22, ca_city#23]
Condition : (isnotnull(ca_address_sk#22) AND isnotnull(ca_city#23))
(48) Exchange
Input [2]: [ca_address_sk#22, ca_city#23]
Arguments: hashpartitioning(ca_address_sk#22, 5), true, [id=#45]
(49) Sort [codegen id : 13]
Input [2]: [ca_address_sk#22, ca_city#23]
Arguments: [ca_address_sk#22 ASC NULLS FIRST], false, 0
(50) SortMergeJoin [codegen id : 14]
Left keys [1]: [c_current_addr_sk#40]
Right keys [1]: [ca_address_sk#22]
Join condition: NOT (ca_city#23 = bought_city#34)
(51) Project [codegen id : 14]
Output [8]: [c_last_name#42, c_first_name#41, ca_city#23, bought_city#34, ss_ticket_number#6, extended_price#35, extended_tax#37, list_price#36]
Input [10]: [ss_ticket_number#6, bought_city#34, extended_price#35, list_price#36, extended_tax#37, c_current_addr_sk#40, c_first_name#41, c_last_name#42, ca_address_sk#22, ca_city#23]
(52) TakeOrderedAndProject
Input [8]: [c_last_name#42, c_first_name#41, ca_city#23, bought_city#34, ss_ticket_number#6, extended_price#35, extended_tax#37, list_price#36]
Arguments: 100, [c_last_name#42 ASC NULLS FIRST, ss_ticket_number#6 ASC NULLS FIRST], [c_last_name#42, c_first_name#41, ca_city#23, bought_city#34, ss_ticket_number#6, extended_price#35, extended_tax#37, list_price#36]

View file

@ -0,0 +1,86 @@
TakeOrderedAndProject [bought_city,c_first_name,c_last_name,ca_city,extended_price,extended_tax,list_price,ss_ticket_number]
WholeStageCodegen (14)
Project [bought_city,c_first_name,c_last_name,ca_city,extended_price,extended_tax,list_price,ss_ticket_number]
SortMergeJoin [bought_city,c_current_addr_sk,ca_address_sk,ca_city]
InputAdapter
WholeStageCodegen (11)
Sort [c_current_addr_sk]
InputAdapter
Exchange [c_current_addr_sk] #1
WholeStageCodegen (10)
Project [bought_city,c_current_addr_sk,c_first_name,c_last_name,extended_price,extended_tax,list_price,ss_ticket_number]
SortMergeJoin [c_customer_sk,ss_customer_sk]
InputAdapter
WholeStageCodegen (7)
Sort [ss_customer_sk]
InputAdapter
Exchange [ss_customer_sk] #2
WholeStageCodegen (6)
HashAggregate [ca_city,ss_addr_sk,ss_customer_sk,ss_ticket_number,sum,sum,sum] [bought_city,extended_price,extended_tax,list_price,sum,sum,sum,sum(UnscaledValue(ss_ext_list_price)),sum(UnscaledValue(ss_ext_sales_price)),sum(UnscaledValue(ss_ext_tax))]
InputAdapter
Exchange [ca_city,ss_addr_sk,ss_customer_sk,ss_ticket_number] #3
WholeStageCodegen (5)
HashAggregate [ca_city,ss_addr_sk,ss_customer_sk,ss_ext_list_price,ss_ext_sales_price,ss_ext_tax,ss_ticket_number] [sum,sum,sum,sum,sum,sum]
Project [ca_city,ss_addr_sk,ss_customer_sk,ss_ext_list_price,ss_ext_sales_price,ss_ext_tax,ss_ticket_number]
BroadcastHashJoin [ca_address_sk,ss_addr_sk]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (4)
Project [ss_addr_sk,ss_customer_sk,ss_ext_list_price,ss_ext_sales_price,ss_ext_tax,ss_ticket_number]
BroadcastHashJoin [hd_demo_sk,ss_hdemo_sk]
Project [ss_addr_sk,ss_customer_sk,ss_ext_list_price,ss_ext_sales_price,ss_ext_tax,ss_hdemo_sk,ss_ticket_number]
BroadcastHashJoin [s_store_sk,ss_store_sk]
Project [ss_addr_sk,ss_customer_sk,ss_ext_list_price,ss_ext_sales_price,ss_ext_tax,ss_hdemo_sk,ss_store_sk,ss_ticket_number]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Filter [ss_addr_sk,ss_customer_sk,ss_hdemo_sk,ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_addr_sk,ss_customer_sk,ss_ext_list_price,ss_ext_sales_price,ss_ext_tax,ss_hdemo_sk,ss_sold_date_sk,ss_store_sk,ss_ticket_number]
InputAdapter
BroadcastExchange #5
WholeStageCodegen (1)
Project [d_date_sk]
Filter [d_date_sk,d_dom,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_dom,d_year]
InputAdapter
BroadcastExchange #6
WholeStageCodegen (2)
Project [s_store_sk]
Filter [s_city,s_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store [s_city,s_store_sk]
InputAdapter
BroadcastExchange #7
WholeStageCodegen (3)
Project [hd_demo_sk]
Filter [hd_demo_sk,hd_dep_count,hd_vehicle_count]
ColumnarToRow
InputAdapter
Scan parquet default.household_demographics [hd_demo_sk,hd_dep_count,hd_vehicle_count]
Filter [ca_address_sk,ca_city]
ColumnarToRow
InputAdapter
Scan parquet default.customer_address [ca_address_sk,ca_city]
InputAdapter
WholeStageCodegen (9)
Sort [c_customer_sk]
InputAdapter
Exchange [c_customer_sk] #8
WholeStageCodegen (8)
Filter [c_current_addr_sk,c_customer_sk]
ColumnarToRow
InputAdapter
Scan parquet default.customer [c_current_addr_sk,c_customer_sk,c_first_name,c_last_name]
InputAdapter
WholeStageCodegen (13)
Sort [ca_address_sk]
InputAdapter
Exchange [ca_address_sk] #9
WholeStageCodegen (12)
Filter [ca_address_sk,ca_city]
ColumnarToRow
InputAdapter
Scan parquet default.customer_address [ca_address_sk,ca_city]

View file

@ -0,0 +1,241 @@
== Physical Plan ==
TakeOrderedAndProject (43)
+- * Project (42)
+- * BroadcastHashJoin Inner BuildRight (41)
:- * Project (39)
: +- * BroadcastHashJoin Inner BuildRight (38)
: :- * HashAggregate (33)
: : +- Exchange (32)
: : +- * HashAggregate (31)
: : +- * Project (30)
: : +- * BroadcastHashJoin Inner BuildRight (29)
: : :- * Project (24)
: : : +- * BroadcastHashJoin Inner BuildRight (23)
: : : :- * Project (17)
: : : : +- * BroadcastHashJoin Inner BuildRight (16)
: : : : :- * Project (10)
: : : : : +- * BroadcastHashJoin Inner BuildRight (9)
: : : : : :- * Filter (3)
: : : : : : +- * ColumnarToRow (2)
: : : : : : +- Scan parquet default.store_sales (1)
: : : : : +- BroadcastExchange (8)
: : : : : +- * Project (7)
: : : : : +- * Filter (6)
: : : : : +- * ColumnarToRow (5)
: : : : : +- Scan parquet default.date_dim (4)
: : : : +- BroadcastExchange (15)
: : : : +- * Project (14)
: : : : +- * Filter (13)
: : : : +- * ColumnarToRow (12)
: : : : +- Scan parquet default.store (11)
: : : +- BroadcastExchange (22)
: : : +- * Project (21)
: : : +- * Filter (20)
: : : +- * ColumnarToRow (19)
: : : +- Scan parquet default.household_demographics (18)
: : +- BroadcastExchange (28)
: : +- * Filter (27)
: : +- * ColumnarToRow (26)
: : +- Scan parquet default.customer_address (25)
: +- BroadcastExchange (37)
: +- * Filter (36)
: +- * ColumnarToRow (35)
: +- Scan parquet default.customer (34)
+- ReusedExchange (40)
(1) Scan parquet default.store_sales
Output [9]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_ext_sales_price#7, ss_ext_list_price#8, ss_ext_tax#9]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store_sales]
PushedFilters: [In(ss_sold_date_sk, [2451790,2451180,2452216,2451454,2452184,2451485,2451850,2451514,2452062,2451270,2452123,2451758,2451971,2451546,2451942,2451393,2451667,2451453,2452215,2451819,2451331,2451577,2451911,2452245,2451301,2451545,2451605,2451943,2451851,2451181,2452154,2451820,2452001,2451362,2451392,2451240,2452032,2451637,2451484,2452124,2451300,2451727,2452093,2451759,2451698,2451332,2451606,2451666,2451912,2452185,2451211,2451361,2452031,2451212,2451880,2451789,2451423,2451576,2451728,2452246,2452155,2452092,2451881,2451970,2451697,2452063,2451271,2451636,2451515,2451424,2451239,2452002]), IsNotNull(ss_sold_date_sk), IsNotNull(ss_store_sk), IsNotNull(ss_hdemo_sk), IsNotNull(ss_addr_sk), IsNotNull(ss_customer_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_customer_sk:int,ss_hdemo_sk:int,ss_addr_sk:int,ss_store_sk:int,ss_ticket_number:int,ss_ext_sales_price:decimal(7,2),ss_ext_list_price:decimal(7,2),ss_ext_tax:decimal(7,2)>
(2) ColumnarToRow [codegen id : 5]
Input [9]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_ext_sales_price#7, ss_ext_list_price#8, ss_ext_tax#9]
(3) Filter [codegen id : 5]
Input [9]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_ext_sales_price#7, ss_ext_list_price#8, ss_ext_tax#9]
Condition : (((((ss_sold_date_sk#1 INSET (2451790,2451180,2452216,2451454,2452184,2451485,2451850,2451514,2452062,2451270,2452123,2451758,2451971,2451546,2451942,2451393,2451667,2451453,2452215,2451819,2451331,2451577,2451911,2452245,2451301,2451545,2451605,2451943,2451851,2451181,2452154,2451820,2452001,2451362,2451392,2451240,2452032,2451637,2451484,2452124,2451300,2451727,2452093,2451759,2451698,2451332,2451606,2451666,2451912,2452185,2451211,2451361,2452031,2451212,2451880,2451789,2451423,2451576,2451728,2452246,2452155,2452092,2451881,2451970,2451697,2452063,2451271,2451636,2451515,2451424,2451239,2452002) AND isnotnull(ss_sold_date_sk#1)) AND isnotnull(ss_store_sk#5)) AND isnotnull(ss_hdemo_sk#3)) AND isnotnull(ss_addr_sk#4)) AND isnotnull(ss_customer_sk#2))
(4) Scan parquet default.date_dim
Output [3]: [d_date_sk#10, d_year#11, d_dom#12]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/date_dim]
PushedFilters: [IsNotNull(d_dom), GreaterThanOrEqual(d_dom,1), LessThanOrEqual(d_dom,2), In(d_year, [1999,2000,2001]), In(d_date_sk, [2451790,2451180,2452216,2451454,2452184,2451485,2451850,2451514,2452062,2451270,2452123,2451758,2451971,2451546,2451942,2451393,2451667,2451453,2452215,2451819,2451331,2451577,2451911,2452245,2451301,2451545,2451605,2451943,2451851,2451181,2452154,2451820,2452001,2451362,2451392,2451240,2452032,2451637,2451484,2452124,2451300,2451727,2452093,2451759,2451698,2451332,2451606,2451666,2451912,2452185,2451211,2451361,2452031,2451212,2451880,2451789,2451423,2451576,2451728,2452246,2452155,2452092,2451881,2451970,2451697,2452063,2451271,2451636,2451515,2451424,2451239,2452002]), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int,d_dom:int>
(5) ColumnarToRow [codegen id : 1]
Input [3]: [d_date_sk#10, d_year#11, d_dom#12]
(6) Filter [codegen id : 1]
Input [3]: [d_date_sk#10, d_year#11, d_dom#12]
Condition : (((((isnotnull(d_dom#12) AND (d_dom#12 >= 1)) AND (d_dom#12 <= 2)) AND d_year#11 IN (1999,2000,2001)) AND d_date_sk#10 INSET (2451790,2451180,2452216,2451454,2452184,2451485,2451850,2451514,2452062,2451270,2452123,2451758,2451971,2451546,2451942,2451393,2451667,2451453,2452215,2451819,2451331,2451577,2451911,2452245,2451301,2451545,2451605,2451943,2451851,2451181,2452154,2451820,2452001,2451362,2451392,2451240,2452032,2451637,2451484,2452124,2451300,2451727,2452093,2451759,2451698,2451332,2451606,2451666,2451912,2452185,2451211,2451361,2452031,2451212,2451880,2451789,2451423,2451576,2451728,2452246,2452155,2452092,2451881,2451970,2451697,2452063,2451271,2451636,2451515,2451424,2451239,2452002)) AND isnotnull(d_date_sk#10))
(7) Project [codegen id : 1]
Output [1]: [d_date_sk#10]
Input [3]: [d_date_sk#10, d_year#11, d_dom#12]
(8) BroadcastExchange
Input [1]: [d_date_sk#10]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#13]
(9) BroadcastHashJoin [codegen id : 5]
Left keys [1]: [ss_sold_date_sk#1]
Right keys [1]: [d_date_sk#10]
Join condition: None
(10) Project [codegen id : 5]
Output [8]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_ext_sales_price#7, ss_ext_list_price#8, ss_ext_tax#9]
Input [10]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_ext_sales_price#7, ss_ext_list_price#8, ss_ext_tax#9, d_date_sk#10]
(11) Scan parquet default.store
Output [2]: [s_store_sk#14, s_city#15]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store]
PushedFilters: [In(s_city, [Midway,Fairview]), IsNotNull(s_store_sk)]
ReadSchema: struct<s_store_sk:int,s_city:string>
(12) ColumnarToRow [codegen id : 2]
Input [2]: [s_store_sk#14, s_city#15]
(13) Filter [codegen id : 2]
Input [2]: [s_store_sk#14, s_city#15]
Condition : (s_city#15 IN (Midway,Fairview) AND isnotnull(s_store_sk#14))
(14) Project [codegen id : 2]
Output [1]: [s_store_sk#14]
Input [2]: [s_store_sk#14, s_city#15]
(15) BroadcastExchange
Input [1]: [s_store_sk#14]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#16]
(16) BroadcastHashJoin [codegen id : 5]
Left keys [1]: [ss_store_sk#5]
Right keys [1]: [s_store_sk#14]
Join condition: None
(17) Project [codegen id : 5]
Output [7]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_ticket_number#6, ss_ext_sales_price#7, ss_ext_list_price#8, ss_ext_tax#9]
Input [9]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_ext_sales_price#7, ss_ext_list_price#8, ss_ext_tax#9, s_store_sk#14]
(18) Scan parquet default.household_demographics
Output [3]: [hd_demo_sk#17, hd_dep_count#18, hd_vehicle_count#19]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/household_demographics]
PushedFilters: [Or(EqualTo(hd_dep_count,5),EqualTo(hd_vehicle_count,3)), IsNotNull(hd_demo_sk)]
ReadSchema: struct<hd_demo_sk:int,hd_dep_count:int,hd_vehicle_count:int>
(19) ColumnarToRow [codegen id : 3]
Input [3]: [hd_demo_sk#17, hd_dep_count#18, hd_vehicle_count#19]
(20) Filter [codegen id : 3]
Input [3]: [hd_demo_sk#17, hd_dep_count#18, hd_vehicle_count#19]
Condition : (((hd_dep_count#18 = 5) OR (hd_vehicle_count#19 = 3)) AND isnotnull(hd_demo_sk#17))
(21) Project [codegen id : 3]
Output [1]: [hd_demo_sk#17]
Input [3]: [hd_demo_sk#17, hd_dep_count#18, hd_vehicle_count#19]
(22) BroadcastExchange
Input [1]: [hd_demo_sk#17]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#20]
(23) BroadcastHashJoin [codegen id : 5]
Left keys [1]: [ss_hdemo_sk#3]
Right keys [1]: [hd_demo_sk#17]
Join condition: None
(24) Project [codegen id : 5]
Output [6]: [ss_customer_sk#2, ss_addr_sk#4, ss_ticket_number#6, ss_ext_sales_price#7, ss_ext_list_price#8, ss_ext_tax#9]
Input [8]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_ticket_number#6, ss_ext_sales_price#7, ss_ext_list_price#8, ss_ext_tax#9, hd_demo_sk#17]
(25) Scan parquet default.customer_address
Output [2]: [ca_address_sk#21, ca_city#22]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/customer_address]
PushedFilters: [IsNotNull(ca_address_sk), IsNotNull(ca_city)]
ReadSchema: struct<ca_address_sk:int,ca_city:string>
(26) ColumnarToRow [codegen id : 4]
Input [2]: [ca_address_sk#21, ca_city#22]
(27) Filter [codegen id : 4]
Input [2]: [ca_address_sk#21, ca_city#22]
Condition : (isnotnull(ca_address_sk#21) AND isnotnull(ca_city#22))
(28) BroadcastExchange
Input [2]: [ca_address_sk#21, ca_city#22]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#23]
(29) BroadcastHashJoin [codegen id : 5]
Left keys [1]: [ss_addr_sk#4]
Right keys [1]: [ca_address_sk#21]
Join condition: None
(30) Project [codegen id : 5]
Output [7]: [ss_customer_sk#2, ss_addr_sk#4, ss_ticket_number#6, ss_ext_sales_price#7, ss_ext_list_price#8, ss_ext_tax#9, ca_city#22]
Input [8]: [ss_customer_sk#2, ss_addr_sk#4, ss_ticket_number#6, ss_ext_sales_price#7, ss_ext_list_price#8, ss_ext_tax#9, ca_address_sk#21, ca_city#22]
(31) HashAggregate [codegen id : 5]
Input [7]: [ss_customer_sk#2, ss_addr_sk#4, ss_ticket_number#6, ss_ext_sales_price#7, ss_ext_list_price#8, ss_ext_tax#9, ca_city#22]
Keys [4]: [ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, ca_city#22]
Functions [3]: [partial_sum(UnscaledValue(ss_ext_sales_price#7)), partial_sum(UnscaledValue(ss_ext_list_price#8)), partial_sum(UnscaledValue(ss_ext_tax#9))]
Aggregate Attributes [3]: [sum#24, sum#25, sum#26]
Results [7]: [ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, ca_city#22, sum#27, sum#28, sum#29]
(32) Exchange
Input [7]: [ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, ca_city#22, sum#27, sum#28, sum#29]
Arguments: hashpartitioning(ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, ca_city#22, 5), true, [id=#30]
(33) HashAggregate [codegen id : 8]
Input [7]: [ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, ca_city#22, sum#27, sum#28, sum#29]
Keys [4]: [ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, ca_city#22]
Functions [3]: [sum(UnscaledValue(ss_ext_sales_price#7)), sum(UnscaledValue(ss_ext_list_price#8)), sum(UnscaledValue(ss_ext_tax#9))]
Aggregate Attributes [3]: [sum(UnscaledValue(ss_ext_sales_price#7))#31, sum(UnscaledValue(ss_ext_list_price#8))#32, sum(UnscaledValue(ss_ext_tax#9))#33]
Results [6]: [ss_ticket_number#6, ss_customer_sk#2, ca_city#22 AS bought_city#34, MakeDecimal(sum(UnscaledValue(ss_ext_sales_price#7))#31,17,2) AS extended_price#35, MakeDecimal(sum(UnscaledValue(ss_ext_list_price#8))#32,17,2) AS list_price#36, MakeDecimal(sum(UnscaledValue(ss_ext_tax#9))#33,17,2) AS extended_tax#37]
(34) Scan parquet default.customer
Output [4]: [c_customer_sk#38, c_current_addr_sk#39, c_first_name#40, c_last_name#41]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/customer]
PushedFilters: [IsNotNull(c_customer_sk), IsNotNull(c_current_addr_sk)]
ReadSchema: struct<c_customer_sk:int,c_current_addr_sk:int,c_first_name:string,c_last_name:string>
(35) ColumnarToRow [codegen id : 6]
Input [4]: [c_customer_sk#38, c_current_addr_sk#39, c_first_name#40, c_last_name#41]
(36) Filter [codegen id : 6]
Input [4]: [c_customer_sk#38, c_current_addr_sk#39, c_first_name#40, c_last_name#41]
Condition : (isnotnull(c_customer_sk#38) AND isnotnull(c_current_addr_sk#39))
(37) BroadcastExchange
Input [4]: [c_customer_sk#38, c_current_addr_sk#39, c_first_name#40, c_last_name#41]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#42]
(38) BroadcastHashJoin [codegen id : 8]
Left keys [1]: [ss_customer_sk#2]
Right keys [1]: [c_customer_sk#38]
Join condition: None
(39) Project [codegen id : 8]
Output [8]: [ss_ticket_number#6, bought_city#34, extended_price#35, list_price#36, extended_tax#37, c_current_addr_sk#39, c_first_name#40, c_last_name#41]
Input [10]: [ss_ticket_number#6, ss_customer_sk#2, bought_city#34, extended_price#35, list_price#36, extended_tax#37, c_customer_sk#38, c_current_addr_sk#39, c_first_name#40, c_last_name#41]
(40) ReusedExchange [Reuses operator id: 28]
Output [2]: [ca_address_sk#21, ca_city#22]
(41) BroadcastHashJoin [codegen id : 8]
Left keys [1]: [c_current_addr_sk#39]
Right keys [1]: [ca_address_sk#21]
Join condition: NOT (ca_city#22 = bought_city#34)
(42) Project [codegen id : 8]
Output [8]: [c_last_name#41, c_first_name#40, ca_city#22, bought_city#34, ss_ticket_number#6, extended_price#35, extended_tax#37, list_price#36]
Input [10]: [ss_ticket_number#6, bought_city#34, extended_price#35, list_price#36, extended_tax#37, c_current_addr_sk#39, c_first_name#40, c_last_name#41, ca_address_sk#21, ca_city#22]
(43) TakeOrderedAndProject
Input [8]: [c_last_name#41, c_first_name#40, ca_city#22, bought_city#34, ss_ticket_number#6, extended_price#35, extended_tax#37, list_price#36]
Arguments: 100, [c_last_name#41 ASC NULLS FIRST, ss_ticket_number#6 ASC NULLS FIRST], [c_last_name#41, c_first_name#40, ca_city#22, bought_city#34, ss_ticket_number#6, extended_price#35, extended_tax#37, list_price#36]

View file

@ -0,0 +1,63 @@
TakeOrderedAndProject [bought_city,c_first_name,c_last_name,ca_city,extended_price,extended_tax,list_price,ss_ticket_number]
WholeStageCodegen (8)
Project [bought_city,c_first_name,c_last_name,ca_city,extended_price,extended_tax,list_price,ss_ticket_number]
BroadcastHashJoin [bought_city,c_current_addr_sk,ca_address_sk,ca_city]
Project [bought_city,c_current_addr_sk,c_first_name,c_last_name,extended_price,extended_tax,list_price,ss_ticket_number]
BroadcastHashJoin [c_customer_sk,ss_customer_sk]
HashAggregate [ca_city,ss_addr_sk,ss_customer_sk,ss_ticket_number,sum,sum,sum] [bought_city,extended_price,extended_tax,list_price,sum,sum,sum,sum(UnscaledValue(ss_ext_list_price)),sum(UnscaledValue(ss_ext_sales_price)),sum(UnscaledValue(ss_ext_tax))]
InputAdapter
Exchange [ca_city,ss_addr_sk,ss_customer_sk,ss_ticket_number] #1
WholeStageCodegen (5)
HashAggregate [ca_city,ss_addr_sk,ss_customer_sk,ss_ext_list_price,ss_ext_sales_price,ss_ext_tax,ss_ticket_number] [sum,sum,sum,sum,sum,sum]
Project [ca_city,ss_addr_sk,ss_customer_sk,ss_ext_list_price,ss_ext_sales_price,ss_ext_tax,ss_ticket_number]
BroadcastHashJoin [ca_address_sk,ss_addr_sk]
Project [ss_addr_sk,ss_customer_sk,ss_ext_list_price,ss_ext_sales_price,ss_ext_tax,ss_ticket_number]
BroadcastHashJoin [hd_demo_sk,ss_hdemo_sk]
Project [ss_addr_sk,ss_customer_sk,ss_ext_list_price,ss_ext_sales_price,ss_ext_tax,ss_hdemo_sk,ss_ticket_number]
BroadcastHashJoin [s_store_sk,ss_store_sk]
Project [ss_addr_sk,ss_customer_sk,ss_ext_list_price,ss_ext_sales_price,ss_ext_tax,ss_hdemo_sk,ss_store_sk,ss_ticket_number]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Filter [ss_addr_sk,ss_customer_sk,ss_hdemo_sk,ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_addr_sk,ss_customer_sk,ss_ext_list_price,ss_ext_sales_price,ss_ext_tax,ss_hdemo_sk,ss_sold_date_sk,ss_store_sk,ss_ticket_number]
InputAdapter
BroadcastExchange #2
WholeStageCodegen (1)
Project [d_date_sk]
Filter [d_date_sk,d_dom,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_dom,d_year]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (2)
Project [s_store_sk]
Filter [s_city,s_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store [s_city,s_store_sk]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (3)
Project [hd_demo_sk]
Filter [hd_demo_sk,hd_dep_count,hd_vehicle_count]
ColumnarToRow
InputAdapter
Scan parquet default.household_demographics [hd_demo_sk,hd_dep_count,hd_vehicle_count]
InputAdapter
BroadcastExchange #5
WholeStageCodegen (4)
Filter [ca_address_sk,ca_city]
ColumnarToRow
InputAdapter
Scan parquet default.customer_address [ca_address_sk,ca_city]
InputAdapter
BroadcastExchange #6
WholeStageCodegen (6)
Filter [c_current_addr_sk,c_customer_sk]
ColumnarToRow
InputAdapter
Scan parquet default.customer [c_current_addr_sk,c_customer_sk,c_first_name,c_last_name]
InputAdapter
ReusedExchange [ca_address_sk,ca_city] #5

View file

@ -0,0 +1,193 @@
== Physical Plan ==
TakeOrderedAndProject (34)
+- * HashAggregate (33)
+- Exchange (32)
+- * HashAggregate (31)
+- * Project (30)
+- * BroadcastHashJoin Inner BuildRight (29)
:- * Project (24)
: +- * BroadcastHashJoin Inner BuildRight (23)
: :- * Project (17)
: : +- * BroadcastHashJoin Inner BuildRight (16)
: : :- * Project (10)
: : : +- * BroadcastHashJoin Inner BuildLeft (9)
: : : :- BroadcastExchange (5)
: : : : +- * Project (4)
: : : : +- * Filter (3)
: : : : +- * ColumnarToRow (2)
: : : : +- Scan parquet default.date_dim (1)
: : : +- * Filter (8)
: : : +- * ColumnarToRow (7)
: : : +- Scan parquet default.store_sales (6)
: : +- BroadcastExchange (15)
: : +- * Project (14)
: : +- * Filter (13)
: : +- * ColumnarToRow (12)
: : +- Scan parquet default.promotion (11)
: +- BroadcastExchange (22)
: +- * Project (21)
: +- * Filter (20)
: +- * ColumnarToRow (19)
: +- Scan parquet default.customer_demographics (18)
+- BroadcastExchange (28)
+- * Filter (27)
+- * ColumnarToRow (26)
+- Scan parquet default.item (25)
(1) Scan parquet default.date_dim
Output [2]: [d_date_sk#1, d_year#2]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/date_dim]
PushedFilters: [IsNotNull(d_year), EqualTo(d_year,1998), GreaterThanOrEqual(d_date_sk,2450815), LessThanOrEqual(d_date_sk,2451179), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int>
(2) ColumnarToRow [codegen id : 1]
Input [2]: [d_date_sk#1, d_year#2]
(3) Filter [codegen id : 1]
Input [2]: [d_date_sk#1, d_year#2]
Condition : ((((isnotnull(d_year#2) AND (d_year#2 = 1998)) AND (d_date_sk#1 >= 2450815)) AND (d_date_sk#1 <= 2451179)) AND isnotnull(d_date_sk#1))
(4) Project [codegen id : 1]
Output [1]: [d_date_sk#1]
Input [2]: [d_date_sk#1, d_year#2]
(5) BroadcastExchange
Input [1]: [d_date_sk#1]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#3]
(6) Scan parquet default.store_sales
Output [8]: [ss_sold_date_sk#4, ss_item_sk#5, ss_cdemo_sk#6, ss_promo_sk#7, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2450815), LessThanOrEqual(ss_sold_date_sk,2451179), IsNotNull(ss_cdemo_sk), IsNotNull(ss_item_sk), IsNotNull(ss_promo_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_item_sk:int,ss_cdemo_sk:int,ss_promo_sk:int,ss_quantity:int,ss_list_price:decimal(7,2),ss_sales_price:decimal(7,2),ss_coupon_amt:decimal(7,2)>
(7) ColumnarToRow
Input [8]: [ss_sold_date_sk#4, ss_item_sk#5, ss_cdemo_sk#6, ss_promo_sk#7, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11]
(8) Filter
Input [8]: [ss_sold_date_sk#4, ss_item_sk#5, ss_cdemo_sk#6, ss_promo_sk#7, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11]
Condition : (((((isnotnull(ss_sold_date_sk#4) AND (ss_sold_date_sk#4 >= 2450815)) AND (ss_sold_date_sk#4 <= 2451179)) AND isnotnull(ss_cdemo_sk#6)) AND isnotnull(ss_item_sk#5)) AND isnotnull(ss_promo_sk#7))
(9) BroadcastHashJoin [codegen id : 5]
Left keys [1]: [d_date_sk#1]
Right keys [1]: [ss_sold_date_sk#4]
Join condition: None
(10) Project [codegen id : 5]
Output [7]: [ss_item_sk#5, ss_cdemo_sk#6, ss_promo_sk#7, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11]
Input [9]: [d_date_sk#1, ss_sold_date_sk#4, ss_item_sk#5, ss_cdemo_sk#6, ss_promo_sk#7, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11]
(11) Scan parquet default.promotion
Output [3]: [p_promo_sk#12, p_channel_email#13, p_channel_event#14]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/promotion]
PushedFilters: [Or(EqualTo(p_channel_email,N),EqualTo(p_channel_event,N)), IsNotNull(p_promo_sk)]
ReadSchema: struct<p_promo_sk:int,p_channel_email:string,p_channel_event:string>
(12) ColumnarToRow [codegen id : 2]
Input [3]: [p_promo_sk#12, p_channel_email#13, p_channel_event#14]
(13) Filter [codegen id : 2]
Input [3]: [p_promo_sk#12, p_channel_email#13, p_channel_event#14]
Condition : (((p_channel_email#13 = N) OR (p_channel_event#14 = N)) AND isnotnull(p_promo_sk#12))
(14) Project [codegen id : 2]
Output [1]: [p_promo_sk#12]
Input [3]: [p_promo_sk#12, p_channel_email#13, p_channel_event#14]
(15) BroadcastExchange
Input [1]: [p_promo_sk#12]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#15]
(16) BroadcastHashJoin [codegen id : 5]
Left keys [1]: [ss_promo_sk#7]
Right keys [1]: [p_promo_sk#12]
Join condition: None
(17) Project [codegen id : 5]
Output [6]: [ss_item_sk#5, ss_cdemo_sk#6, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11]
Input [8]: [ss_item_sk#5, ss_cdemo_sk#6, ss_promo_sk#7, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11, p_promo_sk#12]
(18) Scan parquet default.customer_demographics
Output [4]: [cd_demo_sk#16, cd_gender#17, cd_marital_status#18, cd_education_status#19]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/customer_demographics]
PushedFilters: [IsNotNull(cd_education_status), IsNotNull(cd_gender), IsNotNull(cd_marital_status), EqualTo(cd_gender,F), EqualTo(cd_marital_status,W), EqualTo(cd_education_status,Primary), IsNotNull(cd_demo_sk)]
ReadSchema: struct<cd_demo_sk:int,cd_gender:string,cd_marital_status:string,cd_education_status:string>
(19) ColumnarToRow [codegen id : 3]
Input [4]: [cd_demo_sk#16, cd_gender#17, cd_marital_status#18, cd_education_status#19]
(20) Filter [codegen id : 3]
Input [4]: [cd_demo_sk#16, cd_gender#17, cd_marital_status#18, cd_education_status#19]
Condition : ((((((isnotnull(cd_education_status#19) AND isnotnull(cd_gender#17)) AND isnotnull(cd_marital_status#18)) AND (cd_gender#17 = F)) AND (cd_marital_status#18 = W)) AND (cd_education_status#19 = Primary)) AND isnotnull(cd_demo_sk#16))
(21) Project [codegen id : 3]
Output [1]: [cd_demo_sk#16]
Input [4]: [cd_demo_sk#16, cd_gender#17, cd_marital_status#18, cd_education_status#19]
(22) BroadcastExchange
Input [1]: [cd_demo_sk#16]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#20]
(23) BroadcastHashJoin [codegen id : 5]
Left keys [1]: [ss_cdemo_sk#6]
Right keys [1]: [cd_demo_sk#16]
Join condition: None
(24) Project [codegen id : 5]
Output [5]: [ss_item_sk#5, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11]
Input [7]: [ss_item_sk#5, ss_cdemo_sk#6, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11, cd_demo_sk#16]
(25) Scan parquet default.item
Output [2]: [i_item_sk#21, i_item_id#22]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/item]
PushedFilters: [IsNotNull(i_item_sk)]
ReadSchema: struct<i_item_sk:int,i_item_id:string>
(26) ColumnarToRow [codegen id : 4]
Input [2]: [i_item_sk#21, i_item_id#22]
(27) Filter [codegen id : 4]
Input [2]: [i_item_sk#21, i_item_id#22]
Condition : isnotnull(i_item_sk#21)
(28) BroadcastExchange
Input [2]: [i_item_sk#21, i_item_id#22]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#23]
(29) BroadcastHashJoin [codegen id : 5]
Left keys [1]: [ss_item_sk#5]
Right keys [1]: [i_item_sk#21]
Join condition: None
(30) Project [codegen id : 5]
Output [5]: [ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11, i_item_id#22]
Input [7]: [ss_item_sk#5, ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11, i_item_sk#21, i_item_id#22]
(31) HashAggregate [codegen id : 5]
Input [5]: [ss_quantity#8, ss_list_price#9, ss_sales_price#10, ss_coupon_amt#11, i_item_id#22]
Keys [1]: [i_item_id#22]
Functions [4]: [partial_avg(cast(ss_quantity#8 as bigint)), partial_avg(UnscaledValue(ss_list_price#9)), partial_avg(UnscaledValue(ss_coupon_amt#11)), partial_avg(UnscaledValue(ss_sales_price#10))]
Aggregate Attributes [8]: [sum#24, count#25, sum#26, count#27, sum#28, count#29, sum#30, count#31]
Results [9]: [i_item_id#22, sum#32, count#33, sum#34, count#35, sum#36, count#37, sum#38, count#39]
(32) Exchange
Input [9]: [i_item_id#22, sum#32, count#33, sum#34, count#35, sum#36, count#37, sum#38, count#39]
Arguments: hashpartitioning(i_item_id#22, 5), true, [id=#40]
(33) HashAggregate [codegen id : 6]
Input [9]: [i_item_id#22, sum#32, count#33, sum#34, count#35, sum#36, count#37, sum#38, count#39]
Keys [1]: [i_item_id#22]
Functions [4]: [avg(cast(ss_quantity#8 as bigint)), avg(UnscaledValue(ss_list_price#9)), avg(UnscaledValue(ss_coupon_amt#11)), avg(UnscaledValue(ss_sales_price#10))]
Aggregate Attributes [4]: [avg(cast(ss_quantity#8 as bigint))#41, avg(UnscaledValue(ss_list_price#9))#42, avg(UnscaledValue(ss_coupon_amt#11))#43, avg(UnscaledValue(ss_sales_price#10))#44]
Results [5]: [i_item_id#22, avg(cast(ss_quantity#8 as bigint))#41 AS agg1#45, cast((avg(UnscaledValue(ss_list_price#9))#42 / 100.0) as decimal(11,6)) AS agg2#46, cast((avg(UnscaledValue(ss_coupon_amt#11))#43 / 100.0) as decimal(11,6)) AS agg3#47, cast((avg(UnscaledValue(ss_sales_price#10))#44 / 100.0) as decimal(11,6)) AS agg4#48]
(34) TakeOrderedAndProject
Input [5]: [i_item_id#22, agg1#45, agg2#46, agg3#47, agg4#48]
Arguments: 100, [i_item_id#22 ASC NULLS FIRST], [i_item_id#22, agg1#45, agg2#46, agg3#47, agg4#48]

View file

@ -0,0 +1,50 @@
TakeOrderedAndProject [agg1,agg2,agg3,agg4,i_item_id]
WholeStageCodegen (6)
HashAggregate [count,count,count,count,i_item_id,sum,sum,sum,sum] [agg1,agg2,agg3,agg4,avg(UnscaledValue(ss_coupon_amt)),avg(UnscaledValue(ss_list_price)),avg(UnscaledValue(ss_sales_price)),avg(cast(ss_quantity as bigint)),count,count,count,count,sum,sum,sum,sum]
InputAdapter
Exchange [i_item_id] #1
WholeStageCodegen (5)
HashAggregate [i_item_id,ss_coupon_amt,ss_list_price,ss_quantity,ss_sales_price] [count,count,count,count,count,count,count,count,sum,sum,sum,sum,sum,sum,sum,sum]
Project [i_item_id,ss_coupon_amt,ss_list_price,ss_quantity,ss_sales_price]
BroadcastHashJoin [i_item_sk,ss_item_sk]
Project [ss_coupon_amt,ss_item_sk,ss_list_price,ss_quantity,ss_sales_price]
BroadcastHashJoin [cd_demo_sk,ss_cdemo_sk]
Project [ss_cdemo_sk,ss_coupon_amt,ss_item_sk,ss_list_price,ss_quantity,ss_sales_price]
BroadcastHashJoin [p_promo_sk,ss_promo_sk]
Project [ss_cdemo_sk,ss_coupon_amt,ss_item_sk,ss_list_price,ss_promo_sk,ss_quantity,ss_sales_price]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
InputAdapter
BroadcastExchange #2
WholeStageCodegen (1)
Project [d_date_sk]
Filter [d_date_sk,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_year]
Filter [ss_cdemo_sk,ss_item_sk,ss_promo_sk,ss_sold_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_cdemo_sk,ss_coupon_amt,ss_item_sk,ss_list_price,ss_promo_sk,ss_quantity,ss_sales_price,ss_sold_date_sk]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (2)
Project [p_promo_sk]
Filter [p_channel_email,p_channel_event,p_promo_sk]
ColumnarToRow
InputAdapter
Scan parquet default.promotion [p_channel_email,p_channel_event,p_promo_sk]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (3)
Project [cd_demo_sk]
Filter [cd_demo_sk,cd_education_status,cd_gender,cd_marital_status]
ColumnarToRow
InputAdapter
Scan parquet default.customer_demographics [cd_demo_sk,cd_education_status,cd_gender,cd_marital_status]
InputAdapter
BroadcastExchange #5
WholeStageCodegen (4)
Filter [i_item_sk]
ColumnarToRow
InputAdapter
Scan parquet default.item [i_item_id,i_item_sk]

View file

@ -0,0 +1,193 @@
== Physical Plan ==
TakeOrderedAndProject (34)
+- * HashAggregate (33)
+- Exchange (32)
+- * HashAggregate (31)
+- * Project (30)
+- * BroadcastHashJoin Inner BuildRight (29)
:- * Project (23)
: +- * BroadcastHashJoin Inner BuildRight (22)
: :- * Project (17)
: : +- * BroadcastHashJoin Inner BuildRight (16)
: : :- * Project (10)
: : : +- * BroadcastHashJoin Inner BuildRight (9)
: : : :- * Filter (3)
: : : : +- * ColumnarToRow (2)
: : : : +- Scan parquet default.store_sales (1)
: : : +- BroadcastExchange (8)
: : : +- * Project (7)
: : : +- * Filter (6)
: : : +- * ColumnarToRow (5)
: : : +- Scan parquet default.customer_demographics (4)
: : +- BroadcastExchange (15)
: : +- * Project (14)
: : +- * Filter (13)
: : +- * ColumnarToRow (12)
: : +- Scan parquet default.date_dim (11)
: +- BroadcastExchange (21)
: +- * Filter (20)
: +- * ColumnarToRow (19)
: +- Scan parquet default.item (18)
+- BroadcastExchange (28)
+- * Project (27)
+- * Filter (26)
+- * ColumnarToRow (25)
+- Scan parquet default.promotion (24)
(1) Scan parquet default.store_sales
Output [8]: [ss_sold_date_sk#1, ss_item_sk#2, ss_cdemo_sk#3, ss_promo_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2450815), LessThanOrEqual(ss_sold_date_sk,2451179), IsNotNull(ss_cdemo_sk), IsNotNull(ss_item_sk), IsNotNull(ss_promo_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_item_sk:int,ss_cdemo_sk:int,ss_promo_sk:int,ss_quantity:int,ss_list_price:decimal(7,2),ss_sales_price:decimal(7,2),ss_coupon_amt:decimal(7,2)>
(2) ColumnarToRow [codegen id : 5]
Input [8]: [ss_sold_date_sk#1, ss_item_sk#2, ss_cdemo_sk#3, ss_promo_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8]
(3) Filter [codegen id : 5]
Input [8]: [ss_sold_date_sk#1, ss_item_sk#2, ss_cdemo_sk#3, ss_promo_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8]
Condition : (((((isnotnull(ss_sold_date_sk#1) AND (ss_sold_date_sk#1 >= 2450815)) AND (ss_sold_date_sk#1 <= 2451179)) AND isnotnull(ss_cdemo_sk#3)) AND isnotnull(ss_item_sk#2)) AND isnotnull(ss_promo_sk#4))
(4) Scan parquet default.customer_demographics
Output [4]: [cd_demo_sk#9, cd_gender#10, cd_marital_status#11, cd_education_status#12]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/customer_demographics]
PushedFilters: [IsNotNull(cd_marital_status), IsNotNull(cd_education_status), IsNotNull(cd_gender), EqualTo(cd_gender,F), EqualTo(cd_marital_status,W), EqualTo(cd_education_status,Primary), IsNotNull(cd_demo_sk)]
ReadSchema: struct<cd_demo_sk:int,cd_gender:string,cd_marital_status:string,cd_education_status:string>
(5) ColumnarToRow [codegen id : 1]
Input [4]: [cd_demo_sk#9, cd_gender#10, cd_marital_status#11, cd_education_status#12]
(6) Filter [codegen id : 1]
Input [4]: [cd_demo_sk#9, cd_gender#10, cd_marital_status#11, cd_education_status#12]
Condition : ((((((isnotnull(cd_marital_status#11) AND isnotnull(cd_education_status#12)) AND isnotnull(cd_gender#10)) AND (cd_gender#10 = F)) AND (cd_marital_status#11 = W)) AND (cd_education_status#12 = Primary)) AND isnotnull(cd_demo_sk#9))
(7) Project [codegen id : 1]
Output [1]: [cd_demo_sk#9]
Input [4]: [cd_demo_sk#9, cd_gender#10, cd_marital_status#11, cd_education_status#12]
(8) BroadcastExchange
Input [1]: [cd_demo_sk#9]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#13]
(9) BroadcastHashJoin [codegen id : 5]
Left keys [1]: [ss_cdemo_sk#3]
Right keys [1]: [cd_demo_sk#9]
Join condition: None
(10) Project [codegen id : 5]
Output [7]: [ss_sold_date_sk#1, ss_item_sk#2, ss_promo_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8]
Input [9]: [ss_sold_date_sk#1, ss_item_sk#2, ss_cdemo_sk#3, ss_promo_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8, cd_demo_sk#9]
(11) Scan parquet default.date_dim
Output [2]: [d_date_sk#14, d_year#15]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/date_dim]
PushedFilters: [IsNotNull(d_year), EqualTo(d_year,1998), LessThanOrEqual(d_date_sk,2451179), GreaterThanOrEqual(d_date_sk,2450815), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int>
(12) ColumnarToRow [codegen id : 2]
Input [2]: [d_date_sk#14, d_year#15]
(13) Filter [codegen id : 2]
Input [2]: [d_date_sk#14, d_year#15]
Condition : ((((isnotnull(d_year#15) AND (d_year#15 = 1998)) AND (d_date_sk#14 <= 2451179)) AND (d_date_sk#14 >= 2450815)) AND isnotnull(d_date_sk#14))
(14) Project [codegen id : 2]
Output [1]: [d_date_sk#14]
Input [2]: [d_date_sk#14, d_year#15]
(15) BroadcastExchange
Input [1]: [d_date_sk#14]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#16]
(16) BroadcastHashJoin [codegen id : 5]
Left keys [1]: [ss_sold_date_sk#1]
Right keys [1]: [d_date_sk#14]
Join condition: None
(17) Project [codegen id : 5]
Output [6]: [ss_item_sk#2, ss_promo_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8]
Input [8]: [ss_sold_date_sk#1, ss_item_sk#2, ss_promo_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8, d_date_sk#14]
(18) Scan parquet default.item
Output [2]: [i_item_sk#17, i_item_id#18]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/item]
PushedFilters: [IsNotNull(i_item_sk)]
ReadSchema: struct<i_item_sk:int,i_item_id:string>
(19) ColumnarToRow [codegen id : 3]
Input [2]: [i_item_sk#17, i_item_id#18]
(20) Filter [codegen id : 3]
Input [2]: [i_item_sk#17, i_item_id#18]
Condition : isnotnull(i_item_sk#17)
(21) BroadcastExchange
Input [2]: [i_item_sk#17, i_item_id#18]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#19]
(22) BroadcastHashJoin [codegen id : 5]
Left keys [1]: [ss_item_sk#2]
Right keys [1]: [i_item_sk#17]
Join condition: None
(23) Project [codegen id : 5]
Output [6]: [ss_promo_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8, i_item_id#18]
Input [8]: [ss_item_sk#2, ss_promo_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8, i_item_sk#17, i_item_id#18]
(24) Scan parquet default.promotion
Output [3]: [p_promo_sk#20, p_channel_email#21, p_channel_event#22]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/promotion]
PushedFilters: [Or(EqualTo(p_channel_email,N),EqualTo(p_channel_event,N)), IsNotNull(p_promo_sk)]
ReadSchema: struct<p_promo_sk:int,p_channel_email:string,p_channel_event:string>
(25) ColumnarToRow [codegen id : 4]
Input [3]: [p_promo_sk#20, p_channel_email#21, p_channel_event#22]
(26) Filter [codegen id : 4]
Input [3]: [p_promo_sk#20, p_channel_email#21, p_channel_event#22]
Condition : (((p_channel_email#21 = N) OR (p_channel_event#22 = N)) AND isnotnull(p_promo_sk#20))
(27) Project [codegen id : 4]
Output [1]: [p_promo_sk#20]
Input [3]: [p_promo_sk#20, p_channel_email#21, p_channel_event#22]
(28) BroadcastExchange
Input [1]: [p_promo_sk#20]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#23]
(29) BroadcastHashJoin [codegen id : 5]
Left keys [1]: [ss_promo_sk#4]
Right keys [1]: [p_promo_sk#20]
Join condition: None
(30) Project [codegen id : 5]
Output [5]: [ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8, i_item_id#18]
Input [7]: [ss_promo_sk#4, ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8, i_item_id#18, p_promo_sk#20]
(31) HashAggregate [codegen id : 5]
Input [5]: [ss_quantity#5, ss_list_price#6, ss_sales_price#7, ss_coupon_amt#8, i_item_id#18]
Keys [1]: [i_item_id#18]
Functions [4]: [partial_avg(cast(ss_quantity#5 as bigint)), partial_avg(UnscaledValue(ss_list_price#6)), partial_avg(UnscaledValue(ss_coupon_amt#8)), partial_avg(UnscaledValue(ss_sales_price#7))]
Aggregate Attributes [8]: [sum#24, count#25, sum#26, count#27, sum#28, count#29, sum#30, count#31]
Results [9]: [i_item_id#18, sum#32, count#33, sum#34, count#35, sum#36, count#37, sum#38, count#39]
(32) Exchange
Input [9]: [i_item_id#18, sum#32, count#33, sum#34, count#35, sum#36, count#37, sum#38, count#39]
Arguments: hashpartitioning(i_item_id#18, 5), true, [id=#40]
(33) HashAggregate [codegen id : 6]
Input [9]: [i_item_id#18, sum#32, count#33, sum#34, count#35, sum#36, count#37, sum#38, count#39]
Keys [1]: [i_item_id#18]
Functions [4]: [avg(cast(ss_quantity#5 as bigint)), avg(UnscaledValue(ss_list_price#6)), avg(UnscaledValue(ss_coupon_amt#8)), avg(UnscaledValue(ss_sales_price#7))]
Aggregate Attributes [4]: [avg(cast(ss_quantity#5 as bigint))#41, avg(UnscaledValue(ss_list_price#6))#42, avg(UnscaledValue(ss_coupon_amt#8))#43, avg(UnscaledValue(ss_sales_price#7))#44]
Results [5]: [i_item_id#18, avg(cast(ss_quantity#5 as bigint))#41 AS agg1#45, cast((avg(UnscaledValue(ss_list_price#6))#42 / 100.0) as decimal(11,6)) AS agg2#46, cast((avg(UnscaledValue(ss_coupon_amt#8))#43 / 100.0) as decimal(11,6)) AS agg3#47, cast((avg(UnscaledValue(ss_sales_price#7))#44 / 100.0) as decimal(11,6)) AS agg4#48]
(34) TakeOrderedAndProject
Input [5]: [i_item_id#18, agg1#45, agg2#46, agg3#47, agg4#48]
Arguments: 100, [i_item_id#18 ASC NULLS FIRST], [i_item_id#18, agg1#45, agg2#46, agg3#47, agg4#48]

View file

@ -0,0 +1,50 @@
TakeOrderedAndProject [agg1,agg2,agg3,agg4,i_item_id]
WholeStageCodegen (6)
HashAggregate [count,count,count,count,i_item_id,sum,sum,sum,sum] [agg1,agg2,agg3,agg4,avg(UnscaledValue(ss_coupon_amt)),avg(UnscaledValue(ss_list_price)),avg(UnscaledValue(ss_sales_price)),avg(cast(ss_quantity as bigint)),count,count,count,count,sum,sum,sum,sum]
InputAdapter
Exchange [i_item_id] #1
WholeStageCodegen (5)
HashAggregate [i_item_id,ss_coupon_amt,ss_list_price,ss_quantity,ss_sales_price] [count,count,count,count,count,count,count,count,sum,sum,sum,sum,sum,sum,sum,sum]
Project [i_item_id,ss_coupon_amt,ss_list_price,ss_quantity,ss_sales_price]
BroadcastHashJoin [p_promo_sk,ss_promo_sk]
Project [i_item_id,ss_coupon_amt,ss_list_price,ss_promo_sk,ss_quantity,ss_sales_price]
BroadcastHashJoin [i_item_sk,ss_item_sk]
Project [ss_coupon_amt,ss_item_sk,ss_list_price,ss_promo_sk,ss_quantity,ss_sales_price]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Project [ss_coupon_amt,ss_item_sk,ss_list_price,ss_promo_sk,ss_quantity,ss_sales_price,ss_sold_date_sk]
BroadcastHashJoin [cd_demo_sk,ss_cdemo_sk]
Filter [ss_cdemo_sk,ss_item_sk,ss_promo_sk,ss_sold_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_cdemo_sk,ss_coupon_amt,ss_item_sk,ss_list_price,ss_promo_sk,ss_quantity,ss_sales_price,ss_sold_date_sk]
InputAdapter
BroadcastExchange #2
WholeStageCodegen (1)
Project [cd_demo_sk]
Filter [cd_demo_sk,cd_education_status,cd_gender,cd_marital_status]
ColumnarToRow
InputAdapter
Scan parquet default.customer_demographics [cd_demo_sk,cd_education_status,cd_gender,cd_marital_status]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (2)
Project [d_date_sk]
Filter [d_date_sk,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_year]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (3)
Filter [i_item_sk]
ColumnarToRow
InputAdapter
Scan parquet default.item [i_item_id,i_item_sk]
InputAdapter
BroadcastExchange #5
WholeStageCodegen (4)
Project [p_promo_sk]
Filter [p_channel_email,p_channel_event,p_promo_sk]
ColumnarToRow
InputAdapter
Scan parquet default.promotion [p_channel_email,p_channel_event,p_promo_sk]

View file

@ -0,0 +1,203 @@
== Physical Plan ==
* Sort (36)
+- Exchange (35)
+- * Project (34)
+- * BroadcastHashJoin Inner BuildLeft (33)
:- BroadcastExchange (29)
: +- * Filter (28)
: +- * HashAggregate (27)
: +- Exchange (26)
: +- * HashAggregate (25)
: +- * Project (24)
: +- * BroadcastHashJoin Inner BuildRight (23)
: :- * Project (17)
: : +- * BroadcastHashJoin Inner BuildRight (16)
: : :- * Project (10)
: : : +- * BroadcastHashJoin Inner BuildRight (9)
: : : :- * Filter (3)
: : : : +- * ColumnarToRow (2)
: : : : +- Scan parquet default.store_sales (1)
: : : +- BroadcastExchange (8)
: : : +- * Project (7)
: : : +- * Filter (6)
: : : +- * ColumnarToRow (5)
: : : +- Scan parquet default.date_dim (4)
: : +- BroadcastExchange (15)
: : +- * Project (14)
: : +- * Filter (13)
: : +- * ColumnarToRow (12)
: : +- Scan parquet default.store (11)
: +- BroadcastExchange (22)
: +- * Project (21)
: +- * Filter (20)
: +- * ColumnarToRow (19)
: +- Scan parquet default.household_demographics (18)
+- * Filter (32)
+- * ColumnarToRow (31)
+- Scan parquet default.customer (30)
(1) Scan parquet default.store_sales
Output [5]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_store_sk#4, ss_ticket_number#5]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store_sales]
PushedFilters: [In(ss_sold_date_sk, [2451790,2451119,2451180,2451454,2450874,2450906,2450967,2451485,2451850,2451514,2451270,2451758,2451028,2451546,2450997,2450996,2451393,2451667,2451453,2451819,2450905,2451331,2451577,2451089,2451301,2451545,2451605,2451851,2451181,2451149,2451820,2451362,2451392,2451240,2450935,2451637,2451484,2451058,2451300,2451727,2451759,2450815,2451698,2451150,2451332,2451606,2451666,2451211,2450846,2450875,2450966,2450936,2451361,2451212,2451880,2451059,2451789,2451423,2451576,2450816,2451088,2451728,2451027,2451120,2451881,2451697,2450847,2451271,2451636,2451515,2451424,2451239]), IsNotNull(ss_sold_date_sk), IsNotNull(ss_store_sk), IsNotNull(ss_hdemo_sk), IsNotNull(ss_customer_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_customer_sk:int,ss_hdemo_sk:int,ss_store_sk:int,ss_ticket_number:int>
(2) ColumnarToRow [codegen id : 4]
Input [5]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_store_sk#4, ss_ticket_number#5]
(3) Filter [codegen id : 4]
Input [5]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_store_sk#4, ss_ticket_number#5]
Condition : ((((ss_sold_date_sk#1 INSET (2451790,2451119,2451180,2451454,2450874,2450906,2450967,2451485,2451850,2451514,2451270,2451758,2451028,2451546,2450997,2450996,2451393,2451667,2451453,2451819,2450905,2451331,2451577,2451089,2451301,2451545,2451605,2451851,2451181,2451149,2451820,2451362,2451392,2451240,2450935,2451637,2451484,2451058,2451300,2451727,2451759,2450815,2451698,2451150,2451332,2451606,2451666,2451211,2450846,2450875,2450966,2450936,2451361,2451212,2451880,2451059,2451789,2451423,2451576,2450816,2451088,2451728,2451027,2451120,2451881,2451697,2450847,2451271,2451636,2451515,2451424,2451239) AND isnotnull(ss_sold_date_sk#1)) AND isnotnull(ss_store_sk#4)) AND isnotnull(ss_hdemo_sk#3)) AND isnotnull(ss_customer_sk#2))
(4) Scan parquet default.date_dim
Output [3]: [d_date_sk#6, d_year#7, d_dom#8]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/date_dim]
PushedFilters: [IsNotNull(d_dom), GreaterThanOrEqual(d_dom,1), LessThanOrEqual(d_dom,2), In(d_year, [1998,1999,2000]), In(d_date_sk, [2451790,2451119,2451180,2451454,2450874,2450906,2450967,2451485,2451850,2451514,2451270,2451758,2451028,2451546,2450997,2450996,2451393,2451667,2451453,2451819,2450905,2451331,2451577,2451089,2451301,2451545,2451605,2451851,2451181,2451149,2451820,2451362,2451392,2451240,2450935,2451637,2451484,2451058,2451300,2451727,2451759,2450815,2451698,2451150,2451332,2451606,2451666,2451211,2450846,2450875,2450966,2450936,2451361,2451212,2451880,2451059,2451789,2451423,2451576,2450816,2451088,2451728,2451027,2451120,2451881,2451697,2450847,2451271,2451636,2451515,2451424,2451239]), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int,d_dom:int>
(5) ColumnarToRow [codegen id : 1]
Input [3]: [d_date_sk#6, d_year#7, d_dom#8]
(6) Filter [codegen id : 1]
Input [3]: [d_date_sk#6, d_year#7, d_dom#8]
Condition : (((((isnotnull(d_dom#8) AND (d_dom#8 >= 1)) AND (d_dom#8 <= 2)) AND d_year#7 IN (1998,1999,2000)) AND d_date_sk#6 INSET (2451790,2451119,2451180,2451454,2450874,2450906,2450967,2451485,2451850,2451514,2451270,2451758,2451028,2451546,2450997,2450996,2451393,2451667,2451453,2451819,2450905,2451331,2451577,2451089,2451301,2451545,2451605,2451851,2451181,2451149,2451820,2451362,2451392,2451240,2450935,2451637,2451484,2451058,2451300,2451727,2451759,2450815,2451698,2451150,2451332,2451606,2451666,2451211,2450846,2450875,2450966,2450936,2451361,2451212,2451880,2451059,2451789,2451423,2451576,2450816,2451088,2451728,2451027,2451120,2451881,2451697,2450847,2451271,2451636,2451515,2451424,2451239)) AND isnotnull(d_date_sk#6))
(7) Project [codegen id : 1]
Output [1]: [d_date_sk#6]
Input [3]: [d_date_sk#6, d_year#7, d_dom#8]
(8) BroadcastExchange
Input [1]: [d_date_sk#6]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#9]
(9) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_sold_date_sk#1]
Right keys [1]: [d_date_sk#6]
Join condition: None
(10) Project [codegen id : 4]
Output [4]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_store_sk#4, ss_ticket_number#5]
Input [6]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_store_sk#4, ss_ticket_number#5, d_date_sk#6]
(11) Scan parquet default.store
Output [2]: [s_store_sk#10, s_county#11]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store]
PushedFilters: [In(s_county, [Fairfield County,Ziebach County,Bronx County,Barrow County]), IsNotNull(s_store_sk)]
ReadSchema: struct<s_store_sk:int,s_county:string>
(12) ColumnarToRow [codegen id : 2]
Input [2]: [s_store_sk#10, s_county#11]
(13) Filter [codegen id : 2]
Input [2]: [s_store_sk#10, s_county#11]
Condition : (s_county#11 IN (Fairfield County,Ziebach County,Bronx County,Barrow County) AND isnotnull(s_store_sk#10))
(14) Project [codegen id : 2]
Output [1]: [s_store_sk#10]
Input [2]: [s_store_sk#10, s_county#11]
(15) BroadcastExchange
Input [1]: [s_store_sk#10]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#12]
(16) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_store_sk#4]
Right keys [1]: [s_store_sk#10]
Join condition: None
(17) Project [codegen id : 4]
Output [3]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_ticket_number#5]
Input [5]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_store_sk#4, ss_ticket_number#5, s_store_sk#10]
(18) Scan parquet default.household_demographics
Output [4]: [hd_demo_sk#13, hd_buy_potential#14, hd_dep_count#15, hd_vehicle_count#16]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/household_demographics]
PushedFilters: [IsNotNull(hd_vehicle_count), Or(EqualTo(hd_buy_potential,>10000),EqualTo(hd_buy_potential,Unknown)), GreaterThan(hd_vehicle_count,0), IsNotNull(hd_demo_sk)]
ReadSchema: struct<hd_demo_sk:int,hd_buy_potential:string,hd_dep_count:int,hd_vehicle_count:int>
(19) ColumnarToRow [codegen id : 3]
Input [4]: [hd_demo_sk#13, hd_buy_potential#14, hd_dep_count#15, hd_vehicle_count#16]
(20) Filter [codegen id : 3]
Input [4]: [hd_demo_sk#13, hd_buy_potential#14, hd_dep_count#15, hd_vehicle_count#16]
Condition : ((((isnotnull(hd_vehicle_count#16) AND ((hd_buy_potential#14 = >10000) OR (hd_buy_potential#14 = Unknown))) AND (hd_vehicle_count#16 > 0)) AND (CASE WHEN (hd_vehicle_count#16 > 0) THEN (cast(hd_dep_count#15 as double) / cast(hd_vehicle_count#16 as double)) ELSE null END > 1.0)) AND isnotnull(hd_demo_sk#13))
(21) Project [codegen id : 3]
Output [1]: [hd_demo_sk#13]
Input [4]: [hd_demo_sk#13, hd_buy_potential#14, hd_dep_count#15, hd_vehicle_count#16]
(22) BroadcastExchange
Input [1]: [hd_demo_sk#13]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#17]
(23) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_hdemo_sk#3]
Right keys [1]: [hd_demo_sk#13]
Join condition: None
(24) Project [codegen id : 4]
Output [2]: [ss_customer_sk#2, ss_ticket_number#5]
Input [4]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_ticket_number#5, hd_demo_sk#13]
(25) HashAggregate [codegen id : 4]
Input [2]: [ss_customer_sk#2, ss_ticket_number#5]
Keys [2]: [ss_ticket_number#5, ss_customer_sk#2]
Functions [1]: [partial_count(1)]
Aggregate Attributes [1]: [count#18]
Results [3]: [ss_ticket_number#5, ss_customer_sk#2, count#19]
(26) Exchange
Input [3]: [ss_ticket_number#5, ss_customer_sk#2, count#19]
Arguments: hashpartitioning(ss_ticket_number#5, ss_customer_sk#2, 5), true, [id=#20]
(27) HashAggregate [codegen id : 5]
Input [3]: [ss_ticket_number#5, ss_customer_sk#2, count#19]
Keys [2]: [ss_ticket_number#5, ss_customer_sk#2]
Functions [1]: [count(1)]
Aggregate Attributes [1]: [count(1)#21]
Results [3]: [ss_ticket_number#5, ss_customer_sk#2, count(1)#21 AS cnt#22]
(28) Filter [codegen id : 5]
Input [3]: [ss_ticket_number#5, ss_customer_sk#2, cnt#22]
Condition : ((cnt#22 >= 1) AND (cnt#22 <= 5))
(29) BroadcastExchange
Input [3]: [ss_ticket_number#5, ss_customer_sk#2, cnt#22]
Arguments: HashedRelationBroadcastMode(List(cast(input[1, int, true] as bigint)),false), [id=#23]
(30) Scan parquet default.customer
Output [5]: [c_customer_sk#24, c_salutation#25, c_first_name#26, c_last_name#27, c_preferred_cust_flag#28]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/customer]
PushedFilters: [IsNotNull(c_customer_sk)]
ReadSchema: struct<c_customer_sk:int,c_salutation:string,c_first_name:string,c_last_name:string,c_preferred_cust_flag:string>
(31) ColumnarToRow
Input [5]: [c_customer_sk#24, c_salutation#25, c_first_name#26, c_last_name#27, c_preferred_cust_flag#28]
(32) Filter
Input [5]: [c_customer_sk#24, c_salutation#25, c_first_name#26, c_last_name#27, c_preferred_cust_flag#28]
Condition : isnotnull(c_customer_sk#24)
(33) BroadcastHashJoin [codegen id : 6]
Left keys [1]: [ss_customer_sk#2]
Right keys [1]: [c_customer_sk#24]
Join condition: None
(34) Project [codegen id : 6]
Output [6]: [c_last_name#27, c_first_name#26, c_salutation#25, c_preferred_cust_flag#28, ss_ticket_number#5, cnt#22]
Input [8]: [ss_ticket_number#5, ss_customer_sk#2, cnt#22, c_customer_sk#24, c_salutation#25, c_first_name#26, c_last_name#27, c_preferred_cust_flag#28]
(35) Exchange
Input [6]: [c_last_name#27, c_first_name#26, c_salutation#25, c_preferred_cust_flag#28, ss_ticket_number#5, cnt#22]
Arguments: rangepartitioning(cnt#22 DESC NULLS LAST, 5), true, [id=#29]
(36) Sort [codegen id : 7]
Input [6]: [c_last_name#27, c_first_name#26, c_salutation#25, c_preferred_cust_flag#28, ss_ticket_number#5, cnt#22]
Arguments: [cnt#22 DESC NULLS LAST], true, 0

View file

@ -0,0 +1,54 @@
WholeStageCodegen (7)
Sort [cnt]
InputAdapter
Exchange [cnt] #1
WholeStageCodegen (6)
Project [c_first_name,c_last_name,c_preferred_cust_flag,c_salutation,cnt,ss_ticket_number]
BroadcastHashJoin [c_customer_sk,ss_customer_sk]
InputAdapter
BroadcastExchange #2
WholeStageCodegen (5)
Filter [cnt]
HashAggregate [count,ss_customer_sk,ss_ticket_number] [cnt,count,count(1)]
InputAdapter
Exchange [ss_customer_sk,ss_ticket_number] #3
WholeStageCodegen (4)
HashAggregate [ss_customer_sk,ss_ticket_number] [count,count]
Project [ss_customer_sk,ss_ticket_number]
BroadcastHashJoin [hd_demo_sk,ss_hdemo_sk]
Project [ss_customer_sk,ss_hdemo_sk,ss_ticket_number]
BroadcastHashJoin [s_store_sk,ss_store_sk]
Project [ss_customer_sk,ss_hdemo_sk,ss_store_sk,ss_ticket_number]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Filter [ss_customer_sk,ss_hdemo_sk,ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_customer_sk,ss_hdemo_sk,ss_sold_date_sk,ss_store_sk,ss_ticket_number]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (1)
Project [d_date_sk]
Filter [d_date_sk,d_dom,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_dom,d_year]
InputAdapter
BroadcastExchange #5
WholeStageCodegen (2)
Project [s_store_sk]
Filter [s_county,s_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store [s_county,s_store_sk]
InputAdapter
BroadcastExchange #6
WholeStageCodegen (3)
Project [hd_demo_sk]
Filter [hd_buy_potential,hd_demo_sk,hd_dep_count,hd_vehicle_count]
ColumnarToRow
InputAdapter
Scan parquet default.household_demographics [hd_buy_potential,hd_demo_sk,hd_dep_count,hd_vehicle_count]
Filter [c_customer_sk]
ColumnarToRow
InputAdapter
Scan parquet default.customer [c_customer_sk,c_first_name,c_last_name,c_preferred_cust_flag,c_salutation]

View file

@ -0,0 +1,203 @@
== Physical Plan ==
* Sort (36)
+- Exchange (35)
+- * Project (34)
+- * BroadcastHashJoin Inner BuildRight (33)
:- * Filter (28)
: +- * HashAggregate (27)
: +- Exchange (26)
: +- * HashAggregate (25)
: +- * Project (24)
: +- * BroadcastHashJoin Inner BuildRight (23)
: :- * Project (17)
: : +- * BroadcastHashJoin Inner BuildRight (16)
: : :- * Project (10)
: : : +- * BroadcastHashJoin Inner BuildRight (9)
: : : :- * Filter (3)
: : : : +- * ColumnarToRow (2)
: : : : +- Scan parquet default.store_sales (1)
: : : +- BroadcastExchange (8)
: : : +- * Project (7)
: : : +- * Filter (6)
: : : +- * ColumnarToRow (5)
: : : +- Scan parquet default.date_dim (4)
: : +- BroadcastExchange (15)
: : +- * Project (14)
: : +- * Filter (13)
: : +- * ColumnarToRow (12)
: : +- Scan parquet default.store (11)
: +- BroadcastExchange (22)
: +- * Project (21)
: +- * Filter (20)
: +- * ColumnarToRow (19)
: +- Scan parquet default.household_demographics (18)
+- BroadcastExchange (32)
+- * Filter (31)
+- * ColumnarToRow (30)
+- Scan parquet default.customer (29)
(1) Scan parquet default.store_sales
Output [5]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_store_sk#4, ss_ticket_number#5]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store_sales]
PushedFilters: [In(ss_sold_date_sk, [2451790,2451119,2451180,2451454,2450874,2450906,2450967,2451485,2451850,2451514,2451270,2451758,2451028,2451546,2450997,2450996,2451393,2451667,2451453,2451819,2450905,2451331,2451577,2451089,2451301,2451545,2451605,2451851,2451181,2451149,2451820,2451362,2451392,2451240,2450935,2451637,2451484,2451058,2451300,2451727,2451759,2450815,2451698,2451150,2451332,2451606,2451666,2451211,2450846,2450875,2450966,2450936,2451361,2451212,2451880,2451059,2451789,2451423,2451576,2450816,2451088,2451728,2451027,2451120,2451881,2451697,2450847,2451271,2451636,2451515,2451424,2451239]), IsNotNull(ss_sold_date_sk), IsNotNull(ss_store_sk), IsNotNull(ss_hdemo_sk), IsNotNull(ss_customer_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_customer_sk:int,ss_hdemo_sk:int,ss_store_sk:int,ss_ticket_number:int>
(2) ColumnarToRow [codegen id : 4]
Input [5]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_store_sk#4, ss_ticket_number#5]
(3) Filter [codegen id : 4]
Input [5]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_store_sk#4, ss_ticket_number#5]
Condition : ((((ss_sold_date_sk#1 INSET (2451790,2451119,2451180,2451454,2450874,2450906,2450967,2451485,2451850,2451514,2451270,2451758,2451028,2451546,2450997,2450996,2451393,2451667,2451453,2451819,2450905,2451331,2451577,2451089,2451301,2451545,2451605,2451851,2451181,2451149,2451820,2451362,2451392,2451240,2450935,2451637,2451484,2451058,2451300,2451727,2451759,2450815,2451698,2451150,2451332,2451606,2451666,2451211,2450846,2450875,2450966,2450936,2451361,2451212,2451880,2451059,2451789,2451423,2451576,2450816,2451088,2451728,2451027,2451120,2451881,2451697,2450847,2451271,2451636,2451515,2451424,2451239) AND isnotnull(ss_sold_date_sk#1)) AND isnotnull(ss_store_sk#4)) AND isnotnull(ss_hdemo_sk#3)) AND isnotnull(ss_customer_sk#2))
(4) Scan parquet default.date_dim
Output [3]: [d_date_sk#6, d_year#7, d_dom#8]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/date_dim]
PushedFilters: [IsNotNull(d_dom), GreaterThanOrEqual(d_dom,1), LessThanOrEqual(d_dom,2), In(d_year, [1998,1999,2000]), In(d_date_sk, [2451790,2451119,2451180,2451454,2450874,2450906,2450967,2451485,2451850,2451514,2451270,2451758,2451028,2451546,2450997,2450996,2451393,2451667,2451453,2451819,2450905,2451331,2451577,2451089,2451301,2451545,2451605,2451851,2451181,2451149,2451820,2451362,2451392,2451240,2450935,2451637,2451484,2451058,2451300,2451727,2451759,2450815,2451698,2451150,2451332,2451606,2451666,2451211,2450846,2450875,2450966,2450936,2451361,2451212,2451880,2451059,2451789,2451423,2451576,2450816,2451088,2451728,2451027,2451120,2451881,2451697,2450847,2451271,2451636,2451515,2451424,2451239]), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int,d_dom:int>
(5) ColumnarToRow [codegen id : 1]
Input [3]: [d_date_sk#6, d_year#7, d_dom#8]
(6) Filter [codegen id : 1]
Input [3]: [d_date_sk#6, d_year#7, d_dom#8]
Condition : (((((isnotnull(d_dom#8) AND (d_dom#8 >= 1)) AND (d_dom#8 <= 2)) AND d_year#7 IN (1998,1999,2000)) AND d_date_sk#6 INSET (2451790,2451119,2451180,2451454,2450874,2450906,2450967,2451485,2451850,2451514,2451270,2451758,2451028,2451546,2450997,2450996,2451393,2451667,2451453,2451819,2450905,2451331,2451577,2451089,2451301,2451545,2451605,2451851,2451181,2451149,2451820,2451362,2451392,2451240,2450935,2451637,2451484,2451058,2451300,2451727,2451759,2450815,2451698,2451150,2451332,2451606,2451666,2451211,2450846,2450875,2450966,2450936,2451361,2451212,2451880,2451059,2451789,2451423,2451576,2450816,2451088,2451728,2451027,2451120,2451881,2451697,2450847,2451271,2451636,2451515,2451424,2451239)) AND isnotnull(d_date_sk#6))
(7) Project [codegen id : 1]
Output [1]: [d_date_sk#6]
Input [3]: [d_date_sk#6, d_year#7, d_dom#8]
(8) BroadcastExchange
Input [1]: [d_date_sk#6]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#9]
(9) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_sold_date_sk#1]
Right keys [1]: [d_date_sk#6]
Join condition: None
(10) Project [codegen id : 4]
Output [4]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_store_sk#4, ss_ticket_number#5]
Input [6]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_store_sk#4, ss_ticket_number#5, d_date_sk#6]
(11) Scan parquet default.store
Output [2]: [s_store_sk#10, s_county#11]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store]
PushedFilters: [In(s_county, [Fairfield County,Ziebach County,Bronx County,Barrow County]), IsNotNull(s_store_sk)]
ReadSchema: struct<s_store_sk:int,s_county:string>
(12) ColumnarToRow [codegen id : 2]
Input [2]: [s_store_sk#10, s_county#11]
(13) Filter [codegen id : 2]
Input [2]: [s_store_sk#10, s_county#11]
Condition : (s_county#11 IN (Fairfield County,Ziebach County,Bronx County,Barrow County) AND isnotnull(s_store_sk#10))
(14) Project [codegen id : 2]
Output [1]: [s_store_sk#10]
Input [2]: [s_store_sk#10, s_county#11]
(15) BroadcastExchange
Input [1]: [s_store_sk#10]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#12]
(16) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_store_sk#4]
Right keys [1]: [s_store_sk#10]
Join condition: None
(17) Project [codegen id : 4]
Output [3]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_ticket_number#5]
Input [5]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_store_sk#4, ss_ticket_number#5, s_store_sk#10]
(18) Scan parquet default.household_demographics
Output [4]: [hd_demo_sk#13, hd_buy_potential#14, hd_dep_count#15, hd_vehicle_count#16]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/household_demographics]
PushedFilters: [IsNotNull(hd_vehicle_count), Or(EqualTo(hd_buy_potential,>10000),EqualTo(hd_buy_potential,Unknown)), GreaterThan(hd_vehicle_count,0), IsNotNull(hd_demo_sk)]
ReadSchema: struct<hd_demo_sk:int,hd_buy_potential:string,hd_dep_count:int,hd_vehicle_count:int>
(19) ColumnarToRow [codegen id : 3]
Input [4]: [hd_demo_sk#13, hd_buy_potential#14, hd_dep_count#15, hd_vehicle_count#16]
(20) Filter [codegen id : 3]
Input [4]: [hd_demo_sk#13, hd_buy_potential#14, hd_dep_count#15, hd_vehicle_count#16]
Condition : ((((isnotnull(hd_vehicle_count#16) AND ((hd_buy_potential#14 = >10000) OR (hd_buy_potential#14 = Unknown))) AND (hd_vehicle_count#16 > 0)) AND (CASE WHEN (hd_vehicle_count#16 > 0) THEN (cast(hd_dep_count#15 as double) / cast(hd_vehicle_count#16 as double)) ELSE null END > 1.0)) AND isnotnull(hd_demo_sk#13))
(21) Project [codegen id : 3]
Output [1]: [hd_demo_sk#13]
Input [4]: [hd_demo_sk#13, hd_buy_potential#14, hd_dep_count#15, hd_vehicle_count#16]
(22) BroadcastExchange
Input [1]: [hd_demo_sk#13]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#17]
(23) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_hdemo_sk#3]
Right keys [1]: [hd_demo_sk#13]
Join condition: None
(24) Project [codegen id : 4]
Output [2]: [ss_customer_sk#2, ss_ticket_number#5]
Input [4]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_ticket_number#5, hd_demo_sk#13]
(25) HashAggregate [codegen id : 4]
Input [2]: [ss_customer_sk#2, ss_ticket_number#5]
Keys [2]: [ss_ticket_number#5, ss_customer_sk#2]
Functions [1]: [partial_count(1)]
Aggregate Attributes [1]: [count#18]
Results [3]: [ss_ticket_number#5, ss_customer_sk#2, count#19]
(26) Exchange
Input [3]: [ss_ticket_number#5, ss_customer_sk#2, count#19]
Arguments: hashpartitioning(ss_ticket_number#5, ss_customer_sk#2, 5), true, [id=#20]
(27) HashAggregate [codegen id : 6]
Input [3]: [ss_ticket_number#5, ss_customer_sk#2, count#19]
Keys [2]: [ss_ticket_number#5, ss_customer_sk#2]
Functions [1]: [count(1)]
Aggregate Attributes [1]: [count(1)#21]
Results [3]: [ss_ticket_number#5, ss_customer_sk#2, count(1)#21 AS cnt#22]
(28) Filter [codegen id : 6]
Input [3]: [ss_ticket_number#5, ss_customer_sk#2, cnt#22]
Condition : ((cnt#22 >= 1) AND (cnt#22 <= 5))
(29) Scan parquet default.customer
Output [5]: [c_customer_sk#23, c_salutation#24, c_first_name#25, c_last_name#26, c_preferred_cust_flag#27]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/customer]
PushedFilters: [IsNotNull(c_customer_sk)]
ReadSchema: struct<c_customer_sk:int,c_salutation:string,c_first_name:string,c_last_name:string,c_preferred_cust_flag:string>
(30) ColumnarToRow [codegen id : 5]
Input [5]: [c_customer_sk#23, c_salutation#24, c_first_name#25, c_last_name#26, c_preferred_cust_flag#27]
(31) Filter [codegen id : 5]
Input [5]: [c_customer_sk#23, c_salutation#24, c_first_name#25, c_last_name#26, c_preferred_cust_flag#27]
Condition : isnotnull(c_customer_sk#23)
(32) BroadcastExchange
Input [5]: [c_customer_sk#23, c_salutation#24, c_first_name#25, c_last_name#26, c_preferred_cust_flag#27]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#28]
(33) BroadcastHashJoin [codegen id : 6]
Left keys [1]: [ss_customer_sk#2]
Right keys [1]: [c_customer_sk#23]
Join condition: None
(34) Project [codegen id : 6]
Output [6]: [c_last_name#26, c_first_name#25, c_salutation#24, c_preferred_cust_flag#27, ss_ticket_number#5, cnt#22]
Input [8]: [ss_ticket_number#5, ss_customer_sk#2, cnt#22, c_customer_sk#23, c_salutation#24, c_first_name#25, c_last_name#26, c_preferred_cust_flag#27]
(35) Exchange
Input [6]: [c_last_name#26, c_first_name#25, c_salutation#24, c_preferred_cust_flag#27, ss_ticket_number#5, cnt#22]
Arguments: rangepartitioning(cnt#22 DESC NULLS LAST, 5), true, [id=#29]
(36) Sort [codegen id : 7]
Input [6]: [c_last_name#26, c_first_name#25, c_salutation#24, c_preferred_cust_flag#27, ss_ticket_number#5, cnt#22]
Arguments: [cnt#22 DESC NULLS LAST], true, 0

View file

@ -0,0 +1,54 @@
WholeStageCodegen (7)
Sort [cnt]
InputAdapter
Exchange [cnt] #1
WholeStageCodegen (6)
Project [c_first_name,c_last_name,c_preferred_cust_flag,c_salutation,cnt,ss_ticket_number]
BroadcastHashJoin [c_customer_sk,ss_customer_sk]
Filter [cnt]
HashAggregate [count,ss_customer_sk,ss_ticket_number] [cnt,count,count(1)]
InputAdapter
Exchange [ss_customer_sk,ss_ticket_number] #2
WholeStageCodegen (4)
HashAggregate [ss_customer_sk,ss_ticket_number] [count,count]
Project [ss_customer_sk,ss_ticket_number]
BroadcastHashJoin [hd_demo_sk,ss_hdemo_sk]
Project [ss_customer_sk,ss_hdemo_sk,ss_ticket_number]
BroadcastHashJoin [s_store_sk,ss_store_sk]
Project [ss_customer_sk,ss_hdemo_sk,ss_store_sk,ss_ticket_number]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Filter [ss_customer_sk,ss_hdemo_sk,ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_customer_sk,ss_hdemo_sk,ss_sold_date_sk,ss_store_sk,ss_ticket_number]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (1)
Project [d_date_sk]
Filter [d_date_sk,d_dom,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_dom,d_year]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (2)
Project [s_store_sk]
Filter [s_county,s_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store [s_county,s_store_sk]
InputAdapter
BroadcastExchange #5
WholeStageCodegen (3)
Project [hd_demo_sk]
Filter [hd_buy_potential,hd_demo_sk,hd_dep_count,hd_vehicle_count]
ColumnarToRow
InputAdapter
Scan parquet default.household_demographics [hd_buy_potential,hd_demo_sk,hd_dep_count,hd_vehicle_count]
InputAdapter
BroadcastExchange #6
WholeStageCodegen (5)
Filter [c_customer_sk]
ColumnarToRow
InputAdapter
Scan parquet default.customer [c_customer_sk,c_first_name,c_last_name,c_preferred_cust_flag,c_salutation]

View file

@ -0,0 +1,208 @@
== Physical Plan ==
TakeOrderedAndProject (37)
+- * Project (36)
+- * SortMergeJoin Inner (35)
:- * Sort (29)
: +- Exchange (28)
: +- * HashAggregate (27)
: +- Exchange (26)
: +- * HashAggregate (25)
: +- * Project (24)
: +- * BroadcastHashJoin Inner BuildRight (23)
: :- * Project (17)
: : +- * BroadcastHashJoin Inner BuildRight (16)
: : :- * Project (10)
: : : +- * BroadcastHashJoin Inner BuildRight (9)
: : : :- * Filter (3)
: : : : +- * ColumnarToRow (2)
: : : : +- Scan parquet default.store_sales (1)
: : : +- BroadcastExchange (8)
: : : +- * Project (7)
: : : +- * Filter (6)
: : : +- * ColumnarToRow (5)
: : : +- Scan parquet default.date_dim (4)
: : +- BroadcastExchange (15)
: : +- * Project (14)
: : +- * Filter (13)
: : +- * ColumnarToRow (12)
: : +- Scan parquet default.household_demographics (11)
: +- BroadcastExchange (22)
: +- * Project (21)
: +- * Filter (20)
: +- * ColumnarToRow (19)
: +- Scan parquet default.store (18)
+- * Sort (34)
+- Exchange (33)
+- * Filter (32)
+- * ColumnarToRow (31)
+- Scan parquet default.customer (30)
(1) Scan parquet default.store_sales
Output [8]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2450819), LessThanOrEqual(ss_sold_date_sk,2451904), IsNotNull(ss_store_sk), IsNotNull(ss_hdemo_sk), IsNotNull(ss_customer_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_customer_sk:int,ss_hdemo_sk:int,ss_addr_sk:int,ss_store_sk:int,ss_ticket_number:int,ss_coupon_amt:decimal(7,2),ss_net_profit:decimal(7,2)>
(2) ColumnarToRow [codegen id : 4]
Input [8]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8]
(3) Filter [codegen id : 4]
Input [8]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8]
Condition : (((((isnotnull(ss_sold_date_sk#1) AND (ss_sold_date_sk#1 >= 2450819)) AND (ss_sold_date_sk#1 <= 2451904)) AND isnotnull(ss_store_sk#5)) AND isnotnull(ss_hdemo_sk#3)) AND isnotnull(ss_customer_sk#2))
(4) Scan parquet default.date_dim
Output [3]: [d_date_sk#9, d_year#10, d_dow#11]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/date_dim]
PushedFilters: [IsNotNull(d_dow), EqualTo(d_dow,1), In(d_year, [1998,1999,2000]), GreaterThanOrEqual(d_date_sk,2450819), LessThanOrEqual(d_date_sk,2451904), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int,d_dow:int>
(5) ColumnarToRow [codegen id : 1]
Input [3]: [d_date_sk#9, d_year#10, d_dow#11]
(6) Filter [codegen id : 1]
Input [3]: [d_date_sk#9, d_year#10, d_dow#11]
Condition : (((((isnotnull(d_dow#11) AND (d_dow#11 = 1)) AND d_year#10 IN (1998,1999,2000)) AND (d_date_sk#9 >= 2450819)) AND (d_date_sk#9 <= 2451904)) AND isnotnull(d_date_sk#9))
(7) Project [codegen id : 1]
Output [1]: [d_date_sk#9]
Input [3]: [d_date_sk#9, d_year#10, d_dow#11]
(8) BroadcastExchange
Input [1]: [d_date_sk#9]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#12]
(9) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_sold_date_sk#1]
Right keys [1]: [d_date_sk#9]
Join condition: None
(10) Project [codegen id : 4]
Output [7]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8]
Input [9]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8, d_date_sk#9]
(11) Scan parquet default.household_demographics
Output [3]: [hd_demo_sk#13, hd_dep_count#14, hd_vehicle_count#15]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/household_demographics]
PushedFilters: [Or(EqualTo(hd_dep_count,8),GreaterThan(hd_vehicle_count,0)), IsNotNull(hd_demo_sk)]
ReadSchema: struct<hd_demo_sk:int,hd_dep_count:int,hd_vehicle_count:int>
(12) ColumnarToRow [codegen id : 2]
Input [3]: [hd_demo_sk#13, hd_dep_count#14, hd_vehicle_count#15]
(13) Filter [codegen id : 2]
Input [3]: [hd_demo_sk#13, hd_dep_count#14, hd_vehicle_count#15]
Condition : (((hd_dep_count#14 = 8) OR (hd_vehicle_count#15 > 0)) AND isnotnull(hd_demo_sk#13))
(14) Project [codegen id : 2]
Output [1]: [hd_demo_sk#13]
Input [3]: [hd_demo_sk#13, hd_dep_count#14, hd_vehicle_count#15]
(15) BroadcastExchange
Input [1]: [hd_demo_sk#13]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#16]
(16) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_hdemo_sk#3]
Right keys [1]: [hd_demo_sk#13]
Join condition: None
(17) Project [codegen id : 4]
Output [6]: [ss_customer_sk#2, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8]
Input [8]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8, hd_demo_sk#13]
(18) Scan parquet default.store
Output [3]: [s_store_sk#17, s_number_employees#18, s_city#19]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store]
PushedFilters: [IsNotNull(s_number_employees), GreaterThanOrEqual(s_number_employees,200), LessThanOrEqual(s_number_employees,295), IsNotNull(s_store_sk)]
ReadSchema: struct<s_store_sk:int,s_number_employees:int,s_city:string>
(19) ColumnarToRow [codegen id : 3]
Input [3]: [s_store_sk#17, s_number_employees#18, s_city#19]
(20) Filter [codegen id : 3]
Input [3]: [s_store_sk#17, s_number_employees#18, s_city#19]
Condition : (((isnotnull(s_number_employees#18) AND (s_number_employees#18 >= 200)) AND (s_number_employees#18 <= 295)) AND isnotnull(s_store_sk#17))
(21) Project [codegen id : 3]
Output [2]: [s_store_sk#17, s_city#19]
Input [3]: [s_store_sk#17, s_number_employees#18, s_city#19]
(22) BroadcastExchange
Input [2]: [s_store_sk#17, s_city#19]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#20]
(23) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_store_sk#5]
Right keys [1]: [s_store_sk#17]
Join condition: None
(24) Project [codegen id : 4]
Output [6]: [ss_customer_sk#2, ss_addr_sk#4, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8, s_city#19]
Input [8]: [ss_customer_sk#2, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8, s_store_sk#17, s_city#19]
(25) HashAggregate [codegen id : 4]
Input [6]: [ss_customer_sk#2, ss_addr_sk#4, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8, s_city#19]
Keys [4]: [ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, s_city#19]
Functions [2]: [partial_sum(UnscaledValue(ss_coupon_amt#7)), partial_sum(UnscaledValue(ss_net_profit#8))]
Aggregate Attributes [2]: [sum#21, sum#22]
Results [6]: [ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, s_city#19, sum#23, sum#24]
(26) Exchange
Input [6]: [ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, s_city#19, sum#23, sum#24]
Arguments: hashpartitioning(ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, s_city#19, 5), true, [id=#25]
(27) HashAggregate [codegen id : 5]
Input [6]: [ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, s_city#19, sum#23, sum#24]
Keys [4]: [ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, s_city#19]
Functions [2]: [sum(UnscaledValue(ss_coupon_amt#7)), sum(UnscaledValue(ss_net_profit#8))]
Aggregate Attributes [2]: [sum(UnscaledValue(ss_coupon_amt#7))#26, sum(UnscaledValue(ss_net_profit#8))#27]
Results [5]: [ss_ticket_number#6, ss_customer_sk#2, s_city#19, MakeDecimal(sum(UnscaledValue(ss_coupon_amt#7))#26,17,2) AS amt#28, MakeDecimal(sum(UnscaledValue(ss_net_profit#8))#27,17,2) AS profit#29]
(28) Exchange
Input [5]: [ss_ticket_number#6, ss_customer_sk#2, s_city#19, amt#28, profit#29]
Arguments: hashpartitioning(ss_customer_sk#2, 5), true, [id=#30]
(29) Sort [codegen id : 6]
Input [5]: [ss_ticket_number#6, ss_customer_sk#2, s_city#19, amt#28, profit#29]
Arguments: [ss_customer_sk#2 ASC NULLS FIRST], false, 0
(30) Scan parquet default.customer
Output [3]: [c_customer_sk#31, c_first_name#32, c_last_name#33]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/customer]
PushedFilters: [IsNotNull(c_customer_sk)]
ReadSchema: struct<c_customer_sk:int,c_first_name:string,c_last_name:string>
(31) ColumnarToRow [codegen id : 7]
Input [3]: [c_customer_sk#31, c_first_name#32, c_last_name#33]
(32) Filter [codegen id : 7]
Input [3]: [c_customer_sk#31, c_first_name#32, c_last_name#33]
Condition : isnotnull(c_customer_sk#31)
(33) Exchange
Input [3]: [c_customer_sk#31, c_first_name#32, c_last_name#33]
Arguments: hashpartitioning(c_customer_sk#31, 5), true, [id=#34]
(34) Sort [codegen id : 8]
Input [3]: [c_customer_sk#31, c_first_name#32, c_last_name#33]
Arguments: [c_customer_sk#31 ASC NULLS FIRST], false, 0
(35) SortMergeJoin [codegen id : 9]
Left keys [1]: [ss_customer_sk#2]
Right keys [1]: [c_customer_sk#31]
Join condition: None
(36) Project [codegen id : 9]
Output [7]: [c_last_name#33, c_first_name#32, substr(s_city#19, 1, 30) AS substr(s_city, 1, 30)#35, ss_ticket_number#6, amt#28, profit#29, s_city#19]
Input [8]: [ss_ticket_number#6, ss_customer_sk#2, s_city#19, amt#28, profit#29, c_customer_sk#31, c_first_name#32, c_last_name#33]
(37) TakeOrderedAndProject
Input [7]: [c_last_name#33, c_first_name#32, substr(s_city, 1, 30)#35, ss_ticket_number#6, amt#28, profit#29, s_city#19]
Arguments: 100, [c_last_name#33 ASC NULLS FIRST, c_first_name#32 ASC NULLS FIRST, substr(s_city#19, 1, 30) ASC NULLS FIRST, profit#29 ASC NULLS FIRST], [c_last_name#33, c_first_name#32, substr(s_city, 1, 30)#35, ss_ticket_number#6, amt#28, profit#29]

View file

@ -0,0 +1,59 @@
TakeOrderedAndProject [amt,c_first_name,c_last_name,profit,s_city,ss_ticket_number,substr(s_city, 1, 30)]
WholeStageCodegen (9)
Project [amt,c_first_name,c_last_name,profit,s_city,ss_ticket_number]
SortMergeJoin [c_customer_sk,ss_customer_sk]
InputAdapter
WholeStageCodegen (6)
Sort [ss_customer_sk]
InputAdapter
Exchange [ss_customer_sk] #1
WholeStageCodegen (5)
HashAggregate [s_city,ss_addr_sk,ss_customer_sk,ss_ticket_number,sum,sum] [amt,profit,sum,sum,sum(UnscaledValue(ss_coupon_amt)),sum(UnscaledValue(ss_net_profit))]
InputAdapter
Exchange [s_city,ss_addr_sk,ss_customer_sk,ss_ticket_number] #2
WholeStageCodegen (4)
HashAggregate [s_city,ss_addr_sk,ss_coupon_amt,ss_customer_sk,ss_net_profit,ss_ticket_number] [sum,sum,sum,sum]
Project [s_city,ss_addr_sk,ss_coupon_amt,ss_customer_sk,ss_net_profit,ss_ticket_number]
BroadcastHashJoin [s_store_sk,ss_store_sk]
Project [ss_addr_sk,ss_coupon_amt,ss_customer_sk,ss_net_profit,ss_store_sk,ss_ticket_number]
BroadcastHashJoin [hd_demo_sk,ss_hdemo_sk]
Project [ss_addr_sk,ss_coupon_amt,ss_customer_sk,ss_hdemo_sk,ss_net_profit,ss_store_sk,ss_ticket_number]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Filter [ss_customer_sk,ss_hdemo_sk,ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_addr_sk,ss_coupon_amt,ss_customer_sk,ss_hdemo_sk,ss_net_profit,ss_sold_date_sk,ss_store_sk,ss_ticket_number]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (1)
Project [d_date_sk]
Filter [d_date_sk,d_dow,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_dow,d_year]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (2)
Project [hd_demo_sk]
Filter [hd_demo_sk,hd_dep_count,hd_vehicle_count]
ColumnarToRow
InputAdapter
Scan parquet default.household_demographics [hd_demo_sk,hd_dep_count,hd_vehicle_count]
InputAdapter
BroadcastExchange #5
WholeStageCodegen (3)
Project [s_city,s_store_sk]
Filter [s_number_employees,s_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store [s_city,s_number_employees,s_store_sk]
InputAdapter
WholeStageCodegen (8)
Sort [c_customer_sk]
InputAdapter
Exchange [c_customer_sk] #6
WholeStageCodegen (7)
Filter [c_customer_sk]
ColumnarToRow
InputAdapter
Scan parquet default.customer [c_customer_sk,c_first_name,c_last_name]

View file

@ -0,0 +1,193 @@
== Physical Plan ==
TakeOrderedAndProject (34)
+- * Project (33)
+- * BroadcastHashJoin Inner BuildRight (32)
:- * HashAggregate (27)
: +- Exchange (26)
: +- * HashAggregate (25)
: +- * Project (24)
: +- * BroadcastHashJoin Inner BuildRight (23)
: :- * Project (17)
: : +- * BroadcastHashJoin Inner BuildRight (16)
: : :- * Project (10)
: : : +- * BroadcastHashJoin Inner BuildRight (9)
: : : :- * Filter (3)
: : : : +- * ColumnarToRow (2)
: : : : +- Scan parquet default.store_sales (1)
: : : +- BroadcastExchange (8)
: : : +- * Project (7)
: : : +- * Filter (6)
: : : +- * ColumnarToRow (5)
: : : +- Scan parquet default.date_dim (4)
: : +- BroadcastExchange (15)
: : +- * Project (14)
: : +- * Filter (13)
: : +- * ColumnarToRow (12)
: : +- Scan parquet default.store (11)
: +- BroadcastExchange (22)
: +- * Project (21)
: +- * Filter (20)
: +- * ColumnarToRow (19)
: +- Scan parquet default.household_demographics (18)
+- BroadcastExchange (31)
+- * Filter (30)
+- * ColumnarToRow (29)
+- Scan parquet default.customer (28)
(1) Scan parquet default.store_sales
Output [8]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2450819), LessThanOrEqual(ss_sold_date_sk,2451904), IsNotNull(ss_store_sk), IsNotNull(ss_hdemo_sk), IsNotNull(ss_customer_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_customer_sk:int,ss_hdemo_sk:int,ss_addr_sk:int,ss_store_sk:int,ss_ticket_number:int,ss_coupon_amt:decimal(7,2),ss_net_profit:decimal(7,2)>
(2) ColumnarToRow [codegen id : 4]
Input [8]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8]
(3) Filter [codegen id : 4]
Input [8]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8]
Condition : (((((isnotnull(ss_sold_date_sk#1) AND (ss_sold_date_sk#1 >= 2450819)) AND (ss_sold_date_sk#1 <= 2451904)) AND isnotnull(ss_store_sk#5)) AND isnotnull(ss_hdemo_sk#3)) AND isnotnull(ss_customer_sk#2))
(4) Scan parquet default.date_dim
Output [3]: [d_date_sk#9, d_year#10, d_dow#11]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/date_dim]
PushedFilters: [IsNotNull(d_dow), EqualTo(d_dow,1), In(d_year, [1998,1999,2000]), GreaterThanOrEqual(d_date_sk,2450819), LessThanOrEqual(d_date_sk,2451904), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int,d_dow:int>
(5) ColumnarToRow [codegen id : 1]
Input [3]: [d_date_sk#9, d_year#10, d_dow#11]
(6) Filter [codegen id : 1]
Input [3]: [d_date_sk#9, d_year#10, d_dow#11]
Condition : (((((isnotnull(d_dow#11) AND (d_dow#11 = 1)) AND d_year#10 IN (1998,1999,2000)) AND (d_date_sk#9 >= 2450819)) AND (d_date_sk#9 <= 2451904)) AND isnotnull(d_date_sk#9))
(7) Project [codegen id : 1]
Output [1]: [d_date_sk#9]
Input [3]: [d_date_sk#9, d_year#10, d_dow#11]
(8) BroadcastExchange
Input [1]: [d_date_sk#9]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#12]
(9) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_sold_date_sk#1]
Right keys [1]: [d_date_sk#9]
Join condition: None
(10) Project [codegen id : 4]
Output [7]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8]
Input [9]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8, d_date_sk#9]
(11) Scan parquet default.store
Output [3]: [s_store_sk#13, s_number_employees#14, s_city#15]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store]
PushedFilters: [IsNotNull(s_number_employees), GreaterThanOrEqual(s_number_employees,200), LessThanOrEqual(s_number_employees,295), IsNotNull(s_store_sk)]
ReadSchema: struct<s_store_sk:int,s_number_employees:int,s_city:string>
(12) ColumnarToRow [codegen id : 2]
Input [3]: [s_store_sk#13, s_number_employees#14, s_city#15]
(13) Filter [codegen id : 2]
Input [3]: [s_store_sk#13, s_number_employees#14, s_city#15]
Condition : (((isnotnull(s_number_employees#14) AND (s_number_employees#14 >= 200)) AND (s_number_employees#14 <= 295)) AND isnotnull(s_store_sk#13))
(14) Project [codegen id : 2]
Output [2]: [s_store_sk#13, s_city#15]
Input [3]: [s_store_sk#13, s_number_employees#14, s_city#15]
(15) BroadcastExchange
Input [2]: [s_store_sk#13, s_city#15]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#16]
(16) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_store_sk#5]
Right keys [1]: [s_store_sk#13]
Join condition: None
(17) Project [codegen id : 4]
Output [7]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8, s_city#15]
Input [9]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_store_sk#5, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8, s_store_sk#13, s_city#15]
(18) Scan parquet default.household_demographics
Output [3]: [hd_demo_sk#17, hd_dep_count#18, hd_vehicle_count#19]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/household_demographics]
PushedFilters: [Or(EqualTo(hd_dep_count,8),GreaterThan(hd_vehicle_count,0)), IsNotNull(hd_demo_sk)]
ReadSchema: struct<hd_demo_sk:int,hd_dep_count:int,hd_vehicle_count:int>
(19) ColumnarToRow [codegen id : 3]
Input [3]: [hd_demo_sk#17, hd_dep_count#18, hd_vehicle_count#19]
(20) Filter [codegen id : 3]
Input [3]: [hd_demo_sk#17, hd_dep_count#18, hd_vehicle_count#19]
Condition : (((hd_dep_count#18 = 8) OR (hd_vehicle_count#19 > 0)) AND isnotnull(hd_demo_sk#17))
(21) Project [codegen id : 3]
Output [1]: [hd_demo_sk#17]
Input [3]: [hd_demo_sk#17, hd_dep_count#18, hd_vehicle_count#19]
(22) BroadcastExchange
Input [1]: [hd_demo_sk#17]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#20]
(23) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_hdemo_sk#3]
Right keys [1]: [hd_demo_sk#17]
Join condition: None
(24) Project [codegen id : 4]
Output [6]: [ss_customer_sk#2, ss_addr_sk#4, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8, s_city#15]
Input [8]: [ss_customer_sk#2, ss_hdemo_sk#3, ss_addr_sk#4, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8, s_city#15, hd_demo_sk#17]
(25) HashAggregate [codegen id : 4]
Input [6]: [ss_customer_sk#2, ss_addr_sk#4, ss_ticket_number#6, ss_coupon_amt#7, ss_net_profit#8, s_city#15]
Keys [4]: [ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, s_city#15]
Functions [2]: [partial_sum(UnscaledValue(ss_coupon_amt#7)), partial_sum(UnscaledValue(ss_net_profit#8))]
Aggregate Attributes [2]: [sum#21, sum#22]
Results [6]: [ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, s_city#15, sum#23, sum#24]
(26) Exchange
Input [6]: [ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, s_city#15, sum#23, sum#24]
Arguments: hashpartitioning(ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, s_city#15, 5), true, [id=#25]
(27) HashAggregate [codegen id : 6]
Input [6]: [ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, s_city#15, sum#23, sum#24]
Keys [4]: [ss_ticket_number#6, ss_customer_sk#2, ss_addr_sk#4, s_city#15]
Functions [2]: [sum(UnscaledValue(ss_coupon_amt#7)), sum(UnscaledValue(ss_net_profit#8))]
Aggregate Attributes [2]: [sum(UnscaledValue(ss_coupon_amt#7))#26, sum(UnscaledValue(ss_net_profit#8))#27]
Results [5]: [ss_ticket_number#6, ss_customer_sk#2, s_city#15, MakeDecimal(sum(UnscaledValue(ss_coupon_amt#7))#26,17,2) AS amt#28, MakeDecimal(sum(UnscaledValue(ss_net_profit#8))#27,17,2) AS profit#29]
(28) Scan parquet default.customer
Output [3]: [c_customer_sk#30, c_first_name#31, c_last_name#32]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/customer]
PushedFilters: [IsNotNull(c_customer_sk)]
ReadSchema: struct<c_customer_sk:int,c_first_name:string,c_last_name:string>
(29) ColumnarToRow [codegen id : 5]
Input [3]: [c_customer_sk#30, c_first_name#31, c_last_name#32]
(30) Filter [codegen id : 5]
Input [3]: [c_customer_sk#30, c_first_name#31, c_last_name#32]
Condition : isnotnull(c_customer_sk#30)
(31) BroadcastExchange
Input [3]: [c_customer_sk#30, c_first_name#31, c_last_name#32]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#33]
(32) BroadcastHashJoin [codegen id : 6]
Left keys [1]: [ss_customer_sk#2]
Right keys [1]: [c_customer_sk#30]
Join condition: None
(33) Project [codegen id : 6]
Output [7]: [c_last_name#32, c_first_name#31, substr(s_city#15, 1, 30) AS substr(s_city, 1, 30)#34, ss_ticket_number#6, amt#28, profit#29, s_city#15]
Input [8]: [ss_ticket_number#6, ss_customer_sk#2, s_city#15, amt#28, profit#29, c_customer_sk#30, c_first_name#31, c_last_name#32]
(34) TakeOrderedAndProject
Input [7]: [c_last_name#32, c_first_name#31, substr(s_city, 1, 30)#34, ss_ticket_number#6, amt#28, profit#29, s_city#15]
Arguments: 100, [c_last_name#32 ASC NULLS FIRST, c_first_name#31 ASC NULLS FIRST, substr(s_city#15, 1, 30) ASC NULLS FIRST, profit#29 ASC NULLS FIRST], [c_last_name#32, c_first_name#31, substr(s_city, 1, 30)#34, ss_ticket_number#6, amt#28, profit#29]

View file

@ -0,0 +1,50 @@
TakeOrderedAndProject [amt,c_first_name,c_last_name,profit,s_city,ss_ticket_number,substr(s_city, 1, 30)]
WholeStageCodegen (6)
Project [amt,c_first_name,c_last_name,profit,s_city,ss_ticket_number]
BroadcastHashJoin [c_customer_sk,ss_customer_sk]
HashAggregate [s_city,ss_addr_sk,ss_customer_sk,ss_ticket_number,sum,sum] [amt,profit,sum,sum,sum(UnscaledValue(ss_coupon_amt)),sum(UnscaledValue(ss_net_profit))]
InputAdapter
Exchange [s_city,ss_addr_sk,ss_customer_sk,ss_ticket_number] #1
WholeStageCodegen (4)
HashAggregate [s_city,ss_addr_sk,ss_coupon_amt,ss_customer_sk,ss_net_profit,ss_ticket_number] [sum,sum,sum,sum]
Project [s_city,ss_addr_sk,ss_coupon_amt,ss_customer_sk,ss_net_profit,ss_ticket_number]
BroadcastHashJoin [hd_demo_sk,ss_hdemo_sk]
Project [s_city,ss_addr_sk,ss_coupon_amt,ss_customer_sk,ss_hdemo_sk,ss_net_profit,ss_ticket_number]
BroadcastHashJoin [s_store_sk,ss_store_sk]
Project [ss_addr_sk,ss_coupon_amt,ss_customer_sk,ss_hdemo_sk,ss_net_profit,ss_store_sk,ss_ticket_number]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Filter [ss_customer_sk,ss_hdemo_sk,ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_addr_sk,ss_coupon_amt,ss_customer_sk,ss_hdemo_sk,ss_net_profit,ss_sold_date_sk,ss_store_sk,ss_ticket_number]
InputAdapter
BroadcastExchange #2
WholeStageCodegen (1)
Project [d_date_sk]
Filter [d_date_sk,d_dow,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_dow,d_year]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (2)
Project [s_city,s_store_sk]
Filter [s_number_employees,s_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store [s_city,s_number_employees,s_store_sk]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (3)
Project [hd_demo_sk]
Filter [hd_demo_sk,hd_dep_count,hd_vehicle_count]
ColumnarToRow
InputAdapter
Scan parquet default.household_demographics [hd_demo_sk,hd_dep_count,hd_vehicle_count]
InputAdapter
BroadcastExchange #5
WholeStageCodegen (5)
Filter [c_customer_sk]
ColumnarToRow
InputAdapter
Scan parquet default.customer [c_customer_sk,c_first_name,c_last_name]

View file

@ -0,0 +1,175 @@
== Physical Plan ==
TakeOrderedAndProject (31)
+- * Project (30)
+- * Filter (29)
+- Window (28)
+- * Sort (27)
+- Exchange (26)
+- * HashAggregate (25)
+- Exchange (24)
+- * HashAggregate (23)
+- * Project (22)
+- * BroadcastHashJoin Inner BuildRight (21)
:- * Project (16)
: +- * BroadcastHashJoin Inner BuildRight (15)
: :- * Project (10)
: : +- * BroadcastHashJoin Inner BuildRight (9)
: : :- * Filter (3)
: : : +- * ColumnarToRow (2)
: : : +- Scan parquet default.store_sales (1)
: : +- BroadcastExchange (8)
: : +- * Project (7)
: : +- * Filter (6)
: : +- * ColumnarToRow (5)
: : +- Scan parquet default.date_dim (4)
: +- BroadcastExchange (14)
: +- * Filter (13)
: +- * ColumnarToRow (12)
: +- Scan parquet default.store (11)
+- BroadcastExchange (20)
+- * Filter (19)
+- * ColumnarToRow (18)
+- Scan parquet default.item (17)
(1) Scan parquet default.store_sales
Output [4]: [ss_sold_date_sk#1, ss_item_sk#2, ss_store_sk#3, ss_sales_price#4]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2451545), LessThanOrEqual(ss_sold_date_sk,2451910), IsNotNull(ss_item_sk), IsNotNull(ss_store_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_item_sk:int,ss_store_sk:int,ss_sales_price:decimal(7,2)>
(2) ColumnarToRow [codegen id : 4]
Input [4]: [ss_sold_date_sk#1, ss_item_sk#2, ss_store_sk#3, ss_sales_price#4]
(3) Filter [codegen id : 4]
Input [4]: [ss_sold_date_sk#1, ss_item_sk#2, ss_store_sk#3, ss_sales_price#4]
Condition : ((((isnotnull(ss_sold_date_sk#1) AND (ss_sold_date_sk#1 >= 2451545)) AND (ss_sold_date_sk#1 <= 2451910)) AND isnotnull(ss_item_sk#2)) AND isnotnull(ss_store_sk#3))
(4) Scan parquet default.date_dim
Output [3]: [d_date_sk#5, d_year#6, d_moy#7]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/date_dim]
PushedFilters: [IsNotNull(d_year), EqualTo(d_year,2000), LessThanOrEqual(d_date_sk,2451910), GreaterThanOrEqual(d_date_sk,2451545), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int,d_moy:int>
(5) ColumnarToRow [codegen id : 1]
Input [3]: [d_date_sk#5, d_year#6, d_moy#7]
(6) Filter [codegen id : 1]
Input [3]: [d_date_sk#5, d_year#6, d_moy#7]
Condition : ((((isnotnull(d_year#6) AND (d_year#6 = 2000)) AND (d_date_sk#5 <= 2451910)) AND (d_date_sk#5 >= 2451545)) AND isnotnull(d_date_sk#5))
(7) Project [codegen id : 1]
Output [2]: [d_date_sk#5, d_moy#7]
Input [3]: [d_date_sk#5, d_year#6, d_moy#7]
(8) BroadcastExchange
Input [2]: [d_date_sk#5, d_moy#7]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#8]
(9) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_sold_date_sk#1]
Right keys [1]: [d_date_sk#5]
Join condition: None
(10) Project [codegen id : 4]
Output [4]: [ss_item_sk#2, ss_store_sk#3, ss_sales_price#4, d_moy#7]
Input [6]: [ss_sold_date_sk#1, ss_item_sk#2, ss_store_sk#3, ss_sales_price#4, d_date_sk#5, d_moy#7]
(11) Scan parquet default.store
Output [3]: [s_store_sk#9, s_store_name#10, s_company_name#11]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store]
PushedFilters: [IsNotNull(s_store_sk)]
ReadSchema: struct<s_store_sk:int,s_store_name:string,s_company_name:string>
(12) ColumnarToRow [codegen id : 2]
Input [3]: [s_store_sk#9, s_store_name#10, s_company_name#11]
(13) Filter [codegen id : 2]
Input [3]: [s_store_sk#9, s_store_name#10, s_company_name#11]
Condition : isnotnull(s_store_sk#9)
(14) BroadcastExchange
Input [3]: [s_store_sk#9, s_store_name#10, s_company_name#11]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#12]
(15) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_store_sk#3]
Right keys [1]: [s_store_sk#9]
Join condition: None
(16) Project [codegen id : 4]
Output [5]: [ss_item_sk#2, ss_sales_price#4, d_moy#7, s_store_name#10, s_company_name#11]
Input [7]: [ss_item_sk#2, ss_store_sk#3, ss_sales_price#4, d_moy#7, s_store_sk#9, s_store_name#10, s_company_name#11]
(17) Scan parquet default.item
Output [4]: [i_item_sk#13, i_brand#14, i_class#15, i_category#16]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/item]
PushedFilters: [Or(And(In(i_category, [Home,Books,Electronics]),In(i_class, [wallpaper,parenting,musical])),And(In(i_category, [Shoes,Jewelry,Men]),In(i_class, [womens,birdal,pants]))), IsNotNull(i_item_sk)]
ReadSchema: struct<i_item_sk:int,i_brand:string,i_class:string,i_category:string>
(18) ColumnarToRow [codegen id : 3]
Input [4]: [i_item_sk#13, i_brand#14, i_class#15, i_category#16]
(19) Filter [codegen id : 3]
Input [4]: [i_item_sk#13, i_brand#14, i_class#15, i_category#16]
Condition : (((i_category#16 IN (Home,Books,Electronics) AND i_class#15 IN (wallpaper,parenting,musical)) OR (i_category#16 IN (Shoes,Jewelry,Men) AND i_class#15 IN (womens,birdal,pants))) AND isnotnull(i_item_sk#13))
(20) BroadcastExchange
Input [4]: [i_item_sk#13, i_brand#14, i_class#15, i_category#16]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#17]
(21) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_item_sk#2]
Right keys [1]: [i_item_sk#13]
Join condition: None
(22) Project [codegen id : 4]
Output [7]: [i_brand#14, i_class#15, i_category#16, ss_sales_price#4, d_moy#7, s_store_name#10, s_company_name#11]
Input [9]: [ss_item_sk#2, ss_sales_price#4, d_moy#7, s_store_name#10, s_company_name#11, i_item_sk#13, i_brand#14, i_class#15, i_category#16]
(23) HashAggregate [codegen id : 4]
Input [7]: [i_brand#14, i_class#15, i_category#16, ss_sales_price#4, d_moy#7, s_store_name#10, s_company_name#11]
Keys [6]: [i_category#16, i_class#15, i_brand#14, s_store_name#10, s_company_name#11, d_moy#7]
Functions [1]: [partial_sum(UnscaledValue(ss_sales_price#4))]
Aggregate Attributes [1]: [sum#18]
Results [7]: [i_category#16, i_class#15, i_brand#14, s_store_name#10, s_company_name#11, d_moy#7, sum#19]
(24) Exchange
Input [7]: [i_category#16, i_class#15, i_brand#14, s_store_name#10, s_company_name#11, d_moy#7, sum#19]
Arguments: hashpartitioning(i_category#16, i_class#15, i_brand#14, s_store_name#10, s_company_name#11, d_moy#7, 5), true, [id=#20]
(25) HashAggregate [codegen id : 5]
Input [7]: [i_category#16, i_class#15, i_brand#14, s_store_name#10, s_company_name#11, d_moy#7, sum#19]
Keys [6]: [i_category#16, i_class#15, i_brand#14, s_store_name#10, s_company_name#11, d_moy#7]
Functions [1]: [sum(UnscaledValue(ss_sales_price#4))]
Aggregate Attributes [1]: [sum(UnscaledValue(ss_sales_price#4))#21]
Results [8]: [i_category#16, i_class#15, i_brand#14, s_store_name#10, s_company_name#11, d_moy#7, MakeDecimal(sum(UnscaledValue(ss_sales_price#4))#21,17,2) AS sum_sales#22, MakeDecimal(sum(UnscaledValue(ss_sales_price#4))#21,17,2) AS _w0#23]
(26) Exchange
Input [8]: [i_category#16, i_class#15, i_brand#14, s_store_name#10, s_company_name#11, d_moy#7, sum_sales#22, _w0#23]
Arguments: hashpartitioning(i_category#16, i_brand#14, s_store_name#10, s_company_name#11, 5), true, [id=#24]
(27) Sort [codegen id : 6]
Input [8]: [i_category#16, i_class#15, i_brand#14, s_store_name#10, s_company_name#11, d_moy#7, sum_sales#22, _w0#23]
Arguments: [i_category#16 ASC NULLS FIRST, i_brand#14 ASC NULLS FIRST, s_store_name#10 ASC NULLS FIRST, s_company_name#11 ASC NULLS FIRST], false, 0
(28) Window
Input [8]: [i_category#16, i_class#15, i_brand#14, s_store_name#10, s_company_name#11, d_moy#7, sum_sales#22, _w0#23]
Arguments: [avg(_w0#23) windowspecdefinition(i_category#16, i_brand#14, s_store_name#10, s_company_name#11, specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) AS avg_monthly_sales#25], [i_category#16, i_brand#14, s_store_name#10, s_company_name#11]
(29) Filter [codegen id : 7]
Input [9]: [i_category#16, i_class#15, i_brand#14, s_store_name#10, s_company_name#11, d_moy#7, sum_sales#22, _w0#23, avg_monthly_sales#25]
Condition : (CASE WHEN NOT (avg_monthly_sales#25 = 0.000000) THEN CheckOverflow((promote_precision(abs(CheckOverflow((promote_precision(cast(sum_sales#22 as decimal(22,6))) - promote_precision(cast(avg_monthly_sales#25 as decimal(22,6)))), DecimalType(22,6), true))) / promote_precision(cast(avg_monthly_sales#25 as decimal(22,6)))), DecimalType(38,16), true) ELSE null END > 0.1000000000000000)
(30) Project [codegen id : 7]
Output [8]: [i_category#16, i_class#15, i_brand#14, s_store_name#10, s_company_name#11, d_moy#7, sum_sales#22, avg_monthly_sales#25]
Input [9]: [i_category#16, i_class#15, i_brand#14, s_store_name#10, s_company_name#11, d_moy#7, sum_sales#22, _w0#23, avg_monthly_sales#25]
(31) TakeOrderedAndProject
Input [8]: [i_category#16, i_class#15, i_brand#14, s_store_name#10, s_company_name#11, d_moy#7, sum_sales#22, avg_monthly_sales#25]
Arguments: 100, [CheckOverflow((promote_precision(cast(sum_sales#22 as decimal(22,6))) - promote_precision(cast(avg_monthly_sales#25 as decimal(22,6)))), DecimalType(22,6), true) ASC NULLS FIRST, s_store_name#10 ASC NULLS FIRST], [i_category#16, i_class#15, i_brand#14, s_store_name#10, s_company_name#11, d_moy#7, sum_sales#22, avg_monthly_sales#25]

View file

@ -0,0 +1,48 @@
TakeOrderedAndProject [avg_monthly_sales,d_moy,i_brand,i_category,i_class,s_company_name,s_store_name,sum_sales]
WholeStageCodegen (7)
Project [avg_monthly_sales,d_moy,i_brand,i_category,i_class,s_company_name,s_store_name,sum_sales]
Filter [avg_monthly_sales,sum_sales]
InputAdapter
Window [_w0,i_brand,i_category,s_company_name,s_store_name]
WholeStageCodegen (6)
Sort [i_brand,i_category,s_company_name,s_store_name]
InputAdapter
Exchange [i_brand,i_category,s_company_name,s_store_name] #1
WholeStageCodegen (5)
HashAggregate [d_moy,i_brand,i_category,i_class,s_company_name,s_store_name,sum] [_w0,sum,sum(UnscaledValue(ss_sales_price)),sum_sales]
InputAdapter
Exchange [d_moy,i_brand,i_category,i_class,s_company_name,s_store_name] #2
WholeStageCodegen (4)
HashAggregate [d_moy,i_brand,i_category,i_class,s_company_name,s_store_name,ss_sales_price] [sum,sum]
Project [d_moy,i_brand,i_category,i_class,s_company_name,s_store_name,ss_sales_price]
BroadcastHashJoin [i_item_sk,ss_item_sk]
Project [d_moy,s_company_name,s_store_name,ss_item_sk,ss_sales_price]
BroadcastHashJoin [s_store_sk,ss_store_sk]
Project [d_moy,ss_item_sk,ss_sales_price,ss_store_sk]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Filter [ss_item_sk,ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_item_sk,ss_sales_price,ss_sold_date_sk,ss_store_sk]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (1)
Project [d_date_sk,d_moy]
Filter [d_date_sk,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_moy,d_year]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (2)
Filter [s_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store [s_company_name,s_store_name,s_store_sk]
InputAdapter
BroadcastExchange #5
WholeStageCodegen (3)
Filter [i_category,i_class,i_item_sk]
ColumnarToRow
InputAdapter
Scan parquet default.item [i_brand,i_category,i_class,i_item_sk]

View file

@ -0,0 +1,175 @@
== Physical Plan ==
TakeOrderedAndProject (31)
+- * Project (30)
+- * Filter (29)
+- Window (28)
+- * Sort (27)
+- Exchange (26)
+- * HashAggregate (25)
+- Exchange (24)
+- * HashAggregate (23)
+- * Project (22)
+- * BroadcastHashJoin Inner BuildRight (21)
:- * Project (16)
: +- * BroadcastHashJoin Inner BuildRight (15)
: :- * Project (9)
: : +- * BroadcastHashJoin Inner BuildRight (8)
: : :- * Filter (3)
: : : +- * ColumnarToRow (2)
: : : +- Scan parquet default.item (1)
: : +- BroadcastExchange (7)
: : +- * Filter (6)
: : +- * ColumnarToRow (5)
: : +- Scan parquet default.store_sales (4)
: +- BroadcastExchange (14)
: +- * Project (13)
: +- * Filter (12)
: +- * ColumnarToRow (11)
: +- Scan parquet default.date_dim (10)
+- BroadcastExchange (20)
+- * Filter (19)
+- * ColumnarToRow (18)
+- Scan parquet default.store (17)
(1) Scan parquet default.item
Output [4]: [i_item_sk#1, i_brand#2, i_class#3, i_category#4]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/item]
PushedFilters: [Or(And(In(i_category, [Home,Books,Electronics]),In(i_class, [wallpaper,parenting,musical])),And(In(i_category, [Shoes,Jewelry,Men]),In(i_class, [womens,birdal,pants]))), IsNotNull(i_item_sk)]
ReadSchema: struct<i_item_sk:int,i_brand:string,i_class:string,i_category:string>
(2) ColumnarToRow [codegen id : 4]
Input [4]: [i_item_sk#1, i_brand#2, i_class#3, i_category#4]
(3) Filter [codegen id : 4]
Input [4]: [i_item_sk#1, i_brand#2, i_class#3, i_category#4]
Condition : (((i_category#4 IN (Home,Books,Electronics) AND i_class#3 IN (wallpaper,parenting,musical)) OR (i_category#4 IN (Shoes,Jewelry,Men) AND i_class#3 IN (womens,birdal,pants))) AND isnotnull(i_item_sk#1))
(4) Scan parquet default.store_sales
Output [4]: [ss_sold_date_sk#5, ss_item_sk#6, ss_store_sk#7, ss_sales_price#8]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2451545), LessThanOrEqual(ss_sold_date_sk,2451910), IsNotNull(ss_item_sk), IsNotNull(ss_store_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_item_sk:int,ss_store_sk:int,ss_sales_price:decimal(7,2)>
(5) ColumnarToRow [codegen id : 1]
Input [4]: [ss_sold_date_sk#5, ss_item_sk#6, ss_store_sk#7, ss_sales_price#8]
(6) Filter [codegen id : 1]
Input [4]: [ss_sold_date_sk#5, ss_item_sk#6, ss_store_sk#7, ss_sales_price#8]
Condition : ((((isnotnull(ss_sold_date_sk#5) AND (ss_sold_date_sk#5 >= 2451545)) AND (ss_sold_date_sk#5 <= 2451910)) AND isnotnull(ss_item_sk#6)) AND isnotnull(ss_store_sk#7))
(7) BroadcastExchange
Input [4]: [ss_sold_date_sk#5, ss_item_sk#6, ss_store_sk#7, ss_sales_price#8]
Arguments: HashedRelationBroadcastMode(List(cast(input[1, int, false] as bigint)),false), [id=#9]
(8) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [i_item_sk#1]
Right keys [1]: [ss_item_sk#6]
Join condition: None
(9) Project [codegen id : 4]
Output [6]: [i_brand#2, i_class#3, i_category#4, ss_sold_date_sk#5, ss_store_sk#7, ss_sales_price#8]
Input [8]: [i_item_sk#1, i_brand#2, i_class#3, i_category#4, ss_sold_date_sk#5, ss_item_sk#6, ss_store_sk#7, ss_sales_price#8]
(10) Scan parquet default.date_dim
Output [3]: [d_date_sk#10, d_year#11, d_moy#12]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/date_dim]
PushedFilters: [IsNotNull(d_year), EqualTo(d_year,2000), GreaterThanOrEqual(d_date_sk,2451545), LessThanOrEqual(d_date_sk,2451910), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int,d_moy:int>
(11) ColumnarToRow [codegen id : 2]
Input [3]: [d_date_sk#10, d_year#11, d_moy#12]
(12) Filter [codegen id : 2]
Input [3]: [d_date_sk#10, d_year#11, d_moy#12]
Condition : ((((isnotnull(d_year#11) AND (d_year#11 = 2000)) AND (d_date_sk#10 >= 2451545)) AND (d_date_sk#10 <= 2451910)) AND isnotnull(d_date_sk#10))
(13) Project [codegen id : 2]
Output [2]: [d_date_sk#10, d_moy#12]
Input [3]: [d_date_sk#10, d_year#11, d_moy#12]
(14) BroadcastExchange
Input [2]: [d_date_sk#10, d_moy#12]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#13]
(15) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_sold_date_sk#5]
Right keys [1]: [d_date_sk#10]
Join condition: None
(16) Project [codegen id : 4]
Output [6]: [i_brand#2, i_class#3, i_category#4, ss_store_sk#7, ss_sales_price#8, d_moy#12]
Input [8]: [i_brand#2, i_class#3, i_category#4, ss_sold_date_sk#5, ss_store_sk#7, ss_sales_price#8, d_date_sk#10, d_moy#12]
(17) Scan parquet default.store
Output [3]: [s_store_sk#14, s_store_name#15, s_company_name#16]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store]
PushedFilters: [IsNotNull(s_store_sk)]
ReadSchema: struct<s_store_sk:int,s_store_name:string,s_company_name:string>
(18) ColumnarToRow [codegen id : 3]
Input [3]: [s_store_sk#14, s_store_name#15, s_company_name#16]
(19) Filter [codegen id : 3]
Input [3]: [s_store_sk#14, s_store_name#15, s_company_name#16]
Condition : isnotnull(s_store_sk#14)
(20) BroadcastExchange
Input [3]: [s_store_sk#14, s_store_name#15, s_company_name#16]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#17]
(21) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_store_sk#7]
Right keys [1]: [s_store_sk#14]
Join condition: None
(22) Project [codegen id : 4]
Output [7]: [i_brand#2, i_class#3, i_category#4, ss_sales_price#8, d_moy#12, s_store_name#15, s_company_name#16]
Input [9]: [i_brand#2, i_class#3, i_category#4, ss_store_sk#7, ss_sales_price#8, d_moy#12, s_store_sk#14, s_store_name#15, s_company_name#16]
(23) HashAggregate [codegen id : 4]
Input [7]: [i_brand#2, i_class#3, i_category#4, ss_sales_price#8, d_moy#12, s_store_name#15, s_company_name#16]
Keys [6]: [i_category#4, i_class#3, i_brand#2, s_store_name#15, s_company_name#16, d_moy#12]
Functions [1]: [partial_sum(UnscaledValue(ss_sales_price#8))]
Aggregate Attributes [1]: [sum#18]
Results [7]: [i_category#4, i_class#3, i_brand#2, s_store_name#15, s_company_name#16, d_moy#12, sum#19]
(24) Exchange
Input [7]: [i_category#4, i_class#3, i_brand#2, s_store_name#15, s_company_name#16, d_moy#12, sum#19]
Arguments: hashpartitioning(i_category#4, i_class#3, i_brand#2, s_store_name#15, s_company_name#16, d_moy#12, 5), true, [id=#20]
(25) HashAggregate [codegen id : 5]
Input [7]: [i_category#4, i_class#3, i_brand#2, s_store_name#15, s_company_name#16, d_moy#12, sum#19]
Keys [6]: [i_category#4, i_class#3, i_brand#2, s_store_name#15, s_company_name#16, d_moy#12]
Functions [1]: [sum(UnscaledValue(ss_sales_price#8))]
Aggregate Attributes [1]: [sum(UnscaledValue(ss_sales_price#8))#21]
Results [8]: [i_category#4, i_class#3, i_brand#2, s_store_name#15, s_company_name#16, d_moy#12, MakeDecimal(sum(UnscaledValue(ss_sales_price#8))#21,17,2) AS sum_sales#22, MakeDecimal(sum(UnscaledValue(ss_sales_price#8))#21,17,2) AS _w0#23]
(26) Exchange
Input [8]: [i_category#4, i_class#3, i_brand#2, s_store_name#15, s_company_name#16, d_moy#12, sum_sales#22, _w0#23]
Arguments: hashpartitioning(i_category#4, i_brand#2, s_store_name#15, s_company_name#16, 5), true, [id=#24]
(27) Sort [codegen id : 6]
Input [8]: [i_category#4, i_class#3, i_brand#2, s_store_name#15, s_company_name#16, d_moy#12, sum_sales#22, _w0#23]
Arguments: [i_category#4 ASC NULLS FIRST, i_brand#2 ASC NULLS FIRST, s_store_name#15 ASC NULLS FIRST, s_company_name#16 ASC NULLS FIRST], false, 0
(28) Window
Input [8]: [i_category#4, i_class#3, i_brand#2, s_store_name#15, s_company_name#16, d_moy#12, sum_sales#22, _w0#23]
Arguments: [avg(_w0#23) windowspecdefinition(i_category#4, i_brand#2, s_store_name#15, s_company_name#16, specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) AS avg_monthly_sales#25], [i_category#4, i_brand#2, s_store_name#15, s_company_name#16]
(29) Filter [codegen id : 7]
Input [9]: [i_category#4, i_class#3, i_brand#2, s_store_name#15, s_company_name#16, d_moy#12, sum_sales#22, _w0#23, avg_monthly_sales#25]
Condition : (CASE WHEN NOT (avg_monthly_sales#25 = 0.000000) THEN CheckOverflow((promote_precision(abs(CheckOverflow((promote_precision(cast(sum_sales#22 as decimal(22,6))) - promote_precision(cast(avg_monthly_sales#25 as decimal(22,6)))), DecimalType(22,6), true))) / promote_precision(cast(avg_monthly_sales#25 as decimal(22,6)))), DecimalType(38,16), true) ELSE null END > 0.1000000000000000)
(30) Project [codegen id : 7]
Output [8]: [i_category#4, i_class#3, i_brand#2, s_store_name#15, s_company_name#16, d_moy#12, sum_sales#22, avg_monthly_sales#25]
Input [9]: [i_category#4, i_class#3, i_brand#2, s_store_name#15, s_company_name#16, d_moy#12, sum_sales#22, _w0#23, avg_monthly_sales#25]
(31) TakeOrderedAndProject
Input [8]: [i_category#4, i_class#3, i_brand#2, s_store_name#15, s_company_name#16, d_moy#12, sum_sales#22, avg_monthly_sales#25]
Arguments: 100, [CheckOverflow((promote_precision(cast(sum_sales#22 as decimal(22,6))) - promote_precision(cast(avg_monthly_sales#25 as decimal(22,6)))), DecimalType(22,6), true) ASC NULLS FIRST, s_store_name#15 ASC NULLS FIRST], [i_category#4, i_class#3, i_brand#2, s_store_name#15, s_company_name#16, d_moy#12, sum_sales#22, avg_monthly_sales#25]

View file

@ -0,0 +1,48 @@
TakeOrderedAndProject [avg_monthly_sales,d_moy,i_brand,i_category,i_class,s_company_name,s_store_name,sum_sales]
WholeStageCodegen (7)
Project [avg_monthly_sales,d_moy,i_brand,i_category,i_class,s_company_name,s_store_name,sum_sales]
Filter [avg_monthly_sales,sum_sales]
InputAdapter
Window [_w0,i_brand,i_category,s_company_name,s_store_name]
WholeStageCodegen (6)
Sort [i_brand,i_category,s_company_name,s_store_name]
InputAdapter
Exchange [i_brand,i_category,s_company_name,s_store_name] #1
WholeStageCodegen (5)
HashAggregate [d_moy,i_brand,i_category,i_class,s_company_name,s_store_name,sum] [_w0,sum,sum(UnscaledValue(ss_sales_price)),sum_sales]
InputAdapter
Exchange [d_moy,i_brand,i_category,i_class,s_company_name,s_store_name] #2
WholeStageCodegen (4)
HashAggregate [d_moy,i_brand,i_category,i_class,s_company_name,s_store_name,ss_sales_price] [sum,sum]
Project [d_moy,i_brand,i_category,i_class,s_company_name,s_store_name,ss_sales_price]
BroadcastHashJoin [s_store_sk,ss_store_sk]
Project [d_moy,i_brand,i_category,i_class,ss_sales_price,ss_store_sk]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Project [i_brand,i_category,i_class,ss_sales_price,ss_sold_date_sk,ss_store_sk]
BroadcastHashJoin [i_item_sk,ss_item_sk]
Filter [i_category,i_class,i_item_sk]
ColumnarToRow
InputAdapter
Scan parquet default.item [i_brand,i_category,i_class,i_item_sk]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (1)
Filter [ss_item_sk,ss_sold_date_sk,ss_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_item_sk,ss_sales_price,ss_sold_date_sk,ss_store_sk]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (2)
Project [d_date_sk,d_moy]
Filter [d_date_sk,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_moy,d_year]
InputAdapter
BroadcastExchange #5
WholeStageCodegen (3)
Filter [s_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store [s_company_name,s_store_name,s_store_sk]

View file

@ -0,0 +1,162 @@
== Physical Plan ==
* Project (29)
+- * Sort (28)
+- Exchange (27)
+- * Project (26)
+- Window (25)
+- * Sort (24)
+- Exchange (23)
+- * HashAggregate (22)
+- Exchange (21)
+- * HashAggregate (20)
+- * Project (19)
+- * SortMergeJoin Inner (18)
:- * Sort (12)
: +- Exchange (11)
: +- * Project (10)
: +- * BroadcastHashJoin Inner BuildRight (9)
: :- * Filter (3)
: : +- * ColumnarToRow (2)
: : +- Scan parquet default.store_sales (1)
: +- BroadcastExchange (8)
: +- * Project (7)
: +- * Filter (6)
: +- * ColumnarToRow (5)
: +- Scan parquet default.date_dim (4)
+- * Sort (17)
+- Exchange (16)
+- * Filter (15)
+- * ColumnarToRow (14)
+- Scan parquet default.item (13)
(1) Scan parquet default.store_sales
Output [3]: [ss_sold_date_sk#1, ss_item_sk#2, ss_ext_sales_price#3]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2451911), LessThanOrEqual(ss_sold_date_sk,2451941), IsNotNull(ss_item_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_item_sk:int,ss_ext_sales_price:decimal(7,2)>
(2) ColumnarToRow [codegen id : 2]
Input [3]: [ss_sold_date_sk#1, ss_item_sk#2, ss_ext_sales_price#3]
(3) Filter [codegen id : 2]
Input [3]: [ss_sold_date_sk#1, ss_item_sk#2, ss_ext_sales_price#3]
Condition : (((isnotnull(ss_sold_date_sk#1) AND (ss_sold_date_sk#1 >= 2451911)) AND (ss_sold_date_sk#1 <= 2451941)) AND isnotnull(ss_item_sk#2))
(4) Scan parquet default.date_dim
Output [2]: [d_date_sk#4, d_date#5]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/date_dim]
PushedFilters: [IsNotNull(d_date), GreaterThanOrEqual(d_date,2001-01-01), LessThanOrEqual(d_date,2001-01-31), GreaterThanOrEqual(d_date_sk,2451911), LessThanOrEqual(d_date_sk,2451941), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_date:date>
(5) ColumnarToRow [codegen id : 1]
Input [2]: [d_date_sk#4, d_date#5]
(6) Filter [codegen id : 1]
Input [2]: [d_date_sk#4, d_date#5]
Condition : (((((isnotnull(d_date#5) AND (d_date#5 >= 11323)) AND (d_date#5 <= 11353)) AND (d_date_sk#4 >= 2451911)) AND (d_date_sk#4 <= 2451941)) AND isnotnull(d_date_sk#4))
(7) Project [codegen id : 1]
Output [1]: [d_date_sk#4]
Input [2]: [d_date_sk#4, d_date#5]
(8) BroadcastExchange
Input [1]: [d_date_sk#4]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#6]
(9) BroadcastHashJoin [codegen id : 2]
Left keys [1]: [ss_sold_date_sk#1]
Right keys [1]: [d_date_sk#4]
Join condition: None
(10) Project [codegen id : 2]
Output [2]: [ss_item_sk#2, ss_ext_sales_price#3]
Input [4]: [ss_sold_date_sk#1, ss_item_sk#2, ss_ext_sales_price#3, d_date_sk#4]
(11) Exchange
Input [2]: [ss_item_sk#2, ss_ext_sales_price#3]
Arguments: hashpartitioning(ss_item_sk#2, 5), true, [id=#7]
(12) Sort [codegen id : 3]
Input [2]: [ss_item_sk#2, ss_ext_sales_price#3]
Arguments: [ss_item_sk#2 ASC NULLS FIRST], false, 0
(13) Scan parquet default.item
Output [6]: [i_item_sk#8, i_item_id#9, i_item_desc#10, i_current_price#11, i_class#12, i_category#13]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/item]
PushedFilters: [In(i_category, [Jewelry,Sports,Books]), IsNotNull(i_item_sk)]
ReadSchema: struct<i_item_sk:int,i_item_id:string,i_item_desc:string,i_current_price:decimal(7,2),i_class:string,i_category:string>
(14) ColumnarToRow [codegen id : 4]
Input [6]: [i_item_sk#8, i_item_id#9, i_item_desc#10, i_current_price#11, i_class#12, i_category#13]
(15) Filter [codegen id : 4]
Input [6]: [i_item_sk#8, i_item_id#9, i_item_desc#10, i_current_price#11, i_class#12, i_category#13]
Condition : (i_category#13 IN (Jewelry,Sports,Books) AND isnotnull(i_item_sk#8))
(16) Exchange
Input [6]: [i_item_sk#8, i_item_id#9, i_item_desc#10, i_current_price#11, i_class#12, i_category#13]
Arguments: hashpartitioning(i_item_sk#8, 5), true, [id=#14]
(17) Sort [codegen id : 5]
Input [6]: [i_item_sk#8, i_item_id#9, i_item_desc#10, i_current_price#11, i_class#12, i_category#13]
Arguments: [i_item_sk#8 ASC NULLS FIRST], false, 0
(18) SortMergeJoin [codegen id : 6]
Left keys [1]: [ss_item_sk#2]
Right keys [1]: [i_item_sk#8]
Join condition: None
(19) Project [codegen id : 6]
Output [6]: [ss_ext_sales_price#3, i_item_id#9, i_item_desc#10, i_current_price#11, i_class#12, i_category#13]
Input [8]: [ss_item_sk#2, ss_ext_sales_price#3, i_item_sk#8, i_item_id#9, i_item_desc#10, i_current_price#11, i_class#12, i_category#13]
(20) HashAggregate [codegen id : 6]
Input [6]: [ss_ext_sales_price#3, i_item_id#9, i_item_desc#10, i_current_price#11, i_class#12, i_category#13]
Keys [5]: [i_item_id#9, i_item_desc#10, i_category#13, i_class#12, i_current_price#11]
Functions [1]: [partial_sum(UnscaledValue(ss_ext_sales_price#3))]
Aggregate Attributes [1]: [sum#15]
Results [6]: [i_item_id#9, i_item_desc#10, i_category#13, i_class#12, i_current_price#11, sum#16]
(21) Exchange
Input [6]: [i_item_id#9, i_item_desc#10, i_category#13, i_class#12, i_current_price#11, sum#16]
Arguments: hashpartitioning(i_item_id#9, i_item_desc#10, i_category#13, i_class#12, i_current_price#11, 5), true, [id=#17]
(22) HashAggregate [codegen id : 7]
Input [6]: [i_item_id#9, i_item_desc#10, i_category#13, i_class#12, i_current_price#11, sum#16]
Keys [5]: [i_item_id#9, i_item_desc#10, i_category#13, i_class#12, i_current_price#11]
Functions [1]: [sum(UnscaledValue(ss_ext_sales_price#3))]
Aggregate Attributes [1]: [sum(UnscaledValue(ss_ext_sales_price#3))#18]
Results [8]: [i_item_desc#10, i_category#13, i_class#12, i_current_price#11, MakeDecimal(sum(UnscaledValue(ss_ext_sales_price#3))#18,17,2) AS itemrevenue#19, MakeDecimal(sum(UnscaledValue(ss_ext_sales_price#3))#18,17,2) AS _w0#20, MakeDecimal(sum(UnscaledValue(ss_ext_sales_price#3))#18,17,2) AS _w1#21, i_item_id#9]
(23) Exchange
Input [8]: [i_item_desc#10, i_category#13, i_class#12, i_current_price#11, itemrevenue#19, _w0#20, _w1#21, i_item_id#9]
Arguments: hashpartitioning(i_class#12, 5), true, [id=#22]
(24) Sort [codegen id : 8]
Input [8]: [i_item_desc#10, i_category#13, i_class#12, i_current_price#11, itemrevenue#19, _w0#20, _w1#21, i_item_id#9]
Arguments: [i_class#12 ASC NULLS FIRST], false, 0
(25) Window
Input [8]: [i_item_desc#10, i_category#13, i_class#12, i_current_price#11, itemrevenue#19, _w0#20, _w1#21, i_item_id#9]
Arguments: [sum(_w1#21) windowspecdefinition(i_class#12, specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) AS _we0#23], [i_class#12]
(26) Project [codegen id : 9]
Output [7]: [i_item_desc#10, i_category#13, i_class#12, i_current_price#11, itemrevenue#19, CheckOverflow((promote_precision(cast(CheckOverflow((promote_precision(_w0#20) * 100.00), DecimalType(21,2), true) as decimal(27,2))) / promote_precision(_we0#23)), DecimalType(38,17), true) AS revenueratio#24, i_item_id#9]
Input [9]: [i_item_desc#10, i_category#13, i_class#12, i_current_price#11, itemrevenue#19, _w0#20, _w1#21, i_item_id#9, _we0#23]
(27) Exchange
Input [7]: [i_item_desc#10, i_category#13, i_class#12, i_current_price#11, itemrevenue#19, revenueratio#24, i_item_id#9]
Arguments: rangepartitioning(i_category#13 ASC NULLS FIRST, i_class#12 ASC NULLS FIRST, i_item_id#9 ASC NULLS FIRST, i_item_desc#10 ASC NULLS FIRST, revenueratio#24 ASC NULLS FIRST, 5), true, [id=#25]
(28) Sort [codegen id : 10]
Input [7]: [i_item_desc#10, i_category#13, i_class#12, i_current_price#11, itemrevenue#19, revenueratio#24, i_item_id#9]
Arguments: [i_category#13 ASC NULLS FIRST, i_class#12 ASC NULLS FIRST, i_item_id#9 ASC NULLS FIRST, i_item_desc#10 ASC NULLS FIRST, revenueratio#24 ASC NULLS FIRST], true, 0
(29) Project [codegen id : 10]
Output [6]: [i_item_desc#10, i_category#13, i_class#12, i_current_price#11, itemrevenue#19, revenueratio#24]
Input [7]: [i_item_desc#10, i_category#13, i_class#12, i_current_price#11, itemrevenue#19, revenueratio#24, i_item_id#9]

View file

@ -0,0 +1,51 @@
WholeStageCodegen (10)
Project [i_category,i_class,i_current_price,i_item_desc,itemrevenue,revenueratio]
Sort [i_category,i_class,i_item_desc,i_item_id,revenueratio]
InputAdapter
Exchange [i_category,i_class,i_item_desc,i_item_id,revenueratio] #1
WholeStageCodegen (9)
Project [_w0,_we0,i_category,i_class,i_current_price,i_item_desc,i_item_id,itemrevenue]
InputAdapter
Window [_w1,i_class]
WholeStageCodegen (8)
Sort [i_class]
InputAdapter
Exchange [i_class] #2
WholeStageCodegen (7)
HashAggregate [i_category,i_class,i_current_price,i_item_desc,i_item_id,sum] [_w0,_w1,itemrevenue,sum,sum(UnscaledValue(ss_ext_sales_price))]
InputAdapter
Exchange [i_category,i_class,i_current_price,i_item_desc,i_item_id] #3
WholeStageCodegen (6)
HashAggregate [i_category,i_class,i_current_price,i_item_desc,i_item_id,ss_ext_sales_price] [sum,sum]
Project [i_category,i_class,i_current_price,i_item_desc,i_item_id,ss_ext_sales_price]
SortMergeJoin [i_item_sk,ss_item_sk]
InputAdapter
WholeStageCodegen (3)
Sort [ss_item_sk]
InputAdapter
Exchange [ss_item_sk] #4
WholeStageCodegen (2)
Project [ss_ext_sales_price,ss_item_sk]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Filter [ss_item_sk,ss_sold_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_ext_sales_price,ss_item_sk,ss_sold_date_sk]
InputAdapter
BroadcastExchange #5
WholeStageCodegen (1)
Project [d_date_sk]
Filter [d_date,d_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date,d_date_sk]
InputAdapter
WholeStageCodegen (5)
Sort [i_item_sk]
InputAdapter
Exchange [i_item_sk] #6
WholeStageCodegen (4)
Filter [i_category,i_item_sk]
ColumnarToRow
InputAdapter
Scan parquet default.item [i_category,i_class,i_current_price,i_item_desc,i_item_id,i_item_sk]

View file

@ -0,0 +1,147 @@
== Physical Plan ==
* Project (26)
+- * Sort (25)
+- Exchange (24)
+- * Project (23)
+- Window (22)
+- * Sort (21)
+- Exchange (20)
+- * HashAggregate (19)
+- Exchange (18)
+- * HashAggregate (17)
+- * Project (16)
+- * BroadcastHashJoin Inner BuildRight (15)
:- * Project (9)
: +- * BroadcastHashJoin Inner BuildRight (8)
: :- * Filter (3)
: : +- * ColumnarToRow (2)
: : +- Scan parquet default.store_sales (1)
: +- BroadcastExchange (7)
: +- * Filter (6)
: +- * ColumnarToRow (5)
: +- Scan parquet default.item (4)
+- BroadcastExchange (14)
+- * Project (13)
+- * Filter (12)
+- * ColumnarToRow (11)
+- Scan parquet default.date_dim (10)
(1) Scan parquet default.store_sales
Output [3]: [ss_sold_date_sk#1, ss_item_sk#2, ss_ext_sales_price#3]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk), GreaterThanOrEqual(ss_sold_date_sk,2451911), LessThanOrEqual(ss_sold_date_sk,2451941), IsNotNull(ss_item_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_item_sk:int,ss_ext_sales_price:decimal(7,2)>
(2) ColumnarToRow [codegen id : 3]
Input [3]: [ss_sold_date_sk#1, ss_item_sk#2, ss_ext_sales_price#3]
(3) Filter [codegen id : 3]
Input [3]: [ss_sold_date_sk#1, ss_item_sk#2, ss_ext_sales_price#3]
Condition : (((isnotnull(ss_sold_date_sk#1) AND (ss_sold_date_sk#1 >= 2451911)) AND (ss_sold_date_sk#1 <= 2451941)) AND isnotnull(ss_item_sk#2))
(4) Scan parquet default.item
Output [6]: [i_item_sk#4, i_item_id#5, i_item_desc#6, i_current_price#7, i_class#8, i_category#9]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/item]
PushedFilters: [In(i_category, [Jewelry,Sports,Books]), IsNotNull(i_item_sk)]
ReadSchema: struct<i_item_sk:int,i_item_id:string,i_item_desc:string,i_current_price:decimal(7,2),i_class:string,i_category:string>
(5) ColumnarToRow [codegen id : 1]
Input [6]: [i_item_sk#4, i_item_id#5, i_item_desc#6, i_current_price#7, i_class#8, i_category#9]
(6) Filter [codegen id : 1]
Input [6]: [i_item_sk#4, i_item_id#5, i_item_desc#6, i_current_price#7, i_class#8, i_category#9]
Condition : (i_category#9 IN (Jewelry,Sports,Books) AND isnotnull(i_item_sk#4))
(7) BroadcastExchange
Input [6]: [i_item_sk#4, i_item_id#5, i_item_desc#6, i_current_price#7, i_class#8, i_category#9]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#10]
(8) BroadcastHashJoin [codegen id : 3]
Left keys [1]: [ss_item_sk#2]
Right keys [1]: [i_item_sk#4]
Join condition: None
(9) Project [codegen id : 3]
Output [7]: [ss_sold_date_sk#1, ss_ext_sales_price#3, i_item_id#5, i_item_desc#6, i_current_price#7, i_class#8, i_category#9]
Input [9]: [ss_sold_date_sk#1, ss_item_sk#2, ss_ext_sales_price#3, i_item_sk#4, i_item_id#5, i_item_desc#6, i_current_price#7, i_class#8, i_category#9]
(10) Scan parquet default.date_dim
Output [2]: [d_date_sk#11, d_date#12]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/date_dim]
PushedFilters: [IsNotNull(d_date), GreaterThanOrEqual(d_date,2001-01-01), LessThanOrEqual(d_date,2001-01-31), GreaterThanOrEqual(d_date_sk,2451911), LessThanOrEqual(d_date_sk,2451941), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_date:date>
(11) ColumnarToRow [codegen id : 2]
Input [2]: [d_date_sk#11, d_date#12]
(12) Filter [codegen id : 2]
Input [2]: [d_date_sk#11, d_date#12]
Condition : (((((isnotnull(d_date#12) AND (d_date#12 >= 11323)) AND (d_date#12 <= 11353)) AND (d_date_sk#11 >= 2451911)) AND (d_date_sk#11 <= 2451941)) AND isnotnull(d_date_sk#11))
(13) Project [codegen id : 2]
Output [1]: [d_date_sk#11]
Input [2]: [d_date_sk#11, d_date#12]
(14) BroadcastExchange
Input [1]: [d_date_sk#11]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#13]
(15) BroadcastHashJoin [codegen id : 3]
Left keys [1]: [ss_sold_date_sk#1]
Right keys [1]: [d_date_sk#11]
Join condition: None
(16) Project [codegen id : 3]
Output [6]: [ss_ext_sales_price#3, i_item_id#5, i_item_desc#6, i_current_price#7, i_class#8, i_category#9]
Input [8]: [ss_sold_date_sk#1, ss_ext_sales_price#3, i_item_id#5, i_item_desc#6, i_current_price#7, i_class#8, i_category#9, d_date_sk#11]
(17) HashAggregate [codegen id : 3]
Input [6]: [ss_ext_sales_price#3, i_item_id#5, i_item_desc#6, i_current_price#7, i_class#8, i_category#9]
Keys [5]: [i_item_id#5, i_item_desc#6, i_category#9, i_class#8, i_current_price#7]
Functions [1]: [partial_sum(UnscaledValue(ss_ext_sales_price#3))]
Aggregate Attributes [1]: [sum#14]
Results [6]: [i_item_id#5, i_item_desc#6, i_category#9, i_class#8, i_current_price#7, sum#15]
(18) Exchange
Input [6]: [i_item_id#5, i_item_desc#6, i_category#9, i_class#8, i_current_price#7, sum#15]
Arguments: hashpartitioning(i_item_id#5, i_item_desc#6, i_category#9, i_class#8, i_current_price#7, 5), true, [id=#16]
(19) HashAggregate [codegen id : 4]
Input [6]: [i_item_id#5, i_item_desc#6, i_category#9, i_class#8, i_current_price#7, sum#15]
Keys [5]: [i_item_id#5, i_item_desc#6, i_category#9, i_class#8, i_current_price#7]
Functions [1]: [sum(UnscaledValue(ss_ext_sales_price#3))]
Aggregate Attributes [1]: [sum(UnscaledValue(ss_ext_sales_price#3))#17]
Results [8]: [i_item_desc#6, i_category#9, i_class#8, i_current_price#7, MakeDecimal(sum(UnscaledValue(ss_ext_sales_price#3))#17,17,2) AS itemrevenue#18, MakeDecimal(sum(UnscaledValue(ss_ext_sales_price#3))#17,17,2) AS _w0#19, MakeDecimal(sum(UnscaledValue(ss_ext_sales_price#3))#17,17,2) AS _w1#20, i_item_id#5]
(20) Exchange
Input [8]: [i_item_desc#6, i_category#9, i_class#8, i_current_price#7, itemrevenue#18, _w0#19, _w1#20, i_item_id#5]
Arguments: hashpartitioning(i_class#8, 5), true, [id=#21]
(21) Sort [codegen id : 5]
Input [8]: [i_item_desc#6, i_category#9, i_class#8, i_current_price#7, itemrevenue#18, _w0#19, _w1#20, i_item_id#5]
Arguments: [i_class#8 ASC NULLS FIRST], false, 0
(22) Window
Input [8]: [i_item_desc#6, i_category#9, i_class#8, i_current_price#7, itemrevenue#18, _w0#19, _w1#20, i_item_id#5]
Arguments: [sum(_w1#20) windowspecdefinition(i_class#8, specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) AS _we0#22], [i_class#8]
(23) Project [codegen id : 6]
Output [7]: [i_item_desc#6, i_category#9, i_class#8, i_current_price#7, itemrevenue#18, CheckOverflow((promote_precision(cast(CheckOverflow((promote_precision(_w0#19) * 100.00), DecimalType(21,2), true) as decimal(27,2))) / promote_precision(_we0#22)), DecimalType(38,17), true) AS revenueratio#23, i_item_id#5]
Input [9]: [i_item_desc#6, i_category#9, i_class#8, i_current_price#7, itemrevenue#18, _w0#19, _w1#20, i_item_id#5, _we0#22]
(24) Exchange
Input [7]: [i_item_desc#6, i_category#9, i_class#8, i_current_price#7, itemrevenue#18, revenueratio#23, i_item_id#5]
Arguments: rangepartitioning(i_category#9 ASC NULLS FIRST, i_class#8 ASC NULLS FIRST, i_item_id#5 ASC NULLS FIRST, i_item_desc#6 ASC NULLS FIRST, revenueratio#23 ASC NULLS FIRST, 5), true, [id=#24]
(25) Sort [codegen id : 7]
Input [7]: [i_item_desc#6, i_category#9, i_class#8, i_current_price#7, itemrevenue#18, revenueratio#23, i_item_id#5]
Arguments: [i_category#9 ASC NULLS FIRST, i_class#8 ASC NULLS FIRST, i_item_id#5 ASC NULLS FIRST, i_item_desc#6 ASC NULLS FIRST, revenueratio#23 ASC NULLS FIRST], true, 0
(26) Project [codegen id : 7]
Output [6]: [i_item_desc#6, i_category#9, i_class#8, i_current_price#7, itemrevenue#18, revenueratio#23]
Input [7]: [i_item_desc#6, i_category#9, i_class#8, i_current_price#7, itemrevenue#18, revenueratio#23, i_item_id#5]

View file

@ -0,0 +1,42 @@
WholeStageCodegen (7)
Project [i_category,i_class,i_current_price,i_item_desc,itemrevenue,revenueratio]
Sort [i_category,i_class,i_item_desc,i_item_id,revenueratio]
InputAdapter
Exchange [i_category,i_class,i_item_desc,i_item_id,revenueratio] #1
WholeStageCodegen (6)
Project [_w0,_we0,i_category,i_class,i_current_price,i_item_desc,i_item_id,itemrevenue]
InputAdapter
Window [_w1,i_class]
WholeStageCodegen (5)
Sort [i_class]
InputAdapter
Exchange [i_class] #2
WholeStageCodegen (4)
HashAggregate [i_category,i_class,i_current_price,i_item_desc,i_item_id,sum] [_w0,_w1,itemrevenue,sum,sum(UnscaledValue(ss_ext_sales_price))]
InputAdapter
Exchange [i_category,i_class,i_current_price,i_item_desc,i_item_id] #3
WholeStageCodegen (3)
HashAggregate [i_category,i_class,i_current_price,i_item_desc,i_item_id,ss_ext_sales_price] [sum,sum]
Project [i_category,i_class,i_current_price,i_item_desc,i_item_id,ss_ext_sales_price]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Project [i_category,i_class,i_current_price,i_item_desc,i_item_id,ss_ext_sales_price,ss_sold_date_sk]
BroadcastHashJoin [i_item_sk,ss_item_sk]
Filter [ss_item_sk,ss_sold_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_ext_sales_price,ss_item_sk,ss_sold_date_sk]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (1)
Filter [i_category,i_item_sk]
ColumnarToRow
InputAdapter
Scan parquet default.item [i_category,i_class,i_current_price,i_item_desc,i_item_id,i_item_sk]
InputAdapter
BroadcastExchange #5
WholeStageCodegen (2)
Project [d_date_sk]
Filter [d_date,d_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date,d_date_sk]

View file

@ -0,0 +1,56 @@
== Physical Plan ==
* HashAggregate (8)
+- Exchange (7)
+- * HashAggregate (6)
+- * HashAggregate (5)
+- Exchange (4)
+- * HashAggregate (3)
+- * ColumnarToRow (2)
+- Scan parquet default.store_sales (1)
(1) Scan parquet default.store_sales
Output [9]: [ss_sold_date_sk#1, ss_sold_time_sk#2, ss_item_sk#3, ss_customer_sk#4, ss_cdemo_sk#5, ss_hdemo_sk#6, ss_addr_sk#7, ss_store_sk#8, ss_promo_sk#9]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilityWithStatsSuite/store_sales]
ReadSchema: struct<ss_sold_date_sk:int,ss_sold_time_sk:int,ss_item_sk:int,ss_customer_sk:int,ss_cdemo_sk:int,ss_hdemo_sk:int,ss_addr_sk:int,ss_store_sk:int,ss_promo_sk:int>
(2) ColumnarToRow [codegen id : 1]
Input [9]: [ss_sold_date_sk#1, ss_sold_time_sk#2, ss_item_sk#3, ss_customer_sk#4, ss_cdemo_sk#5, ss_hdemo_sk#6, ss_addr_sk#7, ss_store_sk#8, ss_promo_sk#9]
(3) HashAggregate [codegen id : 1]
Input [9]: [ss_sold_date_sk#1, ss_sold_time_sk#2, ss_item_sk#3, ss_customer_sk#4, ss_cdemo_sk#5, ss_hdemo_sk#6, ss_addr_sk#7, ss_store_sk#8, ss_promo_sk#9]
Keys [1]: [ss_sold_date_sk#1]
Functions [11]: [partial_count(1), partial_count(ss_sold_date_sk#1), partial_max(ss_sold_date_sk#1), partial_max(ss_sold_time_sk#2), partial_max(ss_item_sk#3), partial_max(ss_customer_sk#4), partial_max(ss_cdemo_sk#5), partial_max(ss_hdemo_sk#6), partial_max(ss_addr_sk#7), partial_max(ss_store_sk#8), partial_max(ss_promo_sk#9)]
Aggregate Attributes [11]: [count(1)#10, count(ss_sold_date_sk#1)#11, max(ss_sold_date_sk#1)#12, max(ss_sold_time_sk#2)#13, max(ss_item_sk#3)#14, max(ss_customer_sk#4)#15, max(ss_cdemo_sk#5)#16, max(ss_hdemo_sk#6)#17, max(ss_addr_sk#7)#18, max(ss_store_sk#8)#19, max(ss_promo_sk#9)#20]
Results [12]: [ss_sold_date_sk#1, count#21, count#22, max#23, max#24, max#25, max#26, max#27, max#28, max#29, max#30, max#31]
(4) Exchange
Input [12]: [ss_sold_date_sk#1, count#21, count#22, max#23, max#24, max#25, max#26, max#27, max#28, max#29, max#30, max#31]
Arguments: hashpartitioning(ss_sold_date_sk#1, 5), true, [id=#32]
(5) HashAggregate [codegen id : 2]
Input [12]: [ss_sold_date_sk#1, count#21, count#22, max#23, max#24, max#25, max#26, max#27, max#28, max#29, max#30, max#31]
Keys [1]: [ss_sold_date_sk#1]
Functions [11]: [merge_count(1), merge_count(ss_sold_date_sk#1), merge_max(ss_sold_date_sk#1), merge_max(ss_sold_time_sk#2), merge_max(ss_item_sk#3), merge_max(ss_customer_sk#4), merge_max(ss_cdemo_sk#5), merge_max(ss_hdemo_sk#6), merge_max(ss_addr_sk#7), merge_max(ss_store_sk#8), merge_max(ss_promo_sk#9)]
Aggregate Attributes [11]: [count(1)#10, count(ss_sold_date_sk#1)#11, max(ss_sold_date_sk#1)#12, max(ss_sold_time_sk#2)#13, max(ss_item_sk#3)#14, max(ss_customer_sk#4)#15, max(ss_cdemo_sk#5)#16, max(ss_hdemo_sk#6)#17, max(ss_addr_sk#7)#18, max(ss_store_sk#8)#19, max(ss_promo_sk#9)#20]
Results [12]: [ss_sold_date_sk#1, count#21, count#22, max#23, max#24, max#25, max#26, max#27, max#28, max#29, max#30, max#31]
(6) HashAggregate [codegen id : 2]
Input [12]: [ss_sold_date_sk#1, count#21, count#22, max#23, max#24, max#25, max#26, max#27, max#28, max#29, max#30, max#31]
Keys: []
Functions [12]: [merge_count(1), merge_count(ss_sold_date_sk#1), merge_max(ss_sold_date_sk#1), merge_max(ss_sold_time_sk#2), merge_max(ss_item_sk#3), merge_max(ss_customer_sk#4), merge_max(ss_cdemo_sk#5), merge_max(ss_hdemo_sk#6), merge_max(ss_addr_sk#7), merge_max(ss_store_sk#8), merge_max(ss_promo_sk#9), partial_count(distinct ss_sold_date_sk#1)]
Aggregate Attributes [12]: [count(1)#10, count(ss_sold_date_sk#1)#11, max(ss_sold_date_sk#1)#12, max(ss_sold_time_sk#2)#13, max(ss_item_sk#3)#14, max(ss_customer_sk#4)#15, max(ss_cdemo_sk#5)#16, max(ss_hdemo_sk#6)#17, max(ss_addr_sk#7)#18, max(ss_store_sk#8)#19, max(ss_promo_sk#9)#20, count(ss_sold_date_sk#1)#33]
Results [12]: [count#21, count#22, max#23, max#24, max#25, max#26, max#27, max#28, max#29, max#30, max#31, count#34]
(7) Exchange
Input [12]: [count#21, count#22, max#23, max#24, max#25, max#26, max#27, max#28, max#29, max#30, max#31, count#34]
Arguments: SinglePartition, true, [id=#35]
(8) HashAggregate [codegen id : 3]
Input [12]: [count#21, count#22, max#23, max#24, max#25, max#26, max#27, max#28, max#29, max#30, max#31, count#34]
Keys: []
Functions [12]: [count(1), count(ss_sold_date_sk#1), max(ss_sold_date_sk#1), max(ss_sold_time_sk#2), max(ss_item_sk#3), max(ss_customer_sk#4), max(ss_cdemo_sk#5), max(ss_hdemo_sk#6), max(ss_addr_sk#7), max(ss_store_sk#8), max(ss_promo_sk#9), count(distinct ss_sold_date_sk#1)]
Aggregate Attributes [12]: [count(1)#10, count(ss_sold_date_sk#1)#11, max(ss_sold_date_sk#1)#12, max(ss_sold_time_sk#2)#13, max(ss_item_sk#3)#14, max(ss_customer_sk#4)#15, max(ss_cdemo_sk#5)#16, max(ss_hdemo_sk#6)#17, max(ss_addr_sk#7)#18, max(ss_store_sk#8)#19, max(ss_promo_sk#9)#20, count(ss_sold_date_sk#1)#33]
Results [12]: [count(1)#10 AS total#36, count(ss_sold_date_sk#1)#11 AS not_null_total#37, count(ss_sold_date_sk#1)#33 AS unique_days#38, max(ss_sold_date_sk#1)#12 AS max_ss_sold_date_sk#39, max(ss_sold_time_sk#2)#13 AS max_ss_sold_time_sk#40, max(ss_item_sk#3)#14 AS max_ss_item_sk#41, max(ss_customer_sk#4)#15 AS max_ss_customer_sk#42, max(ss_cdemo_sk#5)#16 AS max_ss_cdemo_sk#43, max(ss_hdemo_sk#6)#17 AS max_ss_hdemo_sk#44, max(ss_addr_sk#7)#18 AS max_ss_addr_sk#45, max(ss_store_sk#8)#19 AS max_ss_store_sk#46, max(ss_promo_sk#9)#20 AS max_ss_promo_sk#47]

View file

@ -0,0 +1,14 @@
WholeStageCodegen (3)
HashAggregate [count,count,count,max,max,max,max,max,max,max,max,max] [count,count,count,count(1),count(ss_sold_date_sk),count(ss_sold_date_sk),max,max,max,max,max,max,max,max,max,max(ss_addr_sk),max(ss_cdemo_sk),max(ss_customer_sk),max(ss_hdemo_sk),max(ss_item_sk),max(ss_promo_sk),max(ss_sold_date_sk),max(ss_sold_time_sk),max(ss_store_sk),max_ss_addr_sk,max_ss_cdemo_sk,max_ss_customer_sk,max_ss_hdemo_sk,max_ss_item_sk,max_ss_promo_sk,max_ss_sold_date_sk,max_ss_sold_time_sk,max_ss_store_sk,not_null_total,total,unique_days]
InputAdapter
Exchange #1
WholeStageCodegen (2)
HashAggregate [ss_sold_date_sk] [count,count,count,count,count,count,count(1),count(ss_sold_date_sk),count(ss_sold_date_sk),max,max,max,max,max,max,max,max,max,max,max,max,max,max,max,max,max,max,max(ss_addr_sk),max(ss_cdemo_sk),max(ss_customer_sk),max(ss_hdemo_sk),max(ss_item_sk),max(ss_promo_sk),max(ss_sold_date_sk),max(ss_sold_time_sk),max(ss_store_sk)]
HashAggregate [ss_sold_date_sk] [count,count,count,count,count(1),count(ss_sold_date_sk),max,max,max,max,max,max,max,max,max,max,max,max,max,max,max,max,max,max,max(ss_addr_sk),max(ss_cdemo_sk),max(ss_customer_sk),max(ss_hdemo_sk),max(ss_item_sk),max(ss_promo_sk),max(ss_sold_date_sk),max(ss_sold_time_sk),max(ss_store_sk)]
InputAdapter
Exchange [ss_sold_date_sk] #2
WholeStageCodegen (1)
HashAggregate [ss_addr_sk,ss_cdemo_sk,ss_customer_sk,ss_hdemo_sk,ss_item_sk,ss_promo_sk,ss_sold_date_sk,ss_sold_time_sk,ss_store_sk] [count,count,count,count,count(1),count(ss_sold_date_sk),max,max,max,max,max,max,max,max,max,max,max,max,max,max,max,max,max,max,max(ss_addr_sk),max(ss_cdemo_sk),max(ss_customer_sk),max(ss_hdemo_sk),max(ss_item_sk),max(ss_promo_sk),max(ss_sold_date_sk),max(ss_sold_time_sk),max(ss_store_sk)]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_addr_sk,ss_cdemo_sk,ss_customer_sk,ss_hdemo_sk,ss_item_sk,ss_promo_sk,ss_sold_date_sk,ss_sold_time_sk,ss_store_sk]

View file

@ -0,0 +1,56 @@
== Physical Plan ==
* HashAggregate (8)
+- Exchange (7)
+- * HashAggregate (6)
+- * HashAggregate (5)
+- Exchange (4)
+- * HashAggregate (3)
+- * ColumnarToRow (2)
+- Scan parquet default.store_sales (1)
(1) Scan parquet default.store_sales
Output [9]: [ss_sold_date_sk#1, ss_sold_time_sk#2, ss_item_sk#3, ss_customer_sk#4, ss_cdemo_sk#5, ss_hdemo_sk#6, ss_addr_sk#7, ss_store_sk#8, ss_promo_sk#9]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSModifiedPlanStabilitySuite/store_sales]
ReadSchema: struct<ss_sold_date_sk:int,ss_sold_time_sk:int,ss_item_sk:int,ss_customer_sk:int,ss_cdemo_sk:int,ss_hdemo_sk:int,ss_addr_sk:int,ss_store_sk:int,ss_promo_sk:int>
(2) ColumnarToRow [codegen id : 1]
Input [9]: [ss_sold_date_sk#1, ss_sold_time_sk#2, ss_item_sk#3, ss_customer_sk#4, ss_cdemo_sk#5, ss_hdemo_sk#6, ss_addr_sk#7, ss_store_sk#8, ss_promo_sk#9]
(3) HashAggregate [codegen id : 1]
Input [9]: [ss_sold_date_sk#1, ss_sold_time_sk#2, ss_item_sk#3, ss_customer_sk#4, ss_cdemo_sk#5, ss_hdemo_sk#6, ss_addr_sk#7, ss_store_sk#8, ss_promo_sk#9]
Keys [1]: [ss_sold_date_sk#1]
Functions [11]: [partial_count(1), partial_count(ss_sold_date_sk#1), partial_max(ss_sold_date_sk#1), partial_max(ss_sold_time_sk#2), partial_max(ss_item_sk#3), partial_max(ss_customer_sk#4), partial_max(ss_cdemo_sk#5), partial_max(ss_hdemo_sk#6), partial_max(ss_addr_sk#7), partial_max(ss_store_sk#8), partial_max(ss_promo_sk#9)]
Aggregate Attributes [11]: [count(1)#10, count(ss_sold_date_sk#1)#11, max(ss_sold_date_sk#1)#12, max(ss_sold_time_sk#2)#13, max(ss_item_sk#3)#14, max(ss_customer_sk#4)#15, max(ss_cdemo_sk#5)#16, max(ss_hdemo_sk#6)#17, max(ss_addr_sk#7)#18, max(ss_store_sk#8)#19, max(ss_promo_sk#9)#20]
Results [12]: [ss_sold_date_sk#1, count#21, count#22, max#23, max#24, max#25, max#26, max#27, max#28, max#29, max#30, max#31]
(4) Exchange
Input [12]: [ss_sold_date_sk#1, count#21, count#22, max#23, max#24, max#25, max#26, max#27, max#28, max#29, max#30, max#31]
Arguments: hashpartitioning(ss_sold_date_sk#1, 5), true, [id=#32]
(5) HashAggregate [codegen id : 2]
Input [12]: [ss_sold_date_sk#1, count#21, count#22, max#23, max#24, max#25, max#26, max#27, max#28, max#29, max#30, max#31]
Keys [1]: [ss_sold_date_sk#1]
Functions [11]: [merge_count(1), merge_count(ss_sold_date_sk#1), merge_max(ss_sold_date_sk#1), merge_max(ss_sold_time_sk#2), merge_max(ss_item_sk#3), merge_max(ss_customer_sk#4), merge_max(ss_cdemo_sk#5), merge_max(ss_hdemo_sk#6), merge_max(ss_addr_sk#7), merge_max(ss_store_sk#8), merge_max(ss_promo_sk#9)]
Aggregate Attributes [11]: [count(1)#10, count(ss_sold_date_sk#1)#11, max(ss_sold_date_sk#1)#12, max(ss_sold_time_sk#2)#13, max(ss_item_sk#3)#14, max(ss_customer_sk#4)#15, max(ss_cdemo_sk#5)#16, max(ss_hdemo_sk#6)#17, max(ss_addr_sk#7)#18, max(ss_store_sk#8)#19, max(ss_promo_sk#9)#20]
Results [12]: [ss_sold_date_sk#1, count#21, count#22, max#23, max#24, max#25, max#26, max#27, max#28, max#29, max#30, max#31]
(6) HashAggregate [codegen id : 2]
Input [12]: [ss_sold_date_sk#1, count#21, count#22, max#23, max#24, max#25, max#26, max#27, max#28, max#29, max#30, max#31]
Keys: []
Functions [12]: [merge_count(1), merge_count(ss_sold_date_sk#1), merge_max(ss_sold_date_sk#1), merge_max(ss_sold_time_sk#2), merge_max(ss_item_sk#3), merge_max(ss_customer_sk#4), merge_max(ss_cdemo_sk#5), merge_max(ss_hdemo_sk#6), merge_max(ss_addr_sk#7), merge_max(ss_store_sk#8), merge_max(ss_promo_sk#9), partial_count(distinct ss_sold_date_sk#1)]
Aggregate Attributes [12]: [count(1)#10, count(ss_sold_date_sk#1)#11, max(ss_sold_date_sk#1)#12, max(ss_sold_time_sk#2)#13, max(ss_item_sk#3)#14, max(ss_customer_sk#4)#15, max(ss_cdemo_sk#5)#16, max(ss_hdemo_sk#6)#17, max(ss_addr_sk#7)#18, max(ss_store_sk#8)#19, max(ss_promo_sk#9)#20, count(ss_sold_date_sk#1)#33]
Results [12]: [count#21, count#22, max#23, max#24, max#25, max#26, max#27, max#28, max#29, max#30, max#31, count#34]
(7) Exchange
Input [12]: [count#21, count#22, max#23, max#24, max#25, max#26, max#27, max#28, max#29, max#30, max#31, count#34]
Arguments: SinglePartition, true, [id=#35]
(8) HashAggregate [codegen id : 3]
Input [12]: [count#21, count#22, max#23, max#24, max#25, max#26, max#27, max#28, max#29, max#30, max#31, count#34]
Keys: []
Functions [12]: [count(1), count(ss_sold_date_sk#1), max(ss_sold_date_sk#1), max(ss_sold_time_sk#2), max(ss_item_sk#3), max(ss_customer_sk#4), max(ss_cdemo_sk#5), max(ss_hdemo_sk#6), max(ss_addr_sk#7), max(ss_store_sk#8), max(ss_promo_sk#9), count(distinct ss_sold_date_sk#1)]
Aggregate Attributes [12]: [count(1)#10, count(ss_sold_date_sk#1)#11, max(ss_sold_date_sk#1)#12, max(ss_sold_time_sk#2)#13, max(ss_item_sk#3)#14, max(ss_customer_sk#4)#15, max(ss_cdemo_sk#5)#16, max(ss_hdemo_sk#6)#17, max(ss_addr_sk#7)#18, max(ss_store_sk#8)#19, max(ss_promo_sk#9)#20, count(ss_sold_date_sk#1)#33]
Results [12]: [count(1)#10 AS total#36, count(ss_sold_date_sk#1)#11 AS not_null_total#37, count(ss_sold_date_sk#1)#33 AS unique_days#38, max(ss_sold_date_sk#1)#12 AS max_ss_sold_date_sk#39, max(ss_sold_time_sk#2)#13 AS max_ss_sold_time_sk#40, max(ss_item_sk#3)#14 AS max_ss_item_sk#41, max(ss_customer_sk#4)#15 AS max_ss_customer_sk#42, max(ss_cdemo_sk#5)#16 AS max_ss_cdemo_sk#43, max(ss_hdemo_sk#6)#17 AS max_ss_hdemo_sk#44, max(ss_addr_sk#7)#18 AS max_ss_addr_sk#45, max(ss_store_sk#8)#19 AS max_ss_store_sk#46, max(ss_promo_sk#9)#20 AS max_ss_promo_sk#47]

View file

@ -0,0 +1,14 @@
WholeStageCodegen (3)
HashAggregate [count,count,count,max,max,max,max,max,max,max,max,max] [count,count,count,count(1),count(ss_sold_date_sk),count(ss_sold_date_sk),max,max,max,max,max,max,max,max,max,max(ss_addr_sk),max(ss_cdemo_sk),max(ss_customer_sk),max(ss_hdemo_sk),max(ss_item_sk),max(ss_promo_sk),max(ss_sold_date_sk),max(ss_sold_time_sk),max(ss_store_sk),max_ss_addr_sk,max_ss_cdemo_sk,max_ss_customer_sk,max_ss_hdemo_sk,max_ss_item_sk,max_ss_promo_sk,max_ss_sold_date_sk,max_ss_sold_time_sk,max_ss_store_sk,not_null_total,total,unique_days]
InputAdapter
Exchange #1
WholeStageCodegen (2)
HashAggregate [ss_sold_date_sk] [count,count,count,count,count,count,count(1),count(ss_sold_date_sk),count(ss_sold_date_sk),max,max,max,max,max,max,max,max,max,max,max,max,max,max,max,max,max,max,max(ss_addr_sk),max(ss_cdemo_sk),max(ss_customer_sk),max(ss_hdemo_sk),max(ss_item_sk),max(ss_promo_sk),max(ss_sold_date_sk),max(ss_sold_time_sk),max(ss_store_sk)]
HashAggregate [ss_sold_date_sk] [count,count,count,count,count(1),count(ss_sold_date_sk),max,max,max,max,max,max,max,max,max,max,max,max,max,max,max,max,max,max,max(ss_addr_sk),max(ss_cdemo_sk),max(ss_customer_sk),max(ss_hdemo_sk),max(ss_item_sk),max(ss_promo_sk),max(ss_sold_date_sk),max(ss_sold_time_sk),max(ss_store_sk)]
InputAdapter
Exchange [ss_sold_date_sk] #2
WholeStageCodegen (1)
HashAggregate [ss_addr_sk,ss_cdemo_sk,ss_customer_sk,ss_hdemo_sk,ss_item_sk,ss_promo_sk,ss_sold_date_sk,ss_sold_time_sk,ss_store_sk] [count,count,count,count,count(1),count(ss_sold_date_sk),max,max,max,max,max,max,max,max,max,max,max,max,max,max,max,max,max,max,max(ss_addr_sk),max(ss_cdemo_sk),max(ss_customer_sk),max(ss_hdemo_sk),max(ss_item_sk),max(ss_promo_sk),max(ss_sold_date_sk),max(ss_sold_time_sk),max(ss_store_sk)]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_addr_sk,ss_cdemo_sk,ss_customer_sk,ss_hdemo_sk,ss_item_sk,ss_promo_sk,ss_sold_date_sk,ss_sold_time_sk,ss_store_sk]

View file

@ -0,0 +1,270 @@
== Physical Plan ==
TakeOrderedAndProject (47)
+- * Project (46)
+- * SortMergeJoin Inner (45)
:- * Sort (39)
: +- Exchange (38)
: +- * Project (37)
: +- * BroadcastHashJoin Inner BuildRight (36)
: :- * Project (30)
: : +- * BroadcastHashJoin Inner BuildRight (29)
: : :- * Filter (14)
: : : +- * HashAggregate (13)
: : : +- Exchange (12)
: : : +- * HashAggregate (11)
: : : +- * Project (10)
: : : +- * BroadcastHashJoin Inner BuildRight (9)
: : : :- * Filter (3)
: : : : +- * ColumnarToRow (2)
: : : : +- Scan parquet default.store_returns (1)
: : : +- BroadcastExchange (8)
: : : +- * Project (7)
: : : +- * Filter (6)
: : : +- * ColumnarToRow (5)
: : : +- Scan parquet default.date_dim (4)
: : +- BroadcastExchange (28)
: : +- * Filter (27)
: : +- * HashAggregate (26)
: : +- Exchange (25)
: : +- * HashAggregate (24)
: : +- * HashAggregate (23)
: : +- Exchange (22)
: : +- * HashAggregate (21)
: : +- * Project (20)
: : +- * BroadcastHashJoin Inner BuildRight (19)
: : :- * Filter (17)
: : : +- * ColumnarToRow (16)
: : : +- Scan parquet default.store_returns (15)
: : +- ReusedExchange (18)
: +- BroadcastExchange (35)
: +- * Project (34)
: +- * Filter (33)
: +- * ColumnarToRow (32)
: +- Scan parquet default.store (31)
+- * Sort (44)
+- Exchange (43)
+- * Filter (42)
+- * ColumnarToRow (41)
+- Scan parquet default.customer (40)
(1) Scan parquet default.store_returns
Output [4]: [sr_returned_date_sk#1, sr_customer_sk#2, sr_store_sk#3, sr_return_amt#4]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilityWithStatsSuite/store_returns]
PushedFilters: [IsNotNull(sr_returned_date_sk), IsNotNull(sr_store_sk), IsNotNull(sr_customer_sk)]
ReadSchema: struct<sr_returned_date_sk:bigint,sr_customer_sk:bigint,sr_store_sk:bigint,sr_return_amt:decimal(7,2)>
(2) ColumnarToRow [codegen id : 2]
Input [4]: [sr_returned_date_sk#1, sr_customer_sk#2, sr_store_sk#3, sr_return_amt#4]
(3) Filter [codegen id : 2]
Input [4]: [sr_returned_date_sk#1, sr_customer_sk#2, sr_store_sk#3, sr_return_amt#4]
Condition : ((isnotnull(sr_returned_date_sk#1) AND isnotnull(sr_store_sk#3)) AND isnotnull(sr_customer_sk#2))
(4) Scan parquet default.date_dim
Output [2]: [d_date_sk#5, d_year#6]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilityWithStatsSuite/date_dim]
PushedFilters: [IsNotNull(d_year), EqualTo(d_year,2000), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int>
(5) ColumnarToRow [codegen id : 1]
Input [2]: [d_date_sk#5, d_year#6]
(6) Filter [codegen id : 1]
Input [2]: [d_date_sk#5, d_year#6]
Condition : ((isnotnull(d_year#6) AND (d_year#6 = 2000)) AND isnotnull(d_date_sk#5))
(7) Project [codegen id : 1]
Output [1]: [d_date_sk#5]
Input [2]: [d_date_sk#5, d_year#6]
(8) BroadcastExchange
Input [1]: [d_date_sk#5]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#7]
(9) BroadcastHashJoin [codegen id : 2]
Left keys [1]: [sr_returned_date_sk#1]
Right keys [1]: [cast(d_date_sk#5 as bigint)]
Join condition: None
(10) Project [codegen id : 2]
Output [3]: [sr_customer_sk#2, sr_store_sk#3, sr_return_amt#4]
Input [5]: [sr_returned_date_sk#1, sr_customer_sk#2, sr_store_sk#3, sr_return_amt#4, d_date_sk#5]
(11) HashAggregate [codegen id : 2]
Input [3]: [sr_customer_sk#2, sr_store_sk#3, sr_return_amt#4]
Keys [2]: [sr_customer_sk#2, sr_store_sk#3]
Functions [1]: [partial_sum(UnscaledValue(sr_return_amt#4))]
Aggregate Attributes [1]: [sum#8]
Results [3]: [sr_customer_sk#2, sr_store_sk#3, sum#9]
(12) Exchange
Input [3]: [sr_customer_sk#2, sr_store_sk#3, sum#9]
Arguments: hashpartitioning(sr_customer_sk#2, sr_store_sk#3, 5), true, [id=#10]
(13) HashAggregate [codegen id : 8]
Input [3]: [sr_customer_sk#2, sr_store_sk#3, sum#9]
Keys [2]: [sr_customer_sk#2, sr_store_sk#3]
Functions [1]: [sum(UnscaledValue(sr_return_amt#4))]
Aggregate Attributes [1]: [sum(UnscaledValue(sr_return_amt#4))#11]
Results [3]: [sr_customer_sk#2 AS ctr_customer_sk#12, sr_store_sk#3 AS ctr_store_sk#13, MakeDecimal(sum(UnscaledValue(sr_return_amt#4))#11,17,2) AS ctr_total_return#14]
(14) Filter [codegen id : 8]
Input [3]: [ctr_customer_sk#12, ctr_store_sk#13, ctr_total_return#14]
Condition : isnotnull(ctr_total_return#14)
(15) Scan parquet default.store_returns
Output [4]: [sr_returned_date_sk#1, sr_customer_sk#2, sr_store_sk#3, sr_return_amt#4]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilityWithStatsSuite/store_returns]
PushedFilters: [IsNotNull(sr_returned_date_sk), IsNotNull(sr_store_sk)]
ReadSchema: struct<sr_returned_date_sk:bigint,sr_customer_sk:bigint,sr_store_sk:bigint,sr_return_amt:decimal(7,2)>
(16) ColumnarToRow [codegen id : 4]
Input [4]: [sr_returned_date_sk#1, sr_customer_sk#2, sr_store_sk#3, sr_return_amt#4]
(17) Filter [codegen id : 4]
Input [4]: [sr_returned_date_sk#1, sr_customer_sk#2, sr_store_sk#3, sr_return_amt#4]
Condition : (isnotnull(sr_returned_date_sk#1) AND isnotnull(sr_store_sk#3))
(18) ReusedExchange [Reuses operator id: 8]
Output [1]: [d_date_sk#5]
(19) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [sr_returned_date_sk#1]
Right keys [1]: [cast(d_date_sk#5 as bigint)]
Join condition: None
(20) Project [codegen id : 4]
Output [3]: [sr_customer_sk#2, sr_store_sk#3, sr_return_amt#4]
Input [5]: [sr_returned_date_sk#1, sr_customer_sk#2, sr_store_sk#3, sr_return_amt#4, d_date_sk#5]
(21) HashAggregate [codegen id : 4]
Input [3]: [sr_customer_sk#2, sr_store_sk#3, sr_return_amt#4]
Keys [2]: [sr_customer_sk#2, sr_store_sk#3]
Functions [1]: [partial_sum(UnscaledValue(sr_return_amt#4))]
Aggregate Attributes [1]: [sum#15]
Results [3]: [sr_customer_sk#2, sr_store_sk#3, sum#16]
(22) Exchange
Input [3]: [sr_customer_sk#2, sr_store_sk#3, sum#16]
Arguments: hashpartitioning(sr_customer_sk#2, sr_store_sk#3, 5), true, [id=#17]
(23) HashAggregate [codegen id : 5]
Input [3]: [sr_customer_sk#2, sr_store_sk#3, sum#16]
Keys [2]: [sr_customer_sk#2, sr_store_sk#3]
Functions [1]: [sum(UnscaledValue(sr_return_amt#4))]
Aggregate Attributes [1]: [sum(UnscaledValue(sr_return_amt#4))#18]
Results [2]: [sr_store_sk#3 AS ctr_store_sk#13, MakeDecimal(sum(UnscaledValue(sr_return_amt#4))#18,17,2) AS ctr_total_return#14]
(24) HashAggregate [codegen id : 5]
Input [2]: [ctr_store_sk#13, ctr_total_return#14]
Keys [1]: [ctr_store_sk#13]
Functions [1]: [partial_avg(ctr_total_return#14)]
Aggregate Attributes [2]: [sum#19, count#20]
Results [3]: [ctr_store_sk#13, sum#21, count#22]
(25) Exchange
Input [3]: [ctr_store_sk#13, sum#21, count#22]
Arguments: hashpartitioning(ctr_store_sk#13, 5), true, [id=#23]
(26) HashAggregate [codegen id : 6]
Input [3]: [ctr_store_sk#13, sum#21, count#22]
Keys [1]: [ctr_store_sk#13]
Functions [1]: [avg(ctr_total_return#14)]
Aggregate Attributes [1]: [avg(ctr_total_return#14)#24]
Results [2]: [CheckOverflow((promote_precision(avg(ctr_total_return#14)#24) * 1.200000), DecimalType(24,7), true) AS (CAST(avg(ctr_total_return) AS DECIMAL(21,6)) * CAST(1.2 AS DECIMAL(21,6)))#25, ctr_store_sk#13 AS ctr_store_sk#13#26]
(27) Filter [codegen id : 6]
Input [2]: [(CAST(avg(ctr_total_return) AS DECIMAL(21,6)) * CAST(1.2 AS DECIMAL(21,6)))#25, ctr_store_sk#13#26]
Condition : isnotnull((CAST(avg(ctr_total_return) AS DECIMAL(21,6)) * CAST(1.2 AS DECIMAL(21,6)))#25)
(28) BroadcastExchange
Input [2]: [(CAST(avg(ctr_total_return) AS DECIMAL(21,6)) * CAST(1.2 AS DECIMAL(21,6)))#25, ctr_store_sk#13#26]
Arguments: HashedRelationBroadcastMode(List(input[1, bigint, true]),false), [id=#27]
(29) BroadcastHashJoin [codegen id : 8]
Left keys [1]: [ctr_store_sk#13]
Right keys [1]: [ctr_store_sk#13#26]
Join condition: (cast(ctr_total_return#14 as decimal(24,7)) > (CAST(avg(ctr_total_return) AS DECIMAL(21,6)) * CAST(1.2 AS DECIMAL(21,6)))#25)
(30) Project [codegen id : 8]
Output [2]: [ctr_customer_sk#12, ctr_store_sk#13]
Input [5]: [ctr_customer_sk#12, ctr_store_sk#13, ctr_total_return#14, (CAST(avg(ctr_total_return) AS DECIMAL(21,6)) * CAST(1.2 AS DECIMAL(21,6)))#25, ctr_store_sk#13#26]
(31) Scan parquet default.store
Output [2]: [s_store_sk#28, s_state#29]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilityWithStatsSuite/store]
PushedFilters: [IsNotNull(s_state), EqualTo(s_state,TN), IsNotNull(s_store_sk)]
ReadSchema: struct<s_store_sk:int,s_state:string>
(32) ColumnarToRow [codegen id : 7]
Input [2]: [s_store_sk#28, s_state#29]
(33) Filter [codegen id : 7]
Input [2]: [s_store_sk#28, s_state#29]
Condition : ((isnotnull(s_state#29) AND (s_state#29 = TN)) AND isnotnull(s_store_sk#28))
(34) Project [codegen id : 7]
Output [1]: [s_store_sk#28]
Input [2]: [s_store_sk#28, s_state#29]
(35) BroadcastExchange
Input [1]: [s_store_sk#28]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#30]
(36) BroadcastHashJoin [codegen id : 8]
Left keys [1]: [ctr_store_sk#13]
Right keys [1]: [cast(s_store_sk#28 as bigint)]
Join condition: None
(37) Project [codegen id : 8]
Output [1]: [ctr_customer_sk#12]
Input [3]: [ctr_customer_sk#12, ctr_store_sk#13, s_store_sk#28]
(38) Exchange
Input [1]: [ctr_customer_sk#12]
Arguments: hashpartitioning(ctr_customer_sk#12, 5), true, [id=#31]
(39) Sort [codegen id : 9]
Input [1]: [ctr_customer_sk#12]
Arguments: [ctr_customer_sk#12 ASC NULLS FIRST], false, 0
(40) Scan parquet default.customer
Output [2]: [c_customer_sk#32, c_customer_id#33]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilityWithStatsSuite/customer]
PushedFilters: [IsNotNull(c_customer_sk)]
ReadSchema: struct<c_customer_sk:int,c_customer_id:string>
(41) ColumnarToRow [codegen id : 10]
Input [2]: [c_customer_sk#32, c_customer_id#33]
(42) Filter [codegen id : 10]
Input [2]: [c_customer_sk#32, c_customer_id#33]
Condition : isnotnull(c_customer_sk#32)
(43) Exchange
Input [2]: [c_customer_sk#32, c_customer_id#33]
Arguments: hashpartitioning(cast(c_customer_sk#32 as bigint), 5), true, [id=#34]
(44) Sort [codegen id : 11]
Input [2]: [c_customer_sk#32, c_customer_id#33]
Arguments: [cast(c_customer_sk#32 as bigint) ASC NULLS FIRST], false, 0
(45) SortMergeJoin [codegen id : 12]
Left keys [1]: [ctr_customer_sk#12]
Right keys [1]: [cast(c_customer_sk#32 as bigint)]
Join condition: None
(46) Project [codegen id : 12]
Output [1]: [c_customer_id#33]
Input [3]: [ctr_customer_sk#12, c_customer_sk#32, c_customer_id#33]
(47) TakeOrderedAndProject
Input [1]: [c_customer_id#33]
Arguments: 100, [c_customer_id#33 ASC NULLS FIRST], [c_customer_id#33]

View file

@ -0,0 +1,74 @@
TakeOrderedAndProject [c_customer_id]
WholeStageCodegen (12)
Project [c_customer_id]
SortMergeJoin [c_customer_sk,ctr_customer_sk]
InputAdapter
WholeStageCodegen (9)
Sort [ctr_customer_sk]
InputAdapter
Exchange [ctr_customer_sk] #1
WholeStageCodegen (8)
Project [ctr_customer_sk]
BroadcastHashJoin [ctr_store_sk,s_store_sk]
Project [ctr_customer_sk,ctr_store_sk]
BroadcastHashJoin [(CAST(avg(ctr_total_return) AS DECIMAL(21,6)) * CAST(1.2 AS DECIMAL(21,6))),ctr_store_sk,ctr_store_skL,ctr_total_return]
Filter [ctr_total_return]
HashAggregate [sr_customer_sk,sr_store_sk,sum] [ctr_customer_sk,ctr_store_sk,ctr_total_return,sum,sum(UnscaledValue(sr_return_amt))]
InputAdapter
Exchange [sr_customer_sk,sr_store_sk] #2
WholeStageCodegen (2)
HashAggregate [sr_customer_sk,sr_return_amt,sr_store_sk] [sum,sum]
Project [sr_customer_sk,sr_return_amt,sr_store_sk]
BroadcastHashJoin [d_date_sk,sr_returned_date_sk]
Filter [sr_customer_sk,sr_returned_date_sk,sr_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_returns [sr_customer_sk,sr_return_amt,sr_returned_date_sk,sr_store_sk]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (1)
Project [d_date_sk]
Filter [d_date_sk,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_year]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (6)
Filter [(CAST(avg(ctr_total_return) AS DECIMAL(21,6)) * CAST(1.2 AS DECIMAL(21,6)))]
HashAggregate [count,ctr_store_sk,sum] [(CAST(avg(ctr_total_return) AS DECIMAL(21,6)) * CAST(1.2 AS DECIMAL(21,6))),avg(ctr_total_return),count,ctr_store_skL,sum]
InputAdapter
Exchange [ctr_store_sk] #5
WholeStageCodegen (5)
HashAggregate [ctr_store_sk,ctr_total_return] [count,count,sum,sum]
HashAggregate [sr_customer_sk,sr_store_sk,sum] [ctr_store_sk,ctr_total_return,sum,sum(UnscaledValue(sr_return_amt))]
InputAdapter
Exchange [sr_customer_sk,sr_store_sk] #6
WholeStageCodegen (4)
HashAggregate [sr_customer_sk,sr_return_amt,sr_store_sk] [sum,sum]
Project [sr_customer_sk,sr_return_amt,sr_store_sk]
BroadcastHashJoin [d_date_sk,sr_returned_date_sk]
Filter [sr_returned_date_sk,sr_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_returns [sr_customer_sk,sr_return_amt,sr_returned_date_sk,sr_store_sk]
InputAdapter
ReusedExchange [d_date_sk] #3
InputAdapter
BroadcastExchange #7
WholeStageCodegen (7)
Project [s_store_sk]
Filter [s_state,s_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store [s_state,s_store_sk]
InputAdapter
WholeStageCodegen (11)
Sort [c_customer_sk]
InputAdapter
Exchange [c_customer_sk] #8
WholeStageCodegen (10)
Filter [c_customer_sk]
ColumnarToRow
InputAdapter
Scan parquet default.customer [c_customer_id,c_customer_sk]

View file

@ -0,0 +1,255 @@
== Physical Plan ==
TakeOrderedAndProject (44)
+- * Project (43)
+- * BroadcastHashJoin Inner BuildRight (42)
:- * Project (37)
: +- * BroadcastHashJoin Inner BuildRight (36)
: :- * Project (30)
: : +- * BroadcastHashJoin Inner BuildRight (29)
: : :- * Filter (14)
: : : +- * HashAggregate (13)
: : : +- Exchange (12)
: : : +- * HashAggregate (11)
: : : +- * Project (10)
: : : +- * BroadcastHashJoin Inner BuildRight (9)
: : : :- * Filter (3)
: : : : +- * ColumnarToRow (2)
: : : : +- Scan parquet default.store_returns (1)
: : : +- BroadcastExchange (8)
: : : +- * Project (7)
: : : +- * Filter (6)
: : : +- * ColumnarToRow (5)
: : : +- Scan parquet default.date_dim (4)
: : +- BroadcastExchange (28)
: : +- * Filter (27)
: : +- * HashAggregate (26)
: : +- Exchange (25)
: : +- * HashAggregate (24)
: : +- * HashAggregate (23)
: : +- Exchange (22)
: : +- * HashAggregate (21)
: : +- * Project (20)
: : +- * BroadcastHashJoin Inner BuildRight (19)
: : :- * Filter (17)
: : : +- * ColumnarToRow (16)
: : : +- Scan parquet default.store_returns (15)
: : +- ReusedExchange (18)
: +- BroadcastExchange (35)
: +- * Project (34)
: +- * Filter (33)
: +- * ColumnarToRow (32)
: +- Scan parquet default.store (31)
+- BroadcastExchange (41)
+- * Filter (40)
+- * ColumnarToRow (39)
+- Scan parquet default.customer (38)
(1) Scan parquet default.store_returns
Output [4]: [sr_returned_date_sk#1, sr_customer_sk#2, sr_store_sk#3, sr_return_amt#4]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilitySuite/store_returns]
PushedFilters: [IsNotNull(sr_returned_date_sk), IsNotNull(sr_store_sk), IsNotNull(sr_customer_sk)]
ReadSchema: struct<sr_returned_date_sk:bigint,sr_customer_sk:bigint,sr_store_sk:bigint,sr_return_amt:decimal(7,2)>
(2) ColumnarToRow [codegen id : 2]
Input [4]: [sr_returned_date_sk#1, sr_customer_sk#2, sr_store_sk#3, sr_return_amt#4]
(3) Filter [codegen id : 2]
Input [4]: [sr_returned_date_sk#1, sr_customer_sk#2, sr_store_sk#3, sr_return_amt#4]
Condition : ((isnotnull(sr_returned_date_sk#1) AND isnotnull(sr_store_sk#3)) AND isnotnull(sr_customer_sk#2))
(4) Scan parquet default.date_dim
Output [2]: [d_date_sk#5, d_year#6]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilitySuite/date_dim]
PushedFilters: [IsNotNull(d_year), EqualTo(d_year,2000), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int>
(5) ColumnarToRow [codegen id : 1]
Input [2]: [d_date_sk#5, d_year#6]
(6) Filter [codegen id : 1]
Input [2]: [d_date_sk#5, d_year#6]
Condition : ((isnotnull(d_year#6) AND (d_year#6 = 2000)) AND isnotnull(d_date_sk#5))
(7) Project [codegen id : 1]
Output [1]: [d_date_sk#5]
Input [2]: [d_date_sk#5, d_year#6]
(8) BroadcastExchange
Input [1]: [d_date_sk#5]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#7]
(9) BroadcastHashJoin [codegen id : 2]
Left keys [1]: [sr_returned_date_sk#1]
Right keys [1]: [cast(d_date_sk#5 as bigint)]
Join condition: None
(10) Project [codegen id : 2]
Output [3]: [sr_customer_sk#2, sr_store_sk#3, sr_return_amt#4]
Input [5]: [sr_returned_date_sk#1, sr_customer_sk#2, sr_store_sk#3, sr_return_amt#4, d_date_sk#5]
(11) HashAggregate [codegen id : 2]
Input [3]: [sr_customer_sk#2, sr_store_sk#3, sr_return_amt#4]
Keys [2]: [sr_customer_sk#2, sr_store_sk#3]
Functions [1]: [partial_sum(UnscaledValue(sr_return_amt#4))]
Aggregate Attributes [1]: [sum#8]
Results [3]: [sr_customer_sk#2, sr_store_sk#3, sum#9]
(12) Exchange
Input [3]: [sr_customer_sk#2, sr_store_sk#3, sum#9]
Arguments: hashpartitioning(sr_customer_sk#2, sr_store_sk#3, 5), true, [id=#10]
(13) HashAggregate [codegen id : 9]
Input [3]: [sr_customer_sk#2, sr_store_sk#3, sum#9]
Keys [2]: [sr_customer_sk#2, sr_store_sk#3]
Functions [1]: [sum(UnscaledValue(sr_return_amt#4))]
Aggregate Attributes [1]: [sum(UnscaledValue(sr_return_amt#4))#11]
Results [3]: [sr_customer_sk#2 AS ctr_customer_sk#12, sr_store_sk#3 AS ctr_store_sk#13, MakeDecimal(sum(UnscaledValue(sr_return_amt#4))#11,17,2) AS ctr_total_return#14]
(14) Filter [codegen id : 9]
Input [3]: [ctr_customer_sk#12, ctr_store_sk#13, ctr_total_return#14]
Condition : isnotnull(ctr_total_return#14)
(15) Scan parquet default.store_returns
Output [4]: [sr_returned_date_sk#1, sr_customer_sk#2, sr_store_sk#3, sr_return_amt#4]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilitySuite/store_returns]
PushedFilters: [IsNotNull(sr_returned_date_sk), IsNotNull(sr_store_sk)]
ReadSchema: struct<sr_returned_date_sk:bigint,sr_customer_sk:bigint,sr_store_sk:bigint,sr_return_amt:decimal(7,2)>
(16) ColumnarToRow [codegen id : 4]
Input [4]: [sr_returned_date_sk#1, sr_customer_sk#2, sr_store_sk#3, sr_return_amt#4]
(17) Filter [codegen id : 4]
Input [4]: [sr_returned_date_sk#1, sr_customer_sk#2, sr_store_sk#3, sr_return_amt#4]
Condition : (isnotnull(sr_returned_date_sk#1) AND isnotnull(sr_store_sk#3))
(18) ReusedExchange [Reuses operator id: 8]
Output [1]: [d_date_sk#5]
(19) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [sr_returned_date_sk#1]
Right keys [1]: [cast(d_date_sk#5 as bigint)]
Join condition: None
(20) Project [codegen id : 4]
Output [3]: [sr_customer_sk#2, sr_store_sk#3, sr_return_amt#4]
Input [5]: [sr_returned_date_sk#1, sr_customer_sk#2, sr_store_sk#3, sr_return_amt#4, d_date_sk#5]
(21) HashAggregate [codegen id : 4]
Input [3]: [sr_customer_sk#2, sr_store_sk#3, sr_return_amt#4]
Keys [2]: [sr_customer_sk#2, sr_store_sk#3]
Functions [1]: [partial_sum(UnscaledValue(sr_return_amt#4))]
Aggregate Attributes [1]: [sum#15]
Results [3]: [sr_customer_sk#2, sr_store_sk#3, sum#16]
(22) Exchange
Input [3]: [sr_customer_sk#2, sr_store_sk#3, sum#16]
Arguments: hashpartitioning(sr_customer_sk#2, sr_store_sk#3, 5), true, [id=#17]
(23) HashAggregate [codegen id : 5]
Input [3]: [sr_customer_sk#2, sr_store_sk#3, sum#16]
Keys [2]: [sr_customer_sk#2, sr_store_sk#3]
Functions [1]: [sum(UnscaledValue(sr_return_amt#4))]
Aggregate Attributes [1]: [sum(UnscaledValue(sr_return_amt#4))#18]
Results [2]: [sr_store_sk#3 AS ctr_store_sk#13, MakeDecimal(sum(UnscaledValue(sr_return_amt#4))#18,17,2) AS ctr_total_return#14]
(24) HashAggregate [codegen id : 5]
Input [2]: [ctr_store_sk#13, ctr_total_return#14]
Keys [1]: [ctr_store_sk#13]
Functions [1]: [partial_avg(ctr_total_return#14)]
Aggregate Attributes [2]: [sum#19, count#20]
Results [3]: [ctr_store_sk#13, sum#21, count#22]
(25) Exchange
Input [3]: [ctr_store_sk#13, sum#21, count#22]
Arguments: hashpartitioning(ctr_store_sk#13, 5), true, [id=#23]
(26) HashAggregate [codegen id : 6]
Input [3]: [ctr_store_sk#13, sum#21, count#22]
Keys [1]: [ctr_store_sk#13]
Functions [1]: [avg(ctr_total_return#14)]
Aggregate Attributes [1]: [avg(ctr_total_return#14)#24]
Results [2]: [CheckOverflow((promote_precision(avg(ctr_total_return#14)#24) * 1.200000), DecimalType(24,7), true) AS (CAST(avg(ctr_total_return) AS DECIMAL(21,6)) * CAST(1.2 AS DECIMAL(21,6)))#25, ctr_store_sk#13 AS ctr_store_sk#13#26]
(27) Filter [codegen id : 6]
Input [2]: [(CAST(avg(ctr_total_return) AS DECIMAL(21,6)) * CAST(1.2 AS DECIMAL(21,6)))#25, ctr_store_sk#13#26]
Condition : isnotnull((CAST(avg(ctr_total_return) AS DECIMAL(21,6)) * CAST(1.2 AS DECIMAL(21,6)))#25)
(28) BroadcastExchange
Input [2]: [(CAST(avg(ctr_total_return) AS DECIMAL(21,6)) * CAST(1.2 AS DECIMAL(21,6)))#25, ctr_store_sk#13#26]
Arguments: HashedRelationBroadcastMode(List(input[1, bigint, true]),false), [id=#27]
(29) BroadcastHashJoin [codegen id : 9]
Left keys [1]: [ctr_store_sk#13]
Right keys [1]: [ctr_store_sk#13#26]
Join condition: (cast(ctr_total_return#14 as decimal(24,7)) > (CAST(avg(ctr_total_return) AS DECIMAL(21,6)) * CAST(1.2 AS DECIMAL(21,6)))#25)
(30) Project [codegen id : 9]
Output [2]: [ctr_customer_sk#12, ctr_store_sk#13]
Input [5]: [ctr_customer_sk#12, ctr_store_sk#13, ctr_total_return#14, (CAST(avg(ctr_total_return) AS DECIMAL(21,6)) * CAST(1.2 AS DECIMAL(21,6)))#25, ctr_store_sk#13#26]
(31) Scan parquet default.store
Output [2]: [s_store_sk#28, s_state#29]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilitySuite/store]
PushedFilters: [IsNotNull(s_state), EqualTo(s_state,TN), IsNotNull(s_store_sk)]
ReadSchema: struct<s_store_sk:int,s_state:string>
(32) ColumnarToRow [codegen id : 7]
Input [2]: [s_store_sk#28, s_state#29]
(33) Filter [codegen id : 7]
Input [2]: [s_store_sk#28, s_state#29]
Condition : ((isnotnull(s_state#29) AND (s_state#29 = TN)) AND isnotnull(s_store_sk#28))
(34) Project [codegen id : 7]
Output [1]: [s_store_sk#28]
Input [2]: [s_store_sk#28, s_state#29]
(35) BroadcastExchange
Input [1]: [s_store_sk#28]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#30]
(36) BroadcastHashJoin [codegen id : 9]
Left keys [1]: [ctr_store_sk#13]
Right keys [1]: [cast(s_store_sk#28 as bigint)]
Join condition: None
(37) Project [codegen id : 9]
Output [1]: [ctr_customer_sk#12]
Input [3]: [ctr_customer_sk#12, ctr_store_sk#13, s_store_sk#28]
(38) Scan parquet default.customer
Output [2]: [c_customer_sk#31, c_customer_id#32]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilitySuite/customer]
PushedFilters: [IsNotNull(c_customer_sk)]
ReadSchema: struct<c_customer_sk:int,c_customer_id:string>
(39) ColumnarToRow [codegen id : 8]
Input [2]: [c_customer_sk#31, c_customer_id#32]
(40) Filter [codegen id : 8]
Input [2]: [c_customer_sk#31, c_customer_id#32]
Condition : isnotnull(c_customer_sk#31)
(41) BroadcastExchange
Input [2]: [c_customer_sk#31, c_customer_id#32]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#33]
(42) BroadcastHashJoin [codegen id : 9]
Left keys [1]: [ctr_customer_sk#12]
Right keys [1]: [cast(c_customer_sk#31 as bigint)]
Join condition: None
(43) Project [codegen id : 9]
Output [1]: [c_customer_id#32]
Input [3]: [ctr_customer_sk#12, c_customer_sk#31, c_customer_id#32]
(44) TakeOrderedAndProject
Input [1]: [c_customer_id#32]
Arguments: 100, [c_customer_id#32 ASC NULLS FIRST], [c_customer_id#32]

View file

@ -0,0 +1,65 @@
TakeOrderedAndProject [c_customer_id]
WholeStageCodegen (9)
Project [c_customer_id]
BroadcastHashJoin [c_customer_sk,ctr_customer_sk]
Project [ctr_customer_sk]
BroadcastHashJoin [ctr_store_sk,s_store_sk]
Project [ctr_customer_sk,ctr_store_sk]
BroadcastHashJoin [(CAST(avg(ctr_total_return) AS DECIMAL(21,6)) * CAST(1.2 AS DECIMAL(21,6))),ctr_store_sk,ctr_store_skL,ctr_total_return]
Filter [ctr_total_return]
HashAggregate [sr_customer_sk,sr_store_sk,sum] [ctr_customer_sk,ctr_store_sk,ctr_total_return,sum,sum(UnscaledValue(sr_return_amt))]
InputAdapter
Exchange [sr_customer_sk,sr_store_sk] #1
WholeStageCodegen (2)
HashAggregate [sr_customer_sk,sr_return_amt,sr_store_sk] [sum,sum]
Project [sr_customer_sk,sr_return_amt,sr_store_sk]
BroadcastHashJoin [d_date_sk,sr_returned_date_sk]
Filter [sr_customer_sk,sr_returned_date_sk,sr_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_returns [sr_customer_sk,sr_return_amt,sr_returned_date_sk,sr_store_sk]
InputAdapter
BroadcastExchange #2
WholeStageCodegen (1)
Project [d_date_sk]
Filter [d_date_sk,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_year]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (6)
Filter [(CAST(avg(ctr_total_return) AS DECIMAL(21,6)) * CAST(1.2 AS DECIMAL(21,6)))]
HashAggregate [count,ctr_store_sk,sum] [(CAST(avg(ctr_total_return) AS DECIMAL(21,6)) * CAST(1.2 AS DECIMAL(21,6))),avg(ctr_total_return),count,ctr_store_skL,sum]
InputAdapter
Exchange [ctr_store_sk] #4
WholeStageCodegen (5)
HashAggregate [ctr_store_sk,ctr_total_return] [count,count,sum,sum]
HashAggregate [sr_customer_sk,sr_store_sk,sum] [ctr_store_sk,ctr_total_return,sum,sum(UnscaledValue(sr_return_amt))]
InputAdapter
Exchange [sr_customer_sk,sr_store_sk] #5
WholeStageCodegen (4)
HashAggregate [sr_customer_sk,sr_return_amt,sr_store_sk] [sum,sum]
Project [sr_customer_sk,sr_return_amt,sr_store_sk]
BroadcastHashJoin [d_date_sk,sr_returned_date_sk]
Filter [sr_returned_date_sk,sr_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_returns [sr_customer_sk,sr_return_amt,sr_returned_date_sk,sr_store_sk]
InputAdapter
ReusedExchange [d_date_sk] #2
InputAdapter
BroadcastExchange #6
WholeStageCodegen (7)
Project [s_store_sk]
Filter [s_state,s_store_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store [s_state,s_store_sk]
InputAdapter
BroadcastExchange #7
WholeStageCodegen (8)
Filter [c_customer_sk]
ColumnarToRow
InputAdapter
Scan parquet default.customer [c_customer_id,c_customer_sk]

View file

@ -0,0 +1,319 @@
== Physical Plan ==
TakeOrderedAndProject (58)
+- * HashAggregate (57)
+- Exchange (56)
+- * HashAggregate (55)
+- * Project (54)
+- * SortMergeJoin Inner (53)
:- * Sort (47)
: +- Exchange (46)
: +- * Project (45)
: +- * BroadcastHashJoin Inner BuildRight (44)
: :- * Project (38)
: : +- * Filter (37)
: : +- SortMergeJoin ExistenceJoin(exists#1) (36)
: : :- SortMergeJoin ExistenceJoin(exists#2) (27)
: : : :- SortMergeJoin LeftSemi (18)
: : : : :- * Sort (5)
: : : : : +- Exchange (4)
: : : : : +- * Filter (3)
: : : : : +- * ColumnarToRow (2)
: : : : : +- Scan parquet default.customer (1)
: : : : +- * Sort (17)
: : : : +- Exchange (16)
: : : : +- * Project (15)
: : : : +- * BroadcastHashJoin Inner BuildRight (14)
: : : : :- * Filter (8)
: : : : : +- * ColumnarToRow (7)
: : : : : +- Scan parquet default.store_sales (6)
: : : : +- BroadcastExchange (13)
: : : : +- * Project (12)
: : : : +- * Filter (11)
: : : : +- * ColumnarToRow (10)
: : : : +- Scan parquet default.date_dim (9)
: : : +- * Sort (26)
: : : +- Exchange (25)
: : : +- * Project (24)
: : : +- * BroadcastHashJoin Inner BuildRight (23)
: : : :- * Filter (21)
: : : : +- * ColumnarToRow (20)
: : : : +- Scan parquet default.web_sales (19)
: : : +- ReusedExchange (22)
: : +- * Sort (35)
: : +- Exchange (34)
: : +- * Project (33)
: : +- * BroadcastHashJoin Inner BuildRight (32)
: : :- * Filter (30)
: : : +- * ColumnarToRow (29)
: : : +- Scan parquet default.catalog_sales (28)
: : +- ReusedExchange (31)
: +- BroadcastExchange (43)
: +- * Project (42)
: +- * Filter (41)
: +- * ColumnarToRow (40)
: +- Scan parquet default.customer_address (39)
+- * Sort (52)
+- Exchange (51)
+- * Filter (50)
+- * ColumnarToRow (49)
+- Scan parquet default.customer_demographics (48)
(1) Scan parquet default.customer
Output [3]: [c_customer_sk#3, c_current_cdemo_sk#4, c_current_addr_sk#5]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilityWithStatsSuite/customer]
PushedFilters: [IsNotNull(c_current_addr_sk), IsNotNull(c_current_cdemo_sk)]
ReadSchema: struct<c_customer_sk:int,c_current_cdemo_sk:int,c_current_addr_sk:int>
(2) ColumnarToRow [codegen id : 1]
Input [3]: [c_customer_sk#3, c_current_cdemo_sk#4, c_current_addr_sk#5]
(3) Filter [codegen id : 1]
Input [3]: [c_customer_sk#3, c_current_cdemo_sk#4, c_current_addr_sk#5]
Condition : (isnotnull(c_current_addr_sk#5) AND isnotnull(c_current_cdemo_sk#4))
(4) Exchange
Input [3]: [c_customer_sk#3, c_current_cdemo_sk#4, c_current_addr_sk#5]
Arguments: hashpartitioning(c_customer_sk#3, 5), true, [id=#6]
(5) Sort [codegen id : 2]
Input [3]: [c_customer_sk#3, c_current_cdemo_sk#4, c_current_addr_sk#5]
Arguments: [c_customer_sk#3 ASC NULLS FIRST], false, 0
(6) Scan parquet default.store_sales
Output [2]: [ss_sold_date_sk#7, ss_customer_sk#8]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilityWithStatsSuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_customer_sk:int>
(7) ColumnarToRow [codegen id : 4]
Input [2]: [ss_sold_date_sk#7, ss_customer_sk#8]
(8) Filter [codegen id : 4]
Input [2]: [ss_sold_date_sk#7, ss_customer_sk#8]
Condition : isnotnull(ss_sold_date_sk#7)
(9) Scan parquet default.date_dim
Output [3]: [d_date_sk#9, d_year#10, d_moy#11]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilityWithStatsSuite/date_dim]
PushedFilters: [IsNotNull(d_year), IsNotNull(d_moy), EqualTo(d_year,2002), GreaterThanOrEqual(d_moy,1), LessThanOrEqual(d_moy,4), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int,d_moy:int>
(10) ColumnarToRow [codegen id : 3]
Input [3]: [d_date_sk#9, d_year#10, d_moy#11]
(11) Filter [codegen id : 3]
Input [3]: [d_date_sk#9, d_year#10, d_moy#11]
Condition : (((((isnotnull(d_year#10) AND isnotnull(d_moy#11)) AND (d_year#10 = 2002)) AND (d_moy#11 >= 1)) AND (d_moy#11 <= 4)) AND isnotnull(d_date_sk#9))
(12) Project [codegen id : 3]
Output [1]: [d_date_sk#9]
Input [3]: [d_date_sk#9, d_year#10, d_moy#11]
(13) BroadcastExchange
Input [1]: [d_date_sk#9]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#12]
(14) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ss_sold_date_sk#7]
Right keys [1]: [d_date_sk#9]
Join condition: None
(15) Project [codegen id : 4]
Output [1]: [ss_customer_sk#8]
Input [3]: [ss_sold_date_sk#7, ss_customer_sk#8, d_date_sk#9]
(16) Exchange
Input [1]: [ss_customer_sk#8]
Arguments: hashpartitioning(ss_customer_sk#8, 5), true, [id=#13]
(17) Sort [codegen id : 5]
Input [1]: [ss_customer_sk#8]
Arguments: [ss_customer_sk#8 ASC NULLS FIRST], false, 0
(18) SortMergeJoin
Left keys [1]: [c_customer_sk#3]
Right keys [1]: [ss_customer_sk#8]
Join condition: None
(19) Scan parquet default.web_sales
Output [2]: [ws_sold_date_sk#14, ws_bill_customer_sk#15]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilityWithStatsSuite/web_sales]
PushedFilters: [IsNotNull(ws_sold_date_sk)]
ReadSchema: struct<ws_sold_date_sk:int,ws_bill_customer_sk:int>
(20) ColumnarToRow [codegen id : 7]
Input [2]: [ws_sold_date_sk#14, ws_bill_customer_sk#15]
(21) Filter [codegen id : 7]
Input [2]: [ws_sold_date_sk#14, ws_bill_customer_sk#15]
Condition : isnotnull(ws_sold_date_sk#14)
(22) ReusedExchange [Reuses operator id: 13]
Output [1]: [d_date_sk#9]
(23) BroadcastHashJoin [codegen id : 7]
Left keys [1]: [ws_sold_date_sk#14]
Right keys [1]: [d_date_sk#9]
Join condition: None
(24) Project [codegen id : 7]
Output [1]: [ws_bill_customer_sk#15]
Input [3]: [ws_sold_date_sk#14, ws_bill_customer_sk#15, d_date_sk#9]
(25) Exchange
Input [1]: [ws_bill_customer_sk#15]
Arguments: hashpartitioning(ws_bill_customer_sk#15, 5), true, [id=#16]
(26) Sort [codegen id : 8]
Input [1]: [ws_bill_customer_sk#15]
Arguments: [ws_bill_customer_sk#15 ASC NULLS FIRST], false, 0
(27) SortMergeJoin
Left keys [1]: [c_customer_sk#3]
Right keys [1]: [ws_bill_customer_sk#15]
Join condition: None
(28) Scan parquet default.catalog_sales
Output [2]: [cs_sold_date_sk#17, cs_ship_customer_sk#18]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilityWithStatsSuite/catalog_sales]
PushedFilters: [IsNotNull(cs_sold_date_sk)]
ReadSchema: struct<cs_sold_date_sk:int,cs_ship_customer_sk:int>
(29) ColumnarToRow [codegen id : 10]
Input [2]: [cs_sold_date_sk#17, cs_ship_customer_sk#18]
(30) Filter [codegen id : 10]
Input [2]: [cs_sold_date_sk#17, cs_ship_customer_sk#18]
Condition : isnotnull(cs_sold_date_sk#17)
(31) ReusedExchange [Reuses operator id: 13]
Output [1]: [d_date_sk#9]
(32) BroadcastHashJoin [codegen id : 10]
Left keys [1]: [cs_sold_date_sk#17]
Right keys [1]: [d_date_sk#9]
Join condition: None
(33) Project [codegen id : 10]
Output [1]: [cs_ship_customer_sk#18]
Input [3]: [cs_sold_date_sk#17, cs_ship_customer_sk#18, d_date_sk#9]
(34) Exchange
Input [1]: [cs_ship_customer_sk#18]
Arguments: hashpartitioning(cs_ship_customer_sk#18, 5), true, [id=#19]
(35) Sort [codegen id : 11]
Input [1]: [cs_ship_customer_sk#18]
Arguments: [cs_ship_customer_sk#18 ASC NULLS FIRST], false, 0
(36) SortMergeJoin
Left keys [1]: [c_customer_sk#3]
Right keys [1]: [cs_ship_customer_sk#18]
Join condition: None
(37) Filter [codegen id : 13]
Input [5]: [c_customer_sk#3, c_current_cdemo_sk#4, c_current_addr_sk#5, exists#2, exists#1]
Condition : (exists#2 OR exists#1)
(38) Project [codegen id : 13]
Output [2]: [c_current_cdemo_sk#4, c_current_addr_sk#5]
Input [5]: [c_customer_sk#3, c_current_cdemo_sk#4, c_current_addr_sk#5, exists#2, exists#1]
(39) Scan parquet default.customer_address
Output [2]: [ca_address_sk#20, ca_county#21]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilityWithStatsSuite/customer_address]
PushedFilters: [In(ca_county, [Rush County,Toole County,Jefferson County,Dona Ana County,La Porte County]), IsNotNull(ca_address_sk)]
ReadSchema: struct<ca_address_sk:int,ca_county:string>
(40) ColumnarToRow [codegen id : 12]
Input [2]: [ca_address_sk#20, ca_county#21]
(41) Filter [codegen id : 12]
Input [2]: [ca_address_sk#20, ca_county#21]
Condition : (ca_county#21 IN (Rush County,Toole County,Jefferson County,Dona Ana County,La Porte County) AND isnotnull(ca_address_sk#20))
(42) Project [codegen id : 12]
Output [1]: [ca_address_sk#20]
Input [2]: [ca_address_sk#20, ca_county#21]
(43) BroadcastExchange
Input [1]: [ca_address_sk#20]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#22]
(44) BroadcastHashJoin [codegen id : 13]
Left keys [1]: [c_current_addr_sk#5]
Right keys [1]: [ca_address_sk#20]
Join condition: None
(45) Project [codegen id : 13]
Output [1]: [c_current_cdemo_sk#4]
Input [3]: [c_current_cdemo_sk#4, c_current_addr_sk#5, ca_address_sk#20]
(46) Exchange
Input [1]: [c_current_cdemo_sk#4]
Arguments: hashpartitioning(c_current_cdemo_sk#4, 5), true, [id=#23]
(47) Sort [codegen id : 14]
Input [1]: [c_current_cdemo_sk#4]
Arguments: [c_current_cdemo_sk#4 ASC NULLS FIRST], false, 0
(48) Scan parquet default.customer_demographics
Output [9]: [cd_demo_sk#24, cd_gender#25, cd_marital_status#26, cd_education_status#27, cd_purchase_estimate#28, cd_credit_rating#29, cd_dep_count#30, cd_dep_employed_count#31, cd_dep_college_count#32]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilityWithStatsSuite/customer_demographics]
PushedFilters: [IsNotNull(cd_demo_sk)]
ReadSchema: struct<cd_demo_sk:int,cd_gender:string,cd_marital_status:string,cd_education_status:string,cd_purchase_estimate:int,cd_credit_rating:string,cd_dep_count:int,cd_dep_employed_count:int,cd_dep_college_count:int>
(49) ColumnarToRow [codegen id : 15]
Input [9]: [cd_demo_sk#24, cd_gender#25, cd_marital_status#26, cd_education_status#27, cd_purchase_estimate#28, cd_credit_rating#29, cd_dep_count#30, cd_dep_employed_count#31, cd_dep_college_count#32]
(50) Filter [codegen id : 15]
Input [9]: [cd_demo_sk#24, cd_gender#25, cd_marital_status#26, cd_education_status#27, cd_purchase_estimate#28, cd_credit_rating#29, cd_dep_count#30, cd_dep_employed_count#31, cd_dep_college_count#32]
Condition : isnotnull(cd_demo_sk#24)
(51) Exchange
Input [9]: [cd_demo_sk#24, cd_gender#25, cd_marital_status#26, cd_education_status#27, cd_purchase_estimate#28, cd_credit_rating#29, cd_dep_count#30, cd_dep_employed_count#31, cd_dep_college_count#32]
Arguments: hashpartitioning(cd_demo_sk#24, 5), true, [id=#33]
(52) Sort [codegen id : 16]
Input [9]: [cd_demo_sk#24, cd_gender#25, cd_marital_status#26, cd_education_status#27, cd_purchase_estimate#28, cd_credit_rating#29, cd_dep_count#30, cd_dep_employed_count#31, cd_dep_college_count#32]
Arguments: [cd_demo_sk#24 ASC NULLS FIRST], false, 0
(53) SortMergeJoin [codegen id : 17]
Left keys [1]: [c_current_cdemo_sk#4]
Right keys [1]: [cd_demo_sk#24]
Join condition: None
(54) Project [codegen id : 17]
Output [8]: [cd_gender#25, cd_marital_status#26, cd_education_status#27, cd_purchase_estimate#28, cd_credit_rating#29, cd_dep_count#30, cd_dep_employed_count#31, cd_dep_college_count#32]
Input [10]: [c_current_cdemo_sk#4, cd_demo_sk#24, cd_gender#25, cd_marital_status#26, cd_education_status#27, cd_purchase_estimate#28, cd_credit_rating#29, cd_dep_count#30, cd_dep_employed_count#31, cd_dep_college_count#32]
(55) HashAggregate [codegen id : 17]
Input [8]: [cd_gender#25, cd_marital_status#26, cd_education_status#27, cd_purchase_estimate#28, cd_credit_rating#29, cd_dep_count#30, cd_dep_employed_count#31, cd_dep_college_count#32]
Keys [8]: [cd_gender#25, cd_marital_status#26, cd_education_status#27, cd_purchase_estimate#28, cd_credit_rating#29, cd_dep_count#30, cd_dep_employed_count#31, cd_dep_college_count#32]
Functions [1]: [partial_count(1)]
Aggregate Attributes [1]: [count#34]
Results [9]: [cd_gender#25, cd_marital_status#26, cd_education_status#27, cd_purchase_estimate#28, cd_credit_rating#29, cd_dep_count#30, cd_dep_employed_count#31, cd_dep_college_count#32, count#35]
(56) Exchange
Input [9]: [cd_gender#25, cd_marital_status#26, cd_education_status#27, cd_purchase_estimate#28, cd_credit_rating#29, cd_dep_count#30, cd_dep_employed_count#31, cd_dep_college_count#32, count#35]
Arguments: hashpartitioning(cd_gender#25, cd_marital_status#26, cd_education_status#27, cd_purchase_estimate#28, cd_credit_rating#29, cd_dep_count#30, cd_dep_employed_count#31, cd_dep_college_count#32, 5), true, [id=#36]
(57) HashAggregate [codegen id : 18]
Input [9]: [cd_gender#25, cd_marital_status#26, cd_education_status#27, cd_purchase_estimate#28, cd_credit_rating#29, cd_dep_count#30, cd_dep_employed_count#31, cd_dep_college_count#32, count#35]
Keys [8]: [cd_gender#25, cd_marital_status#26, cd_education_status#27, cd_purchase_estimate#28, cd_credit_rating#29, cd_dep_count#30, cd_dep_employed_count#31, cd_dep_college_count#32]
Functions [1]: [count(1)]
Aggregate Attributes [1]: [count(1)#37]
Results [14]: [cd_gender#25, cd_marital_status#26, cd_education_status#27, count(1)#37 AS cnt1#38, cd_purchase_estimate#28, count(1)#37 AS cnt2#39, cd_credit_rating#29, count(1)#37 AS cnt3#40, cd_dep_count#30, count(1)#37 AS cnt4#41, cd_dep_employed_count#31, count(1)#37 AS cnt5#42, cd_dep_college_count#32, count(1)#37 AS cnt6#43]
(58) TakeOrderedAndProject
Input [14]: [cd_gender#25, cd_marital_status#26, cd_education_status#27, cnt1#38, cd_purchase_estimate#28, cnt2#39, cd_credit_rating#29, cnt3#40, cd_dep_count#30, cnt4#41, cd_dep_employed_count#31, cnt5#42, cd_dep_college_count#32, cnt6#43]
Arguments: 100, [cd_gender#25 ASC NULLS FIRST, cd_marital_status#26 ASC NULLS FIRST, cd_education_status#27 ASC NULLS FIRST, cd_purchase_estimate#28 ASC NULLS FIRST, cd_credit_rating#29 ASC NULLS FIRST, cd_dep_count#30 ASC NULLS FIRST, cd_dep_employed_count#31 ASC NULLS FIRST, cd_dep_college_count#32 ASC NULLS FIRST], [cd_gender#25, cd_marital_status#26, cd_education_status#27, cnt1#38, cd_purchase_estimate#28, cnt2#39, cd_credit_rating#29, cnt3#40, cd_dep_count#30, cnt4#41, cd_dep_employed_count#31, cnt5#42, cd_dep_college_count#32, cnt6#43]

View file

@ -0,0 +1,95 @@
TakeOrderedAndProject [cd_credit_rating,cd_dep_college_count,cd_dep_count,cd_dep_employed_count,cd_education_status,cd_gender,cd_marital_status,cd_purchase_estimate,cnt1,cnt2,cnt3,cnt4,cnt5,cnt6]
WholeStageCodegen (18)
HashAggregate [cd_credit_rating,cd_dep_college_count,cd_dep_count,cd_dep_employed_count,cd_education_status,cd_gender,cd_marital_status,cd_purchase_estimate,count] [cnt1,cnt2,cnt3,cnt4,cnt5,cnt6,count,count(1)]
InputAdapter
Exchange [cd_credit_rating,cd_dep_college_count,cd_dep_count,cd_dep_employed_count,cd_education_status,cd_gender,cd_marital_status,cd_purchase_estimate] #1
WholeStageCodegen (17)
HashAggregate [cd_credit_rating,cd_dep_college_count,cd_dep_count,cd_dep_employed_count,cd_education_status,cd_gender,cd_marital_status,cd_purchase_estimate] [count,count]
Project [cd_credit_rating,cd_dep_college_count,cd_dep_count,cd_dep_employed_count,cd_education_status,cd_gender,cd_marital_status,cd_purchase_estimate]
SortMergeJoin [c_current_cdemo_sk,cd_demo_sk]
InputAdapter
WholeStageCodegen (14)
Sort [c_current_cdemo_sk]
InputAdapter
Exchange [c_current_cdemo_sk] #2
WholeStageCodegen (13)
Project [c_current_cdemo_sk]
BroadcastHashJoin [c_current_addr_sk,ca_address_sk]
Project [c_current_addr_sk,c_current_cdemo_sk]
Filter [exists,exists]
InputAdapter
SortMergeJoin [c_customer_sk,cs_ship_customer_sk]
SortMergeJoin [c_customer_sk,ws_bill_customer_sk]
SortMergeJoin [c_customer_sk,ss_customer_sk]
WholeStageCodegen (2)
Sort [c_customer_sk]
InputAdapter
Exchange [c_customer_sk] #3
WholeStageCodegen (1)
Filter [c_current_addr_sk,c_current_cdemo_sk]
ColumnarToRow
InputAdapter
Scan parquet default.customer [c_current_addr_sk,c_current_cdemo_sk,c_customer_sk]
WholeStageCodegen (5)
Sort [ss_customer_sk]
InputAdapter
Exchange [ss_customer_sk] #4
WholeStageCodegen (4)
Project [ss_customer_sk]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Filter [ss_sold_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_customer_sk,ss_sold_date_sk]
InputAdapter
BroadcastExchange #5
WholeStageCodegen (3)
Project [d_date_sk]
Filter [d_date_sk,d_moy,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_moy,d_year]
WholeStageCodegen (8)
Sort [ws_bill_customer_sk]
InputAdapter
Exchange [ws_bill_customer_sk] #6
WholeStageCodegen (7)
Project [ws_bill_customer_sk]
BroadcastHashJoin [d_date_sk,ws_sold_date_sk]
Filter [ws_sold_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.web_sales [ws_bill_customer_sk,ws_sold_date_sk]
InputAdapter
ReusedExchange [d_date_sk] #5
WholeStageCodegen (11)
Sort [cs_ship_customer_sk]
InputAdapter
Exchange [cs_ship_customer_sk] #7
WholeStageCodegen (10)
Project [cs_ship_customer_sk]
BroadcastHashJoin [cs_sold_date_sk,d_date_sk]
Filter [cs_sold_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.catalog_sales [cs_ship_customer_sk,cs_sold_date_sk]
InputAdapter
ReusedExchange [d_date_sk] #5
InputAdapter
BroadcastExchange #8
WholeStageCodegen (12)
Project [ca_address_sk]
Filter [ca_address_sk,ca_county]
ColumnarToRow
InputAdapter
Scan parquet default.customer_address [ca_address_sk,ca_county]
InputAdapter
WholeStageCodegen (16)
Sort [cd_demo_sk]
InputAdapter
Exchange [cd_demo_sk] #9
WholeStageCodegen (15)
Filter [cd_demo_sk]
ColumnarToRow
InputAdapter
Scan parquet default.customer_demographics [cd_credit_rating,cd_demo_sk,cd_dep_college_count,cd_dep_count,cd_dep_employed_count,cd_education_status,cd_gender,cd_marital_status,cd_purchase_estimate]

View file

@ -0,0 +1,279 @@
== Physical Plan ==
TakeOrderedAndProject (50)
+- * HashAggregate (49)
+- Exchange (48)
+- * HashAggregate (47)
+- * Project (46)
+- * BroadcastHashJoin Inner BuildRight (45)
:- * Project (40)
: +- * BroadcastHashJoin Inner BuildRight (39)
: :- * Project (33)
: : +- * Filter (32)
: : +- * BroadcastHashJoin ExistenceJoin(exists#1) BuildRight (31)
: : :- * BroadcastHashJoin ExistenceJoin(exists#2) BuildRight (23)
: : : :- * BroadcastHashJoin LeftSemi BuildRight (15)
: : : : :- * Filter (3)
: : : : : +- * ColumnarToRow (2)
: : : : : +- Scan parquet default.customer (1)
: : : : +- BroadcastExchange (14)
: : : : +- * Project (13)
: : : : +- * BroadcastHashJoin Inner BuildRight (12)
: : : : :- * Filter (6)
: : : : : +- * ColumnarToRow (5)
: : : : : +- Scan parquet default.store_sales (4)
: : : : +- BroadcastExchange (11)
: : : : +- * Project (10)
: : : : +- * Filter (9)
: : : : +- * ColumnarToRow (8)
: : : : +- Scan parquet default.date_dim (7)
: : : +- BroadcastExchange (22)
: : : +- * Project (21)
: : : +- * BroadcastHashJoin Inner BuildRight (20)
: : : :- * Filter (18)
: : : : +- * ColumnarToRow (17)
: : : : +- Scan parquet default.web_sales (16)
: : : +- ReusedExchange (19)
: : +- BroadcastExchange (30)
: : +- * Project (29)
: : +- * BroadcastHashJoin Inner BuildRight (28)
: : :- * Filter (26)
: : : +- * ColumnarToRow (25)
: : : +- Scan parquet default.catalog_sales (24)
: : +- ReusedExchange (27)
: +- BroadcastExchange (38)
: +- * Project (37)
: +- * Filter (36)
: +- * ColumnarToRow (35)
: +- Scan parquet default.customer_address (34)
+- BroadcastExchange (44)
+- * Filter (43)
+- * ColumnarToRow (42)
+- Scan parquet default.customer_demographics (41)
(1) Scan parquet default.customer
Output [3]: [c_customer_sk#3, c_current_cdemo_sk#4, c_current_addr_sk#5]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilitySuite/customer]
PushedFilters: [IsNotNull(c_current_addr_sk), IsNotNull(c_current_cdemo_sk)]
ReadSchema: struct<c_customer_sk:int,c_current_cdemo_sk:int,c_current_addr_sk:int>
(2) ColumnarToRow [codegen id : 9]
Input [3]: [c_customer_sk#3, c_current_cdemo_sk#4, c_current_addr_sk#5]
(3) Filter [codegen id : 9]
Input [3]: [c_customer_sk#3, c_current_cdemo_sk#4, c_current_addr_sk#5]
Condition : (isnotnull(c_current_addr_sk#5) AND isnotnull(c_current_cdemo_sk#4))
(4) Scan parquet default.store_sales
Output [2]: [ss_sold_date_sk#6, ss_customer_sk#7]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilitySuite/store_sales]
PushedFilters: [IsNotNull(ss_sold_date_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_customer_sk:int>
(5) ColumnarToRow [codegen id : 2]
Input [2]: [ss_sold_date_sk#6, ss_customer_sk#7]
(6) Filter [codegen id : 2]
Input [2]: [ss_sold_date_sk#6, ss_customer_sk#7]
Condition : isnotnull(ss_sold_date_sk#6)
(7) Scan parquet default.date_dim
Output [3]: [d_date_sk#8, d_year#9, d_moy#10]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilitySuite/date_dim]
PushedFilters: [IsNotNull(d_moy), IsNotNull(d_year), EqualTo(d_year,2002), GreaterThanOrEqual(d_moy,1), LessThanOrEqual(d_moy,4), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int,d_moy:int>
(8) ColumnarToRow [codegen id : 1]
Input [3]: [d_date_sk#8, d_year#9, d_moy#10]
(9) Filter [codegen id : 1]
Input [3]: [d_date_sk#8, d_year#9, d_moy#10]
Condition : (((((isnotnull(d_moy#10) AND isnotnull(d_year#9)) AND (d_year#9 = 2002)) AND (d_moy#10 >= 1)) AND (d_moy#10 <= 4)) AND isnotnull(d_date_sk#8))
(10) Project [codegen id : 1]
Output [1]: [d_date_sk#8]
Input [3]: [d_date_sk#8, d_year#9, d_moy#10]
(11) BroadcastExchange
Input [1]: [d_date_sk#8]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#11]
(12) BroadcastHashJoin [codegen id : 2]
Left keys [1]: [ss_sold_date_sk#6]
Right keys [1]: [d_date_sk#8]
Join condition: None
(13) Project [codegen id : 2]
Output [1]: [ss_customer_sk#7]
Input [3]: [ss_sold_date_sk#6, ss_customer_sk#7, d_date_sk#8]
(14) BroadcastExchange
Input [1]: [ss_customer_sk#7]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#12]
(15) BroadcastHashJoin [codegen id : 9]
Left keys [1]: [c_customer_sk#3]
Right keys [1]: [ss_customer_sk#7]
Join condition: None
(16) Scan parquet default.web_sales
Output [2]: [ws_sold_date_sk#13, ws_bill_customer_sk#14]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilitySuite/web_sales]
PushedFilters: [IsNotNull(ws_sold_date_sk)]
ReadSchema: struct<ws_sold_date_sk:int,ws_bill_customer_sk:int>
(17) ColumnarToRow [codegen id : 4]
Input [2]: [ws_sold_date_sk#13, ws_bill_customer_sk#14]
(18) Filter [codegen id : 4]
Input [2]: [ws_sold_date_sk#13, ws_bill_customer_sk#14]
Condition : isnotnull(ws_sold_date_sk#13)
(19) ReusedExchange [Reuses operator id: 11]
Output [1]: [d_date_sk#8]
(20) BroadcastHashJoin [codegen id : 4]
Left keys [1]: [ws_sold_date_sk#13]
Right keys [1]: [d_date_sk#8]
Join condition: None
(21) Project [codegen id : 4]
Output [1]: [ws_bill_customer_sk#14]
Input [3]: [ws_sold_date_sk#13, ws_bill_customer_sk#14, d_date_sk#8]
(22) BroadcastExchange
Input [1]: [ws_bill_customer_sk#14]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#15]
(23) BroadcastHashJoin [codegen id : 9]
Left keys [1]: [c_customer_sk#3]
Right keys [1]: [ws_bill_customer_sk#14]
Join condition: None
(24) Scan parquet default.catalog_sales
Output [2]: [cs_sold_date_sk#16, cs_ship_customer_sk#17]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilitySuite/catalog_sales]
PushedFilters: [IsNotNull(cs_sold_date_sk)]
ReadSchema: struct<cs_sold_date_sk:int,cs_ship_customer_sk:int>
(25) ColumnarToRow [codegen id : 6]
Input [2]: [cs_sold_date_sk#16, cs_ship_customer_sk#17]
(26) Filter [codegen id : 6]
Input [2]: [cs_sold_date_sk#16, cs_ship_customer_sk#17]
Condition : isnotnull(cs_sold_date_sk#16)
(27) ReusedExchange [Reuses operator id: 11]
Output [1]: [d_date_sk#8]
(28) BroadcastHashJoin [codegen id : 6]
Left keys [1]: [cs_sold_date_sk#16]
Right keys [1]: [d_date_sk#8]
Join condition: None
(29) Project [codegen id : 6]
Output [1]: [cs_ship_customer_sk#17]
Input [3]: [cs_sold_date_sk#16, cs_ship_customer_sk#17, d_date_sk#8]
(30) BroadcastExchange
Input [1]: [cs_ship_customer_sk#17]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#18]
(31) BroadcastHashJoin [codegen id : 9]
Left keys [1]: [c_customer_sk#3]
Right keys [1]: [cs_ship_customer_sk#17]
Join condition: None
(32) Filter [codegen id : 9]
Input [5]: [c_customer_sk#3, c_current_cdemo_sk#4, c_current_addr_sk#5, exists#2, exists#1]
Condition : (exists#2 OR exists#1)
(33) Project [codegen id : 9]
Output [2]: [c_current_cdemo_sk#4, c_current_addr_sk#5]
Input [5]: [c_customer_sk#3, c_current_cdemo_sk#4, c_current_addr_sk#5, exists#2, exists#1]
(34) Scan parquet default.customer_address
Output [2]: [ca_address_sk#19, ca_county#20]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilitySuite/customer_address]
PushedFilters: [In(ca_county, [Rush County,Toole County,Jefferson County,Dona Ana County,La Porte County]), IsNotNull(ca_address_sk)]
ReadSchema: struct<ca_address_sk:int,ca_county:string>
(35) ColumnarToRow [codegen id : 7]
Input [2]: [ca_address_sk#19, ca_county#20]
(36) Filter [codegen id : 7]
Input [2]: [ca_address_sk#19, ca_county#20]
Condition : (ca_county#20 IN (Rush County,Toole County,Jefferson County,Dona Ana County,La Porte County) AND isnotnull(ca_address_sk#19))
(37) Project [codegen id : 7]
Output [1]: [ca_address_sk#19]
Input [2]: [ca_address_sk#19, ca_county#20]
(38) BroadcastExchange
Input [1]: [ca_address_sk#19]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#21]
(39) BroadcastHashJoin [codegen id : 9]
Left keys [1]: [c_current_addr_sk#5]
Right keys [1]: [ca_address_sk#19]
Join condition: None
(40) Project [codegen id : 9]
Output [1]: [c_current_cdemo_sk#4]
Input [3]: [c_current_cdemo_sk#4, c_current_addr_sk#5, ca_address_sk#19]
(41) Scan parquet default.customer_demographics
Output [9]: [cd_demo_sk#22, cd_gender#23, cd_marital_status#24, cd_education_status#25, cd_purchase_estimate#26, cd_credit_rating#27, cd_dep_count#28, cd_dep_employed_count#29, cd_dep_college_count#30]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilitySuite/customer_demographics]
PushedFilters: [IsNotNull(cd_demo_sk)]
ReadSchema: struct<cd_demo_sk:int,cd_gender:string,cd_marital_status:string,cd_education_status:string,cd_purchase_estimate:int,cd_credit_rating:string,cd_dep_count:int,cd_dep_employed_count:int,cd_dep_college_count:int>
(42) ColumnarToRow [codegen id : 8]
Input [9]: [cd_demo_sk#22, cd_gender#23, cd_marital_status#24, cd_education_status#25, cd_purchase_estimate#26, cd_credit_rating#27, cd_dep_count#28, cd_dep_employed_count#29, cd_dep_college_count#30]
(43) Filter [codegen id : 8]
Input [9]: [cd_demo_sk#22, cd_gender#23, cd_marital_status#24, cd_education_status#25, cd_purchase_estimate#26, cd_credit_rating#27, cd_dep_count#28, cd_dep_employed_count#29, cd_dep_college_count#30]
Condition : isnotnull(cd_demo_sk#22)
(44) BroadcastExchange
Input [9]: [cd_demo_sk#22, cd_gender#23, cd_marital_status#24, cd_education_status#25, cd_purchase_estimate#26, cd_credit_rating#27, cd_dep_count#28, cd_dep_employed_count#29, cd_dep_college_count#30]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#31]
(45) BroadcastHashJoin [codegen id : 9]
Left keys [1]: [c_current_cdemo_sk#4]
Right keys [1]: [cd_demo_sk#22]
Join condition: None
(46) Project [codegen id : 9]
Output [8]: [cd_gender#23, cd_marital_status#24, cd_education_status#25, cd_purchase_estimate#26, cd_credit_rating#27, cd_dep_count#28, cd_dep_employed_count#29, cd_dep_college_count#30]
Input [10]: [c_current_cdemo_sk#4, cd_demo_sk#22, cd_gender#23, cd_marital_status#24, cd_education_status#25, cd_purchase_estimate#26, cd_credit_rating#27, cd_dep_count#28, cd_dep_employed_count#29, cd_dep_college_count#30]
(47) HashAggregate [codegen id : 9]
Input [8]: [cd_gender#23, cd_marital_status#24, cd_education_status#25, cd_purchase_estimate#26, cd_credit_rating#27, cd_dep_count#28, cd_dep_employed_count#29, cd_dep_college_count#30]
Keys [8]: [cd_gender#23, cd_marital_status#24, cd_education_status#25, cd_purchase_estimate#26, cd_credit_rating#27, cd_dep_count#28, cd_dep_employed_count#29, cd_dep_college_count#30]
Functions [1]: [partial_count(1)]
Aggregate Attributes [1]: [count#32]
Results [9]: [cd_gender#23, cd_marital_status#24, cd_education_status#25, cd_purchase_estimate#26, cd_credit_rating#27, cd_dep_count#28, cd_dep_employed_count#29, cd_dep_college_count#30, count#33]
(48) Exchange
Input [9]: [cd_gender#23, cd_marital_status#24, cd_education_status#25, cd_purchase_estimate#26, cd_credit_rating#27, cd_dep_count#28, cd_dep_employed_count#29, cd_dep_college_count#30, count#33]
Arguments: hashpartitioning(cd_gender#23, cd_marital_status#24, cd_education_status#25, cd_purchase_estimate#26, cd_credit_rating#27, cd_dep_count#28, cd_dep_employed_count#29, cd_dep_college_count#30, 5), true, [id=#34]
(49) HashAggregate [codegen id : 10]
Input [9]: [cd_gender#23, cd_marital_status#24, cd_education_status#25, cd_purchase_estimate#26, cd_credit_rating#27, cd_dep_count#28, cd_dep_employed_count#29, cd_dep_college_count#30, count#33]
Keys [8]: [cd_gender#23, cd_marital_status#24, cd_education_status#25, cd_purchase_estimate#26, cd_credit_rating#27, cd_dep_count#28, cd_dep_employed_count#29, cd_dep_college_count#30]
Functions [1]: [count(1)]
Aggregate Attributes [1]: [count(1)#35]
Results [14]: [cd_gender#23, cd_marital_status#24, cd_education_status#25, count(1)#35 AS cnt1#36, cd_purchase_estimate#26, count(1)#35 AS cnt2#37, cd_credit_rating#27, count(1)#35 AS cnt3#38, cd_dep_count#28, count(1)#35 AS cnt4#39, cd_dep_employed_count#29, count(1)#35 AS cnt5#40, cd_dep_college_count#30, count(1)#35 AS cnt6#41]
(50) TakeOrderedAndProject
Input [14]: [cd_gender#23, cd_marital_status#24, cd_education_status#25, cnt1#36, cd_purchase_estimate#26, cnt2#37, cd_credit_rating#27, cnt3#38, cd_dep_count#28, cnt4#39, cd_dep_employed_count#29, cnt5#40, cd_dep_college_count#30, cnt6#41]
Arguments: 100, [cd_gender#23 ASC NULLS FIRST, cd_marital_status#24 ASC NULLS FIRST, cd_education_status#25 ASC NULLS FIRST, cd_purchase_estimate#26 ASC NULLS FIRST, cd_credit_rating#27 ASC NULLS FIRST, cd_dep_count#28 ASC NULLS FIRST, cd_dep_employed_count#29 ASC NULLS FIRST, cd_dep_college_count#30 ASC NULLS FIRST], [cd_gender#23, cd_marital_status#24, cd_education_status#25, cnt1#36, cd_purchase_estimate#26, cnt2#37, cd_credit_rating#27, cnt3#38, cd_dep_count#28, cnt4#39, cd_dep_employed_count#29, cnt5#40, cd_dep_college_count#30, cnt6#41]

View file

@ -0,0 +1,74 @@
TakeOrderedAndProject [cd_credit_rating,cd_dep_college_count,cd_dep_count,cd_dep_employed_count,cd_education_status,cd_gender,cd_marital_status,cd_purchase_estimate,cnt1,cnt2,cnt3,cnt4,cnt5,cnt6]
WholeStageCodegen (10)
HashAggregate [cd_credit_rating,cd_dep_college_count,cd_dep_count,cd_dep_employed_count,cd_education_status,cd_gender,cd_marital_status,cd_purchase_estimate,count] [cnt1,cnt2,cnt3,cnt4,cnt5,cnt6,count,count(1)]
InputAdapter
Exchange [cd_credit_rating,cd_dep_college_count,cd_dep_count,cd_dep_employed_count,cd_education_status,cd_gender,cd_marital_status,cd_purchase_estimate] #1
WholeStageCodegen (9)
HashAggregate [cd_credit_rating,cd_dep_college_count,cd_dep_count,cd_dep_employed_count,cd_education_status,cd_gender,cd_marital_status,cd_purchase_estimate] [count,count]
Project [cd_credit_rating,cd_dep_college_count,cd_dep_count,cd_dep_employed_count,cd_education_status,cd_gender,cd_marital_status,cd_purchase_estimate]
BroadcastHashJoin [c_current_cdemo_sk,cd_demo_sk]
Project [c_current_cdemo_sk]
BroadcastHashJoin [c_current_addr_sk,ca_address_sk]
Project [c_current_addr_sk,c_current_cdemo_sk]
Filter [exists,exists]
BroadcastHashJoin [c_customer_sk,cs_ship_customer_sk]
BroadcastHashJoin [c_customer_sk,ws_bill_customer_sk]
BroadcastHashJoin [c_customer_sk,ss_customer_sk]
Filter [c_current_addr_sk,c_current_cdemo_sk]
ColumnarToRow
InputAdapter
Scan parquet default.customer [c_current_addr_sk,c_current_cdemo_sk,c_customer_sk]
InputAdapter
BroadcastExchange #2
WholeStageCodegen (2)
Project [ss_customer_sk]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Filter [ss_sold_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_customer_sk,ss_sold_date_sk]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (1)
Project [d_date_sk]
Filter [d_date_sk,d_moy,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_moy,d_year]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (4)
Project [ws_bill_customer_sk]
BroadcastHashJoin [d_date_sk,ws_sold_date_sk]
Filter [ws_sold_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.web_sales [ws_bill_customer_sk,ws_sold_date_sk]
InputAdapter
ReusedExchange [d_date_sk] #3
InputAdapter
BroadcastExchange #5
WholeStageCodegen (6)
Project [cs_ship_customer_sk]
BroadcastHashJoin [cs_sold_date_sk,d_date_sk]
Filter [cs_sold_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.catalog_sales [cs_ship_customer_sk,cs_sold_date_sk]
InputAdapter
ReusedExchange [d_date_sk] #3
InputAdapter
BroadcastExchange #6
WholeStageCodegen (7)
Project [ca_address_sk]
Filter [ca_address_sk,ca_county]
ColumnarToRow
InputAdapter
Scan parquet default.customer_address [ca_address_sk,ca_county]
InputAdapter
BroadcastExchange #7
WholeStageCodegen (8)
Filter [cd_demo_sk]
ColumnarToRow
InputAdapter
Scan parquet default.customer_demographics [cd_credit_rating,cd_demo_sk,cd_dep_college_count,cd_dep_count,cd_dep_employed_count,cd_education_status,cd_gender,cd_marital_status,cd_purchase_estimate]

View file

@ -0,0 +1,482 @@
== Physical Plan ==
TakeOrderedAndProject (87)
+- * Project (86)
+- * SortMergeJoin Inner (85)
:- * Project (67)
: +- * SortMergeJoin Inner (66)
: :- * Project (46)
: : +- * SortMergeJoin Inner (45)
: : :- * Sort (24)
: : : +- Exchange (23)
: : : +- * Filter (22)
: : : +- * HashAggregate (21)
: : : +- Exchange (20)
: : : +- * HashAggregate (19)
: : : +- * Project (18)
: : : +- * SortMergeJoin Inner (17)
: : : :- * Sort (11)
: : : : +- Exchange (10)
: : : : +- * Project (9)
: : : : +- * BroadcastHashJoin Inner BuildRight (8)
: : : : :- * Filter (3)
: : : : : +- * ColumnarToRow (2)
: : : : : +- Scan parquet default.store_sales (1)
: : : : +- BroadcastExchange (7)
: : : : +- * Filter (6)
: : : : +- * ColumnarToRow (5)
: : : : +- Scan parquet default.date_dim (4)
: : : +- * Sort (16)
: : : +- Exchange (15)
: : : +- * Filter (14)
: : : +- * ColumnarToRow (13)
: : : +- Scan parquet default.customer (12)
: : +- * Sort (44)
: : +- Exchange (43)
: : +- * HashAggregate (42)
: : +- Exchange (41)
: : +- * HashAggregate (40)
: : +- * Project (39)
: : +- * SortMergeJoin Inner (38)
: : :- * Sort (35)
: : : +- Exchange (34)
: : : +- * Project (33)
: : : +- * BroadcastHashJoin Inner BuildRight (32)
: : : :- * Filter (27)
: : : : +- * ColumnarToRow (26)
: : : : +- Scan parquet default.store_sales (25)
: : : +- BroadcastExchange (31)
: : : +- * Filter (30)
: : : +- * ColumnarToRow (29)
: : : +- Scan parquet default.date_dim (28)
: : +- * Sort (37)
: : +- ReusedExchange (36)
: +- * Sort (65)
: +- Exchange (64)
: +- * Project (63)
: +- * Filter (62)
: +- * HashAggregate (61)
: +- Exchange (60)
: +- * HashAggregate (59)
: +- * Project (58)
: +- * SortMergeJoin Inner (57)
: :- * Sort (54)
: : +- Exchange (53)
: : +- * Project (52)
: : +- * BroadcastHashJoin Inner BuildRight (51)
: : :- * Filter (49)
: : : +- * ColumnarToRow (48)
: : : +- Scan parquet default.web_sales (47)
: : +- ReusedExchange (50)
: +- * Sort (56)
: +- ReusedExchange (55)
+- * Sort (84)
+- Exchange (83)
+- * HashAggregate (82)
+- Exchange (81)
+- * HashAggregate (80)
+- * Project (79)
+- * SortMergeJoin Inner (78)
:- * Sort (75)
: +- Exchange (74)
: +- * Project (73)
: +- * BroadcastHashJoin Inner BuildRight (72)
: :- * Filter (70)
: : +- * ColumnarToRow (69)
: : +- Scan parquet default.web_sales (68)
: +- ReusedExchange (71)
+- * Sort (77)
+- ReusedExchange (76)
(1) Scan parquet default.store_sales
Output [4]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_ext_discount_amt#3, ss_ext_list_price#4]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilityWithStatsSuite/store_sales]
PushedFilters: [IsNotNull(ss_customer_sk), IsNotNull(ss_sold_date_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_customer_sk:int,ss_ext_discount_amt:decimal(7,2),ss_ext_list_price:decimal(7,2)>
(2) ColumnarToRow [codegen id : 2]
Input [4]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_ext_discount_amt#3, ss_ext_list_price#4]
(3) Filter [codegen id : 2]
Input [4]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_ext_discount_amt#3, ss_ext_list_price#4]
Condition : (isnotnull(ss_customer_sk#2) AND isnotnull(ss_sold_date_sk#1))
(4) Scan parquet default.date_dim
Output [2]: [d_date_sk#5, d_year#6]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilityWithStatsSuite/date_dim]
PushedFilters: [IsNotNull(d_year), EqualTo(d_year,2001), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int>
(5) ColumnarToRow [codegen id : 1]
Input [2]: [d_date_sk#5, d_year#6]
(6) Filter [codegen id : 1]
Input [2]: [d_date_sk#5, d_year#6]
Condition : ((isnotnull(d_year#6) AND (d_year#6 = 2001)) AND isnotnull(d_date_sk#5))
(7) BroadcastExchange
Input [2]: [d_date_sk#5, d_year#6]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#7]
(8) BroadcastHashJoin [codegen id : 2]
Left keys [1]: [ss_sold_date_sk#1]
Right keys [1]: [d_date_sk#5]
Join condition: None
(9) Project [codegen id : 2]
Output [4]: [ss_customer_sk#2, ss_ext_discount_amt#3, ss_ext_list_price#4, d_year#6]
Input [6]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_ext_discount_amt#3, ss_ext_list_price#4, d_date_sk#5, d_year#6]
(10) Exchange
Input [4]: [ss_customer_sk#2, ss_ext_discount_amt#3, ss_ext_list_price#4, d_year#6]
Arguments: hashpartitioning(ss_customer_sk#2, 5), true, [id=#8]
(11) Sort [codegen id : 3]
Input [4]: [ss_customer_sk#2, ss_ext_discount_amt#3, ss_ext_list_price#4, d_year#6]
Arguments: [ss_customer_sk#2 ASC NULLS FIRST], false, 0
(12) Scan parquet default.customer
Output [8]: [c_customer_sk#9, c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilityWithStatsSuite/customer]
PushedFilters: [IsNotNull(c_customer_sk), IsNotNull(c_customer_id)]
ReadSchema: struct<c_customer_sk:int,c_customer_id:string,c_first_name:string,c_last_name:string,c_preferred_cust_flag:string,c_birth_country:string,c_login:string,c_email_address:string>
(13) ColumnarToRow [codegen id : 4]
Input [8]: [c_customer_sk#9, c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16]
(14) Filter [codegen id : 4]
Input [8]: [c_customer_sk#9, c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16]
Condition : (isnotnull(c_customer_sk#9) AND isnotnull(c_customer_id#10))
(15) Exchange
Input [8]: [c_customer_sk#9, c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16]
Arguments: hashpartitioning(c_customer_sk#9, 5), true, [id=#17]
(16) Sort [codegen id : 5]
Input [8]: [c_customer_sk#9, c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16]
Arguments: [c_customer_sk#9 ASC NULLS FIRST], false, 0
(17) SortMergeJoin [codegen id : 6]
Left keys [1]: [ss_customer_sk#2]
Right keys [1]: [c_customer_sk#9]
Join condition: None
(18) Project [codegen id : 6]
Output [10]: [c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16, ss_ext_discount_amt#3, ss_ext_list_price#4, d_year#6]
Input [12]: [ss_customer_sk#2, ss_ext_discount_amt#3, ss_ext_list_price#4, d_year#6, c_customer_sk#9, c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16]
(19) HashAggregate [codegen id : 6]
Input [10]: [c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16, ss_ext_discount_amt#3, ss_ext_list_price#4, d_year#6]
Keys [8]: [c_customer_id#10, c_first_name#11, c_last_name#12, d_year#6, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16]
Functions [1]: [partial_sum(UnscaledValue(CheckOverflow((promote_precision(cast(ss_ext_list_price#4 as decimal(8,2))) - promote_precision(cast(ss_ext_discount_amt#3 as decimal(8,2)))), DecimalType(8,2), true)))]
Aggregate Attributes [1]: [sum#18]
Results [9]: [c_customer_id#10, c_first_name#11, c_last_name#12, d_year#6, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16, sum#19]
(20) Exchange
Input [9]: [c_customer_id#10, c_first_name#11, c_last_name#12, d_year#6, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16, sum#19]
Arguments: hashpartitioning(c_customer_id#10, c_first_name#11, c_last_name#12, d_year#6, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16, 5), true, [id=#20]
(21) HashAggregate [codegen id : 7]
Input [9]: [c_customer_id#10, c_first_name#11, c_last_name#12, d_year#6, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16, sum#19]
Keys [8]: [c_customer_id#10, c_first_name#11, c_last_name#12, d_year#6, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16]
Functions [1]: [sum(UnscaledValue(CheckOverflow((promote_precision(cast(ss_ext_list_price#4 as decimal(8,2))) - promote_precision(cast(ss_ext_discount_amt#3 as decimal(8,2)))), DecimalType(8,2), true)))]
Aggregate Attributes [1]: [sum(UnscaledValue(CheckOverflow((promote_precision(cast(ss_ext_list_price#4 as decimal(8,2))) - promote_precision(cast(ss_ext_discount_amt#3 as decimal(8,2)))), DecimalType(8,2), true)))#21]
Results [2]: [c_customer_id#10 AS customer_id#22, MakeDecimal(sum(UnscaledValue(CheckOverflow((promote_precision(cast(ss_ext_list_price#4 as decimal(8,2))) - promote_precision(cast(ss_ext_discount_amt#3 as decimal(8,2)))), DecimalType(8,2), true)))#21,18,2) AS year_total#23]
(22) Filter [codegen id : 7]
Input [2]: [customer_id#22, year_total#23]
Condition : (isnotnull(year_total#23) AND (year_total#23 > 0.00))
(23) Exchange
Input [2]: [customer_id#22, year_total#23]
Arguments: hashpartitioning(customer_id#22, 5), true, [id=#24]
(24) Sort [codegen id : 8]
Input [2]: [customer_id#22, year_total#23]
Arguments: [customer_id#22 ASC NULLS FIRST], false, 0
(25) Scan parquet default.store_sales
Output [4]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_ext_discount_amt#3, ss_ext_list_price#4]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilityWithStatsSuite/store_sales]
PushedFilters: [IsNotNull(ss_customer_sk), IsNotNull(ss_sold_date_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_customer_sk:int,ss_ext_discount_amt:decimal(7,2),ss_ext_list_price:decimal(7,2)>
(26) ColumnarToRow [codegen id : 10]
Input [4]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_ext_discount_amt#3, ss_ext_list_price#4]
(27) Filter [codegen id : 10]
Input [4]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_ext_discount_amt#3, ss_ext_list_price#4]
Condition : (isnotnull(ss_customer_sk#2) AND isnotnull(ss_sold_date_sk#1))
(28) Scan parquet default.date_dim
Output [2]: [d_date_sk#5, d_year#6]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilityWithStatsSuite/date_dim]
PushedFilters: [IsNotNull(d_year), EqualTo(d_year,2002), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int>
(29) ColumnarToRow [codegen id : 9]
Input [2]: [d_date_sk#5, d_year#6]
(30) Filter [codegen id : 9]
Input [2]: [d_date_sk#5, d_year#6]
Condition : ((isnotnull(d_year#6) AND (d_year#6 = 2002)) AND isnotnull(d_date_sk#5))
(31) BroadcastExchange
Input [2]: [d_date_sk#5, d_year#6]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#25]
(32) BroadcastHashJoin [codegen id : 10]
Left keys [1]: [ss_sold_date_sk#1]
Right keys [1]: [d_date_sk#5]
Join condition: None
(33) Project [codegen id : 10]
Output [4]: [ss_customer_sk#2, ss_ext_discount_amt#3, ss_ext_list_price#4, d_year#6]
Input [6]: [ss_sold_date_sk#1, ss_customer_sk#2, ss_ext_discount_amt#3, ss_ext_list_price#4, d_date_sk#5, d_year#6]
(34) Exchange
Input [4]: [ss_customer_sk#2, ss_ext_discount_amt#3, ss_ext_list_price#4, d_year#6]
Arguments: hashpartitioning(ss_customer_sk#2, 5), true, [id=#26]
(35) Sort [codegen id : 11]
Input [4]: [ss_customer_sk#2, ss_ext_discount_amt#3, ss_ext_list_price#4, d_year#6]
Arguments: [ss_customer_sk#2 ASC NULLS FIRST], false, 0
(36) ReusedExchange [Reuses operator id: 15]
Output [8]: [c_customer_sk#9, c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16]
(37) Sort [codegen id : 13]
Input [8]: [c_customer_sk#9, c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16]
Arguments: [c_customer_sk#9 ASC NULLS FIRST], false, 0
(38) SortMergeJoin [codegen id : 14]
Left keys [1]: [ss_customer_sk#2]
Right keys [1]: [c_customer_sk#9]
Join condition: None
(39) Project [codegen id : 14]
Output [10]: [c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16, ss_ext_discount_amt#3, ss_ext_list_price#4, d_year#6]
Input [12]: [ss_customer_sk#2, ss_ext_discount_amt#3, ss_ext_list_price#4, d_year#6, c_customer_sk#9, c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16]
(40) HashAggregate [codegen id : 14]
Input [10]: [c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16, ss_ext_discount_amt#3, ss_ext_list_price#4, d_year#6]
Keys [8]: [c_customer_id#10, c_first_name#11, c_last_name#12, d_year#6, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16]
Functions [1]: [partial_sum(UnscaledValue(CheckOverflow((promote_precision(cast(ss_ext_list_price#4 as decimal(8,2))) - promote_precision(cast(ss_ext_discount_amt#3 as decimal(8,2)))), DecimalType(8,2), true)))]
Aggregate Attributes [1]: [sum#27]
Results [9]: [c_customer_id#10, c_first_name#11, c_last_name#12, d_year#6, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16, sum#28]
(41) Exchange
Input [9]: [c_customer_id#10, c_first_name#11, c_last_name#12, d_year#6, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16, sum#28]
Arguments: hashpartitioning(c_customer_id#10, c_first_name#11, c_last_name#12, d_year#6, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16, 5), true, [id=#29]
(42) HashAggregate [codegen id : 15]
Input [9]: [c_customer_id#10, c_first_name#11, c_last_name#12, d_year#6, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16, sum#28]
Keys [8]: [c_customer_id#10, c_first_name#11, c_last_name#12, d_year#6, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16]
Functions [1]: [sum(UnscaledValue(CheckOverflow((promote_precision(cast(ss_ext_list_price#4 as decimal(8,2))) - promote_precision(cast(ss_ext_discount_amt#3 as decimal(8,2)))), DecimalType(8,2), true)))]
Aggregate Attributes [1]: [sum(UnscaledValue(CheckOverflow((promote_precision(cast(ss_ext_list_price#4 as decimal(8,2))) - promote_precision(cast(ss_ext_discount_amt#3 as decimal(8,2)))), DecimalType(8,2), true)))#30]
Results [3]: [c_customer_id#10 AS customer_id#31, c_preferred_cust_flag#13 AS customer_preferred_cust_flag#32, MakeDecimal(sum(UnscaledValue(CheckOverflow((promote_precision(cast(ss_ext_list_price#4 as decimal(8,2))) - promote_precision(cast(ss_ext_discount_amt#3 as decimal(8,2)))), DecimalType(8,2), true)))#30,18,2) AS year_total#33]
(43) Exchange
Input [3]: [customer_id#31, customer_preferred_cust_flag#32, year_total#33]
Arguments: hashpartitioning(customer_id#31, 5), true, [id=#34]
(44) Sort [codegen id : 16]
Input [3]: [customer_id#31, customer_preferred_cust_flag#32, year_total#33]
Arguments: [customer_id#31 ASC NULLS FIRST], false, 0
(45) SortMergeJoin [codegen id : 17]
Left keys [1]: [customer_id#22]
Right keys [1]: [customer_id#31]
Join condition: None
(46) Project [codegen id : 17]
Output [4]: [customer_id#22, year_total#23, customer_preferred_cust_flag#32, year_total#33]
Input [5]: [customer_id#22, year_total#23, customer_id#31, customer_preferred_cust_flag#32, year_total#33]
(47) Scan parquet default.web_sales
Output [4]: [ws_sold_date_sk#35, ws_bill_customer_sk#36, ws_ext_discount_amt#37, ws_ext_list_price#38]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilityWithStatsSuite/web_sales]
PushedFilters: [IsNotNull(ws_bill_customer_sk), IsNotNull(ws_sold_date_sk)]
ReadSchema: struct<ws_sold_date_sk:int,ws_bill_customer_sk:int,ws_ext_discount_amt:decimal(7,2),ws_ext_list_price:decimal(7,2)>
(48) ColumnarToRow [codegen id : 19]
Input [4]: [ws_sold_date_sk#35, ws_bill_customer_sk#36, ws_ext_discount_amt#37, ws_ext_list_price#38]
(49) Filter [codegen id : 19]
Input [4]: [ws_sold_date_sk#35, ws_bill_customer_sk#36, ws_ext_discount_amt#37, ws_ext_list_price#38]
Condition : (isnotnull(ws_bill_customer_sk#36) AND isnotnull(ws_sold_date_sk#35))
(50) ReusedExchange [Reuses operator id: 7]
Output [2]: [d_date_sk#5, d_year#6]
(51) BroadcastHashJoin [codegen id : 19]
Left keys [1]: [ws_sold_date_sk#35]
Right keys [1]: [d_date_sk#5]
Join condition: None
(52) Project [codegen id : 19]
Output [4]: [ws_bill_customer_sk#36, ws_ext_discount_amt#37, ws_ext_list_price#38, d_year#6]
Input [6]: [ws_sold_date_sk#35, ws_bill_customer_sk#36, ws_ext_discount_amt#37, ws_ext_list_price#38, d_date_sk#5, d_year#6]
(53) Exchange
Input [4]: [ws_bill_customer_sk#36, ws_ext_discount_amt#37, ws_ext_list_price#38, d_year#6]
Arguments: hashpartitioning(ws_bill_customer_sk#36, 5), true, [id=#39]
(54) Sort [codegen id : 20]
Input [4]: [ws_bill_customer_sk#36, ws_ext_discount_amt#37, ws_ext_list_price#38, d_year#6]
Arguments: [ws_bill_customer_sk#36 ASC NULLS FIRST], false, 0
(55) ReusedExchange [Reuses operator id: 15]
Output [8]: [c_customer_sk#9, c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16]
(56) Sort [codegen id : 22]
Input [8]: [c_customer_sk#9, c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16]
Arguments: [c_customer_sk#9 ASC NULLS FIRST], false, 0
(57) SortMergeJoin [codegen id : 23]
Left keys [1]: [ws_bill_customer_sk#36]
Right keys [1]: [c_customer_sk#9]
Join condition: None
(58) Project [codegen id : 23]
Output [10]: [c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16, ws_ext_discount_amt#37, ws_ext_list_price#38, d_year#6]
Input [12]: [ws_bill_customer_sk#36, ws_ext_discount_amt#37, ws_ext_list_price#38, d_year#6, c_customer_sk#9, c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16]
(59) HashAggregate [codegen id : 23]
Input [10]: [c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16, ws_ext_discount_amt#37, ws_ext_list_price#38, d_year#6]
Keys [8]: [c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16, d_year#6]
Functions [1]: [partial_sum(UnscaledValue(CheckOverflow((promote_precision(cast(ws_ext_list_price#38 as decimal(8,2))) - promote_precision(cast(ws_ext_discount_amt#37 as decimal(8,2)))), DecimalType(8,2), true)))]
Aggregate Attributes [1]: [sum#40]
Results [9]: [c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16, d_year#6, sum#41]
(60) Exchange
Input [9]: [c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16, d_year#6, sum#41]
Arguments: hashpartitioning(c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16, d_year#6, 5), true, [id=#42]
(61) HashAggregate [codegen id : 24]
Input [9]: [c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16, d_year#6, sum#41]
Keys [8]: [c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16, d_year#6]
Functions [1]: [sum(UnscaledValue(CheckOverflow((promote_precision(cast(ws_ext_list_price#38 as decimal(8,2))) - promote_precision(cast(ws_ext_discount_amt#37 as decimal(8,2)))), DecimalType(8,2), true)))]
Aggregate Attributes [1]: [sum(UnscaledValue(CheckOverflow((promote_precision(cast(ws_ext_list_price#38 as decimal(8,2))) - promote_precision(cast(ws_ext_discount_amt#37 as decimal(8,2)))), DecimalType(8,2), true)))#43]
Results [2]: [c_customer_id#10 AS customer_id#44, MakeDecimal(sum(UnscaledValue(CheckOverflow((promote_precision(cast(ws_ext_list_price#38 as decimal(8,2))) - promote_precision(cast(ws_ext_discount_amt#37 as decimal(8,2)))), DecimalType(8,2), true)))#43,18,2) AS year_total#45]
(62) Filter [codegen id : 24]
Input [2]: [customer_id#44, year_total#45]
Condition : (isnotnull(year_total#45) AND (year_total#45 > 0.00))
(63) Project [codegen id : 24]
Output [2]: [customer_id#44 AS customer_id#46, year_total#45 AS year_total#47]
Input [2]: [customer_id#44, year_total#45]
(64) Exchange
Input [2]: [customer_id#46, year_total#47]
Arguments: hashpartitioning(customer_id#46, 5), true, [id=#48]
(65) Sort [codegen id : 25]
Input [2]: [customer_id#46, year_total#47]
Arguments: [customer_id#46 ASC NULLS FIRST], false, 0
(66) SortMergeJoin [codegen id : 26]
Left keys [1]: [customer_id#22]
Right keys [1]: [customer_id#46]
Join condition: None
(67) Project [codegen id : 26]
Output [5]: [customer_id#22, year_total#23, customer_preferred_cust_flag#32, year_total#33, year_total#47]
Input [6]: [customer_id#22, year_total#23, customer_preferred_cust_flag#32, year_total#33, customer_id#46, year_total#47]
(68) Scan parquet default.web_sales
Output [4]: [ws_sold_date_sk#35, ws_bill_customer_sk#36, ws_ext_discount_amt#37, ws_ext_list_price#38]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilityWithStatsSuite/web_sales]
PushedFilters: [IsNotNull(ws_bill_customer_sk), IsNotNull(ws_sold_date_sk)]
ReadSchema: struct<ws_sold_date_sk:int,ws_bill_customer_sk:int,ws_ext_discount_amt:decimal(7,2),ws_ext_list_price:decimal(7,2)>
(69) ColumnarToRow [codegen id : 28]
Input [4]: [ws_sold_date_sk#35, ws_bill_customer_sk#36, ws_ext_discount_amt#37, ws_ext_list_price#38]
(70) Filter [codegen id : 28]
Input [4]: [ws_sold_date_sk#35, ws_bill_customer_sk#36, ws_ext_discount_amt#37, ws_ext_list_price#38]
Condition : (isnotnull(ws_bill_customer_sk#36) AND isnotnull(ws_sold_date_sk#35))
(71) ReusedExchange [Reuses operator id: 31]
Output [2]: [d_date_sk#5, d_year#6]
(72) BroadcastHashJoin [codegen id : 28]
Left keys [1]: [ws_sold_date_sk#35]
Right keys [1]: [d_date_sk#5]
Join condition: None
(73) Project [codegen id : 28]
Output [4]: [ws_bill_customer_sk#36, ws_ext_discount_amt#37, ws_ext_list_price#38, d_year#6]
Input [6]: [ws_sold_date_sk#35, ws_bill_customer_sk#36, ws_ext_discount_amt#37, ws_ext_list_price#38, d_date_sk#5, d_year#6]
(74) Exchange
Input [4]: [ws_bill_customer_sk#36, ws_ext_discount_amt#37, ws_ext_list_price#38, d_year#6]
Arguments: hashpartitioning(ws_bill_customer_sk#36, 5), true, [id=#49]
(75) Sort [codegen id : 29]
Input [4]: [ws_bill_customer_sk#36, ws_ext_discount_amt#37, ws_ext_list_price#38, d_year#6]
Arguments: [ws_bill_customer_sk#36 ASC NULLS FIRST], false, 0
(76) ReusedExchange [Reuses operator id: 15]
Output [8]: [c_customer_sk#9, c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16]
(77) Sort [codegen id : 31]
Input [8]: [c_customer_sk#9, c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16]
Arguments: [c_customer_sk#9 ASC NULLS FIRST], false, 0
(78) SortMergeJoin [codegen id : 32]
Left keys [1]: [ws_bill_customer_sk#36]
Right keys [1]: [c_customer_sk#9]
Join condition: None
(79) Project [codegen id : 32]
Output [10]: [c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16, ws_ext_discount_amt#37, ws_ext_list_price#38, d_year#6]
Input [12]: [ws_bill_customer_sk#36, ws_ext_discount_amt#37, ws_ext_list_price#38, d_year#6, c_customer_sk#9, c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16]
(80) HashAggregate [codegen id : 32]
Input [10]: [c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16, ws_ext_discount_amt#37, ws_ext_list_price#38, d_year#6]
Keys [8]: [c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16, d_year#6]
Functions [1]: [partial_sum(UnscaledValue(CheckOverflow((promote_precision(cast(ws_ext_list_price#38 as decimal(8,2))) - promote_precision(cast(ws_ext_discount_amt#37 as decimal(8,2)))), DecimalType(8,2), true)))]
Aggregate Attributes [1]: [sum#50]
Results [9]: [c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16, d_year#6, sum#51]
(81) Exchange
Input [9]: [c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16, d_year#6, sum#51]
Arguments: hashpartitioning(c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16, d_year#6, 5), true, [id=#52]
(82) HashAggregate [codegen id : 33]
Input [9]: [c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16, d_year#6, sum#51]
Keys [8]: [c_customer_id#10, c_first_name#11, c_last_name#12, c_preferred_cust_flag#13, c_birth_country#14, c_login#15, c_email_address#16, d_year#6]
Functions [1]: [sum(UnscaledValue(CheckOverflow((promote_precision(cast(ws_ext_list_price#38 as decimal(8,2))) - promote_precision(cast(ws_ext_discount_amt#37 as decimal(8,2)))), DecimalType(8,2), true)))]
Aggregate Attributes [1]: [sum(UnscaledValue(CheckOverflow((promote_precision(cast(ws_ext_list_price#38 as decimal(8,2))) - promote_precision(cast(ws_ext_discount_amt#37 as decimal(8,2)))), DecimalType(8,2), true)))#53]
Results [2]: [c_customer_id#10 AS customer_id#54, MakeDecimal(sum(UnscaledValue(CheckOverflow((promote_precision(cast(ws_ext_list_price#38 as decimal(8,2))) - promote_precision(cast(ws_ext_discount_amt#37 as decimal(8,2)))), DecimalType(8,2), true)))#53,18,2) AS year_total#55]
(83) Exchange
Input [2]: [customer_id#54, year_total#55]
Arguments: hashpartitioning(customer_id#54, 5), true, [id=#56]
(84) Sort [codegen id : 34]
Input [2]: [customer_id#54, year_total#55]
Arguments: [customer_id#54 ASC NULLS FIRST], false, 0
(85) SortMergeJoin [codegen id : 35]
Left keys [1]: [customer_id#22]
Right keys [1]: [customer_id#54]
Join condition: (CASE WHEN (year_total#47 > 0.00) THEN CheckOverflow((promote_precision(year_total#55) / promote_precision(year_total#47)), DecimalType(38,20), true) ELSE null END > CASE WHEN (year_total#23 > 0.00) THEN CheckOverflow((promote_precision(year_total#33) / promote_precision(year_total#23)), DecimalType(38,20), true) ELSE null END)
(86) Project [codegen id : 35]
Output [1]: [customer_preferred_cust_flag#32]
Input [7]: [customer_id#22, year_total#23, customer_preferred_cust_flag#32, year_total#33, year_total#47, customer_id#54, year_total#55]
(87) TakeOrderedAndProject
Input [1]: [customer_preferred_cust_flag#32]
Arguments: 100, [customer_preferred_cust_flag#32 ASC NULLS FIRST], [customer_preferred_cust_flag#32]

View file

@ -0,0 +1,158 @@
TakeOrderedAndProject [customer_preferred_cust_flag]
WholeStageCodegen (35)
Project [customer_preferred_cust_flag]
SortMergeJoin [customer_id,customer_id,year_total,year_total,year_total,year_total]
InputAdapter
WholeStageCodegen (26)
Project [customer_id,customer_preferred_cust_flag,year_total,year_total,year_total]
SortMergeJoin [customer_id,customer_id]
InputAdapter
WholeStageCodegen (17)
Project [customer_id,customer_preferred_cust_flag,year_total,year_total]
SortMergeJoin [customer_id,customer_id]
InputAdapter
WholeStageCodegen (8)
Sort [customer_id]
InputAdapter
Exchange [customer_id] #1
WholeStageCodegen (7)
Filter [year_total]
HashAggregate [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,d_year,sum] [customer_id,sum,sum(UnscaledValue(CheckOverflow((promote_precision(cast(ss_ext_list_price as decimal(8,2))) - promote_precision(cast(ss_ext_discount_amt as decimal(8,2)))), DecimalType(8,2), true))),year_total]
InputAdapter
Exchange [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,d_year] #2
WholeStageCodegen (6)
HashAggregate [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,d_year,ss_ext_discount_amt,ss_ext_list_price] [sum,sum]
Project [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,d_year,ss_ext_discount_amt,ss_ext_list_price]
SortMergeJoin [c_customer_sk,ss_customer_sk]
InputAdapter
WholeStageCodegen (3)
Sort [ss_customer_sk]
InputAdapter
Exchange [ss_customer_sk] #3
WholeStageCodegen (2)
Project [d_year,ss_customer_sk,ss_ext_discount_amt,ss_ext_list_price]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Filter [ss_customer_sk,ss_sold_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_customer_sk,ss_ext_discount_amt,ss_ext_list_price,ss_sold_date_sk]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (1)
Filter [d_date_sk,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_year]
InputAdapter
WholeStageCodegen (5)
Sort [c_customer_sk]
InputAdapter
Exchange [c_customer_sk] #5
WholeStageCodegen (4)
Filter [c_customer_id,c_customer_sk]
ColumnarToRow
InputAdapter
Scan parquet default.customer [c_birth_country,c_customer_id,c_customer_sk,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag]
InputAdapter
WholeStageCodegen (16)
Sort [customer_id]
InputAdapter
Exchange [customer_id] #6
WholeStageCodegen (15)
HashAggregate [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,d_year,sum] [customer_id,customer_preferred_cust_flag,sum,sum(UnscaledValue(CheckOverflow((promote_precision(cast(ss_ext_list_price as decimal(8,2))) - promote_precision(cast(ss_ext_discount_amt as decimal(8,2)))), DecimalType(8,2), true))),year_total]
InputAdapter
Exchange [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,d_year] #7
WholeStageCodegen (14)
HashAggregate [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,d_year,ss_ext_discount_amt,ss_ext_list_price] [sum,sum]
Project [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,d_year,ss_ext_discount_amt,ss_ext_list_price]
SortMergeJoin [c_customer_sk,ss_customer_sk]
InputAdapter
WholeStageCodegen (11)
Sort [ss_customer_sk]
InputAdapter
Exchange [ss_customer_sk] #8
WholeStageCodegen (10)
Project [d_year,ss_customer_sk,ss_ext_discount_amt,ss_ext_list_price]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Filter [ss_customer_sk,ss_sold_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_customer_sk,ss_ext_discount_amt,ss_ext_list_price,ss_sold_date_sk]
InputAdapter
BroadcastExchange #9
WholeStageCodegen (9)
Filter [d_date_sk,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_year]
InputAdapter
WholeStageCodegen (13)
Sort [c_customer_sk]
InputAdapter
ReusedExchange [c_birth_country,c_customer_id,c_customer_sk,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag] #5
InputAdapter
WholeStageCodegen (25)
Sort [customer_id]
InputAdapter
Exchange [customer_id] #10
WholeStageCodegen (24)
Project [customer_id,year_total]
Filter [year_total]
HashAggregate [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,d_year,sum] [customer_id,sum,sum(UnscaledValue(CheckOverflow((promote_precision(cast(ws_ext_list_price as decimal(8,2))) - promote_precision(cast(ws_ext_discount_amt as decimal(8,2)))), DecimalType(8,2), true))),year_total]
InputAdapter
Exchange [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,d_year] #11
WholeStageCodegen (23)
HashAggregate [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,d_year,ws_ext_discount_amt,ws_ext_list_price] [sum,sum]
Project [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,d_year,ws_ext_discount_amt,ws_ext_list_price]
SortMergeJoin [c_customer_sk,ws_bill_customer_sk]
InputAdapter
WholeStageCodegen (20)
Sort [ws_bill_customer_sk]
InputAdapter
Exchange [ws_bill_customer_sk] #12
WholeStageCodegen (19)
Project [d_year,ws_bill_customer_sk,ws_ext_discount_amt,ws_ext_list_price]
BroadcastHashJoin [d_date_sk,ws_sold_date_sk]
Filter [ws_bill_customer_sk,ws_sold_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.web_sales [ws_bill_customer_sk,ws_ext_discount_amt,ws_ext_list_price,ws_sold_date_sk]
InputAdapter
ReusedExchange [d_date_sk,d_year] #4
InputAdapter
WholeStageCodegen (22)
Sort [c_customer_sk]
InputAdapter
ReusedExchange [c_birth_country,c_customer_id,c_customer_sk,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag] #5
InputAdapter
WholeStageCodegen (34)
Sort [customer_id]
InputAdapter
Exchange [customer_id] #13
WholeStageCodegen (33)
HashAggregate [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,d_year,sum] [customer_id,sum,sum(UnscaledValue(CheckOverflow((promote_precision(cast(ws_ext_list_price as decimal(8,2))) - promote_precision(cast(ws_ext_discount_amt as decimal(8,2)))), DecimalType(8,2), true))),year_total]
InputAdapter
Exchange [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,d_year] #14
WholeStageCodegen (32)
HashAggregate [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,d_year,ws_ext_discount_amt,ws_ext_list_price] [sum,sum]
Project [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,d_year,ws_ext_discount_amt,ws_ext_list_price]
SortMergeJoin [c_customer_sk,ws_bill_customer_sk]
InputAdapter
WholeStageCodegen (29)
Sort [ws_bill_customer_sk]
InputAdapter
Exchange [ws_bill_customer_sk] #15
WholeStageCodegen (28)
Project [d_year,ws_bill_customer_sk,ws_ext_discount_amt,ws_ext_list_price]
BroadcastHashJoin [d_date_sk,ws_sold_date_sk]
Filter [ws_bill_customer_sk,ws_sold_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.web_sales [ws_bill_customer_sk,ws_ext_discount_amt,ws_ext_list_price,ws_sold_date_sk]
InputAdapter
ReusedExchange [d_date_sk,d_year] #9
InputAdapter
WholeStageCodegen (31)
Sort [c_customer_sk]
InputAdapter
ReusedExchange [c_birth_country,c_customer_id,c_customer_sk,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag] #5

View file

@ -0,0 +1,415 @@
== Physical Plan ==
TakeOrderedAndProject (73)
+- * Project (72)
+- * BroadcastHashJoin Inner BuildRight (71)
:- * Project (57)
: +- * BroadcastHashJoin Inner BuildRight (56)
: :- * Project (37)
: : +- * BroadcastHashJoin Inner BuildRight (36)
: : :- * Filter (19)
: : : +- * HashAggregate (18)
: : : +- Exchange (17)
: : : +- * HashAggregate (16)
: : : +- * Project (15)
: : : +- * BroadcastHashJoin Inner BuildRight (14)
: : : :- * Project (9)
: : : : +- * BroadcastHashJoin Inner BuildRight (8)
: : : : :- * Filter (3)
: : : : : +- * ColumnarToRow (2)
: : : : : +- Scan parquet default.customer (1)
: : : : +- BroadcastExchange (7)
: : : : +- * Filter (6)
: : : : +- * ColumnarToRow (5)
: : : : +- Scan parquet default.store_sales (4)
: : : +- BroadcastExchange (13)
: : : +- * Filter (12)
: : : +- * ColumnarToRow (11)
: : : +- Scan parquet default.date_dim (10)
: : +- BroadcastExchange (35)
: : +- * HashAggregate (34)
: : +- Exchange (33)
: : +- * HashAggregate (32)
: : +- * Project (31)
: : +- * BroadcastHashJoin Inner BuildRight (30)
: : :- * Project (25)
: : : +- * BroadcastHashJoin Inner BuildRight (24)
: : : :- * Filter (22)
: : : : +- * ColumnarToRow (21)
: : : : +- Scan parquet default.customer (20)
: : : +- ReusedExchange (23)
: : +- BroadcastExchange (29)
: : +- * Filter (28)
: : +- * ColumnarToRow (27)
: : +- Scan parquet default.date_dim (26)
: +- BroadcastExchange (55)
: +- * Project (54)
: +- * Filter (53)
: +- * HashAggregate (52)
: +- Exchange (51)
: +- * HashAggregate (50)
: +- * Project (49)
: +- * BroadcastHashJoin Inner BuildRight (48)
: :- * Project (46)
: : +- * BroadcastHashJoin Inner BuildRight (45)
: : :- * Filter (40)
: : : +- * ColumnarToRow (39)
: : : +- Scan parquet default.customer (38)
: : +- BroadcastExchange (44)
: : +- * Filter (43)
: : +- * ColumnarToRow (42)
: : +- Scan parquet default.web_sales (41)
: +- ReusedExchange (47)
+- BroadcastExchange (70)
+- * HashAggregate (69)
+- Exchange (68)
+- * HashAggregate (67)
+- * Project (66)
+- * BroadcastHashJoin Inner BuildRight (65)
:- * Project (63)
: +- * BroadcastHashJoin Inner BuildRight (62)
: :- * Filter (60)
: : +- * ColumnarToRow (59)
: : +- Scan parquet default.customer (58)
: +- ReusedExchange (61)
+- ReusedExchange (64)
(1) Scan parquet default.customer
Output [8]: [c_customer_sk#1, c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilitySuite/customer]
PushedFilters: [IsNotNull(c_customer_sk), IsNotNull(c_customer_id)]
ReadSchema: struct<c_customer_sk:int,c_customer_id:string,c_first_name:string,c_last_name:string,c_preferred_cust_flag:string,c_birth_country:string,c_login:string,c_email_address:string>
(2) ColumnarToRow [codegen id : 3]
Input [8]: [c_customer_sk#1, c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8]
(3) Filter [codegen id : 3]
Input [8]: [c_customer_sk#1, c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8]
Condition : (isnotnull(c_customer_sk#1) AND isnotnull(c_customer_id#2))
(4) Scan parquet default.store_sales
Output [4]: [ss_sold_date_sk#9, ss_customer_sk#10, ss_ext_discount_amt#11, ss_ext_list_price#12]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilitySuite/store_sales]
PushedFilters: [IsNotNull(ss_customer_sk), IsNotNull(ss_sold_date_sk)]
ReadSchema: struct<ss_sold_date_sk:int,ss_customer_sk:int,ss_ext_discount_amt:decimal(7,2),ss_ext_list_price:decimal(7,2)>
(5) ColumnarToRow [codegen id : 1]
Input [4]: [ss_sold_date_sk#9, ss_customer_sk#10, ss_ext_discount_amt#11, ss_ext_list_price#12]
(6) Filter [codegen id : 1]
Input [4]: [ss_sold_date_sk#9, ss_customer_sk#10, ss_ext_discount_amt#11, ss_ext_list_price#12]
Condition : (isnotnull(ss_customer_sk#10) AND isnotnull(ss_sold_date_sk#9))
(7) BroadcastExchange
Input [4]: [ss_sold_date_sk#9, ss_customer_sk#10, ss_ext_discount_amt#11, ss_ext_list_price#12]
Arguments: HashedRelationBroadcastMode(List(cast(input[1, int, false] as bigint)),false), [id=#13]
(8) BroadcastHashJoin [codegen id : 3]
Left keys [1]: [c_customer_sk#1]
Right keys [1]: [ss_customer_sk#10]
Join condition: None
(9) Project [codegen id : 3]
Output [10]: [c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, ss_sold_date_sk#9, ss_ext_discount_amt#11, ss_ext_list_price#12]
Input [12]: [c_customer_sk#1, c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, ss_sold_date_sk#9, ss_customer_sk#10, ss_ext_discount_amt#11, ss_ext_list_price#12]
(10) Scan parquet default.date_dim
Output [2]: [d_date_sk#14, d_year#15]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilitySuite/date_dim]
PushedFilters: [IsNotNull(d_year), EqualTo(d_year,2001), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int>
(11) ColumnarToRow [codegen id : 2]
Input [2]: [d_date_sk#14, d_year#15]
(12) Filter [codegen id : 2]
Input [2]: [d_date_sk#14, d_year#15]
Condition : ((isnotnull(d_year#15) AND (d_year#15 = 2001)) AND isnotnull(d_date_sk#14))
(13) BroadcastExchange
Input [2]: [d_date_sk#14, d_year#15]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#16]
(14) BroadcastHashJoin [codegen id : 3]
Left keys [1]: [ss_sold_date_sk#9]
Right keys [1]: [d_date_sk#14]
Join condition: None
(15) Project [codegen id : 3]
Output [10]: [c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, ss_ext_discount_amt#11, ss_ext_list_price#12, d_year#15]
Input [12]: [c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, ss_sold_date_sk#9, ss_ext_discount_amt#11, ss_ext_list_price#12, d_date_sk#14, d_year#15]
(16) HashAggregate [codegen id : 3]
Input [10]: [c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, ss_ext_discount_amt#11, ss_ext_list_price#12, d_year#15]
Keys [8]: [c_customer_id#2, c_first_name#3, c_last_name#4, d_year#15, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8]
Functions [1]: [partial_sum(UnscaledValue(CheckOverflow((promote_precision(cast(ss_ext_list_price#12 as decimal(8,2))) - promote_precision(cast(ss_ext_discount_amt#11 as decimal(8,2)))), DecimalType(8,2), true)))]
Aggregate Attributes [1]: [sum#17]
Results [9]: [c_customer_id#2, c_first_name#3, c_last_name#4, d_year#15, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, sum#18]
(17) Exchange
Input [9]: [c_customer_id#2, c_first_name#3, c_last_name#4, d_year#15, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, sum#18]
Arguments: hashpartitioning(c_customer_id#2, c_first_name#3, c_last_name#4, d_year#15, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, 5), true, [id=#19]
(18) HashAggregate [codegen id : 16]
Input [9]: [c_customer_id#2, c_first_name#3, c_last_name#4, d_year#15, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, sum#18]
Keys [8]: [c_customer_id#2, c_first_name#3, c_last_name#4, d_year#15, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8]
Functions [1]: [sum(UnscaledValue(CheckOverflow((promote_precision(cast(ss_ext_list_price#12 as decimal(8,2))) - promote_precision(cast(ss_ext_discount_amt#11 as decimal(8,2)))), DecimalType(8,2), true)))]
Aggregate Attributes [1]: [sum(UnscaledValue(CheckOverflow((promote_precision(cast(ss_ext_list_price#12 as decimal(8,2))) - promote_precision(cast(ss_ext_discount_amt#11 as decimal(8,2)))), DecimalType(8,2), true)))#20]
Results [2]: [c_customer_id#2 AS customer_id#21, MakeDecimal(sum(UnscaledValue(CheckOverflow((promote_precision(cast(ss_ext_list_price#12 as decimal(8,2))) - promote_precision(cast(ss_ext_discount_amt#11 as decimal(8,2)))), DecimalType(8,2), true)))#20,18,2) AS year_total#22]
(19) Filter [codegen id : 16]
Input [2]: [customer_id#21, year_total#22]
Condition : (isnotnull(year_total#22) AND (year_total#22 > 0.00))
(20) Scan parquet default.customer
Output [8]: [c_customer_sk#1, c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilitySuite/customer]
PushedFilters: [IsNotNull(c_customer_sk), IsNotNull(c_customer_id)]
ReadSchema: struct<c_customer_sk:int,c_customer_id:string,c_first_name:string,c_last_name:string,c_preferred_cust_flag:string,c_birth_country:string,c_login:string,c_email_address:string>
(21) ColumnarToRow [codegen id : 6]
Input [8]: [c_customer_sk#1, c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8]
(22) Filter [codegen id : 6]
Input [8]: [c_customer_sk#1, c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8]
Condition : (isnotnull(c_customer_sk#1) AND isnotnull(c_customer_id#2))
(23) ReusedExchange [Reuses operator id: 7]
Output [4]: [ss_sold_date_sk#9, ss_customer_sk#10, ss_ext_discount_amt#11, ss_ext_list_price#12]
(24) BroadcastHashJoin [codegen id : 6]
Left keys [1]: [c_customer_sk#1]
Right keys [1]: [ss_customer_sk#10]
Join condition: None
(25) Project [codegen id : 6]
Output [10]: [c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, ss_sold_date_sk#9, ss_ext_discount_amt#11, ss_ext_list_price#12]
Input [12]: [c_customer_sk#1, c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, ss_sold_date_sk#9, ss_customer_sk#10, ss_ext_discount_amt#11, ss_ext_list_price#12]
(26) Scan parquet default.date_dim
Output [2]: [d_date_sk#14, d_year#15]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilitySuite/date_dim]
PushedFilters: [IsNotNull(d_year), EqualTo(d_year,2002), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_year:int>
(27) ColumnarToRow [codegen id : 5]
Input [2]: [d_date_sk#14, d_year#15]
(28) Filter [codegen id : 5]
Input [2]: [d_date_sk#14, d_year#15]
Condition : ((isnotnull(d_year#15) AND (d_year#15 = 2002)) AND isnotnull(d_date_sk#14))
(29) BroadcastExchange
Input [2]: [d_date_sk#14, d_year#15]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#23]
(30) BroadcastHashJoin [codegen id : 6]
Left keys [1]: [ss_sold_date_sk#9]
Right keys [1]: [d_date_sk#14]
Join condition: None
(31) Project [codegen id : 6]
Output [10]: [c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, ss_ext_discount_amt#11, ss_ext_list_price#12, d_year#15]
Input [12]: [c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, ss_sold_date_sk#9, ss_ext_discount_amt#11, ss_ext_list_price#12, d_date_sk#14, d_year#15]
(32) HashAggregate [codegen id : 6]
Input [10]: [c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, ss_ext_discount_amt#11, ss_ext_list_price#12, d_year#15]
Keys [8]: [c_customer_id#2, c_first_name#3, c_last_name#4, d_year#15, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8]
Functions [1]: [partial_sum(UnscaledValue(CheckOverflow((promote_precision(cast(ss_ext_list_price#12 as decimal(8,2))) - promote_precision(cast(ss_ext_discount_amt#11 as decimal(8,2)))), DecimalType(8,2), true)))]
Aggregate Attributes [1]: [sum#24]
Results [9]: [c_customer_id#2, c_first_name#3, c_last_name#4, d_year#15, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, sum#25]
(33) Exchange
Input [9]: [c_customer_id#2, c_first_name#3, c_last_name#4, d_year#15, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, sum#25]
Arguments: hashpartitioning(c_customer_id#2, c_first_name#3, c_last_name#4, d_year#15, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, 5), true, [id=#26]
(34) HashAggregate [codegen id : 7]
Input [9]: [c_customer_id#2, c_first_name#3, c_last_name#4, d_year#15, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, sum#25]
Keys [8]: [c_customer_id#2, c_first_name#3, c_last_name#4, d_year#15, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8]
Functions [1]: [sum(UnscaledValue(CheckOverflow((promote_precision(cast(ss_ext_list_price#12 as decimal(8,2))) - promote_precision(cast(ss_ext_discount_amt#11 as decimal(8,2)))), DecimalType(8,2), true)))]
Aggregate Attributes [1]: [sum(UnscaledValue(CheckOverflow((promote_precision(cast(ss_ext_list_price#12 as decimal(8,2))) - promote_precision(cast(ss_ext_discount_amt#11 as decimal(8,2)))), DecimalType(8,2), true)))#27]
Results [3]: [c_customer_id#2 AS customer_id#28, c_preferred_cust_flag#5 AS customer_preferred_cust_flag#29, MakeDecimal(sum(UnscaledValue(CheckOverflow((promote_precision(cast(ss_ext_list_price#12 as decimal(8,2))) - promote_precision(cast(ss_ext_discount_amt#11 as decimal(8,2)))), DecimalType(8,2), true)))#27,18,2) AS year_total#30]
(35) BroadcastExchange
Input [3]: [customer_id#28, customer_preferred_cust_flag#29, year_total#30]
Arguments: HashedRelationBroadcastMode(List(input[0, string, true]),false), [id=#31]
(36) BroadcastHashJoin [codegen id : 16]
Left keys [1]: [customer_id#21]
Right keys [1]: [customer_id#28]
Join condition: None
(37) Project [codegen id : 16]
Output [4]: [customer_id#21, year_total#22, customer_preferred_cust_flag#29, year_total#30]
Input [5]: [customer_id#21, year_total#22, customer_id#28, customer_preferred_cust_flag#29, year_total#30]
(38) Scan parquet default.customer
Output [8]: [c_customer_sk#1, c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilitySuite/customer]
PushedFilters: [IsNotNull(c_customer_sk), IsNotNull(c_customer_id)]
ReadSchema: struct<c_customer_sk:int,c_customer_id:string,c_first_name:string,c_last_name:string,c_preferred_cust_flag:string,c_birth_country:string,c_login:string,c_email_address:string>
(39) ColumnarToRow [codegen id : 10]
Input [8]: [c_customer_sk#1, c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8]
(40) Filter [codegen id : 10]
Input [8]: [c_customer_sk#1, c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8]
Condition : (isnotnull(c_customer_sk#1) AND isnotnull(c_customer_id#2))
(41) Scan parquet default.web_sales
Output [4]: [ws_sold_date_sk#32, ws_bill_customer_sk#33, ws_ext_discount_amt#34, ws_ext_list_price#35]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilitySuite/web_sales]
PushedFilters: [IsNotNull(ws_bill_customer_sk), IsNotNull(ws_sold_date_sk)]
ReadSchema: struct<ws_sold_date_sk:int,ws_bill_customer_sk:int,ws_ext_discount_amt:decimal(7,2),ws_ext_list_price:decimal(7,2)>
(42) ColumnarToRow [codegen id : 8]
Input [4]: [ws_sold_date_sk#32, ws_bill_customer_sk#33, ws_ext_discount_amt#34, ws_ext_list_price#35]
(43) Filter [codegen id : 8]
Input [4]: [ws_sold_date_sk#32, ws_bill_customer_sk#33, ws_ext_discount_amt#34, ws_ext_list_price#35]
Condition : (isnotnull(ws_bill_customer_sk#33) AND isnotnull(ws_sold_date_sk#32))
(44) BroadcastExchange
Input [4]: [ws_sold_date_sk#32, ws_bill_customer_sk#33, ws_ext_discount_amt#34, ws_ext_list_price#35]
Arguments: HashedRelationBroadcastMode(List(cast(input[1, int, false] as bigint)),false), [id=#36]
(45) BroadcastHashJoin [codegen id : 10]
Left keys [1]: [c_customer_sk#1]
Right keys [1]: [ws_bill_customer_sk#33]
Join condition: None
(46) Project [codegen id : 10]
Output [10]: [c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, ws_sold_date_sk#32, ws_ext_discount_amt#34, ws_ext_list_price#35]
Input [12]: [c_customer_sk#1, c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, ws_sold_date_sk#32, ws_bill_customer_sk#33, ws_ext_discount_amt#34, ws_ext_list_price#35]
(47) ReusedExchange [Reuses operator id: 13]
Output [2]: [d_date_sk#14, d_year#15]
(48) BroadcastHashJoin [codegen id : 10]
Left keys [1]: [ws_sold_date_sk#32]
Right keys [1]: [d_date_sk#14]
Join condition: None
(49) Project [codegen id : 10]
Output [10]: [c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, ws_ext_discount_amt#34, ws_ext_list_price#35, d_year#15]
Input [12]: [c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, ws_sold_date_sk#32, ws_ext_discount_amt#34, ws_ext_list_price#35, d_date_sk#14, d_year#15]
(50) HashAggregate [codegen id : 10]
Input [10]: [c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, ws_ext_discount_amt#34, ws_ext_list_price#35, d_year#15]
Keys [8]: [c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, d_year#15]
Functions [1]: [partial_sum(UnscaledValue(CheckOverflow((promote_precision(cast(ws_ext_list_price#35 as decimal(8,2))) - promote_precision(cast(ws_ext_discount_amt#34 as decimal(8,2)))), DecimalType(8,2), true)))]
Aggregate Attributes [1]: [sum#37]
Results [9]: [c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, d_year#15, sum#38]
(51) Exchange
Input [9]: [c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, d_year#15, sum#38]
Arguments: hashpartitioning(c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, d_year#15, 5), true, [id=#39]
(52) HashAggregate [codegen id : 11]
Input [9]: [c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, d_year#15, sum#38]
Keys [8]: [c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, d_year#15]
Functions [1]: [sum(UnscaledValue(CheckOverflow((promote_precision(cast(ws_ext_list_price#35 as decimal(8,2))) - promote_precision(cast(ws_ext_discount_amt#34 as decimal(8,2)))), DecimalType(8,2), true)))]
Aggregate Attributes [1]: [sum(UnscaledValue(CheckOverflow((promote_precision(cast(ws_ext_list_price#35 as decimal(8,2))) - promote_precision(cast(ws_ext_discount_amt#34 as decimal(8,2)))), DecimalType(8,2), true)))#40]
Results [2]: [c_customer_id#2 AS customer_id#41, MakeDecimal(sum(UnscaledValue(CheckOverflow((promote_precision(cast(ws_ext_list_price#35 as decimal(8,2))) - promote_precision(cast(ws_ext_discount_amt#34 as decimal(8,2)))), DecimalType(8,2), true)))#40,18,2) AS year_total#42]
(53) Filter [codegen id : 11]
Input [2]: [customer_id#41, year_total#42]
Condition : (isnotnull(year_total#42) AND (year_total#42 > 0.00))
(54) Project [codegen id : 11]
Output [2]: [customer_id#41 AS customer_id#43, year_total#42 AS year_total#44]
Input [2]: [customer_id#41, year_total#42]
(55) BroadcastExchange
Input [2]: [customer_id#43, year_total#44]
Arguments: HashedRelationBroadcastMode(List(input[0, string, true]),false), [id=#45]
(56) BroadcastHashJoin [codegen id : 16]
Left keys [1]: [customer_id#21]
Right keys [1]: [customer_id#43]
Join condition: None
(57) Project [codegen id : 16]
Output [5]: [customer_id#21, year_total#22, customer_preferred_cust_flag#29, year_total#30, year_total#44]
Input [6]: [customer_id#21, year_total#22, customer_preferred_cust_flag#29, year_total#30, customer_id#43, year_total#44]
(58) Scan parquet default.customer
Output [8]: [c_customer_sk#1, c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilitySuite/customer]
PushedFilters: [IsNotNull(c_customer_sk), IsNotNull(c_customer_id)]
ReadSchema: struct<c_customer_sk:int,c_customer_id:string,c_first_name:string,c_last_name:string,c_preferred_cust_flag:string,c_birth_country:string,c_login:string,c_email_address:string>
(59) ColumnarToRow [codegen id : 14]
Input [8]: [c_customer_sk#1, c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8]
(60) Filter [codegen id : 14]
Input [8]: [c_customer_sk#1, c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8]
Condition : (isnotnull(c_customer_sk#1) AND isnotnull(c_customer_id#2))
(61) ReusedExchange [Reuses operator id: 44]
Output [4]: [ws_sold_date_sk#32, ws_bill_customer_sk#33, ws_ext_discount_amt#34, ws_ext_list_price#35]
(62) BroadcastHashJoin [codegen id : 14]
Left keys [1]: [c_customer_sk#1]
Right keys [1]: [ws_bill_customer_sk#33]
Join condition: None
(63) Project [codegen id : 14]
Output [10]: [c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, ws_sold_date_sk#32, ws_ext_discount_amt#34, ws_ext_list_price#35]
Input [12]: [c_customer_sk#1, c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, ws_sold_date_sk#32, ws_bill_customer_sk#33, ws_ext_discount_amt#34, ws_ext_list_price#35]
(64) ReusedExchange [Reuses operator id: 29]
Output [2]: [d_date_sk#14, d_year#15]
(65) BroadcastHashJoin [codegen id : 14]
Left keys [1]: [ws_sold_date_sk#32]
Right keys [1]: [d_date_sk#14]
Join condition: None
(66) Project [codegen id : 14]
Output [10]: [c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, ws_ext_discount_amt#34, ws_ext_list_price#35, d_year#15]
Input [12]: [c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, ws_sold_date_sk#32, ws_ext_discount_amt#34, ws_ext_list_price#35, d_date_sk#14, d_year#15]
(67) HashAggregate [codegen id : 14]
Input [10]: [c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, ws_ext_discount_amt#34, ws_ext_list_price#35, d_year#15]
Keys [8]: [c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, d_year#15]
Functions [1]: [partial_sum(UnscaledValue(CheckOverflow((promote_precision(cast(ws_ext_list_price#35 as decimal(8,2))) - promote_precision(cast(ws_ext_discount_amt#34 as decimal(8,2)))), DecimalType(8,2), true)))]
Aggregate Attributes [1]: [sum#46]
Results [9]: [c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, d_year#15, sum#47]
(68) Exchange
Input [9]: [c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, d_year#15, sum#47]
Arguments: hashpartitioning(c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, d_year#15, 5), true, [id=#48]
(69) HashAggregate [codegen id : 15]
Input [9]: [c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, d_year#15, sum#47]
Keys [8]: [c_customer_id#2, c_first_name#3, c_last_name#4, c_preferred_cust_flag#5, c_birth_country#6, c_login#7, c_email_address#8, d_year#15]
Functions [1]: [sum(UnscaledValue(CheckOverflow((promote_precision(cast(ws_ext_list_price#35 as decimal(8,2))) - promote_precision(cast(ws_ext_discount_amt#34 as decimal(8,2)))), DecimalType(8,2), true)))]
Aggregate Attributes [1]: [sum(UnscaledValue(CheckOverflow((promote_precision(cast(ws_ext_list_price#35 as decimal(8,2))) - promote_precision(cast(ws_ext_discount_amt#34 as decimal(8,2)))), DecimalType(8,2), true)))#49]
Results [2]: [c_customer_id#2 AS customer_id#50, MakeDecimal(sum(UnscaledValue(CheckOverflow((promote_precision(cast(ws_ext_list_price#35 as decimal(8,2))) - promote_precision(cast(ws_ext_discount_amt#34 as decimal(8,2)))), DecimalType(8,2), true)))#49,18,2) AS year_total#51]
(70) BroadcastExchange
Input [2]: [customer_id#50, year_total#51]
Arguments: HashedRelationBroadcastMode(List(input[0, string, true]),false), [id=#52]
(71) BroadcastHashJoin [codegen id : 16]
Left keys [1]: [customer_id#21]
Right keys [1]: [customer_id#50]
Join condition: (CASE WHEN (year_total#44 > 0.00) THEN CheckOverflow((promote_precision(year_total#51) / promote_precision(year_total#44)), DecimalType(38,20), true) ELSE null END > CASE WHEN (year_total#22 > 0.00) THEN CheckOverflow((promote_precision(year_total#30) / promote_precision(year_total#22)), DecimalType(38,20), true) ELSE null END)
(72) Project [codegen id : 16]
Output [1]: [customer_preferred_cust_flag#29]
Input [7]: [customer_id#21, year_total#22, customer_preferred_cust_flag#29, year_total#30, year_total#44, customer_id#50, year_total#51]
(73) TakeOrderedAndProject
Input [1]: [customer_preferred_cust_flag#29]
Arguments: 100, [customer_preferred_cust_flag#29 ASC NULLS FIRST], [customer_preferred_cust_flag#29]

View file

@ -0,0 +1,108 @@
TakeOrderedAndProject [customer_preferred_cust_flag]
WholeStageCodegen (16)
Project [customer_preferred_cust_flag]
BroadcastHashJoin [customer_id,customer_id,year_total,year_total,year_total,year_total]
Project [customer_id,customer_preferred_cust_flag,year_total,year_total,year_total]
BroadcastHashJoin [customer_id,customer_id]
Project [customer_id,customer_preferred_cust_flag,year_total,year_total]
BroadcastHashJoin [customer_id,customer_id]
Filter [year_total]
HashAggregate [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,d_year,sum] [customer_id,sum,sum(UnscaledValue(CheckOverflow((promote_precision(cast(ss_ext_list_price as decimal(8,2))) - promote_precision(cast(ss_ext_discount_amt as decimal(8,2)))), DecimalType(8,2), true))),year_total]
InputAdapter
Exchange [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,d_year] #1
WholeStageCodegen (3)
HashAggregate [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,d_year,ss_ext_discount_amt,ss_ext_list_price] [sum,sum]
Project [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,d_year,ss_ext_discount_amt,ss_ext_list_price]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Project [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,ss_ext_discount_amt,ss_ext_list_price,ss_sold_date_sk]
BroadcastHashJoin [c_customer_sk,ss_customer_sk]
Filter [c_customer_id,c_customer_sk]
ColumnarToRow
InputAdapter
Scan parquet default.customer [c_birth_country,c_customer_id,c_customer_sk,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag]
InputAdapter
BroadcastExchange #2
WholeStageCodegen (1)
Filter [ss_customer_sk,ss_sold_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.store_sales [ss_customer_sk,ss_ext_discount_amt,ss_ext_list_price,ss_sold_date_sk]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (2)
Filter [d_date_sk,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_year]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (7)
HashAggregate [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,d_year,sum] [customer_id,customer_preferred_cust_flag,sum,sum(UnscaledValue(CheckOverflow((promote_precision(cast(ss_ext_list_price as decimal(8,2))) - promote_precision(cast(ss_ext_discount_amt as decimal(8,2)))), DecimalType(8,2), true))),year_total]
InputAdapter
Exchange [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,d_year] #5
WholeStageCodegen (6)
HashAggregate [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,d_year,ss_ext_discount_amt,ss_ext_list_price] [sum,sum]
Project [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,d_year,ss_ext_discount_amt,ss_ext_list_price]
BroadcastHashJoin [d_date_sk,ss_sold_date_sk]
Project [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,ss_ext_discount_amt,ss_ext_list_price,ss_sold_date_sk]
BroadcastHashJoin [c_customer_sk,ss_customer_sk]
Filter [c_customer_id,c_customer_sk]
ColumnarToRow
InputAdapter
Scan parquet default.customer [c_birth_country,c_customer_id,c_customer_sk,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag]
InputAdapter
ReusedExchange [ss_customer_sk,ss_ext_discount_amt,ss_ext_list_price,ss_sold_date_sk] #2
InputAdapter
BroadcastExchange #6
WholeStageCodegen (5)
Filter [d_date_sk,d_year]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date_sk,d_year]
InputAdapter
BroadcastExchange #7
WholeStageCodegen (11)
Project [customer_id,year_total]
Filter [year_total]
HashAggregate [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,d_year,sum] [customer_id,sum,sum(UnscaledValue(CheckOverflow((promote_precision(cast(ws_ext_list_price as decimal(8,2))) - promote_precision(cast(ws_ext_discount_amt as decimal(8,2)))), DecimalType(8,2), true))),year_total]
InputAdapter
Exchange [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,d_year] #8
WholeStageCodegen (10)
HashAggregate [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,d_year,ws_ext_discount_amt,ws_ext_list_price] [sum,sum]
Project [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,d_year,ws_ext_discount_amt,ws_ext_list_price]
BroadcastHashJoin [d_date_sk,ws_sold_date_sk]
Project [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,ws_ext_discount_amt,ws_ext_list_price,ws_sold_date_sk]
BroadcastHashJoin [c_customer_sk,ws_bill_customer_sk]
Filter [c_customer_id,c_customer_sk]
ColumnarToRow
InputAdapter
Scan parquet default.customer [c_birth_country,c_customer_id,c_customer_sk,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag]
InputAdapter
BroadcastExchange #9
WholeStageCodegen (8)
Filter [ws_bill_customer_sk,ws_sold_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.web_sales [ws_bill_customer_sk,ws_ext_discount_amt,ws_ext_list_price,ws_sold_date_sk]
InputAdapter
ReusedExchange [d_date_sk,d_year] #3
InputAdapter
BroadcastExchange #10
WholeStageCodegen (15)
HashAggregate [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,d_year,sum] [customer_id,sum,sum(UnscaledValue(CheckOverflow((promote_precision(cast(ws_ext_list_price as decimal(8,2))) - promote_precision(cast(ws_ext_discount_amt as decimal(8,2)))), DecimalType(8,2), true))),year_total]
InputAdapter
Exchange [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,d_year] #11
WholeStageCodegen (14)
HashAggregate [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,d_year,ws_ext_discount_amt,ws_ext_list_price] [sum,sum]
Project [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,d_year,ws_ext_discount_amt,ws_ext_list_price]
BroadcastHashJoin [d_date_sk,ws_sold_date_sk]
Project [c_birth_country,c_customer_id,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag,ws_ext_discount_amt,ws_ext_list_price,ws_sold_date_sk]
BroadcastHashJoin [c_customer_sk,ws_bill_customer_sk]
Filter [c_customer_id,c_customer_sk]
ColumnarToRow
InputAdapter
Scan parquet default.customer [c_birth_country,c_customer_id,c_customer_sk,c_email_address,c_first_name,c_last_name,c_login,c_preferred_cust_flag]
InputAdapter
ReusedExchange [ws_bill_customer_sk,ws_ext_discount_amt,ws_ext_list_price,ws_sold_date_sk] #9
InputAdapter
ReusedExchange [d_date_sk,d_year] #6

View file

@ -0,0 +1,152 @@
== Physical Plan ==
TakeOrderedAndProject (27)
+- * Project (26)
+- Window (25)
+- * Sort (24)
+- Exchange (23)
+- * HashAggregate (22)
+- Exchange (21)
+- * HashAggregate (20)
+- * Project (19)
+- * SortMergeJoin Inner (18)
:- * Sort (12)
: +- Exchange (11)
: +- * Project (10)
: +- * BroadcastHashJoin Inner BuildRight (9)
: :- * Filter (3)
: : +- * ColumnarToRow (2)
: : +- Scan parquet default.web_sales (1)
: +- BroadcastExchange (8)
: +- * Project (7)
: +- * Filter (6)
: +- * ColumnarToRow (5)
: +- Scan parquet default.date_dim (4)
+- * Sort (17)
+- Exchange (16)
+- * Filter (15)
+- * ColumnarToRow (14)
+- Scan parquet default.item (13)
(1) Scan parquet default.web_sales
Output [3]: [ws_sold_date_sk#1, ws_item_sk#2, ws_ext_sales_price#3]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilityWithStatsSuite/web_sales]
PushedFilters: [IsNotNull(ws_item_sk), IsNotNull(ws_sold_date_sk)]
ReadSchema: struct<ws_sold_date_sk:int,ws_item_sk:int,ws_ext_sales_price:decimal(7,2)>
(2) ColumnarToRow [codegen id : 2]
Input [3]: [ws_sold_date_sk#1, ws_item_sk#2, ws_ext_sales_price#3]
(3) Filter [codegen id : 2]
Input [3]: [ws_sold_date_sk#1, ws_item_sk#2, ws_ext_sales_price#3]
Condition : (isnotnull(ws_item_sk#2) AND isnotnull(ws_sold_date_sk#1))
(4) Scan parquet default.date_dim
Output [2]: [d_date_sk#4, d_date#5]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilityWithStatsSuite/date_dim]
PushedFilters: [IsNotNull(d_date), GreaterThanOrEqual(d_date,1999-02-22), LessThanOrEqual(d_date,1999-03-24), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_date:date>
(5) ColumnarToRow [codegen id : 1]
Input [2]: [d_date_sk#4, d_date#5]
(6) Filter [codegen id : 1]
Input [2]: [d_date_sk#4, d_date#5]
Condition : (((isnotnull(d_date#5) AND (d_date#5 >= 10644)) AND (d_date#5 <= 10674)) AND isnotnull(d_date_sk#4))
(7) Project [codegen id : 1]
Output [1]: [d_date_sk#4]
Input [2]: [d_date_sk#4, d_date#5]
(8) BroadcastExchange
Input [1]: [d_date_sk#4]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#6]
(9) BroadcastHashJoin [codegen id : 2]
Left keys [1]: [ws_sold_date_sk#1]
Right keys [1]: [d_date_sk#4]
Join condition: None
(10) Project [codegen id : 2]
Output [2]: [ws_item_sk#2, ws_ext_sales_price#3]
Input [4]: [ws_sold_date_sk#1, ws_item_sk#2, ws_ext_sales_price#3, d_date_sk#4]
(11) Exchange
Input [2]: [ws_item_sk#2, ws_ext_sales_price#3]
Arguments: hashpartitioning(ws_item_sk#2, 5), true, [id=#7]
(12) Sort [codegen id : 3]
Input [2]: [ws_item_sk#2, ws_ext_sales_price#3]
Arguments: [ws_item_sk#2 ASC NULLS FIRST], false, 0
(13) Scan parquet default.item
Output [6]: [i_item_sk#8, i_item_id#9, i_item_desc#10, i_current_price#11, i_class#12, i_category#13]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilityWithStatsSuite/item]
PushedFilters: [In(i_category, [Sports,Books,Home]), IsNotNull(i_item_sk)]
ReadSchema: struct<i_item_sk:int,i_item_id:string,i_item_desc:string,i_current_price:decimal(7,2),i_class:string,i_category:string>
(14) ColumnarToRow [codegen id : 4]
Input [6]: [i_item_sk#8, i_item_id#9, i_item_desc#10, i_current_price#11, i_class#12, i_category#13]
(15) Filter [codegen id : 4]
Input [6]: [i_item_sk#8, i_item_id#9, i_item_desc#10, i_current_price#11, i_class#12, i_category#13]
Condition : (i_category#13 IN (Sports,Books,Home) AND isnotnull(i_item_sk#8))
(16) Exchange
Input [6]: [i_item_sk#8, i_item_id#9, i_item_desc#10, i_current_price#11, i_class#12, i_category#13]
Arguments: hashpartitioning(i_item_sk#8, 5), true, [id=#14]
(17) Sort [codegen id : 5]
Input [6]: [i_item_sk#8, i_item_id#9, i_item_desc#10, i_current_price#11, i_class#12, i_category#13]
Arguments: [i_item_sk#8 ASC NULLS FIRST], false, 0
(18) SortMergeJoin [codegen id : 6]
Left keys [1]: [ws_item_sk#2]
Right keys [1]: [i_item_sk#8]
Join condition: None
(19) Project [codegen id : 6]
Output [6]: [ws_ext_sales_price#3, i_item_id#9, i_item_desc#10, i_current_price#11, i_class#12, i_category#13]
Input [8]: [ws_item_sk#2, ws_ext_sales_price#3, i_item_sk#8, i_item_id#9, i_item_desc#10, i_current_price#11, i_class#12, i_category#13]
(20) HashAggregate [codegen id : 6]
Input [6]: [ws_ext_sales_price#3, i_item_id#9, i_item_desc#10, i_current_price#11, i_class#12, i_category#13]
Keys [5]: [i_item_id#9, i_item_desc#10, i_category#13, i_class#12, i_current_price#11]
Functions [1]: [partial_sum(UnscaledValue(ws_ext_sales_price#3))]
Aggregate Attributes [1]: [sum#15]
Results [6]: [i_item_id#9, i_item_desc#10, i_category#13, i_class#12, i_current_price#11, sum#16]
(21) Exchange
Input [6]: [i_item_id#9, i_item_desc#10, i_category#13, i_class#12, i_current_price#11, sum#16]
Arguments: hashpartitioning(i_item_id#9, i_item_desc#10, i_category#13, i_class#12, i_current_price#11, 5), true, [id=#17]
(22) HashAggregate [codegen id : 7]
Input [6]: [i_item_id#9, i_item_desc#10, i_category#13, i_class#12, i_current_price#11, sum#16]
Keys [5]: [i_item_id#9, i_item_desc#10, i_category#13, i_class#12, i_current_price#11]
Functions [1]: [sum(UnscaledValue(ws_ext_sales_price#3))]
Aggregate Attributes [1]: [sum(UnscaledValue(ws_ext_sales_price#3))#18]
Results [8]: [i_item_desc#10, i_category#13, i_class#12, i_current_price#11, MakeDecimal(sum(UnscaledValue(ws_ext_sales_price#3))#18,17,2) AS itemrevenue#19, MakeDecimal(sum(UnscaledValue(ws_ext_sales_price#3))#18,17,2) AS _w0#20, MakeDecimal(sum(UnscaledValue(ws_ext_sales_price#3))#18,17,2) AS _w1#21, i_item_id#9]
(23) Exchange
Input [8]: [i_item_desc#10, i_category#13, i_class#12, i_current_price#11, itemrevenue#19, _w0#20, _w1#21, i_item_id#9]
Arguments: hashpartitioning(i_class#12, 5), true, [id=#22]
(24) Sort [codegen id : 8]
Input [8]: [i_item_desc#10, i_category#13, i_class#12, i_current_price#11, itemrevenue#19, _w0#20, _w1#21, i_item_id#9]
Arguments: [i_class#12 ASC NULLS FIRST], false, 0
(25) Window
Input [8]: [i_item_desc#10, i_category#13, i_class#12, i_current_price#11, itemrevenue#19, _w0#20, _w1#21, i_item_id#9]
Arguments: [sum(_w1#21) windowspecdefinition(i_class#12, specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) AS _we0#23], [i_class#12]
(26) Project [codegen id : 9]
Output [7]: [i_item_desc#10, i_category#13, i_class#12, i_current_price#11, itemrevenue#19, CheckOverflow((promote_precision(cast(CheckOverflow((promote_precision(_w0#20) * 100.00), DecimalType(21,2), true) as decimal(27,2))) / promote_precision(_we0#23)), DecimalType(38,17), true) AS revenueratio#24, i_item_id#9]
Input [9]: [i_item_desc#10, i_category#13, i_class#12, i_current_price#11, itemrevenue#19, _w0#20, _w1#21, i_item_id#9, _we0#23]
(27) TakeOrderedAndProject
Input [7]: [i_item_desc#10, i_category#13, i_class#12, i_current_price#11, itemrevenue#19, revenueratio#24, i_item_id#9]
Arguments: 100, [i_category#13 ASC NULLS FIRST, i_class#12 ASC NULLS FIRST, i_item_id#9 ASC NULLS FIRST, i_item_desc#10 ASC NULLS FIRST, revenueratio#24 ASC NULLS FIRST], [i_item_desc#10, i_category#13, i_class#12, i_current_price#11, itemrevenue#19, revenueratio#24]

View file

@ -0,0 +1,47 @@
TakeOrderedAndProject [i_category,i_class,i_current_price,i_item_desc,i_item_id,itemrevenue,revenueratio]
WholeStageCodegen (9)
Project [_w0,_we0,i_category,i_class,i_current_price,i_item_desc,i_item_id,itemrevenue]
InputAdapter
Window [_w1,i_class]
WholeStageCodegen (8)
Sort [i_class]
InputAdapter
Exchange [i_class] #1
WholeStageCodegen (7)
HashAggregate [i_category,i_class,i_current_price,i_item_desc,i_item_id,sum] [_w0,_w1,itemrevenue,sum,sum(UnscaledValue(ws_ext_sales_price))]
InputAdapter
Exchange [i_category,i_class,i_current_price,i_item_desc,i_item_id] #2
WholeStageCodegen (6)
HashAggregate [i_category,i_class,i_current_price,i_item_desc,i_item_id,ws_ext_sales_price] [sum,sum]
Project [i_category,i_class,i_current_price,i_item_desc,i_item_id,ws_ext_sales_price]
SortMergeJoin [i_item_sk,ws_item_sk]
InputAdapter
WholeStageCodegen (3)
Sort [ws_item_sk]
InputAdapter
Exchange [ws_item_sk] #3
WholeStageCodegen (2)
Project [ws_ext_sales_price,ws_item_sk]
BroadcastHashJoin [d_date_sk,ws_sold_date_sk]
Filter [ws_item_sk,ws_sold_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.web_sales [ws_ext_sales_price,ws_item_sk,ws_sold_date_sk]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (1)
Project [d_date_sk]
Filter [d_date,d_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date,d_date_sk]
InputAdapter
WholeStageCodegen (5)
Sort [i_item_sk]
InputAdapter
Exchange [i_item_sk] #5
WholeStageCodegen (4)
Filter [i_category,i_item_sk]
ColumnarToRow
InputAdapter
Scan parquet default.item [i_category,i_class,i_current_price,i_item_desc,i_item_id,i_item_sk]

View file

@ -0,0 +1,137 @@
== Physical Plan ==
TakeOrderedAndProject (24)
+- * Project (23)
+- Window (22)
+- * Sort (21)
+- Exchange (20)
+- * HashAggregate (19)
+- Exchange (18)
+- * HashAggregate (17)
+- * Project (16)
+- * BroadcastHashJoin Inner BuildRight (15)
:- * Project (9)
: +- * BroadcastHashJoin Inner BuildRight (8)
: :- * Filter (3)
: : +- * ColumnarToRow (2)
: : +- Scan parquet default.web_sales (1)
: +- BroadcastExchange (7)
: +- * Filter (6)
: +- * ColumnarToRow (5)
: +- Scan parquet default.item (4)
+- BroadcastExchange (14)
+- * Project (13)
+- * Filter (12)
+- * ColumnarToRow (11)
+- Scan parquet default.date_dim (10)
(1) Scan parquet default.web_sales
Output [3]: [ws_sold_date_sk#1, ws_item_sk#2, ws_ext_sales_price#3]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilitySuite/web_sales]
PushedFilters: [IsNotNull(ws_item_sk), IsNotNull(ws_sold_date_sk)]
ReadSchema: struct<ws_sold_date_sk:int,ws_item_sk:int,ws_ext_sales_price:decimal(7,2)>
(2) ColumnarToRow [codegen id : 3]
Input [3]: [ws_sold_date_sk#1, ws_item_sk#2, ws_ext_sales_price#3]
(3) Filter [codegen id : 3]
Input [3]: [ws_sold_date_sk#1, ws_item_sk#2, ws_ext_sales_price#3]
Condition : (isnotnull(ws_item_sk#2) AND isnotnull(ws_sold_date_sk#1))
(4) Scan parquet default.item
Output [6]: [i_item_sk#4, i_item_id#5, i_item_desc#6, i_current_price#7, i_class#8, i_category#9]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilitySuite/item]
PushedFilters: [In(i_category, [Sports,Books,Home]), IsNotNull(i_item_sk)]
ReadSchema: struct<i_item_sk:int,i_item_id:string,i_item_desc:string,i_current_price:decimal(7,2),i_class:string,i_category:string>
(5) ColumnarToRow [codegen id : 1]
Input [6]: [i_item_sk#4, i_item_id#5, i_item_desc#6, i_current_price#7, i_class#8, i_category#9]
(6) Filter [codegen id : 1]
Input [6]: [i_item_sk#4, i_item_id#5, i_item_desc#6, i_current_price#7, i_class#8, i_category#9]
Condition : (i_category#9 IN (Sports,Books,Home) AND isnotnull(i_item_sk#4))
(7) BroadcastExchange
Input [6]: [i_item_sk#4, i_item_id#5, i_item_desc#6, i_current_price#7, i_class#8, i_category#9]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint)),false), [id=#10]
(8) BroadcastHashJoin [codegen id : 3]
Left keys [1]: [ws_item_sk#2]
Right keys [1]: [i_item_sk#4]
Join condition: None
(9) Project [codegen id : 3]
Output [7]: [ws_sold_date_sk#1, ws_ext_sales_price#3, i_item_id#5, i_item_desc#6, i_current_price#7, i_class#8, i_category#9]
Input [9]: [ws_sold_date_sk#1, ws_item_sk#2, ws_ext_sales_price#3, i_item_sk#4, i_item_id#5, i_item_desc#6, i_current_price#7, i_class#8, i_category#9]
(10) Scan parquet default.date_dim
Output [2]: [d_date_sk#11, d_date#12]
Batched: true
Location: InMemoryFileIndex [file:/Users/yi.wu/IdeaProjects/spark/sql/core/spark-warehouse/org.apache.spark.sql.TPCDSV1_4_PlanStabilitySuite/date_dim]
PushedFilters: [IsNotNull(d_date), GreaterThanOrEqual(d_date,1999-02-22), LessThanOrEqual(d_date,1999-03-24), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_date:date>
(11) ColumnarToRow [codegen id : 2]
Input [2]: [d_date_sk#11, d_date#12]
(12) Filter [codegen id : 2]
Input [2]: [d_date_sk#11, d_date#12]
Condition : (((isnotnull(d_date#12) AND (d_date#12 >= 10644)) AND (d_date#12 <= 10674)) AND isnotnull(d_date_sk#11))
(13) Project [codegen id : 2]
Output [1]: [d_date_sk#11]
Input [2]: [d_date_sk#11, d_date#12]
(14) BroadcastExchange
Input [1]: [d_date_sk#11]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [id=#13]
(15) BroadcastHashJoin [codegen id : 3]
Left keys [1]: [ws_sold_date_sk#1]
Right keys [1]: [d_date_sk#11]
Join condition: None
(16) Project [codegen id : 3]
Output [6]: [ws_ext_sales_price#3, i_item_id#5, i_item_desc#6, i_current_price#7, i_class#8, i_category#9]
Input [8]: [ws_sold_date_sk#1, ws_ext_sales_price#3, i_item_id#5, i_item_desc#6, i_current_price#7, i_class#8, i_category#9, d_date_sk#11]
(17) HashAggregate [codegen id : 3]
Input [6]: [ws_ext_sales_price#3, i_item_id#5, i_item_desc#6, i_current_price#7, i_class#8, i_category#9]
Keys [5]: [i_item_id#5, i_item_desc#6, i_category#9, i_class#8, i_current_price#7]
Functions [1]: [partial_sum(UnscaledValue(ws_ext_sales_price#3))]
Aggregate Attributes [1]: [sum#14]
Results [6]: [i_item_id#5, i_item_desc#6, i_category#9, i_class#8, i_current_price#7, sum#15]
(18) Exchange
Input [6]: [i_item_id#5, i_item_desc#6, i_category#9, i_class#8, i_current_price#7, sum#15]
Arguments: hashpartitioning(i_item_id#5, i_item_desc#6, i_category#9, i_class#8, i_current_price#7, 5), true, [id=#16]
(19) HashAggregate [codegen id : 4]
Input [6]: [i_item_id#5, i_item_desc#6, i_category#9, i_class#8, i_current_price#7, sum#15]
Keys [5]: [i_item_id#5, i_item_desc#6, i_category#9, i_class#8, i_current_price#7]
Functions [1]: [sum(UnscaledValue(ws_ext_sales_price#3))]
Aggregate Attributes [1]: [sum(UnscaledValue(ws_ext_sales_price#3))#17]
Results [8]: [i_item_desc#6, i_category#9, i_class#8, i_current_price#7, MakeDecimal(sum(UnscaledValue(ws_ext_sales_price#3))#17,17,2) AS itemrevenue#18, MakeDecimal(sum(UnscaledValue(ws_ext_sales_price#3))#17,17,2) AS _w0#19, MakeDecimal(sum(UnscaledValue(ws_ext_sales_price#3))#17,17,2) AS _w1#20, i_item_id#5]
(20) Exchange
Input [8]: [i_item_desc#6, i_category#9, i_class#8, i_current_price#7, itemrevenue#18, _w0#19, _w1#20, i_item_id#5]
Arguments: hashpartitioning(i_class#8, 5), true, [id=#21]
(21) Sort [codegen id : 5]
Input [8]: [i_item_desc#6, i_category#9, i_class#8, i_current_price#7, itemrevenue#18, _w0#19, _w1#20, i_item_id#5]
Arguments: [i_class#8 ASC NULLS FIRST], false, 0
(22) Window
Input [8]: [i_item_desc#6, i_category#9, i_class#8, i_current_price#7, itemrevenue#18, _w0#19, _w1#20, i_item_id#5]
Arguments: [sum(_w1#20) windowspecdefinition(i_class#8, specifiedwindowframe(RowFrame, unboundedpreceding$(), unboundedfollowing$())) AS _we0#22], [i_class#8]
(23) Project [codegen id : 6]
Output [7]: [i_item_desc#6, i_category#9, i_class#8, i_current_price#7, itemrevenue#18, CheckOverflow((promote_precision(cast(CheckOverflow((promote_precision(_w0#19) * 100.00), DecimalType(21,2), true) as decimal(27,2))) / promote_precision(_we0#22)), DecimalType(38,17), true) AS revenueratio#23, i_item_id#5]
Input [9]: [i_item_desc#6, i_category#9, i_class#8, i_current_price#7, itemrevenue#18, _w0#19, _w1#20, i_item_id#5, _we0#22]
(24) TakeOrderedAndProject
Input [7]: [i_item_desc#6, i_category#9, i_class#8, i_current_price#7, itemrevenue#18, revenueratio#23, i_item_id#5]
Arguments: 100, [i_category#9 ASC NULLS FIRST, i_class#8 ASC NULLS FIRST, i_item_id#5 ASC NULLS FIRST, i_item_desc#6 ASC NULLS FIRST, revenueratio#23 ASC NULLS FIRST], [i_item_desc#6, i_category#9, i_class#8, i_current_price#7, itemrevenue#18, revenueratio#23]

View file

@ -0,0 +1,38 @@
TakeOrderedAndProject [i_category,i_class,i_current_price,i_item_desc,i_item_id,itemrevenue,revenueratio]
WholeStageCodegen (6)
Project [_w0,_we0,i_category,i_class,i_current_price,i_item_desc,i_item_id,itemrevenue]
InputAdapter
Window [_w1,i_class]
WholeStageCodegen (5)
Sort [i_class]
InputAdapter
Exchange [i_class] #1
WholeStageCodegen (4)
HashAggregate [i_category,i_class,i_current_price,i_item_desc,i_item_id,sum] [_w0,_w1,itemrevenue,sum,sum(UnscaledValue(ws_ext_sales_price))]
InputAdapter
Exchange [i_category,i_class,i_current_price,i_item_desc,i_item_id] #2
WholeStageCodegen (3)
HashAggregate [i_category,i_class,i_current_price,i_item_desc,i_item_id,ws_ext_sales_price] [sum,sum]
Project [i_category,i_class,i_current_price,i_item_desc,i_item_id,ws_ext_sales_price]
BroadcastHashJoin [d_date_sk,ws_sold_date_sk]
Project [i_category,i_class,i_current_price,i_item_desc,i_item_id,ws_ext_sales_price,ws_sold_date_sk]
BroadcastHashJoin [i_item_sk,ws_item_sk]
Filter [ws_item_sk,ws_sold_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.web_sales [ws_ext_sales_price,ws_item_sk,ws_sold_date_sk]
InputAdapter
BroadcastExchange #3
WholeStageCodegen (1)
Filter [i_category,i_item_sk]
ColumnarToRow
InputAdapter
Scan parquet default.item [i_category,i_class,i_current_price,i_item_desc,i_item_id,i_item_sk]
InputAdapter
BroadcastExchange #4
WholeStageCodegen (2)
Project [d_date_sk]
Filter [d_date,d_date_sk]
ColumnarToRow
InputAdapter
Scan parquet default.date_dim [d_date,d_date_sk]

Some files were not shown because too many files have changed in this diff Show more