== Physical Plan ==
* HashAggregate (45)
+- Exchange (44)
   +- * HashAggregate (43)
      +- * HashAggregate (42)
         +- * HashAggregate (41)
            +- * Project (40)
               +- * BroadcastHashJoin Inner BuildRight (39)
                  :- * Project (33)
                  :  +- * BroadcastHashJoin Inner BuildRight (32)
                  :     :- * Project (26)
                  :     :  +- * BroadcastHashJoin Inner BuildRight (25)
                  :     :     :- * SortMergeJoin LeftAnti (19)
                  :     :     :  :- * Project (13)
                  :     :     :  :  +- * SortMergeJoin LeftSemi (12)
                  :     :     :  :     :- * Sort (6)
                  :     :     :  :     :  +- Exchange (5)
                  :     :     :  :     :     +- * Project (4)
                  :     :     :  :     :        +- * Filter (3)
                  :     :     :  :     :           +- * ColumnarToRow (2)
                  :     :     :  :     :              +- Scan parquet spark_catalog.default.web_sales (1)
                  :     :     :  :     +- * Sort (11)
                  :     :     :  :        +- Exchange (10)
                  :     :     :  :           +- * Project (9)
                  :     :     :  :              +- * ColumnarToRow (8)
                  :     :     :  :                 +- Scan parquet spark_catalog.default.web_sales (7)
                  :     :     :  +- * Sort (18)
                  :     :     :     +- Exchange (17)
                  :     :     :        +- * Project (16)
                  :     :     :           +- * ColumnarToRow (15)
                  :     :     :              +- Scan parquet spark_catalog.default.web_returns (14)
                  :     :     +- BroadcastExchange (24)
                  :     :        +- * Project (23)
                  :     :           +- * Filter (22)
                  :     :              +- * ColumnarToRow (21)
                  :     :                 +- Scan parquet spark_catalog.default.customer_address (20)
                  :     +- BroadcastExchange (31)
                  :        +- * Project (30)
                  :           +- * Filter (29)
                  :              +- * ColumnarToRow (28)
                  :                 +- Scan parquet spark_catalog.default.web_site (27)
                  +- BroadcastExchange (38)
                     +- * Project (37)
                        +- * Filter (36)
                           +- * ColumnarToRow (35)
                              +- Scan parquet spark_catalog.default.date_dim (34)


(1) Scan parquet spark_catalog.default.web_sales
Output [8]: [ws_ship_date_sk#1, ws_ship_addr_sk#2, ws_web_site_sk#3, ws_warehouse_sk#4, ws_order_number#5, ws_ext_ship_cost#6, ws_net_profit#7, ws_sold_date_sk#8]
Batched: true
Location [not included in comparison]/{warehouse_dir}/web_sales]
PushedFilters: [IsNotNull(ws_ship_date_sk), IsNotNull(ws_ship_addr_sk), IsNotNull(ws_web_site_sk)]
ReadSchema: struct<ws_ship_date_sk:int,ws_ship_addr_sk:int,ws_web_site_sk:int,ws_warehouse_sk:int,ws_order_number:int,ws_ext_ship_cost:decimal(7,2),ws_net_profit:decimal(7,2)>

(2) ColumnarToRow [codegen id : 1]
Input [8]: [ws_ship_date_sk#1, ws_ship_addr_sk#2, ws_web_site_sk#3, ws_warehouse_sk#4, ws_order_number#5, ws_ext_ship_cost#6, ws_net_profit#7, ws_sold_date_sk#8]

(3) Filter [codegen id : 1]
Input [8]: [ws_ship_date_sk#1, ws_ship_addr_sk#2, ws_web_site_sk#3, ws_warehouse_sk#4, ws_order_number#5, ws_ext_ship_cost#6, ws_net_profit#7, ws_sold_date_sk#8]
Condition : (((((isnotnull(ws_ship_date_sk#1) AND isnotnull(ws_ship_addr_sk#2)) AND isnotnull(ws_web_site_sk#3)) AND might_contain(Subquery scalar-subquery#9, [id=#1], xxhash64(ws_ship_addr_sk#2, 42))) AND might_contain(Subquery scalar-subquery#10, [id=#2], xxhash64(ws_web_site_sk#3, 42))) AND might_contain(Subquery scalar-subquery#11, [id=#3], xxhash64(ws_ship_date_sk#1, 42)))

(4) Project [codegen id : 1]
Output [7]: [ws_ship_date_sk#1, ws_ship_addr_sk#2, ws_web_site_sk#3, ws_warehouse_sk#4, ws_order_number#5, ws_ext_ship_cost#6, ws_net_profit#7]
Input [8]: [ws_ship_date_sk#1, ws_ship_addr_sk#2, ws_web_site_sk#3, ws_warehouse_sk#4, ws_order_number#5, ws_ext_ship_cost#6, ws_net_profit#7, ws_sold_date_sk#8]

(5) Exchange
Input [7]: [ws_ship_date_sk#1, ws_ship_addr_sk#2, ws_web_site_sk#3, ws_warehouse_sk#4, ws_order_number#5, ws_ext_ship_cost#6, ws_net_profit#7]
Arguments: hashpartitioning(ws_order_number#5, 5), ENSURE_REQUIREMENTS, [plan_id=4]

(6) Sort [codegen id : 2]
Input [7]: [ws_ship_date_sk#1, ws_ship_addr_sk#2, ws_web_site_sk#3, ws_warehouse_sk#4, ws_order_number#5, ws_ext_ship_cost#6, ws_net_profit#7]
Arguments: [ws_order_number#5 ASC NULLS FIRST], false, 0

(7) Scan parquet spark_catalog.default.web_sales
Output [3]: [ws_warehouse_sk#12, ws_order_number#13, ws_sold_date_sk#14]
Batched: true
Location [not included in comparison]/{warehouse_dir}/web_sales]
ReadSchema: struct<ws_warehouse_sk:int,ws_order_number:int>

(8) ColumnarToRow [codegen id : 3]
Input [3]: [ws_warehouse_sk#12, ws_order_number#13, ws_sold_date_sk#14]

(9) Project [codegen id : 3]
Output [2]: [ws_warehouse_sk#12, ws_order_number#13]
Input [3]: [ws_warehouse_sk#12, ws_order_number#13, ws_sold_date_sk#14]

(10) Exchange
Input [2]: [ws_warehouse_sk#12, ws_order_number#13]
Arguments: hashpartitioning(ws_order_number#13, 5), ENSURE_REQUIREMENTS, [plan_id=5]

(11) Sort [codegen id : 4]
Input [2]: [ws_warehouse_sk#12, ws_order_number#13]
Arguments: [ws_order_number#13 ASC NULLS FIRST], false, 0

(12) SortMergeJoin [codegen id : 5]
Left keys [1]: [ws_order_number#5]
Right keys [1]: [ws_order_number#13]
Join type: LeftSemi
Join condition: NOT (ws_warehouse_sk#4 = ws_warehouse_sk#12)

(13) Project [codegen id : 5]
Output [6]: [ws_ship_date_sk#1, ws_ship_addr_sk#2, ws_web_site_sk#3, ws_order_number#5, ws_ext_ship_cost#6, ws_net_profit#7]
Input [7]: [ws_ship_date_sk#1, ws_ship_addr_sk#2, ws_web_site_sk#3, ws_warehouse_sk#4, ws_order_number#5, ws_ext_ship_cost#6, ws_net_profit#7]

(14) Scan parquet spark_catalog.default.web_returns
Output [2]: [wr_order_number#15, wr_returned_date_sk#16]
Batched: true
Location [not included in comparison]/{warehouse_dir}/web_returns]
ReadSchema: struct<wr_order_number:int>

(15) ColumnarToRow [codegen id : 6]
Input [2]: [wr_order_number#15, wr_returned_date_sk#16]

(16) Project [codegen id : 6]
Output [1]: [wr_order_number#15]
Input [2]: [wr_order_number#15, wr_returned_date_sk#16]

(17) Exchange
Input [1]: [wr_order_number#15]
Arguments: hashpartitioning(wr_order_number#15, 5), ENSURE_REQUIREMENTS, [plan_id=6]

(18) Sort [codegen id : 7]
Input [1]: [wr_order_number#15]
Arguments: [wr_order_number#15 ASC NULLS FIRST], false, 0

(19) SortMergeJoin [codegen id : 11]
Left keys [1]: [ws_order_number#5]
Right keys [1]: [wr_order_number#15]
Join type: LeftAnti
Join condition: None

(20) Scan parquet spark_catalog.default.customer_address
Output [2]: [ca_address_sk#17, ca_state#18]
Batched: true
Location [not included in comparison]/{warehouse_dir}/customer_address]
PushedFilters: [IsNotNull(ca_state), EqualTo(ca_state,IL), IsNotNull(ca_address_sk)]
ReadSchema: struct<ca_address_sk:int,ca_state:string>

(21) ColumnarToRow [codegen id : 8]
Input [2]: [ca_address_sk#17, ca_state#18]

(22) Filter [codegen id : 8]
Input [2]: [ca_address_sk#17, ca_state#18]
Condition : ((isnotnull(ca_state#18) AND (ca_state#18 = IL)) AND isnotnull(ca_address_sk#17))

(23) Project [codegen id : 8]
Output [1]: [ca_address_sk#17]
Input [2]: [ca_address_sk#17, ca_state#18]

(24) BroadcastExchange
Input [1]: [ca_address_sk#17]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [plan_id=7]

(25) BroadcastHashJoin [codegen id : 11]
Left keys [1]: [ws_ship_addr_sk#2]
Right keys [1]: [ca_address_sk#17]
Join type: Inner
Join condition: None

(26) Project [codegen id : 11]
Output [5]: [ws_ship_date_sk#1, ws_web_site_sk#3, ws_order_number#5, ws_ext_ship_cost#6, ws_net_profit#7]
Input [7]: [ws_ship_date_sk#1, ws_ship_addr_sk#2, ws_web_site_sk#3, ws_order_number#5, ws_ext_ship_cost#6, ws_net_profit#7, ca_address_sk#17]

(27) Scan parquet spark_catalog.default.web_site
Output [2]: [web_site_sk#19, web_company_name#20]
Batched: true
Location [not included in comparison]/{warehouse_dir}/web_site]
PushedFilters: [IsNotNull(web_company_name), EqualTo(web_company_name,pri                                               ), IsNotNull(web_site_sk)]
ReadSchema: struct<web_site_sk:int,web_company_name:string>

(28) ColumnarToRow [codegen id : 9]
Input [2]: [web_site_sk#19, web_company_name#20]

(29) Filter [codegen id : 9]
Input [2]: [web_site_sk#19, web_company_name#20]
Condition : ((isnotnull(web_company_name#20) AND (web_company_name#20 = pri                                               )) AND isnotnull(web_site_sk#19))

(30) Project [codegen id : 9]
Output [1]: [web_site_sk#19]
Input [2]: [web_site_sk#19, web_company_name#20]

(31) BroadcastExchange
Input [1]: [web_site_sk#19]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [plan_id=8]

(32) BroadcastHashJoin [codegen id : 11]
Left keys [1]: [ws_web_site_sk#3]
Right keys [1]: [web_site_sk#19]
Join type: Inner
Join condition: None

(33) Project [codegen id : 11]
Output [4]: [ws_ship_date_sk#1, ws_order_number#5, ws_ext_ship_cost#6, ws_net_profit#7]
Input [6]: [ws_ship_date_sk#1, ws_web_site_sk#3, ws_order_number#5, ws_ext_ship_cost#6, ws_net_profit#7, web_site_sk#19]

(34) Scan parquet spark_catalog.default.date_dim
Output [2]: [d_date_sk#21, d_date#22]
Batched: true
Location [not included in comparison]/{warehouse_dir}/date_dim]
PushedFilters: [IsNotNull(d_date), GreaterThanOrEqual(d_date,1999-02-01), LessThanOrEqual(d_date,1999-04-02), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_date:date>

(35) ColumnarToRow [codegen id : 10]
Input [2]: [d_date_sk#21, d_date#22]

(36) Filter [codegen id : 10]
Input [2]: [d_date_sk#21, d_date#22]
Condition : (((isnotnull(d_date#22) AND (d_date#22 >= 1999-02-01)) AND (d_date#22 <= 1999-04-02)) AND isnotnull(d_date_sk#21))

(37) Project [codegen id : 10]
Output [1]: [d_date_sk#21]
Input [2]: [d_date_sk#21, d_date#22]

(38) BroadcastExchange
Input [1]: [d_date_sk#21]
Arguments: HashedRelationBroadcastMode(List(cast(input[0, int, true] as bigint)),false), [plan_id=9]

(39) BroadcastHashJoin [codegen id : 11]
Left keys [1]: [ws_ship_date_sk#1]
Right keys [1]: [d_date_sk#21]
Join type: Inner
Join condition: None

(40) Project [codegen id : 11]
Output [3]: [ws_order_number#5, ws_ext_ship_cost#6, ws_net_profit#7]
Input [5]: [ws_ship_date_sk#1, ws_order_number#5, ws_ext_ship_cost#6, ws_net_profit#7, d_date_sk#21]

(41) HashAggregate [codegen id : 11]
Input [3]: [ws_order_number#5, ws_ext_ship_cost#6, ws_net_profit#7]
Keys [1]: [ws_order_number#5]
Functions [2]: [partial_sum(UnscaledValue(ws_ext_ship_cost#6)), partial_sum(UnscaledValue(ws_net_profit#7))]
Aggregate Attributes [2]: [sum(UnscaledValue(ws_ext_ship_cost#6))#23, sum(UnscaledValue(ws_net_profit#7))#24]
Results [3]: [ws_order_number#5, sum#25, sum#26]

(42) HashAggregate [codegen id : 11]
Input [3]: [ws_order_number#5, sum#25, sum#26]
Keys [1]: [ws_order_number#5]
Functions [2]: [merge_sum(UnscaledValue(ws_ext_ship_cost#6)), merge_sum(UnscaledValue(ws_net_profit#7))]
Aggregate Attributes [2]: [sum(UnscaledValue(ws_ext_ship_cost#6))#23, sum(UnscaledValue(ws_net_profit#7))#24]
Results [3]: [ws_order_number#5, sum#25, sum#26]

(43) HashAggregate [codegen id : 11]
Input [3]: [ws_order_number#5, sum#25, sum#26]
Keys: []
Functions [3]: [merge_sum(UnscaledValue(ws_ext_ship_cost#6)), merge_sum(UnscaledValue(ws_net_profit#7)), partial_count(distinct ws_order_number#5)]
Aggregate Attributes [3]: [sum(UnscaledValue(ws_ext_ship_cost#6))#23, sum(UnscaledValue(ws_net_profit#7))#24, count(ws_order_number#5)#27]
Results [3]: [sum#25, sum#26, count#28]

(44) Exchange
Input [3]: [sum#25, sum#26, count#28]
Arguments: SinglePartition, ENSURE_REQUIREMENTS, [plan_id=10]

(45) HashAggregate [codegen id : 12]
Input [3]: [sum#25, sum#26, count#28]
Keys: []
Functions [3]: [sum(UnscaledValue(ws_ext_ship_cost#6)), sum(UnscaledValue(ws_net_profit#7)), count(distinct ws_order_number#5)]
Aggregate Attributes [3]: [sum(UnscaledValue(ws_ext_ship_cost#6))#23, sum(UnscaledValue(ws_net_profit#7))#24, count(ws_order_number#5)#27]
Results [3]: [count(ws_order_number#5)#27 AS order count #29, MakeDecimal(sum(UnscaledValue(ws_ext_ship_cost#6))#23,17,2) AS total shipping cost #30, MakeDecimal(sum(UnscaledValue(ws_net_profit#7))#24,17,2) AS total net profit #31]

===== Subqueries =====

Subquery:1 Hosting operator id = 3 Hosting Expression = Subquery scalar-subquery#9, [id=#1]
ObjectHashAggregate (52)
+- Exchange (51)
   +- ObjectHashAggregate (50)
      +- * Project (49)
         +- * Filter (48)
            +- * ColumnarToRow (47)
               +- Scan parquet spark_catalog.default.customer_address (46)


(46) Scan parquet spark_catalog.default.customer_address
Output [2]: [ca_address_sk#17, ca_state#18]
Batched: true
Location [not included in comparison]/{warehouse_dir}/customer_address]
PushedFilters: [IsNotNull(ca_state), EqualTo(ca_state,IL), IsNotNull(ca_address_sk)]
ReadSchema: struct<ca_address_sk:int,ca_state:string>

(47) ColumnarToRow [codegen id : 1]
Input [2]: [ca_address_sk#17, ca_state#18]

(48) Filter [codegen id : 1]
Input [2]: [ca_address_sk#17, ca_state#18]
Condition : ((isnotnull(ca_state#18) AND (ca_state#18 = IL)) AND isnotnull(ca_address_sk#17))

(49) Project [codegen id : 1]
Output [1]: [ca_address_sk#17]
Input [2]: [ca_address_sk#17, ca_state#18]

(50) ObjectHashAggregate
Input [1]: [ca_address_sk#17]
Keys: []
Functions [1]: [partial_bloom_filter_agg(xxhash64(ca_address_sk#17, 42), 17961, 333176, 0, 0)]
Aggregate Attributes [1]: [buf#32]
Results [1]: [buf#33]

(51) Exchange
Input [1]: [buf#33]
Arguments: SinglePartition, ENSURE_REQUIREMENTS, [plan_id=11]

(52) ObjectHashAggregate
Input [1]: [buf#33]
Keys: []
Functions [1]: [bloom_filter_agg(xxhash64(ca_address_sk#17, 42), 17961, 333176, 0, 0)]
Aggregate Attributes [1]: [bloom_filter_agg(xxhash64(ca_address_sk#17, 42), 17961, 333176, 0, 0)#34]
Results [1]: [bloom_filter_agg(xxhash64(ca_address_sk#17, 42), 17961, 333176, 0, 0)#34 AS bloomFilter#35]

Subquery:2 Hosting operator id = 3 Hosting Expression = Subquery scalar-subquery#10, [id=#2]
ObjectHashAggregate (59)
+- Exchange (58)
   +- ObjectHashAggregate (57)
      +- * Project (56)
         +- * Filter (55)
            +- * ColumnarToRow (54)
               +- Scan parquet spark_catalog.default.web_site (53)


(53) Scan parquet spark_catalog.default.web_site
Output [2]: [web_site_sk#19, web_company_name#20]
Batched: true
Location [not included in comparison]/{warehouse_dir}/web_site]
PushedFilters: [IsNotNull(web_company_name), EqualTo(web_company_name,pri                                               ), IsNotNull(web_site_sk)]
ReadSchema: struct<web_site_sk:int,web_company_name:string>

(54) ColumnarToRow [codegen id : 1]
Input [2]: [web_site_sk#19, web_company_name#20]

(55) Filter [codegen id : 1]
Input [2]: [web_site_sk#19, web_company_name#20]
Condition : ((isnotnull(web_company_name#20) AND (web_company_name#20 = pri                                               )) AND isnotnull(web_site_sk#19))

(56) Project [codegen id : 1]
Output [1]: [web_site_sk#19]
Input [2]: [web_site_sk#19, web_company_name#20]

(57) ObjectHashAggregate
Input [1]: [web_site_sk#19]
Keys: []
Functions [1]: [partial_bloom_filter_agg(xxhash64(web_site_sk#19, 42), 4, 144, 0, 0)]
Aggregate Attributes [1]: [buf#36]
Results [1]: [buf#37]

(58) Exchange
Input [1]: [buf#37]
Arguments: SinglePartition, ENSURE_REQUIREMENTS, [plan_id=12]

(59) ObjectHashAggregate
Input [1]: [buf#37]
Keys: []
Functions [1]: [bloom_filter_agg(xxhash64(web_site_sk#19, 42), 4, 144, 0, 0)]
Aggregate Attributes [1]: [bloom_filter_agg(xxhash64(web_site_sk#19, 42), 4, 144, 0, 0)#38]
Results [1]: [bloom_filter_agg(xxhash64(web_site_sk#19, 42), 4, 144, 0, 0)#38 AS bloomFilter#39]

Subquery:3 Hosting operator id = 3 Hosting Expression = Subquery scalar-subquery#11, [id=#3]
ObjectHashAggregate (66)
+- Exchange (65)
   +- ObjectHashAggregate (64)
      +- * Project (63)
         +- * Filter (62)
            +- * ColumnarToRow (61)
               +- Scan parquet spark_catalog.default.date_dim (60)


(60) Scan parquet spark_catalog.default.date_dim
Output [2]: [d_date_sk#21, d_date#22]
Batched: true
Location [not included in comparison]/{warehouse_dir}/date_dim]
PushedFilters: [IsNotNull(d_date), GreaterThanOrEqual(d_date,1999-02-01), LessThanOrEqual(d_date,1999-04-02), IsNotNull(d_date_sk)]
ReadSchema: struct<d_date_sk:int,d_date:date>

(61) ColumnarToRow [codegen id : 1]
Input [2]: [d_date_sk#21, d_date#22]

(62) Filter [codegen id : 1]
Input [2]: [d_date_sk#21, d_date#22]
Condition : (((isnotnull(d_date#22) AND (d_date#22 >= 1999-02-01)) AND (d_date#22 <= 1999-04-02)) AND isnotnull(d_date_sk#21))

(63) Project [codegen id : 1]
Output [1]: [d_date_sk#21]
Input [2]: [d_date_sk#21, d_date#22]

(64) ObjectHashAggregate
Input [1]: [d_date_sk#21]
Keys: []
Functions [1]: [partial_bloom_filter_agg(xxhash64(d_date_sk#21, 42), 73049, 1141755, 0, 0)]
Aggregate Attributes [1]: [buf#40]
Results [1]: [buf#41]

(65) Exchange
Input [1]: [buf#41]
Arguments: SinglePartition, ENSURE_REQUIREMENTS, [plan_id=13]

(66) ObjectHashAggregate
Input [1]: [buf#41]
Keys: []
Functions [1]: [bloom_filter_agg(xxhash64(d_date_sk#21, 42), 73049, 1141755, 0, 0)]
Aggregate Attributes [1]: [bloom_filter_agg(xxhash64(d_date_sk#21, 42), 73049, 1141755, 0, 0)#42]
Results [1]: [bloom_filter_agg(xxhash64(d_date_sk#21, 42), 73049, 1141755, 0, 0)#42 AS bloomFilter#43]


