spark-instrumented-optimizer/sql/core
angerszhu 643cd876e4 [SPARK-32352][SQL] Partially push down support data filter if it mixed in partition filters
### What changes were proposed in this pull request?
We support partially push partition filters since SPARK-28169. We can also support partially push down data filters if it mixed in partition filters and data filters. For example:
```
spark.sql(
  s"""
     |CREATE TABLE t(i INT, p STRING)
     |USING parquet
     |PARTITIONED BY (p)""".stripMargin)

spark.range(0, 1000).selectExpr("id as col").createOrReplaceTempView("temp")
for (part <- Seq(1, 2, 3, 4)) {
  sql(s"""
         |INSERT OVERWRITE TABLE t PARTITION (p='$part')
         |SELECT col FROM temp""".stripMargin)
}

spark.sql("SELECT * FROM t WHERE  WHERE (p = '1' AND i = 1) OR (p = '2' and i = 2)").explain()
```

We can also push down ```i = 1 or i = 2 ```

### Why are the changes needed?
Extract more data filter to FileSourceScanExec

### Does this PR introduce _any_ user-facing change?
NO

### How was this patch tested?
Added UT

Closes #29406 from AngersZhuuuu/SPARK-32352.

Authored-by: angerszhu <angers.zhu@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2020-08-12 12:18:33 +00:00
..
benchmarks [SPARK-30648][SQL] Support filters pushdown in JSON datasource 2020-07-17 00:01:13 +09:00
src [SPARK-32352][SQL] Partially push down support data filter if it mixed in partition filters 2020-08-12 12:18:33 +00:00
v1.2/src [SPARK-25557][SQL] Nested column predicate pushdown for ORC 2020-08-07 08:07:41 -07:00
v2.3/src [SPARK-25557][SQL] Nested column predicate pushdown for ORC 2020-08-07 08:07:41 -07:00
pom.xml [SPARK-31336][SQL] Support Oracle Kerberos login in JDBC connector 2020-06-30 10:30:22 -07:00