spark-instrumented-optimizer

History

Marco Gaido 88e0c7bbd5 [SPARK-24341][SQL] Support only IN subqueries with the same number of items per row ## What changes were proposed in this pull request? Using struct types in subqueries with the `IN` clause can generate invalid plans in `RewritePredicateSubquery`. Indeed, we are not handling clearly the cases when the outer value is a struct or the output of the inner subquery is a struct. The PR aims to make Spark's behavior the same as the one of the other RDBMS - namely Oracle and Postgres behavior were checked. So we consider valid only queries having the same number of fields in the outer value and in the subquery. This means that: - `(a, b) IN (select c, d from ...)` is a valid query; - `(a, b) IN (select (c, d) from ...)` throws an AnalysisException, as in the subquery we have only one field of type struct while in the outer value we have 2 fields; - `a IN (select (c, d) from ...)` - where `a` is a struct - is a valid query. ## How was this patch tested? Added UT Closes #21403 from mgaido91/SPARK-24313. Authored-by: Marco Gaido <marcogaido91@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>		2018-08-07 15:43:41 +08:00
..
benchmarks	[SPARK-24549][SQL] Support Decimal type push down to the parquet data sources	2018-07-16 15:44:51 +08:00
src	[SPARK-24341][SQL] Support only IN subqueries with the same number of items per row	2018-08-07 15:43:41 +08:00
pom.xml	[SPARK-25019][BUILD] Fix orc dependency to use the same exclusion rules	2018-08-06 12:00:39 -07:00