spark-instrumented-optimizer

History

shivusondur 167fa0402d [SPARK-28390][SQL][PYTHON][TESTS] Convert and port 'pgSQL/select_having.sql' into UDF test base ## What changes were proposed in this pull request? changed the test according to steps mentioned in SPARK-27921 <details> <summary>difference comparing to select_having.sql</summary> <p> ```diff diff --git a/sql/core/src/test/resources/sql-tests/results/pgSQL/select_having.sql.out b/sql/core/src/test/resources/sql-tests/results/udf/pgSQL/udf-select_having.sql.out index 02536eb..f731d11 100644 --- a/sql/core/src/test/resources/sql-tests/results/pgSQL/select_having.sql.out +++ b/sql/core/src/test/resources/sql-tests/results/udf/pgSQL/udf-select_having.sql.out -91,54 +91,54 struct<> -- !query 11 -SELECT b, c FROM test_having - GROUP BY b, c HAVING count() = 1 ORDER BY b, c +SELECT udf(b), udf(c) FROM test_having + GROUP BY b, c HAVING udf(count()) = 1 ORDER BY udf(b), udf(c) -- !query 11 schema -struct<b:int,c:string> +struct<CAST(udf(cast(b as string)) AS INT):int,CAST(udf(cast(c as string)) AS STRING):string> -- !query 11 output 1 XXXX 3 bbbb -- !query 12 -SELECT b, c FROM test_having - GROUP BY b, c HAVING b = 3 ORDER BY b, c +SELECT udf(b), udf(c) FROM test_having + GROUP BY b, c HAVING udf(b) = 3 ORDER BY udf(b), udf(c) -- !query 12 schema -struct<b:int,c:string> +struct<CAST(udf(cast(b as string)) AS INT):int,CAST(udf(cast(c as string)) AS STRING):string> -- !query 12 output 3 BBBB 3 bbbb -- !query 13 -SELECT c, max(a) FROM test_having - GROUP BY c HAVING count() > 2 OR min(a) = max(a) +SELECT udf(c), max(udf(a)) FROM test_having + GROUP BY c HAVING udf(count()) > 2 OR udf(min(a)) = udf(max(a)) ORDER BY c -- !query 13 schema -struct<c:string,max(a):int> +struct<CAST(udf(cast(c as string)) AS STRING):string,max(CAST(udf(cast(a as string)) AS INT)):int> -- !query 13 output XXXX 0 bbbb 5 -- !query 14 -SELECT min(a), max(a) FROM test_having HAVING min(a) = max(a) +SELECT udf(udf(min(udf(a)))), udf(udf(max(udf(a)))) FROM test_having HAVING udf(udf(min(udf(a)))) = udf(udf(max(udf(a)))) -- !query 14 schema -struct<min(a):int,max(a):int> +struct<CAST(udf(cast(cast(udf(cast(min(cast(udf(cast(a as string)) as int)) as string)) as int) as string)) AS INT):int,CAST(udf(cast(cast(udf(cast(max(cast(udf(cast(a as string)) as int)) as string)) as int) as string)) AS INT):int> -- !query 14 output -- !query 15 -SELECT min(a), max(a) FROM test_having HAVING min(a) < max(a) +SELECT udf(min(udf(a))), udf(udf(max(a))) FROM test_having HAVING udf(min(a)) < udf(max(udf(a))) -- !query 15 schema -struct<min(a):int,max(a):int> +struct<CAST(udf(cast(min(cast(udf(cast(a as string)) as int)) as string)) AS INT):int,CAST(udf(cast(cast(udf(cast(max(a) as string)) as int) as string)) AS INT):int> -- !query 15 output 0 9 -- !query 16 -SELECT a FROM test_having HAVING min(a) < max(a) +SELECT udf(a) FROM test_having HAVING udf(min(a)) < udf(max(a)) -- !query 16 schema struct<> -- !query 16 output -147,16 +147,16 grouping expressions sequence is empty, and 'default.test_having.`a`' is not an -- !query 17 -SELECT 1 AS one FROM test_having HAVING a > 1 +SELECT 1 AS one FROM test_having HAVING udf(a) > 1 -- !query 17 schema struct<> -- !query 17 output org.apache.spark.sql.AnalysisException -cannot resolve '`a`' given input columns: [one]; line 1 pos 40 +cannot resolve '`a`' given input columns: [one]; line 1 pos 44 -- !query 18 -SELECT 1 AS one FROM test_having HAVING 1 > 2 +SELECT 1 AS one FROM test_having HAVING udf(udf(1) > udf(2)) -- !query 18 schema struct<one:int> -- !query 18 output -164,7 +164,7 struct<one:int> -- !query 19 -SELECT 1 AS one FROM test_having HAVING 1 < 2 +SELECT 1 AS one FROM test_having HAVING udf(udf(1) < udf(2)) -- !query 19 schema struct<one:int> -- !query 19 output -172,7 +172,7 struct<one:int> -- !query 20 -SELECT 1 AS one FROM test_having WHERE 1/a = 1 HAVING 1 < 2 +SELECT 1 AS one FROM test_having WHERE 1/udf(a) = 1 HAVING 1 < 2 -- !query 20 schema struct<one:int> -- !query 20 output ``` </p> </details> ## How was this patch tested? by: ```bash sudo SPARK_GENERATE_GOLDEN_FILES=1 build/sbt "sql/test-only *SQLQueryTestSuite -- -z udf/pgSQL/udf-select_having.sql" ``` Closes #25161 from shivusondur/jira28390. Authored-by: shivusondur <shivusondur@gmail.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>		2019-07-24 14:43:39 +09:00
..
benchmarks	[SPARK-27707][SQL] Prune unnecessary nested fields from Generate	2019-07-18 23:32:07 -07:00
src	[SPARK-28390][SQL][PYTHON][TESTS] Convert and port 'pgSQL/select_having.sql' into UDF test base	2019-07-24 14:43:39 +09:00
v1.2.1/src	[SPARK-28108][SQL][test-hadoop3.2] Simplify OrcFilters	2019-06-24 12:23:52 +08:00
v2.3.5/src	[SPARK-28108][SQL][test-hadoop3.2] Simplify OrcFilters	2019-06-24 12:23:52 +08:00
pom.xml	[SPARK-27521][SQL] Move data source v2 to catalyst module	2019-06-05 09:55:55 -07:00