spark-instrumented-optimizer

History

Liang-Chi Hsieh e6a0385289 [SPARK-28422][SQL][PYTHON] GROUPED_AGG pandas_udf should work without group by clause ## What changes were proposed in this pull request? A GROUPED_AGG pandas python udf can't work, if without group by clause, like `select udf(id) from table`. This doesn't match with aggregate function like sum, count..., and also dataset API like `df.agg(udf(df['id']))`. When we parse a udf (or an aggregate function) like that from SQL syntax, it is known as a function in a project. `GlobalAggregates` rule in analysis makes such project as aggregate, by looking for aggregate expressions. At the moment, we should also look for GROUPED_AGG pandas python udf. ## How was this patch tested? Added tests. Closes #25352 from viirya/SPARK-28422. Authored-by: Liang-Chi Hsieh <viirya@gmail.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>	2019-08-14 00:32:33 +09:00
..
main	[SPARK-28383][SQL] SHOW CREATE TABLE is not supported on a temporary view	2019-08-12 21:01:19 -07:00
test	[SPARK-28422][SQL][PYTHON] GROUPED_AGG pandas_udf should work without group by clause	2019-08-14 00:32:33 +09:00

Liang-Chi Hsieh e6a0385289 [SPARK-28422][SQL][PYTHON] GROUPED_AGG pandas_udf should work without group by clause

## What changes were proposed in this pull request?

A GROUPED_AGG pandas python udf can't work, if without group by clause, like `select udf(id) from table`.

This doesn't match with aggregate function like sum, count..., and also dataset API like `df.agg(udf(df['id']))`.

When we parse a udf (or an aggregate function) like that from SQL syntax, it is known as a function in a project. `GlobalAggregates` rule in analysis makes such project as aggregate, by looking for aggregate expressions. At the moment, we should also look for GROUPED_AGG pandas python udf.

## How was this patch tested?

Added tests.

Closes #25352 from viirya/SPARK-28422.

Authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>

2019-08-14 00:32:33 +09:00

main

[SPARK-28383][SQL] SHOW CREATE TABLE is not supported on a temporary view

2019-08-12 21:01:19 -07:00

test

[SPARK-28422][SQL][PYTHON] GROUPED_AGG pandas_udf should work without group by clause

2019-08-14 00:32:33 +09:00