c1bb3316bd
## What changes were proposed in this pull request? Add `count_if` function which returns the number of records satisfying a given condition. There is no aggregation function like this in Spark, so we need to write like - `COUNT(CASE WHEN some_condition THEN 1 END)` or - `SUM(CASE WHEN some_condition THEN 1 END)`, which looks painful. This kind of function is already supported in Presto, BigQuery and even Excel. - Presto: [`count_if`](https://prestodb.github.io/docs/current/functions/aggregate.html#count_if) - BigQuery: [`countif`](https://cloud.google.com/bigquery/docs/reference/standard-sql/aggregate_functions?hl=en#countif) - Excel: [`COUNTIF`](https://support.office.com/en-us/article/countif-function-e0de10c6-f885-4e71-abb4-1f464816df34?omkt=en-US&ui=en-US&rs=en-US&ad=US) (It is a little different from above twos) ## How was this patch tested? This patch is tested by unit test. Closes #24335 from cryeo/SPARK-27425. Authored-by: Chaerim Yeo <yeochaerim@gmail.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org> |
||
---|---|---|
.. | ||
main | ||
test |