### What changes were proposed in this pull request?
As discussed in
https://github.com/apache/spark/pull/30145#discussion_r514728642https://github.com/apache/spark/pull/30145#discussion_r514734648
We need to rewrite current Grouping Analytics grammar to support as flexible as Postgres SQL to support subsequent development.
In postgres sql, it support
```
select a, b, c, count(1) from t group by cube (a, b, c);
select a, b, c, count(1) from t group by cube(a, b, c);
select a, b, c, count(1) from t group by cube (a, b, c, (a, b), (a, b, c));
select a, b, c, count(1) from t group by rollup(a, b, c);
select a, b, c, count(1) from t group by rollup (a, b, c);
select a, b, c, count(1) from t group by rollup (a, b, c, (a, b), (a, b, c));
```
In this pr, we have done three things as below, and we will split it to different pr:
- Refactor CUBE/ROLLUP (regarding them as ANTLR tokens in a parser)
- Refactor GROUPING SETS (the logical node -> a new expr)
- Support new syntax for CUBE/ROLLUP (e.g., GROUP BY CUBE ((a, b), (a, c)))
### Why are the changes needed?
Rewrite current Grouping Analytics grammar to support as flexible as Postgres SQL to support subsequent development.
### Does this PR introduce _any_ user-facing change?
User can write Grouping Analytics grammar as flexible as Postgres SQL to support subsequent development.
### How was this patch tested?
Added UT
Closes#30212 from AngersZhuuuu/refact-grouping-analytics.
Lead-authored-by: angerszhu <angers.zhu@gmail.com>
Co-authored-by: Angerszhuuuu <angers.zhu@gmail.com>
Co-authored-by: AngersZhuuuu <angers.zhu@gmail.com>
Co-authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
### What changes were proposed in this pull request?
This PR intends to fix typos in the sub-modules:
* `bin`
* `core`
* `docs`
* `external`
* `mllib`
* `repl`
* `pom.xml`
Split per srowen https://github.com/apache/spark/pull/30323#issuecomment-728981618
NOTE: The misspellings have been reported at 706a726f87 (commitcomment-44064356)
### Why are the changes needed?
Misspelled words make it harder to read / understand content.
### Does this PR introduce _any_ user-facing change?
There are various fixes to documentation, etc...
### How was this patch tested?
No testing was performed
Closes#30530 from jsoref/spelling-bin-core-docs-external-mllib-repl.
Authored-by: Josh Soref <jsoref@users.noreply.github.com>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
### What changes were proposed in this pull request?
Fix typo for docs, log messages and comments
### Why are the changes needed?
typo fix to increase readability
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
manual test has been performed to test the updated
Closes#29443 from brandonJY/spell-fix-doc.
Authored-by: Brandon Jiang <Brandon.jiang.a@outlook.com>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
### What changes were proposed in this pull request?
Remove the unneeded embedded inline HTML markup by using the basic markdown syntax.
Please see #28414
### Why are the changes needed?
Make the doc cleaner and easily editable by MD editors.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Manually build and check
Closes#28451 from huaxingao/html_cleanup.
Authored-by: Huaxin Gao <huaxing@us.ibm.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
### What changes were proposed in this pull request?
This PR intends to clean up the SQL documents in `doc/sql-ref*`.
Main changes are as follows;
- Fixes wrong syntaxes and capitalize sub-titles
- Adds some DDL queries in `Examples` so that users can run examples there
- Makes query output in `Examples` follows the `Dataset.showString` (right-aligned) format
- Adds/Removes spaces, Indents, or blank lines to follow the format below;
```
---
license...
---
### Description
Writes what's the syntax is.
### Syntax
{% highlight sql %}
SELECT...
WHERE... // 4 indents after the second line
...
{% endhighlight %}
### Parameters
<dl>
<dt><code><em>Param Name</em></code></dt>
<dd>
Param Description
</dd>
...
</dl>
### Examples
{% highlight sql %}
-- It is better that users are able to execute example queries here.
-- So, we prepare test data in the first section if possible.
CREATE TABLE t (key STRING, value DOUBLE);
INSERT INTO t VALUES
('a', 1.0), ('a', 2.0), ('b', 3.0), ('c', 4.0);
-- query output has 2 indents and it follows the `Dataset.showString`
-- format (right-aligned).
SELECT * FROM t;
+---+-----+
|key|value|
+---+-----+
| a| 1.0|
| a| 2.0|
| b| 3.0|
| c| 4.0|
+---+-----+
-- Query statements after the second line have 4 indents.
SELECT key, SUM(value)
FROM t
GROUP BY key;
+---+----------+
|key|sum(value)|
+---+----------+
| c| 4.0|
| b| 3.0|
| a| 3.0|
+---+----------+
...
{% endhighlight %}
### Related Statements
* [XXX](xxx.html)
* ...
```
### Why are the changes needed?
The most changes of this PR are pretty minor, but I think the consistent formats/rules to write documents are important for long-term maintenance in our community
### Does this PR introduce any user-facing change?
Yes.
### How was this patch tested?
Manually checked.
Closes#28151 from maropu/MakeRightAligned.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Sean Owen <srowen@gmail.com>
### What changes were proposed in this pull request?
This PR intends to improve the SQL document of `GROUP BY`; it added the description about FILTER clauses of aggregate functions.
### Why are the changes needed?
To improve the SQL documents
### Does this PR introduce any user-facing change?
Yes.
<img src="https://user-images.githubusercontent.com/692303/78558612-e2234a80-784d-11ea-9353-b3feac4d57a7.png" width="500">
### How was this patch tested?
Manually checked.
Closes#28134 from maropu/SPARK-31358.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
### What changes were proposed in this pull request?
A few improvements to the sql ref SELECT doc:
1. correct the syntax of SELECT query
2. correct the default of null sort order
3. correct the GROUP BY syntax
4. several minor fixes
### Why are the changes needed?
refine document
### Does this PR introduce any user-facing change?
N/A
### How was this patch tested?
N/A
Closes#27866 from cloud-fan/doc.
Authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
### What changes were proposed in this pull request?
- Sets up links between related sections.
- Add "Related sections" for each section.
- Change to the left hand side menu to reflect the current status of the doc.
- Other minor cleanups.
### Why are the changes needed?
Currently Spark lacks documentation on the supported SQL constructs causing
confusion among users who sometimes have to look at the code to understand the
usage. This is aimed at addressing this issue.
### Does this PR introduce any user-facing change?
Yes.
### How was this patch tested?
Tested using jykyll build --serve
Closes#27371 from dilipbiswal/select_finalization.
Authored-by: Dilip Biswal <dkbiswal@gmail.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
### What changes were proposed in this pull request?
Fix a few super nit problems
### Why are the changes needed?
To make doc look better
### Does this PR introduce any user-facing change?
Yes
### How was this patch tested?
Tested using jykyll build --serve
Closes#27332 from huaxingao/spark-30575-followup.
Authored-by: Huaxin Gao <huaxing@us.ibm.com>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
## What changes were proposed in this pull request?
This is a initial PR that creates the table of content for SQL reference guide. The left side bar will displays additional menu items corresponding to supported SQL constructs. One this PR is merged, we will fill in the content incrementally. Additionally this PR contains a minor change to make the left sidebar scrollable. Currently it is not possible to scroll in the left hand side window.
## How was this patch tested?
Used jekyll build and serve to verify.
Closes#25459 from dilipbiswal/ref-doc.
Authored-by: Dilip Biswal <dbiswal@us.ibm.com>
Signed-off-by: gatorsmile <gatorsmile@gmail.com>