2019-08-19 02:17:50 -04:00
|
|
|
---
|
|
|
|
layout: global
|
2020-01-22 19:30:42 -05:00
|
|
|
title: GROUP BY Clause
|
|
|
|
displayTitle: GROUP BY Clause
|
2019-08-19 02:17:50 -04:00
|
|
|
license: |
|
|
|
|
Licensed to the Apache Software Foundation (ASF) under one or more
|
|
|
|
contributor license agreements. See the NOTICE file distributed with
|
|
|
|
this work for additional information regarding copyright ownership.
|
|
|
|
The ASF licenses this file to You under the Apache License, Version 2.0
|
|
|
|
(the "License"); you may not use this file except in compliance with
|
|
|
|
the License. You may obtain a copy of the License at
|
2020-03-11 19:52:40 -04:00
|
|
|
|
2019-08-19 02:17:50 -04:00
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
2020-03-11 19:52:40 -04:00
|
|
|
|
2019-08-19 02:17:50 -04:00
|
|
|
Unless required by applicable law or agreed to in writing, software
|
|
|
|
distributed under the License is distributed on an "AS IS" BASIS,
|
|
|
|
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
|
|
See the License for the specific language governing permissions and
|
|
|
|
limitations under the License.
|
|
|
|
---
|
[SPARK-31383][SQL][DOC] Clean up the SQL documents in docs/sql-ref*
### What changes were proposed in this pull request?
This PR intends to clean up the SQL documents in `doc/sql-ref*`.
Main changes are as follows;
- Fixes wrong syntaxes and capitalize sub-titles
- Adds some DDL queries in `Examples` so that users can run examples there
- Makes query output in `Examples` follows the `Dataset.showString` (right-aligned) format
- Adds/Removes spaces, Indents, or blank lines to follow the format below;
```
---
license...
---
### Description
Writes what's the syntax is.
### Syntax
{% highlight sql %}
SELECT...
WHERE... // 4 indents after the second line
...
{% endhighlight %}
### Parameters
<dl>
<dt><code><em>Param Name</em></code></dt>
<dd>
Param Description
</dd>
...
</dl>
### Examples
{% highlight sql %}
-- It is better that users are able to execute example queries here.
-- So, we prepare test data in the first section if possible.
CREATE TABLE t (key STRING, value DOUBLE);
INSERT INTO t VALUES
('a', 1.0), ('a', 2.0), ('b', 3.0), ('c', 4.0);
-- query output has 2 indents and it follows the `Dataset.showString`
-- format (right-aligned).
SELECT * FROM t;
+---+-----+
|key|value|
+---+-----+
| a| 1.0|
| a| 2.0|
| b| 3.0|
| c| 4.0|
+---+-----+
-- Query statements after the second line have 4 indents.
SELECT key, SUM(value)
FROM t
GROUP BY key;
+---+----------+
|key|sum(value)|
+---+----------+
| c| 4.0|
| b| 3.0|
| a| 3.0|
+---+----------+
...
{% endhighlight %}
### Related Statements
* [XXX](xxx.html)
* ...
```
### Why are the changes needed?
The most changes of this PR are pretty minor, but I think the consistent formats/rules to write documents are important for long-term maintenance in our community
### Does this PR introduce any user-facing change?
Yes.
### How was this patch tested?
Manually checked.
Closes #28151 from maropu/MakeRightAligned.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Sean Owen <srowen@gmail.com>
2020-04-13 00:40:36 -04:00
|
|
|
|
|
|
|
### Description
|
|
|
|
|
2020-05-10 13:57:25 -04:00
|
|
|
The `GROUP BY` clause is used to group the rows based on a set of specified grouping expressions and compute aggregations on
|
2020-03-11 19:52:40 -04:00
|
|
|
the group of rows based on one or more specified aggregate functions. Spark also supports advanced aggregations to do multiple
|
2020-01-22 19:30:42 -05:00
|
|
|
aggregations for the same input record set via `GROUPING SETS`, `CUBE`, `ROLLUP` clauses.
|
2020-04-06 08:36:51 -04:00
|
|
|
When a FILTER clause is attached to an aggregate function, only the matching rows are passed to that function.
|
2019-08-19 02:17:50 -04:00
|
|
|
|
2020-01-22 19:30:42 -05:00
|
|
|
### Syntax
|
[SPARK-31383][SQL][DOC] Clean up the SQL documents in docs/sql-ref*
### What changes were proposed in this pull request?
This PR intends to clean up the SQL documents in `doc/sql-ref*`.
Main changes are as follows;
- Fixes wrong syntaxes and capitalize sub-titles
- Adds some DDL queries in `Examples` so that users can run examples there
- Makes query output in `Examples` follows the `Dataset.showString` (right-aligned) format
- Adds/Removes spaces, Indents, or blank lines to follow the format below;
```
---
license...
---
### Description
Writes what's the syntax is.
### Syntax
{% highlight sql %}
SELECT...
WHERE... // 4 indents after the second line
...
{% endhighlight %}
### Parameters
<dl>
<dt><code><em>Param Name</em></code></dt>
<dd>
Param Description
</dd>
...
</dl>
### Examples
{% highlight sql %}
-- It is better that users are able to execute example queries here.
-- So, we prepare test data in the first section if possible.
CREATE TABLE t (key STRING, value DOUBLE);
INSERT INTO t VALUES
('a', 1.0), ('a', 2.0), ('b', 3.0), ('c', 4.0);
-- query output has 2 indents and it follows the `Dataset.showString`
-- format (right-aligned).
SELECT * FROM t;
+---+-----+
|key|value|
+---+-----+
| a| 1.0|
| a| 2.0|
| b| 3.0|
| c| 4.0|
+---+-----+
-- Query statements after the second line have 4 indents.
SELECT key, SUM(value)
FROM t
GROUP BY key;
+---+----------+
|key|sum(value)|
+---+----------+
| c| 4.0|
| b| 3.0|
| a| 3.0|
+---+----------+
...
{% endhighlight %}
### Related Statements
* [XXX](xxx.html)
* ...
```
### Why are the changes needed?
The most changes of this PR are pretty minor, but I think the consistent formats/rules to write documents are important for long-term maintenance in our community
### Does this PR introduce any user-facing change?
Yes.
### How was this patch tested?
Manually checked.
Closes #28151 from maropu/MakeRightAligned.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Sean Owen <srowen@gmail.com>
2020-04-13 00:40:36 -04:00
|
|
|
|
2020-05-10 13:57:25 -04:00
|
|
|
```sql
|
2020-03-11 19:52:40 -04:00
|
|
|
GROUP BY group_expression [ , group_expression [ , ... ] ]
|
[SPARK-31383][SQL][DOC] Clean up the SQL documents in docs/sql-ref*
### What changes were proposed in this pull request?
This PR intends to clean up the SQL documents in `doc/sql-ref*`.
Main changes are as follows;
- Fixes wrong syntaxes and capitalize sub-titles
- Adds some DDL queries in `Examples` so that users can run examples there
- Makes query output in `Examples` follows the `Dataset.showString` (right-aligned) format
- Adds/Removes spaces, Indents, or blank lines to follow the format below;
```
---
license...
---
### Description
Writes what's the syntax is.
### Syntax
{% highlight sql %}
SELECT...
WHERE... // 4 indents after the second line
...
{% endhighlight %}
### Parameters
<dl>
<dt><code><em>Param Name</em></code></dt>
<dd>
Param Description
</dd>
...
</dl>
### Examples
{% highlight sql %}
-- It is better that users are able to execute example queries here.
-- So, we prepare test data in the first section if possible.
CREATE TABLE t (key STRING, value DOUBLE);
INSERT INTO t VALUES
('a', 1.0), ('a', 2.0), ('b', 3.0), ('c', 4.0);
-- query output has 2 indents and it follows the `Dataset.showString`
-- format (right-aligned).
SELECT * FROM t;
+---+-----+
|key|value|
+---+-----+
| a| 1.0|
| a| 2.0|
| b| 3.0|
| c| 4.0|
+---+-----+
-- Query statements after the second line have 4 indents.
SELECT key, SUM(value)
FROM t
GROUP BY key;
+---+----------+
|key|sum(value)|
+---+----------+
| c| 4.0|
| b| 3.0|
| a| 3.0|
+---+----------+
...
{% endhighlight %}
### Related Statements
* [XXX](xxx.html)
* ...
```
### Why are the changes needed?
The most changes of this PR are pretty minor, but I think the consistent formats/rules to write documents are important for long-term maintenance in our community
### Does this PR introduce any user-facing change?
Yes.
### How was this patch tested?
Manually checked.
Closes #28151 from maropu/MakeRightAligned.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Sean Owen <srowen@gmail.com>
2020-04-13 00:40:36 -04:00
|
|
|
[ { WITH ROLLUP | WITH CUBE | GROUPING SETS (grouping_set [ , ...]) } ]
|
2020-03-11 19:52:40 -04:00
|
|
|
|
|
|
|
GROUP BY GROUPING SETS (grouping_set [ , ...])
|
2020-05-10 13:57:25 -04:00
|
|
|
```
|
2020-01-22 19:30:42 -05:00
|
|
|
|
2020-04-06 08:36:51 -04:00
|
|
|
While aggregate functions are defined as
|
2020-05-10 13:57:25 -04:00
|
|
|
```sql
|
2020-04-06 08:36:51 -04:00
|
|
|
aggregate_name ( [ DISTINCT ] expression [ , ... ] ) [ FILTER ( WHERE boolean_expression ) ]
|
2020-05-10 13:57:25 -04:00
|
|
|
```
|
2020-04-06 08:36:51 -04:00
|
|
|
|
2020-01-22 19:30:42 -05:00
|
|
|
### Parameters
|
[SPARK-31383][SQL][DOC] Clean up the SQL documents in docs/sql-ref*
### What changes were proposed in this pull request?
This PR intends to clean up the SQL documents in `doc/sql-ref*`.
Main changes are as follows;
- Fixes wrong syntaxes and capitalize sub-titles
- Adds some DDL queries in `Examples` so that users can run examples there
- Makes query output in `Examples` follows the `Dataset.showString` (right-aligned) format
- Adds/Removes spaces, Indents, or blank lines to follow the format below;
```
---
license...
---
### Description
Writes what's the syntax is.
### Syntax
{% highlight sql %}
SELECT...
WHERE... // 4 indents after the second line
...
{% endhighlight %}
### Parameters
<dl>
<dt><code><em>Param Name</em></code></dt>
<dd>
Param Description
</dd>
...
</dl>
### Examples
{% highlight sql %}
-- It is better that users are able to execute example queries here.
-- So, we prepare test data in the first section if possible.
CREATE TABLE t (key STRING, value DOUBLE);
INSERT INTO t VALUES
('a', 1.0), ('a', 2.0), ('b', 3.0), ('c', 4.0);
-- query output has 2 indents and it follows the `Dataset.showString`
-- format (right-aligned).
SELECT * FROM t;
+---+-----+
|key|value|
+---+-----+
| a| 1.0|
| a| 2.0|
| b| 3.0|
| c| 4.0|
+---+-----+
-- Query statements after the second line have 4 indents.
SELECT key, SUM(value)
FROM t
GROUP BY key;
+---+----------+
|key|sum(value)|
+---+----------+
| c| 4.0|
| b| 3.0|
| a| 3.0|
+---+----------+
...
{% endhighlight %}
### Related Statements
* [XXX](xxx.html)
* ...
```
### Why are the changes needed?
The most changes of this PR are pretty minor, but I think the consistent formats/rules to write documents are important for long-term maintenance in our community
### Does this PR introduce any user-facing change?
Yes.
### How was this patch tested?
Manually checked.
Closes #28151 from maropu/MakeRightAligned.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Sean Owen <srowen@gmail.com>
2020-04-13 00:40:36 -04:00
|
|
|
|
2020-05-10 13:57:25 -04:00
|
|
|
* **GROUPING SETS**
|
|
|
|
|
2020-03-11 19:52:40 -04:00
|
|
|
Groups the rows for each subset of the expressions specified in the grouping sets. For example,
|
2020-05-10 13:57:25 -04:00
|
|
|
`GROUP BY GROUPING SETS (warehouse, product)` is semantically equivalent
|
|
|
|
to union of results of `GROUP BY warehouse` and `GROUP BY product`. This clause
|
|
|
|
is a shorthand for a `UNION ALL` where each leg of the `UNION ALL`
|
|
|
|
operator performs aggregation of subset of the columns specified in the `GROUPING SETS` clause.
|
|
|
|
|
|
|
|
* **grouping_set**
|
|
|
|
|
|
|
|
A grouping set is specified by zero or more comma-separated expressions in parentheses.
|
|
|
|
|
|
|
|
**Syntax:** `( [ expression [ , ... ] ] )`
|
|
|
|
|
|
|
|
* **grouping_expression**
|
|
|
|
|
2020-01-22 19:30:42 -05:00
|
|
|
Specifies the critieria based on which the rows are grouped together. The grouping of rows is performed based on
|
|
|
|
result values of the grouping expressions. A grouping expression may be a column alias, a column position
|
|
|
|
or an expression.
|
2020-05-10 13:57:25 -04:00
|
|
|
|
|
|
|
* **ROLLUP**
|
|
|
|
|
2020-03-11 19:52:40 -04:00
|
|
|
Specifies multiple levels of aggregations in a single statement. This clause is used to compute aggregations
|
2020-05-10 13:57:25 -04:00
|
|
|
based on multiple grouping sets. `ROLLUP` is a shorthand for `GROUPING SETS`. For example,
|
|
|
|
`GROUP BY warehouse, product WITH ROLLUP` is equivalent to `GROUP BY GROUPING SETS
|
|
|
|
((warehouse, product), (warehouse), ())`.
|
|
|
|
The N elements of a `ROLLUP` specification results in N+1 `GROUPING SETS`.
|
|
|
|
|
|
|
|
* **CUBE**
|
|
|
|
|
|
|
|
`CUBE` clause is used to perform aggregations based on combination of grouping columns specified in the
|
|
|
|
`GROUP BY` clause. `CUBE` is a shorthand for `GROUPING SETS`. For example,
|
|
|
|
`GROUP BY warehouse, product WITH CUBE` is equivalent to `GROUP BY GROUPING SETS
|
|
|
|
((warehouse, product), (warehouse), (product), ())`.
|
|
|
|
The N elements of a `CUBE` specification results in 2^N `GROUPING SETS`.
|
|
|
|
|
|
|
|
* **aggregate_name**
|
|
|
|
|
2020-04-06 08:36:51 -04:00
|
|
|
Specifies an aggregate function name (MIN, MAX, COUNT, SUM, AVG, etc.).
|
2020-05-10 13:57:25 -04:00
|
|
|
|
|
|
|
* **DISTINCT**
|
|
|
|
|
2020-04-06 08:36:51 -04:00
|
|
|
Removes duplicates in input rows before they are passed to aggregate functions.
|
2020-05-10 13:57:25 -04:00
|
|
|
|
|
|
|
* **FILTER**
|
|
|
|
|
|
|
|
Filters the input rows for which the `boolean_expression` in the `WHERE` clause evaluates
|
2020-04-06 08:36:51 -04:00
|
|
|
to true are passed to the aggregate function; other rows are discarded.
|
2020-01-22 19:30:42 -05:00
|
|
|
|
|
|
|
### Examples
|
[SPARK-31383][SQL][DOC] Clean up the SQL documents in docs/sql-ref*
### What changes were proposed in this pull request?
This PR intends to clean up the SQL documents in `doc/sql-ref*`.
Main changes are as follows;
- Fixes wrong syntaxes and capitalize sub-titles
- Adds some DDL queries in `Examples` so that users can run examples there
- Makes query output in `Examples` follows the `Dataset.showString` (right-aligned) format
- Adds/Removes spaces, Indents, or blank lines to follow the format below;
```
---
license...
---
### Description
Writes what's the syntax is.
### Syntax
{% highlight sql %}
SELECT...
WHERE... // 4 indents after the second line
...
{% endhighlight %}
### Parameters
<dl>
<dt><code><em>Param Name</em></code></dt>
<dd>
Param Description
</dd>
...
</dl>
### Examples
{% highlight sql %}
-- It is better that users are able to execute example queries here.
-- So, we prepare test data in the first section if possible.
CREATE TABLE t (key STRING, value DOUBLE);
INSERT INTO t VALUES
('a', 1.0), ('a', 2.0), ('b', 3.0), ('c', 4.0);
-- query output has 2 indents and it follows the `Dataset.showString`
-- format (right-aligned).
SELECT * FROM t;
+---+-----+
|key|value|
+---+-----+
| a| 1.0|
| a| 2.0|
| b| 3.0|
| c| 4.0|
+---+-----+
-- Query statements after the second line have 4 indents.
SELECT key, SUM(value)
FROM t
GROUP BY key;
+---+----------+
|key|sum(value)|
+---+----------+
| c| 4.0|
| b| 3.0|
| a| 3.0|
+---+----------+
...
{% endhighlight %}
### Related Statements
* [XXX](xxx.html)
* ...
```
### Why are the changes needed?
The most changes of this PR are pretty minor, but I think the consistent formats/rules to write documents are important for long-term maintenance in our community
### Does this PR introduce any user-facing change?
Yes.
### How was this patch tested?
Manually checked.
Closes #28151 from maropu/MakeRightAligned.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Sean Owen <srowen@gmail.com>
2020-04-13 00:40:36 -04:00
|
|
|
|
2020-05-10 13:57:25 -04:00
|
|
|
```sql
|
2020-01-22 19:30:42 -05:00
|
|
|
CREATE TABLE dealer (id INT, city STRING, car_model STRING, quantity INT);
|
2020-03-11 19:52:40 -04:00
|
|
|
INSERT INTO dealer VALUES
|
2020-01-29 09:41:40 -05:00
|
|
|
(100, 'Fremont', 'Honda Civic', 10),
|
|
|
|
(100, 'Fremont', 'Honda Accord', 15),
|
|
|
|
(100, 'Fremont', 'Honda CRV', 7),
|
|
|
|
(200, 'Dublin', 'Honda Civic', 20),
|
|
|
|
(200, 'Dublin', 'Honda Accord', 10),
|
|
|
|
(200, 'Dublin', 'Honda CRV', 3),
|
|
|
|
(300, 'San Jose', 'Honda Civic', 5),
|
|
|
|
(300, 'San Jose', 'Honda Accord', 8);
|
2020-01-22 19:30:42 -05:00
|
|
|
|
|
|
|
-- Sum of quantity per dealership. Group by `id`.
|
|
|
|
SELECT id, sum(quantity) FROM dealer GROUP BY id ORDER BY id;
|
2020-05-01 13:11:43 -04:00
|
|
|
+---+-------------+
|
|
|
|
| id|sum(quantity)|
|
|
|
|
+---+-------------+
|
|
|
|
|100| 32|
|
|
|
|
|200| 33|
|
|
|
|
|300| 13|
|
|
|
|
+---+-------------+
|
2020-01-22 19:30:42 -05:00
|
|
|
|
|
|
|
-- Use column position in GROUP by clause.
|
|
|
|
SELECT id, sum(quantity) FROM dealer GROUP BY 1 ORDER BY 1;
|
2020-05-01 13:11:43 -04:00
|
|
|
+---+-------------+
|
|
|
|
| id|sum(quantity)|
|
|
|
|
+---+-------------+
|
|
|
|
|100| 32|
|
|
|
|
|200| 33|
|
|
|
|
|300| 13|
|
|
|
|
+---+-------------+
|
2020-01-22 19:30:42 -05:00
|
|
|
|
|
|
|
-- Multiple aggregations.
|
|
|
|
-- 1. Sum of quantity per dealership.
|
2020-03-11 19:52:40 -04:00
|
|
|
-- 2. Max quantity per dealership.
|
2020-01-22 19:30:42 -05:00
|
|
|
SELECT id, sum(quantity) AS sum, max(quantity) AS max FROM dealer GROUP BY id ORDER BY id;
|
2020-05-01 13:11:43 -04:00
|
|
|
+---+---+---+
|
|
|
|
| id|sum|max|
|
|
|
|
+---+---+---+
|
|
|
|
|100| 32| 15|
|
|
|
|
|200| 33| 20|
|
|
|
|
|300| 13| 8|
|
|
|
|
+---+---+---+
|
2020-01-22 19:30:42 -05:00
|
|
|
|
2020-04-06 08:36:51 -04:00
|
|
|
-- Count the number of distinct dealer cities per car_model.
|
|
|
|
SELECT car_model, count(DISTINCT city) AS count FROM dealer GROUP BY car_model;
|
2020-05-01 13:11:43 -04:00
|
|
|
+------------+-----+
|
|
|
|
| car_model|count|
|
|
|
|
+------------+-----+
|
|
|
|
| Honda Civic| 3|
|
|
|
|
| Honda CRV| 2|
|
|
|
|
|Honda Accord| 3|
|
|
|
|
+------------+-----+
|
2020-04-06 08:36:51 -04:00
|
|
|
|
|
|
|
-- Sum of only 'Honda Civic' and 'Honda CRV' quantities per dealership.
|
|
|
|
SELECT id, sum(quantity) FILTER (
|
|
|
|
WHERE car_model IN ('Honda Civic', 'Honda CRV')
|
|
|
|
) AS `sum(quantity)` FROM dealer
|
|
|
|
GROUP BY id ORDER BY id;
|
2020-05-01 13:11:43 -04:00
|
|
|
+---+-------------+
|
|
|
|
| id|sum(quantity)|
|
|
|
|
+---+-------------+
|
|
|
|
|100| 17|
|
|
|
|
|200| 23|
|
|
|
|
|300| 5|
|
|
|
|
+---+-------------+
|
2020-04-06 08:36:51 -04:00
|
|
|
|
2020-01-22 19:30:42 -05:00
|
|
|
-- Aggregations using multiple sets of grouping columns in a single statement.
|
|
|
|
-- Following performs aggregations based on four sets of grouping columns.
|
|
|
|
-- 1. city, car_model
|
|
|
|
-- 2. city
|
|
|
|
-- 3. car_model
|
|
|
|
-- 4. Empty grouping set. Returns quantities for all city and car models.
|
|
|
|
SELECT city, car_model, sum(quantity) AS sum FROM dealer
|
[SPARK-31383][SQL][DOC] Clean up the SQL documents in docs/sql-ref*
### What changes were proposed in this pull request?
This PR intends to clean up the SQL documents in `doc/sql-ref*`.
Main changes are as follows;
- Fixes wrong syntaxes and capitalize sub-titles
- Adds some DDL queries in `Examples` so that users can run examples there
- Makes query output in `Examples` follows the `Dataset.showString` (right-aligned) format
- Adds/Removes spaces, Indents, or blank lines to follow the format below;
```
---
license...
---
### Description
Writes what's the syntax is.
### Syntax
{% highlight sql %}
SELECT...
WHERE... // 4 indents after the second line
...
{% endhighlight %}
### Parameters
<dl>
<dt><code><em>Param Name</em></code></dt>
<dd>
Param Description
</dd>
...
</dl>
### Examples
{% highlight sql %}
-- It is better that users are able to execute example queries here.
-- So, we prepare test data in the first section if possible.
CREATE TABLE t (key STRING, value DOUBLE);
INSERT INTO t VALUES
('a', 1.0), ('a', 2.0), ('b', 3.0), ('c', 4.0);
-- query output has 2 indents and it follows the `Dataset.showString`
-- format (right-aligned).
SELECT * FROM t;
+---+-----+
|key|value|
+---+-----+
| a| 1.0|
| a| 2.0|
| b| 3.0|
| c| 4.0|
+---+-----+
-- Query statements after the second line have 4 indents.
SELECT key, SUM(value)
FROM t
GROUP BY key;
+---+----------+
|key|sum(value)|
+---+----------+
| c| 4.0|
| b| 3.0|
| a| 3.0|
+---+----------+
...
{% endhighlight %}
### Related Statements
* [XXX](xxx.html)
* ...
```
### Why are the changes needed?
The most changes of this PR are pretty minor, but I think the consistent formats/rules to write documents are important for long-term maintenance in our community
### Does this PR introduce any user-facing change?
Yes.
### How was this patch tested?
Manually checked.
Closes #28151 from maropu/MakeRightAligned.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Sean Owen <srowen@gmail.com>
2020-04-13 00:40:36 -04:00
|
|
|
GROUP BY GROUPING SETS ((city, car_model), (city), (car_model), ())
|
|
|
|
ORDER BY city;
|
2020-05-10 13:57:25 -04:00
|
|
|
+---------+------------+---+
|
|
|
|
| city| car_model|sum|
|
|
|
|
+---------+------------+---+
|
|
|
|
| null| null| 78|
|
|
|
|
| null| HondaAccord| 33|
|
|
|
|
| null| HondaCRV| 10|
|
|
|
|
| null| HondaCivic| 35|
|
|
|
|
| Dublin| null| 33|
|
|
|
|
| Dublin| HondaAccord| 10|
|
|
|
|
| Dublin| HondaCRV| 3|
|
|
|
|
| Dublin| HondaCivic| 20|
|
|
|
|
| Fremont| null| 32|
|
|
|
|
| Fremont| HondaAccord| 15|
|
|
|
|
| Fremont| HondaCRV| 7|
|
|
|
|
| Fremont| HondaCivic| 10|
|
|
|
|
| San Jose| null| 13|
|
|
|
|
| San Jose| HondaAccord| 8|
|
|
|
|
| San Jose| HondaCivic| 5|
|
|
|
|
+---------+------------+---+
|
2020-01-22 19:30:42 -05:00
|
|
|
|
2020-03-11 19:52:40 -04:00
|
|
|
-- Alternate syntax for `GROUPING SETS` in which both `GROUP BY` and `GROUPING SETS`
|
2020-01-22 19:30:42 -05:00
|
|
|
-- specifications are present.
|
|
|
|
SELECT city, car_model, sum(quantity) AS sum FROM dealer
|
[SPARK-31383][SQL][DOC] Clean up the SQL documents in docs/sql-ref*
### What changes were proposed in this pull request?
This PR intends to clean up the SQL documents in `doc/sql-ref*`.
Main changes are as follows;
- Fixes wrong syntaxes and capitalize sub-titles
- Adds some DDL queries in `Examples` so that users can run examples there
- Makes query output in `Examples` follows the `Dataset.showString` (right-aligned) format
- Adds/Removes spaces, Indents, or blank lines to follow the format below;
```
---
license...
---
### Description
Writes what's the syntax is.
### Syntax
{% highlight sql %}
SELECT...
WHERE... // 4 indents after the second line
...
{% endhighlight %}
### Parameters
<dl>
<dt><code><em>Param Name</em></code></dt>
<dd>
Param Description
</dd>
...
</dl>
### Examples
{% highlight sql %}
-- It is better that users are able to execute example queries here.
-- So, we prepare test data in the first section if possible.
CREATE TABLE t (key STRING, value DOUBLE);
INSERT INTO t VALUES
('a', 1.0), ('a', 2.0), ('b', 3.0), ('c', 4.0);
-- query output has 2 indents and it follows the `Dataset.showString`
-- format (right-aligned).
SELECT * FROM t;
+---+-----+
|key|value|
+---+-----+
| a| 1.0|
| a| 2.0|
| b| 3.0|
| c| 4.0|
+---+-----+
-- Query statements after the second line have 4 indents.
SELECT key, SUM(value)
FROM t
GROUP BY key;
+---+----------+
|key|sum(value)|
+---+----------+
| c| 4.0|
| b| 3.0|
| a| 3.0|
+---+----------+
...
{% endhighlight %}
### Related Statements
* [XXX](xxx.html)
* ...
```
### Why are the changes needed?
The most changes of this PR are pretty minor, but I think the consistent formats/rules to write documents are important for long-term maintenance in our community
### Does this PR introduce any user-facing change?
Yes.
### How was this patch tested?
Manually checked.
Closes #28151 from maropu/MakeRightAligned.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Sean Owen <srowen@gmail.com>
2020-04-13 00:40:36 -04:00
|
|
|
GROUP BY city, car_model GROUPING SETS ((city, car_model), (city), (car_model), ())
|
|
|
|
ORDER BY city, car_model;
|
2020-05-10 13:57:25 -04:00
|
|
|
+---------+------------+---+
|
|
|
|
| city| car_model|sum|
|
|
|
|
+---------+------------+---+
|
|
|
|
| null| null| 78|
|
|
|
|
| null| HondaAccord| 33|
|
|
|
|
| null| HondaCRV| 10|
|
|
|
|
| null| HondaCivic| 35|
|
|
|
|
| Dublin| null| 33|
|
|
|
|
| Dublin| HondaAccord| 10|
|
|
|
|
| Dublin| HondaCRV| 3|
|
|
|
|
| Dublin| HondaCivic| 20|
|
|
|
|
| Fremont| null| 32|
|
|
|
|
| Fremont| HondaAccord| 15|
|
|
|
|
| Fremont| HondaCRV| 7|
|
|
|
|
| Fremont| HondaCivic| 10|
|
|
|
|
| San Jose| null| 13|
|
|
|
|
| San Jose| HondaAccord| 8|
|
|
|
|
| San Jose| HondaCivic| 5|
|
|
|
|
+---------+------------+---+
|
2020-01-22 19:30:42 -05:00
|
|
|
|
|
|
|
-- Group by processing with `ROLLUP` clause.
|
|
|
|
-- Equivalent GROUP BY GROUPING SETS ((city, car_model), (city), ())
|
|
|
|
SELECT city, car_model, sum(quantity) AS sum FROM dealer
|
[SPARK-31383][SQL][DOC] Clean up the SQL documents in docs/sql-ref*
### What changes were proposed in this pull request?
This PR intends to clean up the SQL documents in `doc/sql-ref*`.
Main changes are as follows;
- Fixes wrong syntaxes and capitalize sub-titles
- Adds some DDL queries in `Examples` so that users can run examples there
- Makes query output in `Examples` follows the `Dataset.showString` (right-aligned) format
- Adds/Removes spaces, Indents, or blank lines to follow the format below;
```
---
license...
---
### Description
Writes what's the syntax is.
### Syntax
{% highlight sql %}
SELECT...
WHERE... // 4 indents after the second line
...
{% endhighlight %}
### Parameters
<dl>
<dt><code><em>Param Name</em></code></dt>
<dd>
Param Description
</dd>
...
</dl>
### Examples
{% highlight sql %}
-- It is better that users are able to execute example queries here.
-- So, we prepare test data in the first section if possible.
CREATE TABLE t (key STRING, value DOUBLE);
INSERT INTO t VALUES
('a', 1.0), ('a', 2.0), ('b', 3.0), ('c', 4.0);
-- query output has 2 indents and it follows the `Dataset.showString`
-- format (right-aligned).
SELECT * FROM t;
+---+-----+
|key|value|
+---+-----+
| a| 1.0|
| a| 2.0|
| b| 3.0|
| c| 4.0|
+---+-----+
-- Query statements after the second line have 4 indents.
SELECT key, SUM(value)
FROM t
GROUP BY key;
+---+----------+
|key|sum(value)|
+---+----------+
| c| 4.0|
| b| 3.0|
| a| 3.0|
+---+----------+
...
{% endhighlight %}
### Related Statements
* [XXX](xxx.html)
* ...
```
### Why are the changes needed?
The most changes of this PR are pretty minor, but I think the consistent formats/rules to write documents are important for long-term maintenance in our community
### Does this PR introduce any user-facing change?
Yes.
### How was this patch tested?
Manually checked.
Closes #28151 from maropu/MakeRightAligned.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Sean Owen <srowen@gmail.com>
2020-04-13 00:40:36 -04:00
|
|
|
GROUP BY city, car_model WITH ROLLUP
|
|
|
|
ORDER BY city, car_model;
|
2020-05-10 13:57:25 -04:00
|
|
|
+---------+------------+---+
|
|
|
|
| city| car_model|sum|
|
|
|
|
+---------+------------+---+
|
|
|
|
| null| null| 78|
|
|
|
|
| Dublin| null| 33|
|
|
|
|
| Dublin| HondaAccord| 10|
|
|
|
|
| Dublin| HondaCRV| 3|
|
|
|
|
| Dublin| HondaCivic| 20|
|
|
|
|
| Fremont| null| 32|
|
|
|
|
| Fremont| HondaAccord| 15|
|
|
|
|
| Fremont| HondaCRV| 7|
|
|
|
|
| Fremont| HondaCivic| 10|
|
|
|
|
| San Jose| null| 13|
|
|
|
|
| San Jose| HondaAccord| 8|
|
|
|
|
| San Jose| HondaCivic| 5|
|
|
|
|
+---------+------------+---+
|
2020-01-22 19:30:42 -05:00
|
|
|
|
|
|
|
-- Group by processing with `CUBE` clause.
|
|
|
|
-- Equivalent GROUP BY GROUPING SETS ((city, car_model), (city), (car_model), ())
|
|
|
|
SELECT city, car_model, sum(quantity) AS sum FROM dealer
|
[SPARK-31383][SQL][DOC] Clean up the SQL documents in docs/sql-ref*
### What changes were proposed in this pull request?
This PR intends to clean up the SQL documents in `doc/sql-ref*`.
Main changes are as follows;
- Fixes wrong syntaxes and capitalize sub-titles
- Adds some DDL queries in `Examples` so that users can run examples there
- Makes query output in `Examples` follows the `Dataset.showString` (right-aligned) format
- Adds/Removes spaces, Indents, or blank lines to follow the format below;
```
---
license...
---
### Description
Writes what's the syntax is.
### Syntax
{% highlight sql %}
SELECT...
WHERE... // 4 indents after the second line
...
{% endhighlight %}
### Parameters
<dl>
<dt><code><em>Param Name</em></code></dt>
<dd>
Param Description
</dd>
...
</dl>
### Examples
{% highlight sql %}
-- It is better that users are able to execute example queries here.
-- So, we prepare test data in the first section if possible.
CREATE TABLE t (key STRING, value DOUBLE);
INSERT INTO t VALUES
('a', 1.0), ('a', 2.0), ('b', 3.0), ('c', 4.0);
-- query output has 2 indents and it follows the `Dataset.showString`
-- format (right-aligned).
SELECT * FROM t;
+---+-----+
|key|value|
+---+-----+
| a| 1.0|
| a| 2.0|
| b| 3.0|
| c| 4.0|
+---+-----+
-- Query statements after the second line have 4 indents.
SELECT key, SUM(value)
FROM t
GROUP BY key;
+---+----------+
|key|sum(value)|
+---+----------+
| c| 4.0|
| b| 3.0|
| a| 3.0|
+---+----------+
...
{% endhighlight %}
### Related Statements
* [XXX](xxx.html)
* ...
```
### Why are the changes needed?
The most changes of this PR are pretty minor, but I think the consistent formats/rules to write documents are important for long-term maintenance in our community
### Does this PR introduce any user-facing change?
Yes.
### How was this patch tested?
Manually checked.
Closes #28151 from maropu/MakeRightAligned.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Sean Owen <srowen@gmail.com>
2020-04-13 00:40:36 -04:00
|
|
|
GROUP BY city, car_model WITH CUBE
|
|
|
|
ORDER BY city, car_model;
|
2020-05-10 13:57:25 -04:00
|
|
|
+---------+------------+---+
|
|
|
|
| city| car_model|sum|
|
|
|
|
+---------+------------+---+
|
|
|
|
| null| null| 78|
|
|
|
|
| null| HondaAccord| 33|
|
|
|
|
| null| HondaCRV| 10|
|
|
|
|
| null| HondaCivic| 35|
|
|
|
|
| Dublin| null| 33|
|
|
|
|
| Dublin| HondaAccord| 10|
|
|
|
|
| Dublin| HondaCRV| 3|
|
|
|
|
| Dublin| HondaCivic| 20|
|
|
|
|
| Fremont| null| 32|
|
|
|
|
| Fremont| HondaAccord| 15|
|
|
|
|
| Fremont| HondaCRV| 7|
|
|
|
|
| Fremont| HondaCivic| 10|
|
|
|
|
| San Jose| null| 13|
|
|
|
|
| San Jose| HondaAccord| 8|
|
|
|
|
| San Jose| HondaCivic| 5|
|
|
|
|
+---------+------------+---+
|
|
|
|
```
|
2020-01-29 09:41:40 -05:00
|
|
|
|
[SPARK-31383][SQL][DOC] Clean up the SQL documents in docs/sql-ref*
### What changes were proposed in this pull request?
This PR intends to clean up the SQL documents in `doc/sql-ref*`.
Main changes are as follows;
- Fixes wrong syntaxes and capitalize sub-titles
- Adds some DDL queries in `Examples` so that users can run examples there
- Makes query output in `Examples` follows the `Dataset.showString` (right-aligned) format
- Adds/Removes spaces, Indents, or blank lines to follow the format below;
```
---
license...
---
### Description
Writes what's the syntax is.
### Syntax
{% highlight sql %}
SELECT...
WHERE... // 4 indents after the second line
...
{% endhighlight %}
### Parameters
<dl>
<dt><code><em>Param Name</em></code></dt>
<dd>
Param Description
</dd>
...
</dl>
### Examples
{% highlight sql %}
-- It is better that users are able to execute example queries here.
-- So, we prepare test data in the first section if possible.
CREATE TABLE t (key STRING, value DOUBLE);
INSERT INTO t VALUES
('a', 1.0), ('a', 2.0), ('b', 3.0), ('c', 4.0);
-- query output has 2 indents and it follows the `Dataset.showString`
-- format (right-aligned).
SELECT * FROM t;
+---+-----+
|key|value|
+---+-----+
| a| 1.0|
| a| 2.0|
| b| 3.0|
| c| 4.0|
+---+-----+
-- Query statements after the second line have 4 indents.
SELECT key, SUM(value)
FROM t
GROUP BY key;
+---+----------+
|key|sum(value)|
+---+----------+
| c| 4.0|
| b| 3.0|
| a| 3.0|
+---+----------+
...
{% endhighlight %}
### Related Statements
* [XXX](xxx.html)
* ...
```
### Why are the changes needed?
The most changes of this PR are pretty minor, but I think the consistent formats/rules to write documents are important for long-term maintenance in our community
### Does this PR introduce any user-facing change?
Yes.
### How was this patch tested?
Manually checked.
Closes #28151 from maropu/MakeRightAligned.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Sean Owen <srowen@gmail.com>
2020-04-13 00:40:36 -04:00
|
|
|
### Related Statements
|
|
|
|
|
2020-05-10 13:57:25 -04:00
|
|
|
* [SELECT Main](sql-ref-syntax-qry-select.html)
|
|
|
|
* [WHERE Clause](sql-ref-syntax-qry-select-where.html)
|
|
|
|
* [HAVING Clause](sql-ref-syntax-qry-select-having.html)
|
|
|
|
* [ORDER BY Clause](sql-ref-syntax-qry-select-orderby.html)
|
|
|
|
* [SORT BY Clause](sql-ref-syntax-qry-select-sortby.html)
|
|
|
|
* [CLUSTER BY Clause](sql-ref-syntax-qry-select-clusterby.html)
|
|
|
|
* [DISTRIBUTE BY Clause](sql-ref-syntax-qry-select-distribute-by.html)
|
|
|
|
* [LIMIT Clause](sql-ref-syntax-qry-select-limit.html)
|