2018-10-18 14:59:06 -04:00
|
|
|
- text: Getting Started
|
|
|
|
url: sql-getting-started.html
|
|
|
|
subitems:
|
|
|
|
- text: "Starting Point: SparkSession"
|
|
|
|
url: sql-getting-started.html#starting-point-sparksession
|
|
|
|
- text: Creating DataFrames
|
|
|
|
url: sql-getting-started.html#creating-dataframes
|
|
|
|
- text: Untyped Dataset Operations (DataFrame operations)
|
|
|
|
url: sql-getting-started.html#untyped-dataset-operations-aka-dataframe-operations
|
|
|
|
- text: Running SQL Queries Programmatically
|
|
|
|
url: sql-getting-started.html#running-sql-queries-programmatically
|
|
|
|
- text: Global Temporary View
|
|
|
|
url: sql-getting-started.html#global-temporary-view
|
|
|
|
- text: Creating Datasets
|
|
|
|
url: sql-getting-started.html#creating-datasets
|
|
|
|
- text: Interoperating with RDDs
|
|
|
|
url: sql-getting-started.html#interoperating-with-rdds
|
2019-12-27 00:22:26 -05:00
|
|
|
- text: Scalar Functions
|
|
|
|
url: sql-getting-started.html#scalar-functions
|
2018-10-18 14:59:06 -04:00
|
|
|
- text: Aggregations
|
|
|
|
url: sql-getting-started.html#aggregations
|
|
|
|
- text: Data Sources
|
|
|
|
url: sql-data-sources.html
|
|
|
|
subitems:
|
|
|
|
- text: "Generic Load/Save Functions"
|
|
|
|
url: sql-data-sources-load-save-functions.html
|
2020-02-05 04:16:38 -05:00
|
|
|
- text: "Generic File Source Options"
|
|
|
|
url: sql-data-sources-generic-options.html
|
2018-10-18 14:59:06 -04:00
|
|
|
- text: Parquet Files
|
|
|
|
url: sql-data-sources-parquet.html
|
|
|
|
- text: ORC Files
|
|
|
|
url: sql-data-sources-orc.html
|
|
|
|
- text: JSON Files
|
|
|
|
url: sql-data-sources-json.html
|
2021-04-04 22:17:42 -04:00
|
|
|
- text: CSV Files
|
|
|
|
url: sql-data-sources-csv.html
|
2021-04-07 10:11:43 -04:00
|
|
|
- text: Text Files
|
|
|
|
url: sql-data-sources-text.html
|
2018-10-18 14:59:06 -04:00
|
|
|
- text: Hive Tables
|
|
|
|
url: sql-data-sources-hive-tables.html
|
|
|
|
- text: JDBC To Other Databases
|
|
|
|
url: sql-data-sources-jdbc.html
|
|
|
|
- text: Avro Files
|
|
|
|
url: sql-data-sources-avro.html
|
2019-12-27 00:22:26 -05:00
|
|
|
- text: Whole Binary Files
|
|
|
|
url: sql-data-sources-binaryFile.html
|
2018-10-18 14:59:06 -04:00
|
|
|
- text: Troubleshooting
|
|
|
|
url: sql-data-sources-troubleshooting.html
|
2018-10-23 00:19:31 -04:00
|
|
|
- text: Performance Tuning
|
|
|
|
url: sql-performance-tuning.html
|
2018-10-18 14:59:06 -04:00
|
|
|
subitems:
|
|
|
|
- text: Caching Data In Memory
|
2018-10-23 00:19:31 -04:00
|
|
|
url: sql-performance-tuning.html#caching-data-in-memory
|
2018-10-18 14:59:06 -04:00
|
|
|
- text: Other Configuration Options
|
2018-10-23 00:19:31 -04:00
|
|
|
url: sql-performance-tuning.html#other-configuration-options
|
2019-12-27 00:22:26 -05:00
|
|
|
- text: Join Strategy Hints for SQL Queries
|
|
|
|
url: sql-performance-tuning.html#join-strategy-hints-for-sql-queries
|
2020-11-29 14:24:58 -05:00
|
|
|
- text: Coalesce Hints for SQL Queries
|
|
|
|
url: sql-performance-tuning.html#coalesce-hints-for-sql-queries
|
|
|
|
- text: Adaptive Query Execution
|
|
|
|
url: sql-performance-tuning.html#adaptive-query-execution
|
2018-10-18 14:59:06 -04:00
|
|
|
- text: Distributed SQL Engine
|
|
|
|
url: sql-distributed-sql-engine.html
|
|
|
|
subitems:
|
|
|
|
- text: "Running the Thrift JDBC/ODBC server"
|
|
|
|
url: sql-distributed-sql-engine.html#running-the-thrift-jdbcodbc-server
|
|
|
|
- text: Running the Spark SQL CLI
|
|
|
|
url: sql-distributed-sql-engine.html#running-the-spark-sql-cli
|
|
|
|
- text: PySpark Usage Guide for Pandas with Apache Arrow
|
|
|
|
url: sql-pyspark-pandas-with-arrow.html
|
|
|
|
- text: Migration Guide
|
[SPARK-29052][DOCS][ML][PYTHON][CORE][R][SQL][SS] Create a Migration Guide tap in Spark documentation
### What changes were proposed in this pull request?
Currently, there is no migration section for PySpark, SparkCore and Structured Streaming.
It is difficult for users to know what to do when they upgrade.
This PR proposes to create create a "Migration Guide" tap at Spark documentation.
![Screen Shot 2019-09-11 at 7 02 05 PM](https://user-images.githubusercontent.com/6477701/64688126-ad712f80-d4c6-11e9-8672-9a2c56c05bf8.png)
![Screen Shot 2019-09-11 at 7 27 15 PM](https://user-images.githubusercontent.com/6477701/64689915-389ff480-d4ca-11e9-8c54-7f46095d0d23.png)
This page will contain migration guides for Spark SQL, PySpark, SparkR, MLlib, Structured Streaming and Core. Basically it is a refactoring.
There are some new information added, which I will leave a comment inlined for easier review.
1. **MLlib**
Merge [ml-guide.html#migration-guide](https://spark.apache.org/docs/latest/ml-guide.html#migration-guide) and [ml-migration-guides.html](https://spark.apache.org/docs/latest/ml-migration-guides.html)
```
'docs/ml-guide.md'
↓ Merge new/old migration guides
'docs/ml-migration-guide.md'
```
2. **PySpark**
Extract PySpark specific items from https://spark.apache.org/docs/latest/sql-migration-guide-upgrade.html
```
'docs/sql-migration-guide-upgrade.md'
↓ Extract PySpark specific items
'docs/pyspark-migration-guide.md'
```
3. **SparkR**
Move [sparkr.html#migration-guide](https://spark.apache.org/docs/latest/sparkr.html#migration-guide) into a separate file, and extract from [sql-migration-guide-upgrade.html](https://spark.apache.org/docs/latest/sql-migration-guide-upgrade.html)
```
'docs/sparkr.md' 'docs/sql-migration-guide-upgrade.md'
Move migration guide section ↘ ↙ Extract SparkR specific items
docs/sparkr-migration-guide.md
```
4. **Core**
Newly created at `'docs/core-migration-guide.md'`. I skimmed resolved JIRAs at 3.0.0 and found some items to note.
5. **Structured Streaming**
Newly created at `'docs/ss-migration-guide.md'`. I skimmed resolved JIRAs at 3.0.0 and found some items to note.
6. **SQL**
Merged [sql-migration-guide-upgrade.html](https://spark.apache.org/docs/latest/sql-migration-guide-upgrade.html) and [sql-migration-guide-hive-compatibility.html](https://spark.apache.org/docs/latest/sql-migration-guide-hive-compatibility.html)
```
'docs/sql-migration-guide-hive-compatibility.md' 'docs/sql-migration-guide-upgrade.md'
Move Hive compatibility section ↘ ↙ Left over after filtering PySpark and SparkR items
'docs/sql-migration-guide.md'
```
### Why are the changes needed?
In order for users in production to effectively migrate to higher versions, and detect behaviour or breaking changes before upgrading and/or migrating.
### Does this PR introduce any user-facing change?
Yes, this changes Spark's documentation at https://spark.apache.org/docs/latest/index.html.
### How was this patch tested?
Manually build the doc. This can be verified as below:
```bash
cd docs
SKIP_API=1 jekyll build
open _site/index.html
```
Closes #25757 from HyukjinKwon/migration-doc.
Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-09-15 14:17:30 -04:00
|
|
|
url: sql-migration-old.html
|
2019-08-19 02:17:50 -04:00
|
|
|
- text: SQL Reference
|
|
|
|
url: sql-ref.html
|
2018-10-18 14:59:06 -04:00
|
|
|
subitems:
|
2020-02-13 13:53:55 -05:00
|
|
|
- text: ANSI Compliance
|
|
|
|
url: sql-ref-ansi-compliance.html
|
|
|
|
subitems:
|
|
|
|
- text: Arithmetic Operations
|
|
|
|
url: sql-ref-ansi-compliance.html#arithmetic-operations
|
|
|
|
- text: Type Conversion
|
|
|
|
url: sql-ref-ansi-compliance.html#type-conversion
|
|
|
|
- text: SQL Keywords
|
|
|
|
url: sql-ref-ansi-compliance.html#sql-keywords
|
2020-05-22 19:43:16 -04:00
|
|
|
- text: Data Types
|
|
|
|
url: sql-ref-datatypes.html
|
|
|
|
- text: Datetime Pattern
|
|
|
|
url: sql-ref-datetime-pattern.html
|
|
|
|
- text: Functions
|
|
|
|
url: sql-ref-functions.html
|
|
|
|
subitems:
|
|
|
|
- text: Built-in Functions
|
|
|
|
url: sql-ref-functions-builtin.html
|
|
|
|
- text: Scalar UDFs (User-Defined Functions)
|
|
|
|
url: sql-ref-functions-udf-scalar.html
|
|
|
|
- text: UDAFs (User-Defined Aggregate Functions)
|
|
|
|
url: sql-ref-functions-udf-aggregate.html
|
|
|
|
- text: Integration with Hive UDFs/UDAFs/UDTFs
|
|
|
|
url: sql-ref-functions-udf-hive.html
|
|
|
|
- text: Identifiers
|
|
|
|
url: sql-ref-identifier.html
|
|
|
|
- text: Literals
|
|
|
|
url: sql-ref-literals.html
|
|
|
|
- text: Null Semantics
|
|
|
|
url: sql-ref-null-semantics.html
|
2019-08-19 02:17:50 -04:00
|
|
|
- text: SQL Syntax
|
|
|
|
url: sql-ref-syntax.html
|
|
|
|
subitems:
|
|
|
|
- text: Data Definition Statements
|
|
|
|
url: sql-ref-syntax-ddl.html
|
|
|
|
subitems:
|
|
|
|
- text: ALTER DATABASE
|
|
|
|
url: sql-ref-syntax-ddl-alter-database.html
|
|
|
|
- text: ALTER TABLE
|
|
|
|
url: sql-ref-syntax-ddl-alter-table.html
|
|
|
|
- text: ALTER VIEW
|
|
|
|
url: sql-ref-syntax-ddl-alter-view.html
|
|
|
|
- text: CREATE DATABASE
|
|
|
|
url: sql-ref-syntax-ddl-create-database.html
|
|
|
|
- text: CREATE FUNCTION
|
|
|
|
url: sql-ref-syntax-ddl-create-function.html
|
|
|
|
- text: CREATE TABLE
|
|
|
|
url: sql-ref-syntax-ddl-create-table.html
|
|
|
|
- text: CREATE VIEW
|
|
|
|
url: sql-ref-syntax-ddl-create-view.html
|
|
|
|
- text: DROP DATABASE
|
|
|
|
url: sql-ref-syntax-ddl-drop-database.html
|
|
|
|
- text: DROP FUNCTION
|
|
|
|
url: sql-ref-syntax-ddl-drop-function.html
|
|
|
|
- text: DROP TABLE
|
|
|
|
url: sql-ref-syntax-ddl-drop-table.html
|
|
|
|
- text: DROP VIEW
|
|
|
|
url: sql-ref-syntax-ddl-drop-view.html
|
|
|
|
- text: TRUNCATE TABLE
|
|
|
|
url: sql-ref-syntax-ddl-truncate-table.html
|
|
|
|
- text: REPAIR TABLE
|
|
|
|
url: sql-ref-syntax-ddl-repair-table.html
|
2020-03-31 19:42:15 -04:00
|
|
|
- text: USE DATABASE
|
2020-07-04 22:01:07 -04:00
|
|
|
url: sql-ref-syntax-ddl-usedb.html
|
2019-08-19 02:17:50 -04:00
|
|
|
- text: Data Manipulation Statements
|
|
|
|
url: sql-ref-syntax-dml.html
|
|
|
|
subitems:
|
|
|
|
- text: INSERT
|
|
|
|
url: sql-ref-syntax-dml-insert.html
|
|
|
|
- text: LOAD
|
|
|
|
url: sql-ref-syntax-dml-load.html
|
|
|
|
- text: Data Retrieval(Queries)
|
|
|
|
url: sql-ref-syntax-qry.html
|
|
|
|
subitems:
|
|
|
|
- text: SELECT
|
|
|
|
url: sql-ref-syntax-qry-select.html
|
|
|
|
subitems:
|
2020-01-29 09:41:40 -05:00
|
|
|
- text: WHERE Clause
|
|
|
|
url: sql-ref-syntax-qry-select-where.html
|
2019-08-19 02:17:50 -04:00
|
|
|
- text: GROUP BY Clause
|
|
|
|
url: sql-ref-syntax-qry-select-groupby.html
|
|
|
|
- text: HAVING Clause
|
|
|
|
url: sql-ref-syntax-qry-select-having.html
|
2020-01-29 09:41:40 -05:00
|
|
|
- text: ORDER BY Clause
|
|
|
|
url: sql-ref-syntax-qry-select-orderby.html
|
|
|
|
- text: SORT BY Clause
|
|
|
|
url: sql-ref-syntax-qry-select-sortby.html
|
|
|
|
- text: CLUSTER BY Clause
|
|
|
|
url: sql-ref-syntax-qry-select-clusterby.html
|
|
|
|
- text: DISTRIBUTE BY Clause
|
|
|
|
url: sql-ref-syntax-qry-select-distribute-by.html
|
2019-08-19 02:17:50 -04:00
|
|
|
- text: LIMIT Clause
|
|
|
|
url: sql-ref-syntax-qry-select-limit.html
|
2020-05-10 13:57:25 -04:00
|
|
|
- text: Common Table Expression
|
|
|
|
url: sql-ref-syntax-qry-select-cte.html
|
2020-05-30 15:51:45 -04:00
|
|
|
- text: Hints
|
|
|
|
url: sql-ref-syntax-qry-select-hints.html
|
2020-05-10 13:57:25 -04:00
|
|
|
- text: Inline Table
|
|
|
|
url: sql-ref-syntax-qry-select-inline-table.html
|
2020-10-27 22:21:35 -04:00
|
|
|
- text: File
|
|
|
|
url: sql-ref-syntax-qry-select-file.html
|
2020-04-12 14:57:54 -04:00
|
|
|
- text: JOIN
|
|
|
|
url: sql-ref-syntax-qry-select-join.html
|
2020-05-10 13:57:25 -04:00
|
|
|
- text: LIKE Predicate
|
|
|
|
url: sql-ref-syntax-qry-select-like.html
|
2020-04-08 11:51:04 -04:00
|
|
|
- text: Set Operators
|
|
|
|
url: sql-ref-syntax-qry-select-setops.html
|
2020-04-09 20:39:34 -04:00
|
|
|
- text: TABLESAMPLE
|
2020-05-30 15:51:45 -04:00
|
|
|
url: sql-ref-syntax-qry-select-sampling.html
|
2020-04-13 00:39:27 -04:00
|
|
|
- text: Table-valued Function
|
|
|
|
url: sql-ref-syntax-qry-select-tvf.html
|
2020-04-17 20:31:52 -04:00
|
|
|
- text: Window Function
|
2020-05-30 15:51:45 -04:00
|
|
|
url: sql-ref-syntax-qry-select-window.html
|
2020-07-27 20:41:53 -04:00
|
|
|
- text: CASE Clause
|
|
|
|
url: sql-ref-syntax-qry-select-case.html
|
|
|
|
- text: LATERAL VIEW Clause
|
|
|
|
url: sql-ref-syntax-qry-select-lateral-view.html
|
|
|
|
- text: PIVOT Clause
|
|
|
|
url: sql-ref-syntax-qry-select-pivot.html
|
2021-04-20 06:30:26 -04:00
|
|
|
- text: TRANSFORM Clause
|
|
|
|
url: sql-ref-syntax-qry-select-transform.html
|
2019-08-19 02:17:50 -04:00
|
|
|
- text: EXPLAIN
|
|
|
|
url: sql-ref-syntax-qry-explain.html
|
2020-01-29 09:41:40 -05:00
|
|
|
- text: Auxiliary Statements
|
2019-08-19 02:17:50 -04:00
|
|
|
url: sql-ref-syntax-aux.html
|
|
|
|
subitems:
|
2020-02-16 10:53:12 -05:00
|
|
|
- text: ANALYZE
|
2019-08-19 02:17:50 -04:00
|
|
|
url: sql-ref-syntax-aux-analyze.html
|
|
|
|
subitems:
|
|
|
|
- text: ANALYZE TABLE
|
|
|
|
url: sql-ref-syntax-aux-analyze-table.html
|
2021-02-28 19:06:47 -05:00
|
|
|
- text: ANALYZE TABLES
|
|
|
|
url: sql-ref-syntax-aux-analyze-tables.html
|
2020-02-16 10:53:12 -05:00
|
|
|
- text: CACHE
|
2019-08-19 02:17:50 -04:00
|
|
|
url: sql-ref-syntax-aux-cache.html
|
|
|
|
subitems:
|
|
|
|
- text: CACHE TABLE
|
|
|
|
url: sql-ref-syntax-aux-cache-cache-table.html
|
|
|
|
- text: UNCACHE TABLE
|
|
|
|
url: sql-ref-syntax-aux-cache-uncache-table.html
|
|
|
|
- text: CLEAR CACHE
|
|
|
|
url: sql-ref-syntax-aux-cache-clear-cache.html
|
2019-09-13 02:00:42 -04:00
|
|
|
- text: REFRESH TABLE
|
2020-07-04 22:01:07 -04:00
|
|
|
url: sql-ref-syntax-aux-cache-refresh-table.html
|
2020-07-22 15:05:50 -04:00
|
|
|
- text: REFRESH FUNCTION
|
|
|
|
url: sql-ref-syntax-aux-cache-refresh-function.html
|
2019-12-31 10:36:41 -05:00
|
|
|
- text: REFRESH
|
2020-03-29 12:19:24 -04:00
|
|
|
url: sql-ref-syntax-aux-cache-refresh.html
|
2020-02-16 10:53:12 -05:00
|
|
|
- text: DESCRIBE
|
2019-08-19 02:17:50 -04:00
|
|
|
url: sql-ref-syntax-aux-describe.html
|
|
|
|
subitems:
|
|
|
|
- text: DESCRIBE DATABASE
|
|
|
|
url: sql-ref-syntax-aux-describe-database.html
|
|
|
|
- text: DESCRIBE TABLE
|
|
|
|
url: sql-ref-syntax-aux-describe-table.html
|
|
|
|
- text: DESCRIBE FUNCTION
|
|
|
|
url: sql-ref-syntax-aux-describe-function.html
|
|
|
|
- text: DESCRIBE QUERY
|
|
|
|
url: sql-ref-syntax-aux-describe-query.html
|
2020-02-16 10:53:12 -05:00
|
|
|
- text: SHOW
|
2019-08-19 02:17:50 -04:00
|
|
|
url: sql-ref-syntax-aux-show.html
|
|
|
|
subitems:
|
|
|
|
- text: SHOW COLUMNS
|
|
|
|
url: sql-ref-syntax-aux-show-columns.html
|
2020-05-10 13:57:25 -04:00
|
|
|
- text: SHOW CREATE TABLE
|
|
|
|
url: sql-ref-syntax-aux-show-create-table.html
|
2019-08-19 02:17:50 -04:00
|
|
|
- text: SHOW DATABASES
|
|
|
|
url: sql-ref-syntax-aux-show-databases.html
|
|
|
|
- text: SHOW FUNCTIONS
|
|
|
|
url: sql-ref-syntax-aux-show-functions.html
|
2020-05-10 13:57:25 -04:00
|
|
|
- text: SHOW PARTITIONS
|
|
|
|
url: sql-ref-syntax-aux-show-partitions.html
|
2019-08-19 02:17:50 -04:00
|
|
|
- text: SHOW TABLE
|
|
|
|
url: sql-ref-syntax-aux-show-table.html
|
|
|
|
- text: SHOW TABLES
|
|
|
|
url: sql-ref-syntax-aux-show-tables.html
|
|
|
|
- text: SHOW TBLPROPERTIES
|
|
|
|
url: sql-ref-syntax-aux-show-tblproperties.html
|
2020-04-07 12:25:01 -04:00
|
|
|
- text: SHOW VIEWS
|
|
|
|
url: sql-ref-syntax-aux-show-views.html
|
2020-02-16 10:53:12 -05:00
|
|
|
- text: CONFIGURATION MANAGEMENT
|
2019-08-19 02:17:50 -04:00
|
|
|
url: sql-ref-syntax-aux-conf-mgmt.html
|
|
|
|
subitems:
|
|
|
|
- text: SET
|
|
|
|
url: sql-ref-syntax-aux-conf-mgmt-set.html
|
|
|
|
- text: RESET
|
|
|
|
url: sql-ref-syntax-aux-conf-mgmt-reset.html
|
2020-07-16 09:01:53 -04:00
|
|
|
- text: SET TIME ZONE
|
|
|
|
url: sql-ref-syntax-aux-conf-mgmt-set-timezone.html
|
2020-02-16 10:53:12 -05:00
|
|
|
- text: RESOURCE MANAGEMENT
|
2019-08-19 02:17:50 -04:00
|
|
|
url: sql-ref-syntax-aux-resource-mgmt.html
|
|
|
|
subitems:
|
|
|
|
- text: ADD FILE
|
|
|
|
url: sql-ref-syntax-aux-resource-mgmt-add-file.html
|
|
|
|
- text: ADD JAR
|
|
|
|
url: sql-ref-syntax-aux-resource-mgmt-add-jar.html
|
2021-03-09 07:28:35 -05:00
|
|
|
- text: ADD ARCHIVE
|
|
|
|
url: sql-ref-syntax-aux-resource-mgmt-add-archive.html
|
2019-10-07 14:39:03 -04:00
|
|
|
- text: LIST FILE
|
|
|
|
url: sql-ref-syntax-aux-resource-mgmt-list-file.html
|
|
|
|
- text: LIST JAR
|
|
|
|
url: sql-ref-syntax-aux-resource-mgmt-list-jar.html
|
2021-03-09 07:28:35 -05:00
|
|
|
- text: LIST ARCHIVE
|
|
|
|
url: sql-ref-syntax-aux-resource-mgmt-list-archive.html
|