spark-instrumented-optimizer/docs/_data/menu-sql.yaml

238 lines
9.9 KiB
YAML
Raw Normal View History

- text: Getting Started
url: sql-getting-started.html
subitems:
- text: "Starting Point: SparkSession"
url: sql-getting-started.html#starting-point-sparksession
- text: Creating DataFrames
url: sql-getting-started.html#creating-dataframes
- text: Untyped Dataset Operations (DataFrame operations)
url: sql-getting-started.html#untyped-dataset-operations-aka-dataframe-operations
- text: Running SQL Queries Programmatically
url: sql-getting-started.html#running-sql-queries-programmatically
- text: Global Temporary View
url: sql-getting-started.html#global-temporary-view
- text: Creating Datasets
url: sql-getting-started.html#creating-datasets
- text: Interoperating with RDDs
url: sql-getting-started.html#interoperating-with-rdds
- text: Aggregations
url: sql-getting-started.html#aggregations
- text: Data Sources
url: sql-data-sources.html
subitems:
- text: "Generic Load/Save Functions"
url: sql-data-sources-load-save-functions.html
- text: Parquet Files
url: sql-data-sources-parquet.html
- text: ORC Files
url: sql-data-sources-orc.html
- text: JSON Files
url: sql-data-sources-json.html
- text: Hive Tables
url: sql-data-sources-hive-tables.html
- text: JDBC To Other Databases
url: sql-data-sources-jdbc.html
- text: Avro Files
url: sql-data-sources-avro.html
- text: Troubleshooting
url: sql-data-sources-troubleshooting.html
- text: Performance Tuning
url: sql-performance-tuning.html
subitems:
- text: Caching Data In Memory
url: sql-performance-tuning.html#caching-data-in-memory
- text: Other Configuration Options
url: sql-performance-tuning.html#other-configuration-options
- text: Broadcast Hint for SQL Queries
url: sql-performance-tuning.html#broadcast-hint-for-sql-queries
- text: Distributed SQL Engine
url: sql-distributed-sql-engine.html
subitems:
- text: "Running the Thrift JDBC/ODBC server"
url: sql-distributed-sql-engine.html#running-the-thrift-jdbcodbc-server
- text: Running the Spark SQL CLI
url: sql-distributed-sql-engine.html#running-the-spark-sql-cli
- text: PySpark Usage Guide for Pandas with Apache Arrow
url: sql-pyspark-pandas-with-arrow.html
subitems:
- text: Apache Arrow in Spark
url: sql-pyspark-pandas-with-arrow.html#apache-arrow-in-spark
- text: "Enabling for Conversion to/from Pandas"
url: sql-pyspark-pandas-with-arrow.html#enabling-for-conversion-tofrom-pandas
- text: "Pandas UDFs (a.k.a. Vectorized UDFs)"
url: sql-pyspark-pandas-with-arrow.html#pandas-udfs-aka-vectorized-udfs
- text: Usage Notes
url: sql-pyspark-pandas-with-arrow.html#usage-notes
- text: Migration Guide
[SPARK-29052][DOCS][ML][PYTHON][CORE][R][SQL][SS] Create a Migration Guide tap in Spark documentation ### What changes were proposed in this pull request? Currently, there is no migration section for PySpark, SparkCore and Structured Streaming. It is difficult for users to know what to do when they upgrade. This PR proposes to create create a "Migration Guide" tap at Spark documentation. ![Screen Shot 2019-09-11 at 7 02 05 PM](https://user-images.githubusercontent.com/6477701/64688126-ad712f80-d4c6-11e9-8672-9a2c56c05bf8.png) ![Screen Shot 2019-09-11 at 7 27 15 PM](https://user-images.githubusercontent.com/6477701/64689915-389ff480-d4ca-11e9-8c54-7f46095d0d23.png) This page will contain migration guides for Spark SQL, PySpark, SparkR, MLlib, Structured Streaming and Core. Basically it is a refactoring. There are some new information added, which I will leave a comment inlined for easier review. 1. **MLlib** Merge [ml-guide.html#migration-guide](https://spark.apache.org/docs/latest/ml-guide.html#migration-guide) and [ml-migration-guides.html](https://spark.apache.org/docs/latest/ml-migration-guides.html) ``` 'docs/ml-guide.md' ↓ Merge new/old migration guides 'docs/ml-migration-guide.md' ``` 2. **PySpark** Extract PySpark specific items from https://spark.apache.org/docs/latest/sql-migration-guide-upgrade.html ``` 'docs/sql-migration-guide-upgrade.md' ↓ Extract PySpark specific items 'docs/pyspark-migration-guide.md' ``` 3. **SparkR** Move [sparkr.html#migration-guide](https://spark.apache.org/docs/latest/sparkr.html#migration-guide) into a separate file, and extract from [sql-migration-guide-upgrade.html](https://spark.apache.org/docs/latest/sql-migration-guide-upgrade.html) ``` 'docs/sparkr.md' 'docs/sql-migration-guide-upgrade.md' Move migration guide section ↘ ↙ Extract SparkR specific items docs/sparkr-migration-guide.md ``` 4. **Core** Newly created at `'docs/core-migration-guide.md'`. I skimmed resolved JIRAs at 3.0.0 and found some items to note. 5. **Structured Streaming** Newly created at `'docs/ss-migration-guide.md'`. I skimmed resolved JIRAs at 3.0.0 and found some items to note. 6. **SQL** Merged [sql-migration-guide-upgrade.html](https://spark.apache.org/docs/latest/sql-migration-guide-upgrade.html) and [sql-migration-guide-hive-compatibility.html](https://spark.apache.org/docs/latest/sql-migration-guide-hive-compatibility.html) ``` 'docs/sql-migration-guide-hive-compatibility.md' 'docs/sql-migration-guide-upgrade.md' Move Hive compatibility section ↘ ↙ Left over after filtering PySpark and SparkR items 'docs/sql-migration-guide.md' ``` ### Why are the changes needed? In order for users in production to effectively migrate to higher versions, and detect behaviour or breaking changes before upgrading and/or migrating. ### Does this PR introduce any user-facing change? Yes, this changes Spark's documentation at https://spark.apache.org/docs/latest/index.html. ### How was this patch tested? Manually build the doc. This can be verified as below: ```bash cd docs SKIP_API=1 jekyll build open _site/index.html ``` Closes #25757 from HyukjinKwon/migration-doc. Authored-by: HyukjinKwon <gurwls223@apache.org> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-09-15 14:17:30 -04:00
url: sql-migration-old.html
- text: SQL Reference
url: sql-ref.html
subitems:
- text: Data Types
url: sql-ref-datatypes.html
[SPARK-28773][DOC][SQL] Handling of NULL data in Spark SQL ### What changes were proposed in this pull request? Document ```NULL``` semantics in SQL Reference Guide. ### Why are the changes needed? Currently Spark lacks documentation on how `NULL` data is handled in various expressions and operators. This is aimed at addressing this issue. ### Does this PR introduce any user-facing change? Yes. **Before:** There was no documentation for this. **After.** <img width="1234" alt="Screen Shot 2019-09-08 at 11 24 41 PM" src="https://user-images.githubusercontent.com/14225158/64507782-83362c80-d290-11e9-8295-70de412ea1f4.png"> <img width="1234" alt="Screen Shot 2019-09-08 at 11 24 56 PM" src="https://user-images.githubusercontent.com/14225158/64507784-83362c80-d290-11e9-8f85-fbaf6116905f.png"> <img width="1234" alt="Screen Shot 2019-09-08 at 11 25 08 PM" src="https://user-images.githubusercontent.com/14225158/64507785-83362c80-d290-11e9-9f9a-1dbafbc33bba.png"> <img width="1234" alt="Screen Shot 2019-09-08 at 11 25 24 PM" src="https://user-images.githubusercontent.com/14225158/64507787-83362c80-d290-11e9-99b0-fcaa4a1f9a2d.png"> <img width="1234" alt="Screen Shot 2019-09-08 at 11 25 34 PM" src="https://user-images.githubusercontent.com/14225158/64507789-83cec300-d290-11e9-94e7-feb8cf65d7ce.png"> <img width="1234" alt="Screen Shot 2019-09-08 at 11 25 49 PM" src="https://user-images.githubusercontent.com/14225158/64507790-83cec300-d290-11e9-8c68-d745e7e9e4ca.png"> <img width="1234" alt="Screen Shot 2019-09-08 at 11 26 00 PM" src="https://user-images.githubusercontent.com/14225158/64507791-83cec300-d290-11e9-9590-1e4c7ae28dac.png"> <img width="1234" alt="Screen Shot 2019-09-08 at 11 26 09 PM" src="https://user-images.githubusercontent.com/14225158/64507792-83cec300-d290-11e9-885a-58752633ee71.png"> <img width="1234" alt="Screen Shot 2019-09-08 at 11 26 20 PM" src="https://user-images.githubusercontent.com/14225158/64507793-83cec300-d290-11e9-8af8-9ef17034accb.png"> <img width="1234" alt="Screen Shot 2019-09-08 at 11 26 32 PM" src="https://user-images.githubusercontent.com/14225158/64507794-83cec300-d290-11e9-874b-0d419cadbf75.png"> <img width="1234" alt="Screen Shot 2019-09-08 at 11 26 47 PM" src="https://user-images.githubusercontent.com/14225158/64507795-84675980-d290-11e9-9ce6-870b46b060bc.png"> <img width="1234" alt="Screen Shot 2019-09-08 at 11 26 59 PM" src="https://user-images.githubusercontent.com/14225158/64507796-84675980-d290-11e9-91cc-d6ffc5e3374d.png"> <img width="1234" alt="Screen Shot 2019-09-08 at 11 27 10 PM" src="https://user-images.githubusercontent.com/14225158/64507797-84675980-d290-11e9-9d36-dcc6b1e75f38.png"> <img width="1234" alt="Screen Shot 2019-09-08 at 11 27 18 PM" src="https://user-images.githubusercontent.com/14225158/64507798-84675980-d290-11e9-842c-8d57877b4389.png"> <img width="1234" alt="Screen Shot 2019-09-08 at 11 27 27 PM" src="https://user-images.githubusercontent.com/14225158/64507799-84675980-d290-11e9-881d-16a24c6f5acd.png"> <img width="1234" alt="Screen Shot 2019-09-08 at 11 27 37 PM" src="https://user-images.githubusercontent.com/14225158/64507801-84675980-d290-11e9-8f52-875a7a3c92c1.png"> <img width="1234" alt="Screen Shot 2019-09-08 at 11 27 48 PM" src="https://user-images.githubusercontent.com/14225158/64507802-84675980-d290-11e9-9586-1d66fc07c069.png"> <img width="1234" alt="Screen Shot 2019-09-08 at 11 27 59 PM" src="https://user-images.githubusercontent.com/14225158/64507804-84fff000-d290-11e9-8378-2d1a6cfa76d2.png"> <img width="1234" alt="Screen Shot 2019-09-08 at 11 28 08 PM" src="https://user-images.githubusercontent.com/14225158/64507805-84fff000-d290-11e9-81ec-abeec2842922.png"> <img width="1234" alt="Screen Shot 2019-09-08 at 11 28 20 PM" src="https://user-images.githubusercontent.com/14225158/64507806-84fff000-d290-11e9-900f-1debb28f8f93.png"> ### How was this patch tested? Tested using jykyll build --serve Closes #25726 from dilipbiswal/sql-ref-null-data. Lead-authored-by: Dilip Biswal <dbiswal@us.ibm.com> Co-authored-by: Xiao Li <gatorsmile@gmail.com> Signed-off-by: Xiao Li <gatorsmile@gmail.com>
2019-09-09 16:41:17 -04:00
- text: Null Semantics
url: sql-ref-null-semantics.html
- text: NaN Semantics
url: sql-ref-nan-semantics.html
- text: SQL Syntax
url: sql-ref-syntax.html
subitems:
- text: Data Definition Statements
url: sql-ref-syntax-ddl.html
subitems:
- text: ALTER DATABASE
url: sql-ref-syntax-ddl-alter-database.html
- text: ALTER TABLE
url: sql-ref-syntax-ddl-alter-table.html
- text: ALTER VIEW
url: sql-ref-syntax-ddl-alter-view.html
- text: CREATE DATABASE
url: sql-ref-syntax-ddl-create-database.html
- text: CREATE FUNCTION
url: sql-ref-syntax-ddl-create-function.html
- text: CREATE TABLE
url: sql-ref-syntax-ddl-create-table.html
- text: CREATE VIEW
url: sql-ref-syntax-ddl-create-view.html
- text: DROP DATABASE
url: sql-ref-syntax-ddl-drop-database.html
- text: DROP FUNCTION
url: sql-ref-syntax-ddl-drop-function.html
- text: DROP TABLE
url: sql-ref-syntax-ddl-drop-table.html
- text: DROP VIEW
url: sql-ref-syntax-ddl-drop-view.html
- text: TRUNCATE TABLE
url: sql-ref-syntax-ddl-truncate-table.html
- text: REPAIR TABLE
url: sql-ref-syntax-ddl-repair-table.html
- text: Data Manipulation Statements
url: sql-ref-syntax-dml.html
subitems:
- text: INSERT
url: sql-ref-syntax-dml-insert.html
- text: LOAD
url: sql-ref-syntax-dml-load.html
- text: Data Retrieval(Queries)
url: sql-ref-syntax-qry.html
subitems:
- text: SELECT
url: sql-ref-syntax-qry-select.html
subitems:
- text: DISTINCT Clause
url: sql-ref-syntax-qry-select-distinct.html
- text: Joins
url: sql-ref-syntax-qry-select-join.html
- text: ORDER BY Clause
url: sql-ref-syntax-qry-select-orderby.html
- text: GROUP BY Clause
url: sql-ref-syntax-qry-select-groupby.html
- text: HAVING Clause
url: sql-ref-syntax-qry-select-having.html
- text: LIMIT Clause
url: sql-ref-syntax-qry-select-limit.html
- text: Set operations
url: sql-ref-syntax-qry-select-setops.html
- text: USE database
url: sql-ref-syntax-qry-select-usedb.html
- text: Common Table Expression(CTE)
url: sql-ref-syntax-qry-select-cte.html
- text: Subqueries
url: sql-ref-syntax-qry-select-subqueries.html
- text: Query hints
url: sql-ref-syntax-qry-select-hints.html
- text: SAMPLING
url: sql-ref-syntax-qry-sampling.html
- text: WINDOWING ANALYTIC FUNCTIONS
url: sql-ref-syntax-qry-window.html
- text: AGGREGATION (CUBE/ROLLUP/GROUPING)
url: sql-ref-syntax-qry-aggregation.html
- text: EXPLAIN
url: sql-ref-syntax-qry-explain.html
- text: Auxilarry Statements
url: sql-ref-syntax-aux.html
subitems:
- text: Analyze statement
url: sql-ref-syntax-aux-analyze.html
subitems:
- text: ANALYZE TABLE
url: sql-ref-syntax-aux-analyze-table.html
- text: Caching statements
url: sql-ref-syntax-aux-cache.html
subitems:
- text: CACHE TABLE
url: sql-ref-syntax-aux-cache-cache-table.html
- text: UNCACHE TABLE
url: sql-ref-syntax-aux-cache-uncache-table.html
- text: CLEAR CACHE
url: sql-ref-syntax-aux-cache-clear-cache.html
- text: REFRESH TABLE
url: sql-ref-syntax-aux-refresh-table.html
- text: Describe Commands
url: sql-ref-syntax-aux-describe.html
subitems:
- text: DESCRIBE DATABASE
url: sql-ref-syntax-aux-describe-database.html
- text: DESCRIBE TABLE
url: sql-ref-syntax-aux-describe-table.html
- text: DESCRIBE FUNCTION
url: sql-ref-syntax-aux-describe-function.html
- text: DESCRIBE QUERY
url: sql-ref-syntax-aux-describe-query.html
- text: Show commands
url: sql-ref-syntax-aux-show.html
subitems:
- text: SHOW COLUMNS
url: sql-ref-syntax-aux-show-columns.html
- text: SHOW DATABASES
url: sql-ref-syntax-aux-show-databases.html
- text: SHOW FUNCTIONS
url: sql-ref-syntax-aux-show-functions.html
- text: SHOW TABLE
url: sql-ref-syntax-aux-show-table.html
- text: SHOW TABLES
url: sql-ref-syntax-aux-show-tables.html
- text: SHOW TBLPROPERTIES
url: sql-ref-syntax-aux-show-tblproperties.html
- text: SHOW PARTITIONS
url: sql-ref-syntax-aux-show-partitions.html
- text: SHOW CREATE TABLE
url: sql-ref-syntax-aux-show-create-table.html
- text: Configuration Management Commands
url: sql-ref-syntax-aux-conf-mgmt.html
subitems:
- text: SET
url: sql-ref-syntax-aux-conf-mgmt-set.html
- text: RESET
url: sql-ref-syntax-aux-conf-mgmt-reset.html
- text: Resource Management Commands
url: sql-ref-syntax-aux-resource-mgmt.html
subitems:
- text: ADD FILE
url: sql-ref-syntax-aux-resource-mgmt-add-file.html
- text: ADD JAR
url: sql-ref-syntax-aux-resource-mgmt-add-jar.html
- text: LIST FILE
url: sql-ref-syntax-aux-resource-mgmt-list-file.html
- text: LIST JAR
url: sql-ref-syntax-aux-resource-mgmt-list-jar.html
- text: Functions
url: sql-ref-functions.html
subitems:
- text: Builtin Functions
url: sql-ref-functions-builtin.html
subitems:
- text: Scalar functions
url: sql-ref-functions-builtin-scalar.html
- text: Aggregate functions
url: sql-ref-functions-builtin-aggregate.html
- text: User defined Functions
url: sql-ref-functions-udf.html
subitems:
- text: Scalar functions
url: sql-ref-functions-udf-scalar.html
- text: Aggregate functions
url: sql-ref-functions-udf-aggregate.html
- text: Arthmetic operations
url: sql-ref-arithmetic-ops.html