## What changes were proposed in this pull request?
This pr targeted to define reserved/non-reserved keywords for Spark SQL based on the ANSI SQL standards and the other database-like systems (e.g., PostgreSQL). We assume that they basically follow the ANSI SQL-2011 standard, but it is slightly different between each other. Therefore, this pr documented all the keywords in `docs/sql-reserved-and-non-reserved-key-words.md`.
NOTE: This pr only added a small set of keywords as reserved ones and these keywords are reserved in all the ANSI SQL standards (SQL-92, SQL-99, SQL-2003, SQL-2008, SQL-2011, and SQL-2016) and PostgreSQL. This is because there is room to discuss which keyword should be reserved or not, .e.g., interval units (day, hour, minute, second, ...) are reserved in the ANSI SQL standards though, they are not reserved in PostgreSQL. Therefore, we need more researches about the other database-like systems (e.g., Oracle Databases, DB2, SQL server) in follow-up activities.
References:
- The reserved/non-reserved SQL keywords in the ANSI SQL standards: https://developer.mimer.com/wp-content/uploads/2018/05/Standard-SQL-Reserved-Words-Summary.pdf
- SQL Key Words in PostgreSQL: https://www.postgresql.org/docs/current/sql-keywords-appendix.html
## How was this patch tested?
Added tests in `TableIdentifierParserSuite`.
Closes#23259 from maropu/SPARK-26215-WIP.
Authored-by: Takeshi Yamamuro <yamamuro@apache.org>
Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>
## What changes were proposed in this pull request?
1. Split the main page of sql-programming-guide into 7 parts:
- Getting Started
- Data Sources
- Performance Turing
- Distributed SQL Engine
- PySpark Usage Guide for Pandas with Apache Arrow
- Migration Guide
- Reference
2. Add left menu for sql-programming-guide, keep first level index for each part in the menu.
![image](https://user-images.githubusercontent.com/4833765/47016859-6332e180-d183-11e8-92e8-ce62518a83c4.png)
## How was this patch tested?
Local test with jekyll build/serve.
Closes#22746 from xuanyuanking/SPARK-24499.
Authored-by: Yuanjian Li <xyliyuanjian@gmail.com>
Signed-off-by: gatorsmile <gatorsmile@gmail.com>
## What changes were proposed in this pull request?
Add docs and examples for ```ml.stat.Correlation``` and ```ml.stat.ChiSquareTest```.
## How was this patch tested?
Generate docs and run examples manually, successfully.
Author: Yanbo Liang <ybliang8@gmail.com>
Closes#17994 from yanboliang/spark-20505.
## What changes were proposed in this pull request?
Add a new section for fpm
Add Example for FPGrowth in scala and Java
updated: Rewrite transform to be more compact.
## How was this patch tested?
local doc generation.
Author: Yuhao Yang <yuhao.yang@intel.com>
Closes#17130 from hhbyyh/fpmdoc.
## What changes were proposed in this pull request?
Made DataFrame-based API primary
* Spark doc menu bar and other places now link to ml-guide.html, not mllib-guide.html
* mllib-guide.html keeps RDD-specific list of features, with a link at the top redirecting people to ml-guide.html
* ml-guide.html includes a "maintenance mode" announcement about the RDD-based API
* **Reviewers: please check this carefully**
* (minor) Titles for DF API no longer include "- spark.ml" suffix. Titles for RDD API have "- RDD-based API" suffix
* Moved migration guide to ml-guide from mllib-guide
* Also moved past guides from mllib-migration-guides to ml-migration-guides, with a redirect link on mllib-migration-guides
* **Reviewers**: I did not change any of the content of the migration guides.
Reorganized DataFrame-based guide:
* ml-guide.html mimics the old mllib-guide.html page in terms of content: overview, migration guide, etc.
* Moved Pipeline description into ml-pipeline.html and moved tuning into ml-tuning.html
* **Reviewers**: I did not change the content of these guides, except some intro text.
* Sidebar remains the same, but with pipeline and tuning sections added
Other:
* ml-classification-regression.html: Moved text about linear methods to new section in page
## How was this patch tested?
Generated docs locally
Author: Joseph K. Bradley <joseph@databricks.com>
Closes#14213 from jkbradley/ml-guide-2.0.
This documents the implementation of ALS in `spark.ml` with example code in scala, java and python.
Author: BenFradet <benjamin.fradet@gmail.com>
Closes#10411 from BenFradet/SPARK-12247.
Replaces a number of occurences of `MLlib` in the documentation that were meant to refer to the `spark.mllib` package instead. It should clarify for new users the difference between `spark.mllib` (the package) and MLlib (the umbrella project for ML in spark).
It also removes some files that I forgot to delete with #10207
Author: Timothy Hunter <timhunter@databricks.com>
Closes#10234 from thunterdb/12212.
This PR adds a sidebar menu when browsing the user guide of MLlib. It uses a YAML file to describe the structure of the documentation. It should be trivial to adapt this to the other projects.
![screen shot 2015-11-18 at 4 46 12 pm](https://cloud.githubusercontent.com/assets/7594753/11259591/a55173f4-8e17-11e5-9340-0aed79d66262.png)
Author: Timothy Hunter <timhunter@databricks.com>
Closes#9826 from thunterdb/spark-11835.