ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Yuanjian Li	987f386588	[SPARK-24499][SQL][DOC] Split the page of sql-programming-guide.html to multiple separate pages ## What changes were proposed in this pull request? 1. Split the main page of sql-programming-guide into 7 parts: - Getting Started - Data Sources - Performance Turing - Distributed SQL Engine - PySpark Usage Guide for Pandas with Apache Arrow - Migration Guide - Reference 2. Add left menu for sql-programming-guide, keep first level index for each part in the menu. ![image](https://user-images.githubusercontent.com/4833765/47016859-6332e180-d183-11e8-92e8-ce62518a83c4.png) ## How was this patch tested? Local test with jekyll build/serve. Closes #22746 from xuanyuanking/SPARK-24499. Authored-by: Yuanjian Li <xyliyuanjian@gmail.com> Signed-off-by: gatorsmile <gatorsmile@gmail.com>	2018-10-18 11:59:06 -07:00
Joseph K. Bradley	d58fe28836	[SPARK-23154][ML][DOC] Document backwards compatibility guarantees for ML persistence ## What changes were proposed in this pull request? Added documentation about what MLlib guarantees in terms of loading ML models and Pipelines from old Spark versions. Discussed & confirmed on linked JIRA. Author: Joseph K. Bradley <joseph@databricks.com> Closes #20592 from jkbradley/SPARK-23154-backwards-compat-doc.	2018-02-13 11:18:45 -08:00
Zhe Sun	0bac3e4cde	[SPARK-19797][DOC] ML pipeline document correction ## What changes were proposed in this pull request? Description about pipeline in this paragraph is incorrect https://spark.apache.org/docs/latest/ml-pipeline.html#how-it-works > If the Pipeline had more stages, it would call the LogisticRegressionModel’s transform() method on the DataFrame before passing the DataFrame to the next stage. Reason: Transformer could also be a stage. But only another Estimator will invoke an transform call and pass the data to next stage. The description in the document misleads ML pipeline users. ## How was this patch tested? This is a tiny modification of docs/ml-pipelines.md. I jekyll build the modification and check the compiled document. Author: Zhe Sun <ymwdalex@gmail.com> Closes #17137 from ymwdalex/SPARK-19797-ML-pipeline-document-correction.	2017-03-03 11:55:57 +01:00
Zheng RuiFeng	04ee8cf633	[SPARK-19410][DOC] Fix brokens links in ml-pipeline and ml-tuning ## What changes were proposed in this pull request? Fix brokens links in ml-pipeline and ml-tuning `<div data-lang="scala">` -> `<div data-lang="scala" markdown="1">` ## How was this patch tested? manual tests Author: Zheng RuiFeng <ruifengz@foxmail.com> Closes #16754 from zhengruifeng/doc_api_fix.	2017-02-01 13:27:20 +00:00
Zheng RuiFeng	cdaf4ce9fe	[SPARK-18480][DOCS] Fix wrong links for ML guide docs ## What changes were proposed in this pull request? 1, There are two `[Graph.partitionBy]` in `graphx-programming-guide.md`, the first one had no effert. 2, `DataFrame`, `Transformer`, `Pipeline` and `Parameter` in `ml-pipeline.md` were linked to `ml-guide.html` by mistake. 3, `PythonMLLibAPI` in `mllib-linear-methods.md` was not accessable, because class `PythonMLLibAPI` is private. 4, Other link updates. ## How was this patch tested? manual tests Author: Zheng RuiFeng <ruifengz@foxmail.com> Closes #15912 from zhengruifeng/md_fix.	2016-11-17 13:40:16 +00:00
Zheng RuiFeng	a75e3fe923	[SPARK-18446][ML][DOCS] Add links to API docs for ML algos ## What changes were proposed in this pull request? Add links to API docs for ML algos ## How was this patch tested? Manual checking for the API links Author: Zheng RuiFeng <ruifengz@foxmail.com> Closes #15890 from zhengruifeng/algo_link.	2016-11-16 10:53:23 +00:00
Joseph K. Bradley	5ffd5d3838	[SPARK-14817][ML][MLLIB][DOC] Made DataFrame-based API primary in MLlib guide ## What changes were proposed in this pull request? Made DataFrame-based API primary * Spark doc menu bar and other places now link to ml-guide.html, not mllib-guide.html * mllib-guide.html keeps RDD-specific list of features, with a link at the top redirecting people to ml-guide.html * ml-guide.html includes a "maintenance mode" announcement about the RDD-based API * Reviewers: please check this carefully * (minor) Titles for DF API no longer include "- spark.ml" suffix. Titles for RDD API have "- RDD-based API" suffix * Moved migration guide to ml-guide from mllib-guide * Also moved past guides from mllib-migration-guides to ml-migration-guides, with a redirect link on mllib-migration-guides * Reviewers: I did not change any of the content of the migration guides. Reorganized DataFrame-based guide: * ml-guide.html mimics the old mllib-guide.html page in terms of content: overview, migration guide, etc. * Moved Pipeline description into ml-pipeline.html and moved tuning into ml-tuning.html * Reviewers: I did not change the content of these guides, except some intro text. * Sidebar remains the same, but with pipeline and tuning sections added Other: * ml-classification-regression.html: Moved text about linear methods to new section in page ## How was this patch tested? Generated docs locally Author: Joseph K. Bradley <joseph@databricks.com> Closes #14213 from jkbradley/ml-guide-2.0.	2016-07-15 13:38:23 -07:00

7 commits