spark-instrumented-optimizer

History

Liang-Chi Hsieh c0297dedd8 [MINOR][PYSPARK][SQL][DOC] Fix rowsBetween doc in Window ## What changes were proposed in this pull request? I suspect that the doc of `rowsBetween` methods in Scala and PySpark looks wrong. Because: ```scala scala> val df = Seq((1, "a"), (2, "a"), (3, "a"), (4, "a"), (5, "a"), (6, "a")).toDF("id", "category") df: org.apache.spark.sql.DataFrame = [id: int, category: string] scala> val byCategoryOrderedById = Window.partitionBy('category).orderBy('id).rowsBetween(-1, 2) byCategoryOrderedById: org.apache.spark.sql.expressions.WindowSpec = org.apache.spark.sql.expressions.WindowSpec7f04de97 scala> df.withColumn("sum", sum('id) over byCategoryOrderedById).show() +---+--------+---+ \| id\|category\|sum\| +---+--------+---+ \| 1\| a\| 6\| # sum from index 0 to (0 + 2): 1 + 2 + 3 = 6 \| 2\| a\| 10\| # sum from index (1 - 1) to (1 + 2): 1 + 2 + 3 + 4 = 10 \| 3\| a\| 14\| \| 4\| a\| 18\| \| 5\| a\| 15\| \| 6\| a\| 11\| +---+--------+---+ ``` So the frame (-1, 2) for row with index 5, as described in the doc, should range from index 4 to index 7. ## How was this patch tested? N/A, just doc change. Closes #24864 from viirya/window-spec-doc. Authored-by: Liang-Chi Hsieh <viirya@gmail.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>		2019-06-14 09:56:37 +09:00
..
benchmarks	[SPARK-27701][SQL] Extend NestedColumnAliasing to general nested field cases including GetArrayStructField	2019-06-11 20:12:53 -07:00
src	[MINOR][PYSPARK][SQL][DOC] Fix rowsBetween doc in Window	2019-06-14 09:56:37 +09:00
v1.2.1/src	[SPARK-27699][SQL] Partially push down disjunctive predicated in Parquet/ORC	2019-05-17 19:25:24 +08:00
v2.3.5/src	[SPARK-27737][SQL] Upgrade to Hive 2.3.5 for Hive Metastore Client and Hadoop-3.2 profile	2019-05-22 10:24:17 +09:00
pom.xml	[SPARK-27521][SQL] Move data source v2 to catalyst module	2019-06-05 09:55:55 -07:00