spark-instrumented-optimizer

History

Zhenhua Wang 365a29bdbf [SPARK-22100][SQL] Make percentile_approx support date/timestamp type and change the output type to be the same as input type ## What changes were proposed in this pull request? The `percentile_approx` function previously accepted numeric type input and output double type results. But since all numeric types, date and timestamp types are represented as numerics internally, `percentile_approx` can support them easily. After this PR, it supports date type, timestamp type and numeric types as input types. The result type is also changed to be the same as the input type, which is more reasonable for percentiles. This change is also required when we generate equi-height histograms for these types. ## How was this patch tested? Added a new test and modified some existing tests. Author: Zhenhua Wang <wangzhenhua@huawei.com> Closes #19321 from wzhfy/approx_percentile_support_types.		2017-09-25 09:28:42 -07:00
..
__init__.py	[SPARK-16772][PYTHON][DOCS] Fix API doc references to UDFRegistration + Update "important classes"	2016-08-06 05:02:59 +01:00
catalog.py	[SPARK-18777][PYTHON][SQL] Return UDF from udf.register	2017-05-06 22:28:42 -07:00
column.py	[SPARK-19165][PYTHON][SQL] PySpark APIs using columns as arguments should validate input types for column	2017-08-24 20:29:03 +09:00
conf.py	[SPARK-15464][ML][MLLIB][SQL][TESTS] Replace SQLContext and SparkContext with SparkSession using builder pattern in python test code	2016-05-23 18:14:48 -07:00
context.py	[SPARK-20586][SQL] Add deterministic to ScalaUDF	2017-07-25 17:19:44 -07:00
dataframe.py	[SPARK-22100][SQL] Make percentile_approx support date/timestamp type and change the output type to be the same as input type	2017-09-25 09:28:42 -07:00
functions.py	[SPARK-21190][PYSPARK] Python Vectorized UDFs	2017-09-22 16:17:50 +08:00
group.py	[MINOR][PYSPARK][DOC] Fix wrongly formatted examples in PySpark documentation	2016-07-06 10:45:51 -07:00
readwriter.py	[SPARK-21839][SQL] Support SQL config for ORC compression	2017-08-31 08:16:58 +09:00
session.py	[SPARK-19507][SPARK-21296][PYTHON] Avoid per-record type dispatch in schema verification and improve exception message	2017-07-04 20:45:58 +08:00
streaming.py	[SPARK-21756][SQL] Add JSON option to allow unquoted control characters	2017-08-25 10:18:03 -07:00
tests.py	[SPARK-21766][PYSPARK][SQL] DataFrame toPandas() raises ValueError with nullable int columns	2017-09-22 22:39:47 +09:00
types.py	[SPARK-21190][PYSPARK] Python Vectorized UDFs	2017-09-22 16:17:50 +08:00
utils.py	[MINOR][DOCS] Remove consecutive duplicated words/typo in Spark Repo	2017-01-04 15:07:29 +00:00
window.py	[SPARK-18690][PYTHON][SQL] Backward compatibility of unbounded frames	2016-12-02 17:39:28 -08:00