spark-instrumented-optimizer

History

Maxim Gekk 027ed2d11b [SPARK-23643][CORE][SQL][ML] Shrinking the buffer in hashSeed up to size of the seed parameter ## What changes were proposed in this pull request? The hashSeed method allocates 64 bytes instead of 8. Other bytes are always zeros (thanks to default behavior of ByteBuffer). And they could be excluded from hash calculation because they don't differentiate inputs. ## How was this patch tested? By running the existing tests - XORShiftRandomSuite Closes #20793 from MaxGekk/hash-buff-size. Lead-authored-by: Maxim Gekk <maxim.gekk@databricks.com> Co-authored-by: Maxim Gekk <max.gekk@gmail.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>		2019-03-23 11:26:09 -05:00
..
avro	[SPARK-26856][PYSPARK] Python support for from_avro and to_avro APIs	2019-03-11 10:15:07 +09:00
tests	[SPARK-23643][CORE][SQL][ML] Shrinking the buffer in hashSeed up to size of the seed parameter	2019-03-23 11:26:09 -05:00
__init__.py	[SPARK-22369][PYTHON][DOCS] Exposes catalog API documentation in PySpark	2017-11-02 15:22:52 +01:00
catalog.py	[SPARK-24665][PYSPARK][FOLLOWUP] Use SQLConf in PySpark to manage all sql configs	2018-08-17 10:18:08 +08:00
column.py	[SPARK-23847][PYTHON][SQL] Add asc_nulls_first, asc_nulls_last to PySpark	2018-04-08 12:09:06 +08:00
conf.py	[SPARK-23698][PYTHON] Resolve undefined names in Python 3	2018-08-22 10:06:59 -07:00
context.py	[SPARK-26640][CORE][ML][SQL][STREAMING][PYSPARK] Code cleanup from lgtm.com analysis	2019-01-17 19:40:39 -06:00
dataframe.py	[SPARK-23643][CORE][SQL][ML] Shrinking the buffer in hashSeed up to size of the seed parameter	2019-03-23 11:26:09 -05:00
functions.py	[SPARK-23643][CORE][SQL][ML] Shrinking the buffer in hashSeed up to size of the seed parameter	2019-03-23 11:26:09 -05:00
group.py	[SPARK-24722][SQL] pivot() with Column type argument	2018-08-04 14:17:32 +08:00
readwriter.py	[SPARK-26016][DOCS] Clarify that text DataSource read/write, and RDD methods that read text, always use UTF-8	2019-03-05 08:03:39 +09:00
session.py	[SPARK-27163][PYTHON] Cleanup and consolidate Pandas UDF functionality	2019-03-21 17:44:51 +09:00
streaming.py	[SPARK-26016][DOCS] Clarify that text DataSource read/write, and RDD methods that read text, always use UTF-8	2019-03-05 08:03:39 +09:00
types.py	[SPARK-23836][PYTHON] Add support for StructType return in Scalar Pandas UDF	2019-03-07 08:52:24 -08:00
udf.py	[SPARK-23836][PYTHON] Add support for StructType return in Scalar Pandas UDF	2019-03-07 08:52:24 -08:00
utils.py	[SPARK-24721][SQL] Exclude Python UDFs filters in FileSourceStrategy	2018-08-28 10:57:13 +08:00
window.py	[SPARK-26860][PYSPARK][SPARKR] Fix for RangeBetween and RowsBetween docs to be in sync with spark documentation	2019-03-11 08:53:09 -05:00