spark-instrumented-optimizer

History

Maxim Gekk 027ed2d11b [SPARK-23643][CORE][SQL][ML] Shrinking the buffer in hashSeed up to size of the seed parameter ## What changes were proposed in this pull request? The hashSeed method allocates 64 bytes instead of 8. Other bytes are always zeros (thanks to default behavior of ByteBuffer). And they could be excluded from hash calculation because they don't differentiate inputs. ## How was this patch tested? By running the existing tests - XORShiftRandomSuite Closes #20793 from MaxGekk/hash-buff-size. Lead-authored-by: Maxim Gekk <maxim.gekk@databricks.com> Co-authored-by: Maxim Gekk <max.gekk@gmail.com> Signed-off-by: Sean Owen <sean.owen@databricks.com>		2019-03-23 11:26:09 -05:00
..
__init__.py	[SPARK-26033][PYTHON][TESTS] Break large ml/tests.py file into smaller files	2018-11-18 16:02:15 +08:00
test_algorithms.py	[SPARK-23643][CORE][SQL][ML] Shrinking the buffer in hashSeed up to size of the seed parameter	2019-03-23 11:26:09 -05:00
test_base.py	[SPARK-26105][PYTHON] Clean unittest2 imports up that were added for Python 2.6 before	2018-11-19 09:22:32 +08:00
test_evaluation.py	[SPARK-26105][PYTHON] Clean unittest2 imports up that were added for Python 2.6 before	2018-11-19 09:22:32 +08:00
test_feature.py	[SPARK-26616][MLLIB] Expose document frequency in IDFModel	2019-01-22 07:41:54 -06:00
test_image.py	[SPARK-26105][PYTHON] Clean unittest2 imports up that were added for Python 2.6 before	2018-11-19 09:22:32 +08:00
test_linalg.py	[SPARK-26033][SPARK-26034][PYTHON][FOLLOW-UP] Small cleanup and deduplication in ml/mllib tests	2018-12-03 14:03:10 -08:00
test_param.py	[SPARK-26105][PYTHON] Clean unittest2 imports up that were added for Python 2.6 before	2018-11-19 09:22:32 +08:00
test_persistence.py	[SPARK-16838][PYTHON] Add PMML export for ML KMeans in PySpark	2019-01-22 09:34:59 -06:00
test_pipeline.py	[SPARK-26105][PYTHON] Clean unittest2 imports up that were added for Python 2.6 before	2018-11-19 09:22:32 +08:00
test_stat.py	[SPARK-26105][PYTHON] Clean unittest2 imports up that were added for Python 2.6 before	2018-11-19 09:22:32 +08:00
test_training_summary.py	[SPARK-26105][PYTHON] Clean unittest2 imports up that were added for Python 2.6 before	2018-11-19 09:22:32 +08:00
test_tuning.py	[SPARK-26105][PYTHON] Clean unittest2 imports up that were added for Python 2.6 before	2018-11-19 09:22:32 +08:00
test_wrapper.py	[SPARK-22798][PYTHON][ML] Add multiple column support to PySpark StringIndexer	2019-02-20 08:52:46 -06:00