spark-instrumented-optimizer/python/pyspark/ml/tests
Maxim Gekk 027ed2d11b [SPARK-23643][CORE][SQL][ML] Shrinking the buffer in hashSeed up to size of the seed parameter
## What changes were proposed in this pull request?

The hashSeed method allocates 64 bytes instead of 8. Other bytes are always zeros (thanks to default behavior of ByteBuffer). And they could be excluded from hash calculation because they don't differentiate inputs.

## How was this patch tested?

By running the existing tests - XORShiftRandomSuite

Closes #20793 from MaxGekk/hash-buff-size.

Lead-authored-by: Maxim Gekk <maxim.gekk@databricks.com>
Co-authored-by: Maxim Gekk <max.gekk@gmail.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-03-23 11:26:09 -05:00
..
__init__.py [SPARK-26033][PYTHON][TESTS] Break large ml/tests.py file into smaller files 2018-11-18 16:02:15 +08:00
test_algorithms.py [SPARK-23643][CORE][SQL][ML] Shrinking the buffer in hashSeed up to size of the seed parameter 2019-03-23 11:26:09 -05:00
test_base.py [SPARK-26105][PYTHON] Clean unittest2 imports up that were added for Python 2.6 before 2018-11-19 09:22:32 +08:00
test_evaluation.py [SPARK-26105][PYTHON] Clean unittest2 imports up that were added for Python 2.6 before 2018-11-19 09:22:32 +08:00
test_feature.py [SPARK-26616][MLLIB] Expose document frequency in IDFModel 2019-01-22 07:41:54 -06:00
test_image.py [SPARK-26105][PYTHON] Clean unittest2 imports up that were added for Python 2.6 before 2018-11-19 09:22:32 +08:00
test_linalg.py [SPARK-26033][SPARK-26034][PYTHON][FOLLOW-UP] Small cleanup and deduplication in ml/mllib tests 2018-12-03 14:03:10 -08:00
test_param.py [SPARK-26105][PYTHON] Clean unittest2 imports up that were added for Python 2.6 before 2018-11-19 09:22:32 +08:00
test_persistence.py [SPARK-16838][PYTHON] Add PMML export for ML KMeans in PySpark 2019-01-22 09:34:59 -06:00
test_pipeline.py [SPARK-26105][PYTHON] Clean unittest2 imports up that were added for Python 2.6 before 2018-11-19 09:22:32 +08:00
test_stat.py [SPARK-26105][PYTHON] Clean unittest2 imports up that were added for Python 2.6 before 2018-11-19 09:22:32 +08:00
test_training_summary.py [SPARK-26105][PYTHON] Clean unittest2 imports up that were added for Python 2.6 before 2018-11-19 09:22:32 +08:00
test_tuning.py [SPARK-26105][PYTHON] Clean unittest2 imports up that were added for Python 2.6 before 2018-11-19 09:22:32 +08:00
test_wrapper.py [SPARK-22798][PYTHON][ML] Add multiple column support to PySpark StringIndexer 2019-02-20 08:52:46 -06:00