spark-instrumented-optimizer/python/pyspark/sql/tests
Maxim Gekk 027ed2d11b [SPARK-23643][CORE][SQL][ML] Shrinking the buffer in hashSeed up to size of the seed parameter
## What changes were proposed in this pull request?

The hashSeed method allocates 64 bytes instead of 8. Other bytes are always zeros (thanks to default behavior of ByteBuffer). And they could be excluded from hash calculation because they don't differentiate inputs.

## How was this patch tested?

By running the existing tests - XORShiftRandomSuite

Closes #20793 from MaxGekk/hash-buff-size.

Lead-authored-by: Maxim Gekk <maxim.gekk@databricks.com>
Co-authored-by: Maxim Gekk <max.gekk@gmail.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-03-23 11:26:09 -05:00
..
__init__.py [SPARK-26032][PYTHON] Break large sql/tests.py files into smaller files 2018-11-14 14:51:11 +08:00
test_appsubmit.py [SPARK-26036][PYTHON] Break large tests.py files into smaller files 2018-11-15 12:30:52 +08:00
test_arrow.py [SPARK-26887][SQL][PYTHON][NS] Create datetime.date directly instead of creating datetime64 as intermediate data. 2019-02-18 11:48:10 +08:00
test_catalog.py [SPARK-26036][PYTHON] Break large tests.py files into smaller files 2018-11-15 12:30:52 +08:00
test_column.py [SPARK-26036][PYTHON] Break large tests.py files into smaller files 2018-11-15 12:30:52 +08:00
test_conf.py [SPARK-26036][PYTHON] Break large tests.py files into smaller files 2018-11-15 12:30:52 +08:00
test_context.py [SPARK-26676][PYTHON] Make HiveContextSQLTests.test_unbounded_frames test compatible with Python 2 and PyPy 2019-01-21 14:27:17 -08:00
test_dataframe.py [SPARK-23647][PYTHON][SQL] Adds more types for hint in pyspark 2018-12-01 10:37:03 +08:00
test_datasources.py [SPARK-26036][PYTHON] Break large tests.py files into smaller files 2018-11-15 12:30:52 +08:00
test_functions.py [SPARK-23643][CORE][SQL][ML] Shrinking the buffer in hashSeed up to size of the seed parameter 2019-03-23 11:26:09 -05:00
test_group.py [SPARK-26036][PYTHON] Break large tests.py files into smaller files 2018-11-15 12:30:52 +08:00
test_pandas_udf.py [SPARK-25811][PYSPARK] Raise a proper error when unsafe cast is detected by PyArrow 2019-01-22 14:54:41 +08:00
test_pandas_udf_grouped_agg.py [SPARK-26364][PYTHON][TESTING] Clean up imports in test_pandas_udf* 2018-12-14 10:45:24 +08:00
test_pandas_udf_grouped_map.py [SPARK-23836][PYTHON] Add support for StructType return in Scalar Pandas UDF 2019-03-07 08:52:24 -08:00
test_pandas_udf_scalar.py [SPARK-23836][PYTHON] Add support for StructType return in Scalar Pandas UDF 2019-03-07 08:52:24 -08:00
test_pandas_udf_window.py [SPARK-24561][SQL][PYTHON] User-defined window aggregation functions with Pandas UDF (bounded window) 2018-12-18 09:15:21 +08:00
test_readwriter.py [SPARK-26036][PYTHON] Break large tests.py files into smaller files 2018-11-15 12:30:52 +08:00
test_serde.py [SPARK-26036][PYTHON] Break large tests.py files into smaller files 2018-11-15 12:30:52 +08:00
test_session.py [SPARK-27101][PYTHON] Drop the created database after the test in test_session 2019-03-09 09:12:33 +09:00
test_streaming.py [SPARK-26945][PYTHON][SS][TESTS] Fix flaky test_*_await_termination in PySpark SS tests 2019-02-23 14:57:04 +08:00
test_types.py [SPARK-26645][PYTHON] Support decimals with negative scale when parsing datatype 2019-01-20 17:43:50 +08:00
test_udf.py [SPARK-27041][PYSPARK] Use imap() for python 2.x to resolve oom issue 2019-03-12 10:23:26 -05:00
test_utils.py [SPARK-26036][PYTHON] Break large tests.py files into smaller files 2018-11-15 12:30:52 +08:00