spark-instrumented-optimizer

History

zhengruifeng 0c38765b29 [SPARK-32974][ML] FeatureHasher transform optimization ### What changes were proposed in this pull request? pre-compute the output indices of numerical columns, instead of computing them on each row. ### Why are the changes needed? for a numerical column, its output index is a hash of its `col_name`, we can pre-compute it at first, instead of computing it on each row. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? existing testsuites Closes #29850 from zhengruifeng/hash_opt. Authored-by: zhengruifeng <ruifengz@foxmail.com> Signed-off-by: zhengruifeng <ruifengz@foxmail.com>		2020-09-27 09:35:05 +08:00
..
benchmarks	[SPARK-29297][TESTS] Compare `core`/`mllib` module benchmarks in JDK8/11	2019-09-29 21:43:58 -07:00
src	[SPARK-32974][ML] FeatureHasher transform optimization	2020-09-27 09:35:05 +08:00
pom.xml	[SPARK-30950][BUILD] Setting version to 3.1.0-SNAPSHOT	2020-02-25 19:44:31 -08:00