ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Chao Sun	a8032e7efa	[SPARK-35384][SQL][FOLLOWUP] Move `HashMap.get` out of `InvokeLike.invoke` ### What changes were proposed in this pull request? Move hash map lookup operation out of `InvokeLike.invoke` since it doesn't depend on the input. ### Why are the changes needed? We shouldn't need to look up the hash map for every input row evaluated by `InvokeLike.invoke` since it doesn't depend on input. This could speed up the performance a bit. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing tests. Closes #32532 from sunchao/SPARK-35384-follow-up. Authored-by: Chao Sun <sunchao@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2021-05-14 14:00:39 -07:00
Oleksandr Shevchenko	d2fbf0dce4	[SPARK-35405][DOC] Submitting Applications documentation has outdated information about K8s client mode support ### What changes were proposed in this pull request? [Submitting Applications doc](https://spark.apache.org/docs/latest/submitting-applications.html#master-urls) has outdated information about K8s client mode support. It still says "Client mode is currently unsupported and will be supported in future releases". ![image](https://user-images.githubusercontent.com/31073930/118268920-b5b51580-b4c6-11eb-8eed-975be8d37964.png) Whereas it's already supported and [Running Spark on Kubernetes doc](https://spark.apache.org/docs/latest/running-on-kubernetes.html#client-mode) says that it's supported started from 2.4.0 and has all needed information. ![image](https://user-images.githubusercontent.com/31073930/118268947-bd74ba00-b4c6-11eb-98d5-37961327642f.png) Changes: ![image](https://user-images.githubusercontent.com/31073930/118269179-12b0cb80-b4c7-11eb-8a37-d9d301bbda53.png) JIRA: https://issues.apache.org/jira/browse/SPARK-35405 ### Why are the changes needed? Outdated information in the doc is misleading ### Does this PR introduce _any_ user-facing change? Documentation changes ### How was this patch tested? Documentation changes Closes #32551 from o-shevchenko/SPARK-35405. Authored-by: Oleksandr Shevchenko <oleksandr.shevchenko@datarobot.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2021-05-14 11:26:35 -07:00
yi.wu	94bd480761	[SPARK-35206][TESTS][SQL] Extract common used get project path into a function in SparkFunctionSuite ### What changes were proposed in this pull request? Add a common functions `getWorkspaceFilePath` (which prefixed with spark home) to `SparkFunctionSuite`, and applies these the function to where they're extracted from. ### Why are the changes needed? Spark sql has test suites to read resources when running tests. The way of getting the path of resources is commonly used in different suites. We can extract them into a function to ease the code maintenance. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass existing tests. Closes #32315 from Ngone51/extract-common-file-path. Authored-by: yi.wu <yi.wu@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>	2021-05-14 22:17:50 +08:00
Kent Yao	68239d1b55	[SPARK-35404][CORE] Name the timers in TaskSchedulerImpl ### What changes were proposed in this pull request? make these threads easier to identify in thread dumps ### Why are the changes needed? make these threads easier to identify in thread dumps ### Does this PR introduce _any_ user-facing change? yes. Driver thread dumps will show the timers with pretty names ### How was this patch tested? verified locally Closes #32549 from yaooqinn/SPARK-35404. Authored-by: Kent Yao <yao@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>	2021-05-14 19:17:45 +09:00
ulysses-you	6218bc5036	[SPARK-35332][SQL][FOLLOWUP] Refine wrong comment ### What changes were proposed in this pull request? Refine comment in `CacheManager`. ### Why are the changes needed? Avoid misleading developer. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Not needed. Closes #32543 from ulysses-you/SPARK-35332-FOLLOWUP. Authored-by: ulysses-you <ulyssesyou18@gmail.com> Signed-off-by: Kent Yao <yao@apache.org>	2021-05-14 17:10:21 +08:00
Kent Yao	d424771ec2	[MINOR][DOC] ADD toc for monitoring page ### What changes were proposed in this pull request? Add toc tag on monitoring.md ### Why are the changes needed? fix doc ### Does this PR introduce _any_ user-facing change? yes, the table of content of the monitoring page will be shown on the official doc site. ### How was this patch tested? pass GA doc build Closes #32545 from yaooqinn/minor. Authored-by: Kent Yao <yao@apache.org> Signed-off-by: Kent Yao <yao@apache.org>	2021-05-14 14:19:15 +08:00
Pablo Langa	9ea55fe771	[SPARK-35207][SQL] Normalize hash function behavior with negative zero (floating point types) ### What changes were proposed in this pull request? Generally, we would expect that x = y => hash( x ) = hash( y ). However +-0 hash to different values for floating point types. ``` scala> spark.sql("select hash(cast('0.0' as double)), hash(cast('-0.0' as double))").show +-------------------------+--------------------------+ \|hash(CAST(0.0 AS DOUBLE))\|hash(CAST(-0.0 AS DOUBLE))\| +-------------------------+--------------------------+ \| -1670924195\| -853646085\| +-------------------------+--------------------------+ scala> spark.sql("select cast('0.0' as double) == cast('-0.0' as double)").show +--------------------------------------------+ \|(CAST(0.0 AS DOUBLE) = CAST(-0.0 AS DOUBLE))\| +--------------------------------------------+ \| true\| +--------------------------------------------+ ``` Here is an extract from IEEE 754: > The two zeros are distinguishable arithmetically only by either division-byzero ( producing appropriately signed infinities ) or else by the CopySign function recommended by IEEE 754 /854. Infinities, SNaNs, NaNs and Subnormal numbers necessitate four more special cases From this, I deduce that the hash function must produce the same result for 0 and -0. ### Why are the changes needed? It is a correctness issue ### Does this PR introduce _any_ user-facing change? This changes only affect to the hash function applied to -0 value in float and double types ### How was this patch tested? Unit testing and manual testing Closes #32496 from planga82/feature/spark35207_hashnegativezero. Authored-by: Pablo Langa <soypab@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>	2021-05-14 12:40:36 +08:00
Hyukjin Kwon	f7af9ab8dc	[SPARK-34764][UI][FOLLOW-UP] Fix indentation and missing arguments for JavaScript linter ### What changes were proposed in this pull request? This PR is a followup of https://github.com/apache/spark/pull/32436 which broke JavaScript linter. There was a logical conflict - the linter was added after the last successful test run in that PR. ``` added 118 packages in 1.482s /__w/spark/spark/core/src/main/resources/org/apache/spark/ui/static/executorspage.js 34:41 error 'type' is defined but never used. Allowed unused args must match /^_ignored_./u no-unused-vars 34:47 error 'row' is defined but never used. Allowed unused args must match /^_ignored_./u no-unused-vars 35:1 error Expected indentation of 2 spaces but found 4 indent 36:1 error Expected indentation of 4 spaces but found 7 indent 37:1 error Expected indentation of 2 spaces but found 4 indent 38:1 error Expected indentation of 4 spaces but found 7 indent 39:1 error Expected indentation of 2 spaces but found 4 indent 556:1 error Expected indentation of 14 spaces but found 16 indent 557:1 error Expected indentation of 14 spaces but found 16 indent ``` ### Why are the changes needed? To recover the build ### Does this PR introduce _any_ user-facing change? No, dev-only. ### How was this patch tested? Manually tested: ```bash ./dev/lint-js lint-js checks passed. ``` Closes #32541 from HyukjinKwon/SPARK-34764-followup. Authored-by: Hyukjin Kwon <gurwls223@apache.org> Signed-off-by: Kousuke Saruta <sarutak@oss.nttdata.com>	2021-05-14 12:45:13 +09:00
Gabor Somogyi	b6a0a7ea53	[SPARK-35311][SS][UI][DOCS] Structured Streaming Web UI state information documentation ### What changes were proposed in this pull request? In this PR I'm adding Structured Streaming Web UI state information documentation. ### Why are the changes needed? Missing documentation. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? ``` cd docs/ SKIP_API=1 bundle exec jekyll build ``` Manual webpage check. Closes #32433 from gaborgsomogyi/SPARK-35311. Authored-by: Gabor Somogyi <gabor.g.somogyi@gmail.com> Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com>	2021-05-14 10:40:12 +09:00
Takeshi Yamamuro	8fa739fb9d	[SPARK-35329][SQL] Split generated switch code into pieces in ExpandExec ### What changes were proposed in this pull request? This PR intends to split generated switch code into smaller ones in `ExpandExec`. In the current master, even a simple query like the one below generates a large method whose size (`maxMethodCodeSize:7448`) is close to `8000` (`CodeGenerator.DEFAULT_JVM_HUGE_METHOD_LIMIT`); ``` scala> val df = Seq(("2016-03-27 19:39:34", 1, "a"), ("2016-03-27 19:39:56", 2, "a"), ("2016-03-27 19:39:27", 4, "b")).toDF("time", "value", "id") scala> val rdf = df.select(window($"time", "10 seconds", "3 seconds", "0 second"), $"value").orderBy($"window.start".asc, $"value".desc).select("value") scala> sql("SET spark.sql.adaptive.enabled=false") scala> import org.apache.spark.sql.execution.debug._ scala> rdf.debugCodegen Found 2 WholeStageCodegen subtrees. == Subtree 1 / 2 (maxMethodCodeSize:7448; maxConstantPoolSize:189(0.29% used); numInnerClasses:0) == ^^^^ (1) Project [window#34.start AS _gen_alias_39#39, value#11] +- (1) Filter ((isnotnull(window#34) AND (cast(time#10 as timestamp) >= window#34.start)) AND (cast(time#10 as timestamp) < window#34.end)) +- (1) Expand [List(named_struct(start, precisetimestampcon... / 028 / private void expand_doConsume_0(InternalRow localtablescan_row_0, UTF8String expand_expr_0_0, boolean expand_exprIsNull_0_0, int expand_expr_1_0) throws java.io.IOException { / 029 / boolean expand_isNull_0 = true; / 030 / InternalRow expand_value_0 = / 031 / null; / 032 / for (int expand_i_0 = 0; expand_i_0 < 4; expand_i_0 ++) { / 033 / switch (expand_i_0) { / 034 / case 0: (too many code lines) / 517 / break; / 518 / / 519 / case 1: (too many code lines) / 1002 / break; / 1003 / / 1004 / case 2: (too many code lines) / 1487 / break; / 1488 / / 1489 / case 3: (too many code lines) / 1972 / break; / 1973 / } / 1974 / ((org.apache.spark.sql.execution.metric.SQLMetric) references[33] / numOutputRows /).add(1); / 1975 / / 1976 / do { / 1977 / boolean filter_value_2 = !expand_isNull_0; / 1978 / if (!filter_value_2) continue; ``` The fix in this PR can make the method smaller as follows; ``` Found 2 WholeStageCodegen subtrees. == Subtree 1 / 2 (maxMethodCodeSize:1713; maxConstantPoolSize:210(0.32% used); numInnerClasses:0) == ^^^^ (1) Project [window#17.start AS _gen_alias_32#32, value#11] +- (1) Filter ((isnotnull(window#17) AND (cast(time#10 as timestamp) >= window#17.start)) AND (cast(time#10 as timestamp) < window#17.end)) +- (1) Expand [List(named_struct(start, precisetimestampcon... /* 032 / private void expand_doConsume_0(InternalRow localtablescan_row_0, UTF8String expand_expr_0_0, boolean expand_exprIsNull_0_0, int expand_expr_1_0) throws java.io.IOException { / 033 / for (int expand_i_0 = 0; expand_i_0 < 4; expand_i_0 ++) { / 034 / switch (expand_i_0) { / 035 / case 0: / 036 / expand_switchCaseCode_0(expand_exprIsNull_0_0, expand_expr_0_0); / 037 / break; / 038 / / 039 / case 1: / 040 / expand_switchCaseCode_1(expand_exprIsNull_0_0, expand_expr_0_0); / 041 / break; / 042 / / 043 / case 2: / 044 / expand_switchCaseCode_2(expand_exprIsNull_0_0, expand_expr_0_0); / 045 / break; / 046 / / 047 / case 3: / 048 / expand_switchCaseCode_3(expand_exprIsNull_0_0, expand_expr_0_0); / 049 / break; / 050 / } / 051 / ((org.apache.spark.sql.execution.metric.SQLMetric) references[33] / numOutputRows /).add(1); / 052 / / 053 / do { / 054 / boolean filter_value_2 = !expand_resultIsNull_0; / 055 / if (!filter_value_2) continue; / 056 */ ... ``` ### Why are the changes needed? For better generated code. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? GA passed. Closes #32457 from maropu/splitSwitchCode. Authored-by: Takeshi Yamamuro <yamamuro@apache.org> Signed-off-by: Liang-Chi Hsieh <viirya@gmail.com>	2021-05-13 17:53:46 -07:00
Holden Karau	160b3bee71	[SPARK-34764][CORE][K8S][UI] Propagate reason for exec loss to Web UI ### What changes were proposed in this pull request? Adds the exec loss reason to the Spark web UI & in doing so also fix the Kube integration to pass exec loss reason into core. UI change: ![image](https://user-images.githubusercontent.com/59893/117045762-b975ba80-acc4-11eb-9679-8edab3cfadc2.png) ### Why are the changes needed? Debugging Spark jobs is hard, making it clearer why executors have exited could help. ### Does this PR introduce _any_ user-facing change? Yes a new column on the executor page. ### How was this patch tested? K8s unit test updated to validate exec loss reasons are passed through regardless of exec alive state, manual testing to validate the UI. Closes #32436 from holdenk/SPARK-34764-propegate-reason-for-exec-loss. Lead-authored-by: Holden Karau <hkarau@apple.com> Co-authored-by: Holden Karau <holden@pigscanfly.ca> Signed-off-by: Holden Karau <hkarau@apple.com>	2021-05-13 16:02:31 -07:00
Liang-Chi Hsieh	6a949d1659	[SPARK-35397][SQL] Replace sys.err usage with explicit exception type ### What changes were proposed in this pull request? This patch replaces `sys.err` usages with explicit exception types. ### Why are the changes needed? Motivated by the previous comment https://github.com/apache/spark/pull/32519#discussion_r630787080, it sounds better to replace `sys.err` usages with explicit exception type. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing tests. Closes #32535 from viirya/replace-sys-err. Authored-by: Liang-Chi Hsieh <viirya@gmail.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2021-05-13 10:37:24 -07:00
Hyukjin Kwon	7d371d27f2	[SPARK-35393][PYTHON][INFRA][TESTS] Recover pip packaging test in Github Actions ### What changes were proposed in this pull request? Currently pip packaging test is being skipped: ``` ======================================================================== Running PySpark packaging tests ======================================================================== Constructing virtual env for testing Missing virtualenv & conda, skipping pip installability tests Cleaning up temporary directory - /tmp/tmp.iILYWISPXW ``` See https://github.com/apache/spark/runs/2568923639?check_suite_focus=true GitHub Actions's image has its default Conda installed at `/usr/share/miniconda` but seems like the image we're using for PySpark does not have it (which is legitimate). This PR proposes to install Conda to use in pip packaging tests in GitHub Actions. ### Why are the changes needed? To recover the test coverage. ### Does this PR introduce _any_ user-facing change? No, dev-only. ### How was this patch tested? It was tested in my fork: https://github.com/HyukjinKwon/spark/runs/2575126882?check_suite_focus=true ``` ======================================================================== Running PySpark packaging tests ======================================================================== Constructing virtual env for testing Using conda virtual environments Testing pip installation with python 3.6 Using /tmp/tmp.qPjTenqfGn for virtualenv Collecting package metadata (current_repodata.json): ...working... done Solving environment: ...working... failed with repodata from current_repodata.json, will retry with next repodata source. Collecting package metadata (repodata.json): ...working... done Solving environment: ...working... done ## Package Plan ## environment location: /tmp/tmp.qPjTenqfGn/3.6 added / updated specs: - numpy - pandas - pip - python=3.6 - setuptools ... Successfully ran pip sanity check ``` Closes #32537 from HyukjinKwon/SPARK-35393. Authored-by: Hyukjin Kwon <gurwls223@apache.org> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2021-05-13 10:35:56 -07:00
Linhong Liu	6aa2594c6b	[SPARK-35366][SQL] Avoid using deprecated `buildForBatch` and `buildForStreaming` ### What changes were proposed in this pull request? Currently, in DSv2, we are still using the deprecated `buildForBatch` and `buildForStreaming`. This PR implements the `build`, `toBatch`, `toStreaming` interfaces to replace the deprecated ones. ### Why are the changes needed? Code refactor ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? exsting UT Closes #32497 from linhongliu-db/dsv2-writer. Lead-authored-by: Linhong Liu <linhong.liu@databricks.com> Co-authored-by: Linhong Liu <67896261+linhongliu-db@users.noreply.github.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>	2021-05-13 17:23:08 +00:00
gengjiaan	c2e15cccab	[SPARK-35062][SQL] Group exception messages in sql/streaming ### What changes were proposed in this pull request? This PR group exception messages in `sql/core/src/main/scala/org/apache/spark/sql/streaming`. ### Why are the changes needed? It will largely help with standardization of error messages and its maintenance. ### Does this PR introduce _any_ user-facing change? No. Error messages remain unchanged. ### How was this patch tested? No new tests - pass all original tests to make sure it doesn't break any existing behavior. Closes #32464 from beliefer/SPARK-35062. Lead-authored-by: gengjiaan <gengjiaan@360.cn> Co-authored-by: Jiaan Geng <beliefer@163.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>	2021-05-13 15:04:03 +00:00
ulysses-you	6f63057ede	[SPARK-35332][SQL] Make cache plan disable configs configurable ### What changes were proposed in this pull request? Add a new config to make cache plan disable configs configurable. ### Why are the changes needed? The disable configs of cache plan if to avoid the perfermance regression, but not all the query will slow than before due to AQE or bucket scan enabled. It's useful to make a new config so that user can decide if some configs should be disabled during cache plan. ### Does this PR introduce _any_ user-facing change? Yes, a new config. ### How was this patch tested? Add test. Closes #32482 from ulysses-you/SPARK-35332. Authored-by: ulysses-you <ulyssesyou18@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>	2021-05-13 14:49:05 +00:00
Gengliang Wang	02c99f15ee	[SPARK-35162][SQL] New SQL functions: TRY_ADD/TRY_DIVIDE ### What changes were proposed in this pull request? Add New SQL functions: * TRY_ADD * TRY_DIVIDE These expressions are identical to the following expression under ANSI mode except that it returns null if error occurs: * ADD * DIVIDE Note: it is easy to add other expressions like `TRY_SUBTRACT`/`TRY_MULTIPLY` but let's control the number of these new expressions and just add `TRY_ADD` and `TRY_DIVIDE` for now. ### Why are the changes needed? 1. Users can manage to finish queries without interruptions in ANSI mode. 2. Users can get NULLs instead of unreasonable results if overflow occurs when ANSI mode is off. For example, the behavior of the following SQL operations is unreasonable: ``` 2147483647 + 2 => -2147483647 ``` With the new safe version SQL functions: ``` TRY_ADD(2147483647, 2) => null ``` Note: We should only add new expressions to important operators, instead of adding new safe expressions for all the expressions that can throw errors. ### Does this PR introduce _any_ user-facing change? Yes, new SQL functions: TRY_ADD/TRY_DIVIDE ### How was this patch tested? Unit test Closes #32292 from gengliangwang/try_add. Authored-by: Gengliang Wang <ltnwgl@gmail.com> Signed-off-by: Gengliang Wang <ltnwgl@gmail.com>	2021-05-13 22:26:08 +08:00
Sean Owen	6c5fcac6b7	[SPARK-35373][BUILD] Check Maven artifact checksum in build/mvn ### What changes were proposed in this pull request? `./build/mvn` now downloads the .sha512 checksum of Maven artifacts it downloads, and checks the checksum after download. ### Why are the changes needed? This ensures the integrity of the Maven artifact during a user's build, which may come from several non-ASF mirrors. ### Does this PR introduce _any_ user-facing change? Should not affect anything about Spark per se, just the build. ### How was this patch tested? Manual testing wherein I forced Maven/Scala download, verified checksums are downloaded and checked, and verified it fails on error with a corrupted checksum. Closes #32505 from srowen/SPARK-35373. Authored-by: Sean Owen <srowen@gmail.com> Signed-off-by: Sean Owen <srowen@gmail.com>	2021-05-13 09:06:57 -05:00
Ruifeng Zheng	f7704ece40	[SPARK-35392][ML][PYTHON] Fix flaky tests in ml/clustering.py and ml/feature.py ### What changes were proposed in this pull request? This PR removes the check of `summary.logLikelihood` in ml/clustering.py - this GMM test is quite flaky. It fails easily e.g., if: - change number of partitions; - just change the way to compute the sum of weights; - change the underlying BLAS impl Also uses more permissive precision on `Word2Vec` test case. ### Why are the changes needed? To recover the build and tests. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing test cases. Closes #32533 from zhengruifeng/SPARK_35392_disable_flaky_gmm_test. Lead-authored-by: Ruifeng Zheng <ruifengz@foxmail.com> Co-authored-by: Hyukjin Kwon <gurwls223@gmail.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>	2021-05-13 22:23:51 +09:00
jiake	b6d57b6b99	[SPARK-34637][SQL] Support DPP + AQE when the broadcast exchange can be reused ### What changes were proposed in this pull request? We have supported DPP in AQE when the join is Broadcast hash join before applying the AQE rules in [SPARK-34168](https://issues.apache.org/jira/browse/SPARK-34168), which has some limitations. It only apply DPP when the small table side executed firstly and then the big table side can reuse the broadcast exchange in small table side. This PR is to address the above limitations and can apply the DPP when the broadcast exchange can be reused. ### Why are the changes needed? Resolve the limitations when both enabling DPP and AQE ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Adding new ut Closes #31756 from JkSelf/supportDPP2. Authored-by: jiake <ke.a.jia@intel.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>	2021-05-13 13:07:02 +00:00
Wenchen Fan	d1b8bd7d11	[SPARK-34720][SQL] MERGE ... UPDATE/INSERT * should do by-name resolution ### What changes were proposed in this pull request? In Spark, we have an extension in the MERGE syntax: INSERT/UPDATE . This is not from ANSI standard or any other mainstream databases, so we need to define the behaviors by our own. The behavior today is very weird: assume the source table has `n1` columns, target table has `n2` columns. We generate the assignments by taking the first `min(n1, n2)` columns from source & target tables and pairing them by ordinal. This PR proposes a more reasonable behavior: take all the columns from target table as keys, and find the corresponding columns from source table by name as values. ### Why are the changes needed? Fix the MEREG INSERT/UPDATE to be more user-friendly and easy to do schema evolution. ### Does this PR introduce _any_ user-facing change? Yes, but MERGE is only supported by very few data sources. ### How was this patch tested? new tests Closes #32192 from cloud-fan/merge. Authored-by: Wenchen Fan <wenchen@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>	2021-05-13 12:58:24 +00:00
Cheng Su	c1e995ac95	[SPARK-35350][SQL] Add code-gen for left semi sort merge join ### What changes were proposed in this pull request? As title. This PR is to add code-gen support for LEFT SEMI sort merge join. The main change is to add `semiJoin` code path in `SortMergeJoinExec.doProduce()` and introduce `onlyBufferFirstMatchedRow` in `SortMergeJoinExec.genScanner()`. The latter is for left semi sort merge join without condition. For this kind of query, we don't need to buffer all matched rows, but only the first one (this is same as non-code-gen code path). Example query: ``` val df1 = spark.range(10).select($"id".as("k1")) val df2 = spark.range(4).select($"id".as("k2")) val oneJoinDF = df1.join(df2.hint("SHUFFLE_MERGE"), $"k1" === $"k2", "left_semi") ``` Example of generated code for the query: ``` == Subtree 5 / 5 (maxMethodCodeSize:302; maxConstantPoolSize:156(0.24% used); numInnerClasses:0) == (5) Project [id#0L AS k1#2L] +- (5) SortMergeJoin [id#0L], [k2#6L], LeftSemi :- (2) Sort [id#0L ASC NULLS FIRST], false, 0 : +- Exchange hashpartitioning(id#0L, 5), ENSURE_REQUIREMENTS, [id=#27] : +- (1) Range (0, 10, step=1, splits=2) +- (4) Sort [k2#6L ASC NULLS FIRST], false, 0 +- Exchange hashpartitioning(k2#6L, 5), ENSURE_REQUIREMENTS, [id=#33] +- (3) Project [id#4L AS k2#6L] +- (3) Range (0, 4, step=1, splits=2) Generated code: / 001 / public Object generate(Object[] references) { / 002 / return new GeneratedIteratorForCodegenStage5(references); / 003 / } / 004 / / 005 / // codegenStageId=5 / 006 / final class GeneratedIteratorForCodegenStage5 extends org.apache.spark.sql.execution.BufferedRowIterator { / 007 / private Object[] references; / 008 / private scala.collection.Iterator[] inputs; / 009 / private scala.collection.Iterator smj_streamedInput_0; / 010 / private scala.collection.Iterator smj_bufferedInput_0; / 011 / private InternalRow smj_streamedRow_0; / 012 / private InternalRow smj_bufferedRow_0; / 013 / private long smj_value_2; / 014 / private org.apache.spark.sql.execution.ExternalAppendOnlyUnsafeRowArray smj_matches_0; / 015 / private long smj_value_3; / 016 / private org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter[] smj_mutableStateArray_0 = new org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter[2]; / 017 / / 018 / public GeneratedIteratorForCodegenStage5(Object[] references) { / 019 / this.references = references; / 020 / } / 021 / / 022 / public void init(int index, scala.collection.Iterator[] inputs) { / 023 / partitionIndex = index; / 024 / this.inputs = inputs; / 025 / smj_streamedInput_0 = inputs[0]; / 026 / smj_bufferedInput_0 = inputs[1]; / 027 / / 028 / smj_matches_0 = new org.apache.spark.sql.execution.ExternalAppendOnlyUnsafeRowArray(1, 2147483647); / 029 / smj_mutableStateArray_0[0] = new org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter(1, 0); / 030 / smj_mutableStateArray_0[1] = new org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter(1, 0); / 031 / / 032 / } / 033 / / 034 / private boolean findNextJoinRows( / 035 / scala.collection.Iterator streamedIter, / 036 / scala.collection.Iterator bufferedIter) { / 037 / smj_streamedRow_0 = null; / 038 / int comp = 0; / 039 / while (smj_streamedRow_0 == null) { / 040 / if (!streamedIter.hasNext()) return false; / 041 / smj_streamedRow_0 = (InternalRow) streamedIter.next(); / 042 / long smj_value_0 = smj_streamedRow_0.getLong(0); / 043 / if (false) { / 044 / smj_streamedRow_0 = null; / 045 / continue; / 046 / / 047 / } / 048 / if (!smj_matches_0.isEmpty()) { / 049 / comp = 0; / 050 / if (comp == 0) { / 051 / comp = (smj_value_0 > smj_value_3 ? 1 : smj_value_0 < smj_value_3 ? -1 : 0); / 052 / } / 053 / / 054 / if (comp == 0) { / 055 / return true; / 056 / } / 057 / smj_matches_0.clear(); / 058 / } / 059 / / 060 / do { / 061 / if (smj_bufferedRow_0 == null) { / 062 / if (!bufferedIter.hasNext()) { / 063 / smj_value_3 = smj_value_0; / 064 / return !smj_matches_0.isEmpty(); / 065 / } / 066 / smj_bufferedRow_0 = (InternalRow) bufferedIter.next(); / 067 / long smj_value_1 = smj_bufferedRow_0.getLong(0); / 068 / if (false) { / 069 / smj_bufferedRow_0 = null; / 070 / continue; / 071 / } / 072 / smj_value_2 = smj_value_1; / 073 / } / 074 / / 075 / comp = 0; / 076 / if (comp == 0) { / 077 / comp = (smj_value_0 > smj_value_2 ? 1 : smj_value_0 < smj_value_2 ? -1 : 0); / 078 / } / 079 / / 080 / if (comp > 0) { / 081 / smj_bufferedRow_0 = null; / 082 / } else if (comp < 0) { / 083 / if (!smj_matches_0.isEmpty()) { / 084 / smj_value_3 = smj_value_0; / 085 / return true; / 086 / } else { / 087 / smj_streamedRow_0 = null; / 088 / } / 089 / } else { / 090 / if (smj_matches_0.isEmpty()) { / 091 / smj_matches_0.add((UnsafeRow) smj_bufferedRow_0); / 092 / } / 093 / / 094 / smj_bufferedRow_0 = null; / 095 / } / 096 / } while (smj_streamedRow_0 != null); / 097 / } / 098 / return false; // unreachable / 099 / } / 100 / / 101 / protected void processNext() throws java.io.IOException { / 102 / while (findNextJoinRows(smj_streamedInput_0, smj_bufferedInput_0)) { / 103 / long smj_value_4 = -1L; / 104 / smj_value_4 = smj_streamedRow_0.getLong(0); / 105 / scala.collection.Iterator<UnsafeRow> smj_iterator_0 = smj_matches_0.generateIterator(); / 106 / boolean smj_hasOutputRow_0 = false; / 107 / / 108 / while (!smj_hasOutputRow_0 && smj_iterator_0.hasNext()) { / 109 / InternalRow smj_bufferedRow_1 = (InternalRow) smj_iterator_0.next(); / 110 / / 111 / smj_hasOutputRow_0 = true; / 112 / ((org.apache.spark.sql.execution.metric.SQLMetric) references[0] / numOutputRows /).add(1); / 113 / / 114 / // common sub-expressions / 115 / / 116 / smj_mutableStateArray_0[1].reset(); / 117 / / 118 / smj_mutableStateArray_0[1].write(0, smj_value_4); / 119 / append((smj_mutableStateArray_0[1].getRow()).copy()); / 120 / / 121 / } / 122 / if (shouldStop()) return; / 123 / } / 124 / ((org.apache.spark.sql.execution.joins.SortMergeJoinExec) references[1] / plan /).cleanupResources(); / 125 / } / 126 / / 127 / } ``` ### Why are the changes needed? Improve query CPU performance. Test with one query: ``` def sortMergeJoin(): Unit = { val N = 2 << 20 codegenBenchmark("left semi sort merge join", N) { val df1 = spark.range(N).selectExpr(s"id 2 as k1") val df2 = spark.range(N).selectExpr(s"id * 3 as k2") val df = df1.join(df2, col("k1") === col("k2"), "left_semi") assert(df.queryExecution.sparkPlan.find(_.isInstanceOf[SortMergeJoinExec]).isDefined) df.noop() } } ``` Seeing 30% of run-time improvement: ``` Running benchmark: left semi sort merge join Running case: left semi sort merge join code-gen off Stopped after 2 iterations, 1369 ms Running case: left semi sort merge join code-gen on Stopped after 5 iterations, 2743 ms Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.16 Intel(R) Core(TM) i9-9980HK CPU 2.40GHz left semi sort merge join: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------ left semi sort merge join code-gen off 676 685 13 3.1 322.2 1.0X left semi sort merge join code-gen on 524 549 32 4.0 249.7 1.3X ``` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Added unit test in `WholeStageCodegenSuite.scala` and `ExistenceJoinSuite.scala`. Closes #32528 from c21/smj-left-semi. Authored-by: Cheng Su <chengsu@fb.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>	2021-05-13 12:52:26 +00:00
Kent Yao	51815430b2	[SPARK-35380][SQL] Loading SparkSessionExtensions from ServiceLoader ### What changes were proposed in this pull request? In https://github.com/yaooqinn/itachi/issues/8, we had a discussion about the current extension injection for the spark session. We've agreed that the current way is not that convenient for both third-party developers and end-users. It's much simple if third-party developers can provide a resource file that contains default extensions for Spark to load ahead ### Why are the changes needed? better use experience ### Does this PR introduce _any_ user-facing change? no, dev change ### How was this patch tested? new tests Closes #32515 from yaooqinn/SPARK-35380. Authored-by: Kent Yao <yao@apache.org> Signed-off-by: Kent Yao <yao@apache.org>	2021-05-13 16:34:13 +08:00
Dongjoon Hyun	dd5464976f	[SPARK-35394][K8S][BUILD] Move kubernetes-client.version to root pom file ### What changes were proposed in this pull request? This PR aims to unify two K8s version variables in two `pom.xml`s into one. `kubernetes-client.version` is correct because the artifact ID is `kubernetes-client`. ``` kubernetes.client.version (kubernetes/core module) kubernetes-client.version (kubernetes/integration-test module) ``` ### Why are the changes needed? Having two variables for the same value is confusing and inconvenient when we upgrade K8s versions. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CIs. (The compilation test passes are enough.) Closes #32531 from dongjoon-hyun/SPARK-35394. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2021-05-13 00:40:53 -07:00
Takuya UESHIN	17b59a9970	[SPARK-35382][PYTHON] Fix lambda variable name issues in nested DataFrame functions in Python APIs ### What changes were proposed in this pull request? This PR fixes the same issue as #32424. ```py from pyspark.sql.functions import flatten, struct, transform df = spark.sql("SELECT array(1, 2, 3) as numbers, array('a', 'b', 'c') as letters") df.select(flatten( transform( "numbers", lambda number: transform( "letters", lambda letter: struct(number.alias("n"), letter.alias("l")) ) ) ).alias("zipped")).show(truncate=False) ``` Before: ``` +------------------------------------------------------------------------+ \|zipped \| +------------------------------------------------------------------------+ \|[{a, a}, {b, b}, {c, c}, {a, a}, {b, b}, {c, c}, {a, a}, {b, b}, {c, c}]\| +------------------------------------------------------------------------+ ``` After: ``` +------------------------------------------------------------------------+ \|zipped \| +------------------------------------------------------------------------+ \|[{1, a}, {1, b}, {1, c}, {2, a}, {2, b}, {2, c}, {3, a}, {3, b}, {3, c}]\| +------------------------------------------------------------------------+ ``` ### Why are the changes needed? To produce the correct results. ### Does this PR introduce _any_ user-facing change? Yes, it fixes the results to be correct as mentioned above. ### How was this patch tested? Added a unit test as well as manually. Closes #32523 from ueshin/issues/SPARK-35382/nested_higher_order_functions. Authored-by: Takuya UESHIN <ueshin@databricks.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>	2021-05-13 14:58:01 +09:00
Chao Sun	0ab9bd79b3	[SPARK-35384][SQL] Improve performance for InvokeLike.invoke ### What changes were proposed in this pull request? Change `map` in `InvokeLike.invoke` to a while loop to improve performance, following Spark [style guide](https://github.com/databricks/scala-style-guide#traversal-and-zipwithindex). ### Why are the changes needed? `InvokeLike.invoke`, which is used in non-codegen path for `Invoke` and `StaticInvoke`, currently uses `map` to evaluate arguments: ```scala val args = arguments.map(e => e.eval(input).asInstanceOf[Object]) if (needNullCheck && args.exists(_ == null)) { // return null if one of arguments is null null } else { ... ``` which is pretty expensive if the method itself is trivial. We can change it to a plain while loop. <img width="871" alt="Screen Shot 2021-05-12 at 12 19 59 AM" src="https://user-images.githubusercontent.com/506679/118055719-7f985a00-b33d-11eb-943b-cf85eab35f44.png"> Benchmark results show this can improve as much as 3x from `V2FunctionBenchmark`: Before ``` OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.4.0-1046-azure Intel(R) Xeon(R) CPU E5-2673 v3 2.40GHz scalar function (long + long) -> long, result_nullable = false codegen = false: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative -------------------------------------------------------------------------------------------------------------------------------------------------------------- native_long_add 36506 36656 251 13.7 73.0 1.0X java_long_add_default 47151 47540 370 10.6 94.3 0.8X java_long_add_magic 178691 182457 1327 2.8 357.4 0.2X java_long_add_static_magic 177151 178258 1151 2.8 354.3 0.2X ``` After ``` OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.4.0-1046-azure Intel(R) Xeon(R) CPU E5-2673 v3 2.40GHz scalar function (long + long) -> long, result_nullable = false codegen = false: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative -------------------------------------------------------------------------------------------------------------------------------------------------------------- native_long_add 29897 30342 568 16.7 59.8 1.0X java_long_add_default 40628 41075 664 12.3 81.3 0.7X java_long_add_magic 54553 54755 182 9.2 109.1 0.5X java_long_add_static_magic 55410 55532 127 9.0 110.8 0.5X ``` ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing tests. Closes #32527 from sunchao/SPARK-35384. Authored-by: Chao Sun <sunchao@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>	2021-05-12 20:57:21 -07:00
Takuya UESHIN	c0b52da89e	[SPARK-35388][INFRA] Allow the PR source branch to include slashes ### What changes were proposed in this pull request? This PR allows the PR source branch to include slashes. ### Why are the changes needed? There are PRs whose source branches include slashes, like `issues/SPARK-35119/gha` here or #32523. Before the fix, the PR build fails in `Sync the current branch with the latest in Apache Spark` phase. For example, at #32523, the source branch is `issues/SPARK-35382/nested_higher_order_functions`: ``` ... fatal: couldn't find remote ref nested_higher_order_functions Error: Process completed with exit code 128. ``` (https://github.com/ueshin/apache-spark/runs/2569356241) ### Does this PR introduce _any_ user-facing change? No, this is a dev-only change. ### How was this patch tested? This PR source branch includes slashes and #32525 doesn't. Closes #32524 from ueshin/issues/SPARK-35119/gha. Authored-by: Takuya UESHIN <ueshin@databricks.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>	2021-05-13 10:59:30 +09:00
Takeshi Yamamuro	3241aeb7f4	[SPARK-35385][SQL][TESTS] Skip duplicate queries in the TPCDS-related tests ### What changes were proposed in this pull request? This PR proposes to skip the "q6", "q34", "q64", "q74", "q75", "q78" queries in the TPCDS-related tests because the TPCDS v2.7 queries have almost the same ones; the only differences in these queries are ORDER BY columns. ### Why are the changes needed? To improve test performance. ### Does this PR introduce _any_ user-facing change? No, dev only. ### How was this patch tested? Existing tests. Closes #32520 from maropu/SkipDupQueries. Authored-by: Takeshi Yamamuro <yamamuro@apache.org> Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>	2021-05-13 09:46:25 +09:00
Luca Canali	ae0579a945	[SPARK-35369][DOC] Document ExecutorAllocationManager metrics ### What changes were proposed in this pull request? This proposes to document the available metrics for ExecutorAllocationManager in the Spark monitoring documentation. ### Why are the changes needed? The ExecutorAllocationManager is instrumented with metrics using the Spark metrics system. The relevant work is in SPARK-7007 and SPARK-33763 ExecutorAllocationManager metrics are currently undocumented. ### Does this PR introduce _any_ user-facing change? This PR adds documentation only. ### How was this patch tested? na Closes #32500 from LucaCanali/followupMetricsDocSPARK33763. Authored-by: Luca Canali <luca.canali@cern.ch> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>	2021-05-12 13:02:00 -07:00
shahid	b3c916e5a5	[SPARK-35013][CORE] Don't allow to set spark.driver.cores=0 ### What changes were proposed in this pull request? Currently spark is not allowing to set spark.driver.memory, spark.executor.cores, spark.executor.memory to 0, but allowing driver cores to 0. This PR checks for driver core size as well. Thanks Oleg Lypkan for finding this. ### Why are the changes needed? To make the configuration check consistent. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Manual testing Closes #32504 from shahidki31/shahid/drivercore. Lead-authored-by: shahid <shahidki31@gmail.com> Co-authored-by: Hyukjin Kwon <gurwls223@gmail.com> Co-authored-by: Shahid <shahidki31@gmail.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>	2021-05-12 12:45:55 -07:00
Chao Sun	bc95c3a69b	[SPARK-35361][SQL][FOLLOWUP] Switch to use while loop ### What changes were proposed in this pull request? Switch to plain `while` loop following Spark [style guide](https://github.com/databricks/scala-style-guide#traversal-and-zipwithindex). ### Why are the changes needed? `while` loop may yield better performance comparing to `foreach`. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? N/A Closes #32522 from sunchao/SPARK-35361-follow-up. Authored-by: Chao Sun <sunchao@apple.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>	2021-05-12 12:41:12 -07:00
Dongjoon Hyun	77b7fe19e1	[SPARK-35383][CORE] Improve s3a magic committer support by inferring missing configs ### What changes were proposed in this pull request? This PR aims to improve S3A magic committer support by inferring all missing configs from a single minimum configuration, `spark.hadoop.fs.s3a.bucket.<bucket>.committer.magic.enabled=true`. Given that AWS S3 provides a [strong read-after-write consistency](https://aws.amazon.com/blogs/aws/amazon-s3-update-strong-read-after-write-consistency/) since December 2020, we can ignore DynamoDB-related configurations. As a result, the minimum set of configuration are the following: ``` spark.hadoop.fs.s3a.committer.magic.enabled=true spark.hadoop.fs.s3a.bucket.<bucket>.committer.magic.enabled=true spark.hadoop.fs.s3a.committer.name=magic spark.hadoop.mapreduce.outputcommitter.factory.scheme.s3a=org.apache.hadoop.fs.s3a.commit.S3ACommitterFactory spark.sql.parquet.output.committer.class=org.apache.spark.internal.io.cloud.BindingParquetOutputCommitter spark.sql.sources.commitProtocolClass=org.apache.spark.internal.io.cloud.PathOutputCommitProtocol ``` ### Why are the changes needed? To use S3A magic committer in Apache Spark, the users need to setup a set of configurations. And, if something is missed, it will end up with the error messages like the following. ``` Exception in thread "main" org.apache.hadoop.fs.s3a.commit.PathCommitException: `s3a://my-spark-bucket`: Filesystem does not have support for 'magic' committer enabled in configuration option fs.s3a.committer.magic.enabled at org.apache.hadoop.fs.s3a.commit.CommitUtils.verifyIsMagicCommitFS(CommitUtils.java:74) at org.apache.hadoop.fs.s3a.commit.CommitUtils.getS3AFileSystem(CommitUtils.java:109) ``` ### Does this PR introduce _any_ user-facing change? Yes, after this improvement PR, all Spark users can use S3A committer by using a single configuration. ``` spark.hadoop.fs.s3a.bucket.<bucket>.committer.magic.enabled=true ``` This PR is going to inferring the missing configurations. So, there is no side-effect if the existing users who have all configurations already. ### How was this patch tested? Pass the CIs with the newly added test cases. Closes #32518 from dongjoon-hyun/SPARK-35383. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>	2021-05-12 11:53:28 -07:00
Gengliang Wang	dac6f175a6	[SPARK-35387][INFRA] Increase the JVM stack size for Java 11 build test ### What changes were proposed in this pull request? After merging https://github.com/apache/spark/pull/32439, there is flaky error from the Github action job "Java 11 build with Maven": ``` Error: ## Exception when compiling 473 sources to /home/runner/work/spark/spark/sql/catalyst/target/scala-2.12/classes java.lang.StackOverflowError scala.reflect.internal.Trees.itransform(Trees.scala:1376) scala.reflect.internal.Trees.itransform$(Trees.scala:1374) scala.reflect.internal.SymbolTable.itransform(SymbolTable.scala:28) scala.reflect.internal.SymbolTable.itransform(SymbolTable.scala:28) scala.reflect.api.Trees$Transformer.transform(Trees.scala:2563) scala.tools.nsc.transform.TypingTransformers$TypingTransformer.transform(TypingTransformers.scala:51) ``` We can resolve it by increasing the stack size of JVM to 256M. The container for Github action jobs has 7G memory so this should be fine. ### Why are the changes needed? Fix flaky test failure in Java 11 build test ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Github action test Closes #32521 from gengliangwang/increaseStackSize. Authored-by: Gengliang Wang <ltnwgl@gmail.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>	2021-05-12 10:49:09 -07:00
Liang-Chi Hsieh	f156a95641	[SPARK-35347][SQL][FOLLOWUP] Throw exception with an explicit exception type when cannot find the method instead of sys.error ### What changes were proposed in this pull request? A simple follow-up of #32474 to throw exception instead of sys.error. ### Why are the changes needed? An exception only fails the query, instead of sys.error. ### Does this PR introduce _any_ user-facing change? Yes, if `Invoke` or `StaticInvoke` cannot find the method, instead of original `sys.error` now we only throw an exception. ### How was this patch tested? Existing tests. Closes #32519 from viirya/SPARK-35347-followup. Authored-by: Liang-Chi Hsieh <viirya@gmail.com> Signed-off-by: Liang-Chi Hsieh <viirya@gmail.com>	2021-05-12 09:56:08 -07:00
Cheng Su	7bcadedbd2	[SPARK-35349][SQL] Add code-gen for left/right outer sort merge join ### What changes were proposed in this pull request? This PR is to add code-gen support for LEFT OUTER / RIGHT OUTER sort merge join. Currently sort merge join only supports inner join type (https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala#L374 ). There's no fundamental reason why we cannot support code-gen for other join types. Here we add code-gen for LEFT OUTER / RIGHT OUTER join. Will submit followup PRs to add LEFT SEMI, LEFT ANTI and FULL OUTER code-gen separately. The change is to extend current sort merge join logic to work with LEFT OUTER and RIGHT OUTER (should work with LEFT SEMI/ANTI as well, but FULL OUTER join needs some other more code change). Replace left/right with streamed/buffered to make code extendable to other join types besides inner join. Example query: ``` val df1 = spark.range(10).select($"id".as("k1"), $"id".as("k3")) val df2 = spark.range(4).select($"id".as("k2"), $"id".as("k4")) df1.join(df2.hint("SHUFFLE_MERGE"), $"k1" === $"k2" && $"k3" + 1 < $"k4", "left_outer").explain("codegen") ``` Example generated code: ``` == Subtree 5 / 5 (maxMethodCodeSize:396; maxConstantPoolSize:159(0.24% used); numInnerClasses:0) == (5) SortMergeJoin [k1#2L], [k2#8L], LeftOuter, ((k3#3L + 1) < k4#9L) :- (2) Sort [k1#2L ASC NULLS FIRST], false, 0 : +- Exchange hashpartitioning(k1#2L, 5), ENSURE_REQUIREMENTS, [id=#26] : +- (1) Project [id#0L AS k1#2L, id#0L AS k3#3L] : +- (1) Range (0, 10, step=1, splits=2) +- (4) Sort [k2#8L ASC NULLS FIRST], false, 0 +- Exchange hashpartitioning(k2#8L, 5), ENSURE_REQUIREMENTS, [id=#32] +- (3) Project [id#6L AS k2#8L, id#6L AS k4#9L] +- (3) Range (0, 4, step=1, splits=2) Generated code: / 001 / public Object generate(Object[] references) { / 002 / return new GeneratedIteratorForCodegenStage5(references); / 003 / } / 004 / / 005 / // codegenStageId=5 / 006 / final class GeneratedIteratorForCodegenStage5 extends org.apache.spark.sql.execution.BufferedRowIterator { / 007 / private Object[] references; / 008 / private scala.collection.Iterator[] inputs; / 009 / private scala.collection.Iterator smj_streamedInput_0; / 010 / private scala.collection.Iterator smj_bufferedInput_0; / 011 / private InternalRow smj_streamedRow_0; / 012 / private InternalRow smj_bufferedRow_0; / 013 / private long smj_value_2; / 014 / private org.apache.spark.sql.execution.ExternalAppendOnlyUnsafeRowArray smj_matches_0; / 015 / private long smj_value_3; / 016 / private org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter[] smj_mutableStateArray_0 = new org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter[1]; / 017 / / 018 / public GeneratedIteratorForCodegenStage5(Object[] references) { / 019 / this.references = references; / 020 / } / 021 / / 022 / public void init(int index, scala.collection.Iterator[] inputs) { / 023 / partitionIndex = index; / 024 / this.inputs = inputs; / 025 / smj_streamedInput_0 = inputs[0]; / 026 / smj_bufferedInput_0 = inputs[1]; / 027 / / 028 / smj_matches_0 = new org.apache.spark.sql.execution.ExternalAppendOnlyUnsafeRowArray(2147483632, 2147483647); / 029 / smj_mutableStateArray_0[0] = new org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter(4, 0); / 030 / / 031 / } / 032 / / 033 / private boolean findNextJoinRows( / 034 / scala.collection.Iterator streamedIter, / 035 / scala.collection.Iterator bufferedIter) { / 036 / smj_streamedRow_0 = null; / 037 / int comp = 0; / 038 / while (smj_streamedRow_0 == null) { / 039 / if (!streamedIter.hasNext()) return false; / 040 / smj_streamedRow_0 = (InternalRow) streamedIter.next(); / 041 / long smj_value_0 = smj_streamedRow_0.getLong(0); / 042 / if (false) { / 043 / if (!smj_matches_0.isEmpty()) { / 044 / smj_matches_0.clear(); / 045 / } / 046 / return false; / 047 / / 048 / } / 049 / if (!smj_matches_0.isEmpty()) { / 050 / comp = 0; / 051 / if (comp == 0) { / 052 / comp = (smj_value_0 > smj_value_3 ? 1 : smj_value_0 < smj_value_3 ? -1 : 0); / 053 / } / 054 / / 055 / if (comp == 0) { / 056 / return true; / 057 / } / 058 / smj_matches_0.clear(); / 059 / } / 060 / / 061 / do { / 062 / if (smj_bufferedRow_0 == null) { / 063 / if (!bufferedIter.hasNext()) { / 064 / smj_value_3 = smj_value_0; / 065 / return !smj_matches_0.isEmpty(); / 066 / } / 067 / smj_bufferedRow_0 = (InternalRow) bufferedIter.next(); / 068 / long smj_value_1 = smj_bufferedRow_0.getLong(0); / 069 / if (false) { / 070 / smj_bufferedRow_0 = null; / 071 / continue; / 072 / } / 073 / smj_value_2 = smj_value_1; / 074 / } / 075 / / 076 / comp = 0; / 077 / if (comp == 0) { / 078 / comp = (smj_value_0 > smj_value_2 ? 1 : smj_value_0 < smj_value_2 ? -1 : 0); / 079 / } / 080 / / 081 / if (comp > 0) { / 082 / smj_bufferedRow_0 = null; / 083 / } else if (comp < 0) { / 084 / if (!smj_matches_0.isEmpty()) { / 085 / smj_value_3 = smj_value_0; / 086 / return true; / 087 / } else { / 088 / return false; / 089 / } / 090 / } else { / 091 / smj_matches_0.add((UnsafeRow) smj_bufferedRow_0); / 092 / smj_bufferedRow_0 = null; / 093 / } / 094 / } while (smj_streamedRow_0 != null); / 095 / } / 096 / return false; // unreachable / 097 / } / 098 / / 099 / protected void processNext() throws java.io.IOException { / 100 / while (smj_streamedInput_0.hasNext()) { / 101 / findNextJoinRows(smj_streamedInput_0, smj_bufferedInput_0); / 102 / long smj_value_4 = -1L; / 103 / long smj_value_5 = -1L; / 104 / boolean smj_loaded_0 = false; / 105 / smj_value_5 = smj_streamedRow_0.getLong(1); / 106 / scala.collection.Iterator<UnsafeRow> smj_iterator_0 = smj_matches_0.generateIterator(); / 107 / boolean smj_foundMatch_0 = false; / 108 / / 109 / // the last iteration of this loop is to emit an empty row if there is no matched rows. / 110 / while (smj_iterator_0.hasNext() \|\| !smj_foundMatch_0) { / 111 / InternalRow smj_bufferedRow_1 = smj_iterator_0.hasNext() ? / 112 / (InternalRow) smj_iterator_0.next() : null; / 113 / boolean smj_isNull_5 = true; / 114 / long smj_value_9 = -1L; / 115 / if (smj_bufferedRow_1 != null) { / 116 / long smj_value_8 = smj_bufferedRow_1.getLong(1); / 117 / smj_isNull_5 = false; / 118 / smj_value_9 = smj_value_8; / 119 / } / 120 / if (smj_bufferedRow_1 != null) { / 121 / boolean smj_isNull_6 = true; / 122 / boolean smj_value_10 = false; / 123 / long smj_value_11 = -1L; / 124 / / 125 / smj_value_11 = smj_value_5 + 1L; / 126 / / 127 / if (!smj_isNull_5) { / 128 / smj_isNull_6 = false; // resultCode could change nullability. / 129 / smj_value_10 = smj_value_11 < smj_value_9; / 130 / / 131 / } / 132 / if (smj_isNull_6 \|\| !smj_value_10) { / 133 / continue; / 134 / } / 135 / } / 136 / if (!smj_loaded_0) { / 137 / smj_loaded_0 = true; / 138 / smj_value_4 = smj_streamedRow_0.getLong(0); / 139 / } / 140 / boolean smj_isNull_3 = true; / 141 / long smj_value_7 = -1L; / 142 / if (smj_bufferedRow_1 != null) { / 143 / long smj_value_6 = smj_bufferedRow_1.getLong(0); / 144 / smj_isNull_3 = false; / 145 / smj_value_7 = smj_value_6; / 146 / } / 147 / smj_foundMatch_0 = true; / 148 / ((org.apache.spark.sql.execution.metric.SQLMetric) references[0] / numOutputRows /).add(1); / 149 / / 150 / smj_mutableStateArray_0[0].reset(); / 151 / / 152 / smj_mutableStateArray_0[0].zeroOutNullBytes(); / 153 / / 154 / smj_mutableStateArray_0[0].write(0, smj_value_4); / 155 / / 156 / smj_mutableStateArray_0[0].write(1, smj_value_5); / 157 / / 158 / if (smj_isNull_3) { / 159 / smj_mutableStateArray_0[0].setNullAt(2); / 160 / } else { / 161 / smj_mutableStateArray_0[0].write(2, smj_value_7); / 162 / } / 163 / / 164 / if (smj_isNull_5) { / 165 / smj_mutableStateArray_0[0].setNullAt(3); / 166 / } else { / 167 / smj_mutableStateArray_0[0].write(3, smj_value_9); / 168 / } / 169 / append((smj_mutableStateArray_0[0].getRow()).copy()); / 170 / / 171 / } / 172 / if (shouldStop()) return; / 173 / } / 174 / ((org.apache.spark.sql.execution.joins.SortMergeJoinExec) references[1] / plan /).cleanupResources(); / 175 / } / 176 / / 177 / } ``` ### Why are the changes needed? Improve query CPU performance. Example micro benchmark below showed 10% run-time improvement. ``` def sortMergeJoinWithDuplicates(): Unit = { val N = 2 << 20 codegenBenchmark("sort merge join with duplicates", N) { val df1 = spark.range(N) .selectExpr(s"(id 15485863) % ${N10} as k1", "id as k3") val df2 = spark.range(N) .selectExpr(s"(id 15485867) % ${N10} as k2", "id as k4") val df = df1.join(df2, col("k1") === col("k2") && col("k3") 3 < col("k4"), "left_outer") assert(df.queryExecution.sparkPlan.find(_.isInstanceOf[SortMergeJoinExec]).isDefined) df.noop() } } ``` ``` Running benchmark: sort merge join with duplicates Running case: sort merge join with duplicates outer-smj-codegen off Stopped after 2 iterations, 2696 ms Running case: sort merge join with duplicates outer-smj-codegen on Stopped after 5 iterations, 6058 ms Java HotSpot(TM) 64-Bit Server VM 1.8.0_181-b13 on Mac OS X 10.16 Intel(R) Core(TM) i9-9980HK CPU 2.40GHz sort merge join with duplicates: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative ------------------------------------------------------------------------------------------------------------------------------------- sort merge join with duplicates outer-smj-codegen off 1333 1348 21 1.6 635.7 1.0X sort merge join with duplicates outer-smj-codegen on 1169 1212 47 1.8 557.4 1.1X ``` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Added unit test in `WholeStageCodegenSuite.scala` and `WholeStageCodegenSuite.scala`. Closes #32476 from c21/smj-outer-codegen. Authored-by: Cheng Su <chengsu@fb.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>	2021-05-12 14:10:15 +00:00
Ludovic Henry	b52d47a920	[SPARK-35295][ML] Replace fully com.github.fommil.netlib by dev.ludovic.netlib:2.0 ### What changes were proposed in this pull request? Bump to `dev.ludovic.netlib:2.0` which provides JNI-based wrappers for BLAS, ARPACK, and LAPACK. Theseare not taking dependencies on GPL or LGPL libraries, allowing to provide out-of-the-box support for hardware acceleration when a native library is present (this is still up to the end-user to install such library on their system, like OpenBLAS, Intel MKL, and libarpack2). ### Why are the changes needed? Great performance improvement for ML-related workload on vanilla-distributions of Spark. ### Does this PR introduce _any_ user-facing change? Users now take advantage of hardware acceleration as long as a native library is installed (like OpenBLAS, Intel MKL and libarpack2). ### How was this patch tested? Spark test-suite + dev.ludovic.netlib testsuite. #### JDK8: ``` [info] OpenJDK 64-Bit Server VM 1.8.0_292-b10 on Linux 5.8.0-50-generic [info] Intel(R) Xeon(R) E-2276G CPU 3.80GHz [info] [info] f2jBLAS = dev.ludovic.netlib.blas.F2jBLAS [info] javaBLAS = dev.ludovic.netlib.blas.Java8BLAS [info] nativeBLAS = dev.ludovic.netlib.blas.JNIBLAS [info] [info] daxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 220 226 6 454.9 2.2 1.0X [info] java 221 228 5 451.9 2.2 1.0X [info] native 209 215 5 478.7 2.1 1.1X [info] [info] saxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 121 125 3 823.3 1.2 1.0X [info] java 121 125 3 824.3 1.2 1.0X [info] native 101 105 3 988.4 1.0 1.2X [info] [info] dcopy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 212 219 6 470.9 2.1 1.0X [info] java 208 212 4 481.0 2.1 1.0X [info] native 209 215 5 478.5 2.1 1.0X [info] [info] scopy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 114 119 3 878.9 1.1 1.0X [info] java 99 105 3 1011.4 1.0 1.2X [info] native 97 103 3 1026.7 1.0 1.2X [info] [info] ddot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 108 111 2 925.9 1.1 1.0X [info] java 71 73 2 1414.9 0.7 1.5X [info] native 54 56 2 1847.0 0.5 2.0X [info] [info] sdot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 96 97 2 1046.8 1.0 1.0X [info] java 47 48 1 2129.8 0.5 2.0X [info] native 29 30 1 3404.7 0.3 3.3X [info] [info] dnrm2: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 139 143 2 718.2 1.4 1.0X [info] java 46 47 1 2171.2 0.5 3.0X [info] native 44 46 2 2261.8 0.4 3.1X [info] [info] snrm2: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 154 157 4 651.0 1.5 1.0X [info] java 40 42 1 2469.3 0.4 3.8X [info] native 26 27 1 3787.6 0.3 5.8X [info] [info] dscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 185 195 8 541.0 1.8 1.0X [info] java 186 196 7 538.5 1.9 1.0X [info] native 177 187 7 564.1 1.8 1.0X [info] [info] sscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 98 102 3 1016.2 1.0 1.0X [info] java 98 102 3 1017.8 1.0 1.0X [info] native 87 91 3 1143.2 0.9 1.1X [info] [info] dgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 68 70 1 1474.7 0.7 1.0X [info] java 51 52 1 1973.0 0.5 1.3X [info] native 30 32 1 3298.8 0.3 2.2X [info] [info] dgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 96 99 2 1037.9 1.0 1.0X [info] java 50 51 1 1999.6 0.5 1.9X [info] native 30 31 1 3368.1 0.3 3.2X [info] [info] sgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 59 61 1 1688.7 0.6 1.0X [info] java 41 42 1 2461.9 0.4 1.5X [info] native 15 16 1 6593.0 0.2 3.9X [info] [info] sgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 90 92 1 1116.2 0.9 1.0X [info] java 39 40 1 2565.8 0.4 2.3X [info] native 15 16 1 6594.2 0.2 5.9X [info] [info] dger: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 192 202 7 520.5 1.9 1.0X [info] java 203 214 7 491.9 2.0 0.9X [info] native 176 187 7 568.8 1.8 1.1X [info] [info] dspmv[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 59 61 1 846.1 1.2 1.0X [info] java 38 39 1 1313.5 0.8 1.6X [info] native 24 27 1 2047.8 0.5 2.4X [info] [info] dspr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 97 101 3 515.4 1.9 1.0X [info] java 97 101 2 515.1 1.9 1.0X [info] native 88 91 3 569.1 1.8 1.1X [info] [info] dsyr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 169 174 3 295.4 3.4 1.0X [info] java 169 174 3 295.4 3.4 1.0X [info] native 160 165 4 312.2 3.2 1.1X [info] [info] dgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 561 577 13 1782.3 0.6 1.0X [info] java 225 231 4 4446.2 0.2 2.5X [info] native 31 32 3 32473.1 0.0 18.2X [info] [info] dgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 570 584 9 1754.8 0.6 1.0X [info] java 224 230 4 4457.3 0.2 2.5X [info] native 31 32 1 32493.4 0.0 18.5X [info] [info] dgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 855 866 6 1169.2 0.9 1.0X [info] java 224 228 3 4466.9 0.2 3.8X [info] native 31 32 1 32395.5 0.0 27.7X [info] [info] dgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 1328 1344 8 752.8 1.3 1.0X [info] java 224 230 4 4458.9 0.2 5.9X [info] native 31 32 1 32201.8 0.0 42.8X [info] [info] sgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 534 541 5 1873.0 0.5 1.0X [info] java 220 224 3 4542.8 0.2 2.4X [info] native 15 16 1 66803.1 0.0 35.7X [info] [info] sgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 544 551 6 1839.6 0.5 1.0X [info] java 220 224 4 4538.2 0.2 2.5X [info] native 15 16 1 65589.9 0.0 35.7X [info] [info] sgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 833 845 21 1201.0 0.8 1.0X [info] java 220 224 3 4548.7 0.2 3.8X [info] native 15 16 1 66603.2 0.0 55.5X [info] [info] sgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 899 907 5 1112.9 0.9 1.0X [info] java 221 224 2 4531.6 0.2 4.1X [info] native 15 16 1 65944.9 0.0 59.3X ``` #### JDK11: ``` [info] OpenJDK 64-Bit Server VM 11.0.11+9-LTS on Linux 5.8.0-50-generic [info] Intel(R) Xeon(R) E-2276G CPU 3.80GHz [info] [info] f2jBLAS = dev.ludovic.netlib.blas.F2jBLAS [info] javaBLAS = dev.ludovic.netlib.blas.Java11BLAS [info] nativeBLAS = dev.ludovic.netlib.blas.JNIBLAS [info] [info] daxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 195 200 3 512.2 2.0 1.0X [info] java 197 202 3 507.0 2.0 1.0X [info] native 184 189 4 543.0 1.8 1.1X [info] [info] saxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 108 112 3 921.8 1.1 1.0X [info] java 101 105 3 989.4 1.0 1.1X [info] native 87 91 3 1147.1 0.9 1.2X [info] [info] dcopy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 187 191 3 535.1 1.9 1.0X [info] java 182 188 3 548.8 1.8 1.0X [info] native 178 182 3 562.2 1.8 1.1X [info] [info] scopy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 110 114 3 909.3 1.1 1.0X [info] java 86 93 4 1159.3 0.9 1.3X [info] native 86 90 3 1162.4 0.9 1.3X [info] [info] ddot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 106 108 2 943.6 1.1 1.0X [info] java 70 71 2 1426.8 0.7 1.5X [info] native 54 56 2 1835.4 0.5 1.9X [info] [info] sdot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 96 97 1 1047.1 1.0 1.0X [info] java 43 44 1 2331.9 0.4 2.2X [info] native 29 30 1 3392.1 0.3 3.2X [info] [info] dnrm2: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 114 115 2 880.7 1.1 1.0X [info] java 42 43 1 2398.1 0.4 2.7X [info] native 45 46 1 2233.3 0.4 2.5X [info] [info] snrm2: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 140 143 2 714.6 1.4 1.0X [info] java 28 29 1 3531.0 0.3 4.9X [info] native 26 27 1 3820.0 0.3 5.3X [info] [info] dscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 156 166 7 641.3 1.6 1.0X [info] java 158 167 6 633.2 1.6 1.0X [info] native 150 160 7 664.8 1.5 1.0X [info] [info] sscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 85 88 2 1181.7 0.8 1.0X [info] java 85 88 2 1176.0 0.9 1.0X [info] native 75 78 2 1333.2 0.8 1.1X [info] [info] dgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 58 59 1 1731.1 0.6 1.0X [info] java 41 43 1 2415.5 0.4 1.4X [info] native 30 31 1 3293.9 0.3 1.9X [info] [info] dgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 94 96 1 1063.4 0.9 1.0X [info] java 41 42 1 2435.8 0.4 2.3X [info] native 30 30 1 3379.8 0.3 3.2X [info] [info] sgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 44 45 1 2278.9 0.4 1.0X [info] java 37 38 0 2686.8 0.4 1.2X [info] native 15 16 1 6555.4 0.2 2.9X [info] [info] sgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 88 89 1 1142.1 0.9 1.0X [info] java 33 34 1 3010.7 0.3 2.6X [info] native 15 16 1 6553.9 0.2 5.7X [info] [info] dger: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 164 172 4 609.4 1.6 1.0X [info] java 163 172 5 612.6 1.6 1.0X [info] native 150 159 4 667.0 1.5 1.1X [info] [info] dspmv[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 49 50 1 1029.4 1.0 1.0X [info] java 41 42 1 1209.4 0.8 1.2X [info] native 25 27 1 2029.2 0.5 2.0X [info] [info] dspr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 80 85 3 622.2 1.6 1.0X [info] java 80 85 3 622.4 1.6 1.0X [info] native 75 79 3 668.7 1.5 1.1X [info] [info] dsyr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 137 142 3 364.1 2.7 1.0X [info] java 139 142 2 360.4 2.8 1.0X [info] native 131 135 3 380.4 2.6 1.0X [info] [info] dgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 517 525 5 1935.5 0.5 1.0X [info] java 213 216 3 4704.8 0.2 2.4X [info] native 31 31 1 32705.6 0.0 16.9X [info] [info] dgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 589 601 6 1698.6 0.6 1.0X [info] java 213 217 3 4693.3 0.2 2.8X [info] native 31 32 1 32498.9 0.0 19.1X [info] [info] dgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 851 865 6 1175.3 0.9 1.0X [info] java 212 216 3 4717.0 0.2 4.0X [info] native 30 32 1 32903.0 0.0 28.0X [info] [info] dgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 1301 1316 6 768.4 1.3 1.0X [info] java 212 216 2 4717.4 0.2 6.1X [info] native 31 32 1 32606.0 0.0 42.4X [info] [info] sgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 454 460 2 2203.0 0.5 1.0X [info] java 208 212 3 4803.8 0.2 2.2X [info] native 15 16 0 66586.0 0.0 30.2X [info] [info] sgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 529 536 4 1889.7 0.5 1.0X [info] java 208 212 3 4798.6 0.2 2.5X [info] native 15 16 1 66751.4 0.0 35.3X [info] [info] sgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 830 840 5 1205.1 0.8 1.0X [info] java 208 211 2 4814.1 0.2 4.0X [info] native 15 15 1 67676.4 0.0 56.2X [info] [info] sgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 894 907 7 1118.7 0.9 1.0X [info] java 208 211 3 4809.6 0.2 4.3X [info] native 15 16 1 66675.2 0.0 59.6X ``` #### JDK16: ``` [info] OpenJDK 64-Bit Server VM 16+36 on Linux 5.8.0-50-generic [info] Intel(R) Xeon(R) E-2276G CPU 3.80GHz [info] [info] f2jBLAS = dev.ludovic.netlib.blas.F2jBLAS [info] javaBLAS = dev.ludovic.netlib.blas.VectorBLAS [info] nativeBLAS = dev.ludovic.netlib.blas.JNIBLAS [info] [info] daxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 193 199 3 517.5 1.9 1.0X [info] java 181 186 4 553.2 1.8 1.1X [info] native 181 185 5 553.6 1.8 1.1X [info] [info] saxpy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 108 112 2 925.1 1.1 1.0X [info] java 88 91 3 1138.6 0.9 1.2X [info] native 87 91 3 1144.2 0.9 1.2X [info] [info] dcopy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 184 189 3 542.5 1.8 1.0X [info] java 181 185 3 552.8 1.8 1.0X [info] native 179 183 2 558.0 1.8 1.0X [info] [info] scopy: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 97 101 3 1031.6 1.0 1.0X [info] java 86 90 2 1163.7 0.9 1.1X [info] native 85 88 2 1182.9 0.8 1.1X [info] [info] ddot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 107 109 2 932.4 1.1 1.0X [info] java 54 56 2 1846.7 0.5 2.0X [info] native 54 56 2 1846.7 0.5 2.0X [info] [info] sdot: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 96 97 1 1043.6 1.0 1.0X [info] java 29 30 1 3439.3 0.3 3.3X [info] native 29 30 1 3423.9 0.3 3.3X [info] [info] dnrm2: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 121 123 2 829.8 1.2 1.0X [info] java 32 32 1 3171.3 0.3 3.8X [info] native 45 46 1 2246.2 0.4 2.7X [info] [info] snrm2: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 142 144 2 705.9 1.4 1.0X [info] java 15 16 1 6585.8 0.2 9.3X [info] native 26 27 1 3839.5 0.3 5.4X [info] [info] dscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 157 165 5 635.6 1.6 1.0X [info] java 151 159 5 664.0 1.5 1.0X [info] native 151 160 5 663.6 1.5 1.0X [info] [info] sscal: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 85 89 2 1172.3 0.9 1.0X [info] java 75 79 3 1337.3 0.7 1.1X [info] native 75 79 2 1335.5 0.7 1.1X [info] [info] dgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 58 59 1 1731.5 0.6 1.0X [info] java 28 29 1 3544.2 0.3 2.0X [info] native 30 31 1 3306.2 0.3 1.9X [info] [info] dgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 90 92 1 1108.3 0.9 1.0X [info] java 28 28 1 3622.5 0.3 3.3X [info] native 30 31 1 3381.3 0.3 3.1X [info] [info] sgemv[N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 44 45 1 2284.7 0.4 1.0X [info] java 14 15 1 7034.0 0.1 3.1X [info] native 15 16 1 6643.7 0.2 2.9X [info] [info] sgemv[T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 85 86 1 1177.4 0.8 1.0X [info] java 15 15 1 6886.1 0.1 5.8X [info] native 15 16 1 6560.1 0.2 5.6X [info] [info] dger: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 164 173 6 608.1 1.6 1.0X [info] java 148 157 5 675.2 1.5 1.1X [info] native 152 160 5 659.9 1.5 1.1X [info] [info] dspmv[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 61 63 1 815.4 1.2 1.0X [info] java 16 17 1 3104.3 0.3 3.8X [info] native 24 27 1 2071.9 0.5 2.5X [info] [info] dspr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 81 85 2 616.4 1.6 1.0X [info] java 81 85 2 614.7 1.6 1.0X [info] native 75 78 2 669.5 1.5 1.1X [info] [info] dsyr[U]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 138 141 3 362.7 2.8 1.0X [info] java 137 140 2 365.3 2.7 1.0X [info] native 131 134 2 382.9 2.6 1.1X [info] [info] dgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 525 544 8 1906.2 0.5 1.0X [info] java 61 68 3 16358.1 0.1 8.6X [info] native 31 32 1 32623.7 0.0 17.1X [info] [info] dgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 580 598 12 1724.5 0.6 1.0X [info] java 61 68 4 16302.5 0.1 9.5X [info] native 30 32 1 32962.8 0.0 19.1X [info] [info] dgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 829 838 4 1206.2 0.8 1.0X [info] java 61 69 3 16339.7 0.1 13.5X [info] native 30 31 1 33231.9 0.0 27.6X [info] [info] dgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 1352 1363 5 739.6 1.4 1.0X [info] java 61 69 3 16347.0 0.1 22.1X [info] native 31 32 1 32740.3 0.0 44.3X [info] [info] sgemm[N,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 482 493 7 2073.1 0.5 1.0X [info] java 35 38 2 28315.3 0.0 13.7X [info] native 15 15 1 67579.7 0.0 32.6X [info] [info] sgemm[N,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 472 482 4 2119.0 0.5 1.0X [info] java 36 38 2 28138.1 0.0 13.3X [info] native 15 16 1 66616.5 0.0 31.4X [info] [info] sgemm[T,N]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 823 830 5 1215.2 0.8 1.0X [info] java 35 38 2 28681.4 0.0 23.6X [info] native 15 15 1 67908.4 0.0 55.9X [info] [info] sgemm[T,T]: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative [info] ----------------------------------------------------------------------------------------------- [info] f2j 896 908 7 1115.8 0.9 1.0X [info] java 35 38 2 28402.0 0.0 25.5X [info] native 15 16 0 66691.2 0.0 59.8X ``` TODO: - [x] update documentation in `docs/` and `docs/ml-linalg-guide.md` refering `com.github.fommil.netlib` - [ ] merge https://github.com/luhenry/netlib/pull/1 with all feedback from this PR + remove references to snapshot repositories in `pom.xml` and `project/SparkBuild.scala`. Closes #32415 from luhenry/master. Authored-by: Ludovic Henry <git@ludovic.dev> Signed-off-by: Sean Owen <srowen@gmail.com>	2021-05-12 08:59:36 -05:00
Takeshi Yamamuro	101b0cc313	[SPARK-35253][SQL][BUILD] Bump up the janino version to v3.1.4 ### What changes were proposed in this pull request? This PR proposes to bump up the janino version from 3.0.16 to v3.1.4. The major changes of this upgrade are as follows: - Fixed issue #131: Janino 3.1.2 is 10x slower than 3.0.11: The Compiler's IClassLoader was initialized way too eagerly, thus lots of classes were loaded from the class path, which is very slow. - Improved the encoding of stack map frames according to JVMS11 4.7.4: Previously, only "full_frame"s were generated. - Fixed issue #107: Janino requires "org.codehaus.commons.compiler.io", but commons-compiler does not export this package - Fixed the promotion of the array access index expression (see JLS7 15.13 Array Access Expressions). For all the changes, please see the change log: http://janino-compiler.github.io/janino/changelog.html NOTE1: I've checked that there is no obvious performance regression. For all the data, see a link: https://docs.google.com/spreadsheets/d/1srxT9CioGQg1fLKM3Uo8z1sTzgCsMj4pg6JzpdcG6VU/edit?usp=sharing NOTE2: We upgraded janino to 3.1.2 (#27860) once before, but the commit had been reverted in #29495 because of the correctness issue. Recently, #32374 had checked if Spark could land on v3.1.3 or not, but a new bug was found there. These known issues has been fixed in v3.1.4 by following PRs: - janino-compiler/janino#145 - janino-compiler/janino#146 ### Why are the changes needed? janino v3.0.X is no longer maintained. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? GA passed. Closes #32455 from maropu/janino_v3.1.4. Authored-by: Takeshi Yamamuro <yamamuro@apache.org> Signed-off-by: Sean Owen <srowen@gmail.com>	2021-05-12 08:57:57 -05:00
Enzo Bonnal	402375b59e	[SPARK-35357][GRAPHX] Allow to turn off the normalization applied by static PageRank utilities ### What changes were proposed in this pull request? Overload methods `PageRank.runWithOptions` and `PageRank.runWithOptionsWithPreviousPageRank` (not to break any user-facing signature) with a `normalized` parameter that describes "whether or not to normalize the rank sum". ### Why are the changes needed? https://issues.apache.org/jira/browse/SPARK-35357 When dealing with a non negligible proportion of sinks in a graph, algorithm based on incremental update of ranks can get a precision gain for free if they are allowed to manipulate non normalized ranks. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? By adding a unit test that verifies that (even when dealing with a graph containing a sink) we end up with the same result for both these scenarios: a) - Run 6 iterations of pagerank in a row using `PageRank.runWithOptions` with normalization enabled b) - Run 2 iterations using `PageRank.runWithOptions` with normalization disabled - Resume from the `preRankGraph1` and run 2 more iterations using `PageRank.runWithOptionsWithPreviousPageRank` with normalization disabled - Finally resume from the `preRankGraph2` and run 2 more iterations using `PageRank.runWithOptionsWithPreviousPageRank` with normalization enabled Closes #32485 from bonnal-enzo/make-pagerank-normalization-optional. Authored-by: Enzo Bonnal <enzobonnal@gmail.com> Signed-off-by: Sean Owen <srowen@gmail.com>	2021-05-12 08:56:22 -05:00
Angerszhuuuu	ed059541eb	[SPARK-29145][SQL][FOLLOWUP] Clean up code about support sub-queries in join conditions ### What changes were proposed in this pull request? According to discuss https://github.com/apache/spark/pull/25854#discussion_r629451135 ### Why are the changes needed? Clean code ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existed UT Closes #32499 from AngersZhuuuu/SPARK-29145-fix. Authored-by: Angerszhuuuu <angers.zhu@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>	2021-05-12 13:45:53 +00:00
Yingyi Bu	d92018ee35	[SPARK-35298][SQL] Migrate to transformWithPruning for rules in Optimizer.scala ### What changes were proposed in this pull request? Added the following TreePattern enums: - ALIAS - AND_OR - AVERAGE - GENERATE - INTERSECT - SORT - SUM - DISTINCT_LIKE - PROJECT - REPARTITION_OPERATION - UNION Added tree traversal pruning to the following rules in Optimizer.scala: - EliminateAggregateFilter - RemoveRedundantAggregates - RemoveNoopOperators - RemoveNoopUnion - LimitPushDown - ColumnPruning - CollapseRepartition - OptimizeRepartition - OptimizeWindowFunctions - CollapseWindow - TransposeWindow - InferFiltersFromGenerate - InferFiltersFromConstraints - CombineUnions - CombineFilters - EliminateSorts - PruneFilters - EliminateLimits - DecimalAggregates - ConvertToLocalRelation - ReplaceDistinctWithAggregate - ReplaceIntersectWithSemiJoin - ReplaceExceptWithAntiJoin - RewriteExceptAll - RewriteIntersectAll - RemoveLiteralFromGroupExpressions - RemoveRepetitionFromGroupExpressions - OptimizeLimitZero ### Why are the changes needed? Reduce the number of tree traversals and hence improve the query compilation latency. perf diff: Rule name \| Total Time (baseline) \| Total Time (experiment) \| experiment/baseline RemoveRedundantAggregates \| 51290766 \| 67070477 \| 1.31 RemoveNoopOperators \| 192371141 \| 196631275 \| 1.02 RemoveNoopUnion \| 49222561 \| 43266681 \| 0.88 LimitPushDown \| 40885185 \| 21672646 \| 0.53 ColumnPruning \| 2003406120 \| 1285562149 \| 0.64 CollapseRepartition \| 40648048 \| 72646515 \| 1.79 OptimizeRepartition \| 37813850 \| 20600803 \| 0.54 OptimizeWindowFunctions \| 174426904 \| 46741409 \| 0.27 CollapseWindow \| 38959957 \| 24542426 \| 0.63 TransposeWindow \| 33533191 \| 20414930 \| 0.61 InferFiltersFromGenerate \| 21758688 \| 15597344 \| 0.72 InferFiltersFromConstraints \| 518009794 \| 493282321 \| 0.95 CombineUnions \| 67694022 \| 70550382 \| 1.04 CombineFilters \| 35265060 \| 29005424 \| 0.82 EliminateSorts \| 57025509 \| 19795776 \| 0.35 PruneFilters \| 433964815 \| 465579200 \| 1.07 EliminateLimits \| 44275393 \| 24476859 \| 0.55 DecimalAggregates \| 83143172 \| 28816090 \| 0.35 ReplaceDistinctWithAggregate \| 21783760 \| 18287489 \| 0.84 ReplaceIntersectWithSemiJoin \| 22311271 \| 16566393 \| 0.74 ReplaceExceptWithAntiJoin \| 23838520 \| 16588808 \| 0.70 RewriteExceptAll \| 32750296 \| 29421957 \| 0.90 RewriteIntersectAll \| 29760454 \| 21243599 \| 0.71 RemoveLiteralFromGroupExpressions \| 28151861 \| 25270947 \| 0.90 RemoveRepetitionFromGroupExpressions \| 29587030 \| 23447041 \| 0.79 OptimizeLimitZero \| 18081943 \| 15597344 \| 0.86 Accumulated \| 4129959311 \| 3112676285 \| 0.75 ### How was this patch tested? Existing tests. Closes #32439 from sigmod/optimizer. Authored-by: Yingyi Bu <yingyi.bu@databricks.com> Signed-off-by: Gengliang Wang <ltnwgl@gmail.com>	2021-05-12 20:42:47 +08:00
PengLei	82c520a3e2	[SPARK-35243][SQL] Support columnar execution on ANSI interval types ### What changes were proposed in this pull request? Columnar execution support for ANSI interval types include YearMonthIntervalType and DayTimeIntervalType ### Why are the changes needed? support cache tables with ANSI interval types. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? run ./dev/lint-java run ./dev/scalastyle run test: CachedTableSuite run test: ColumnTypeSuite Closes #32452 from Peng-Lei/SPARK-35243. Lead-authored-by: PengLei <18066542445@189.cn> Co-authored-by: Lei Peng <peng.8lei@gmail.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>	2021-05-12 20:11:34 +09:00
Hyukjin Kwon	ecb48ccb7d	[SPARK-35381][R] Fix lambda variable name issues in nested higher order functions at R APIs ### What changes were proposed in this pull request? This PR fixes the same issue as https://github.com/apache/spark/pull/32424 ```r df <- sql("SELECT array(1, 2, 3) as numbers, array('a', 'b', 'c') as letters") collect(select( df, array_transform("numbers", function(number) { array_transform("letters", function(latter) { struct(alias(number, "n"), alias(latter, "l")) }) }) )) ``` Before: ``` ... a, a, b, b, c, c, a, a, b, b, c, c, a, a, b, b, c, c ``` After: ``` ... 1, a, 1, b, 1, c, 2, a, 2, b, 2, c, 3, a, 3, b, 3, c ``` ### Why are the changes needed? To produce the correct results. ### Does this PR introduce _any_ user-facing change? Yes, it fixes the results to be correct as mentioned above. ### How was this patch tested? Manually tested as above, and unit test was added. Closes #32517 from HyukjinKwon/SPARK-35381. Authored-by: Hyukjin Kwon <gurwls223@apache.org> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>	2021-05-12 16:52:39 +09:00
Kousuke Saruta	7e3446a204	[SPARK-35377][INFRA] Add JS linter to GA ### What changes were proposed in this pull request? SPARK-35175 (#32274) added a linter for JS so let's add it to GA. ### Why are the changes needed? To JS code keep clean. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? GA Closes #32512 from sarutak/ga-lintjs. Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>	2021-05-12 16:00:55 +09:00
Sean Owen	a189be8754	[MINOR][DOCS] Avoid some python docs where first sentence has "e.g." or similar ### What changes were proposed in this pull request? Avoid some python docs where first sentence has "e.g." or similar as the period causes the docs to show only half of the first sentence as the summary. ### Why are the changes needed? See for example https://spark.apache.org/docs/latest/api/python/reference/api/pyspark.ml.regression.LinearRegressionModel.html?highlight=linearregressionmodel#pyspark.ml.regression.LinearRegressionModel.summary where the method description is clearly truncated. ### Does this PR introduce _any_ user-facing change? Only changes docs. ### How was this patch tested? Manual testing of docs. Closes #32508 from srowen/TruncatedPythonDesc. Authored-by: Sean Owen <srowen@gmail.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>	2021-05-12 10:38:59 +09:00
Chao Sun	78221bda95	[SPARK-35361][SQL] Improve performance for ApplyFunctionExpression ### What changes were proposed in this pull request? In `ApplyFunctionExpression`, move `zipWithIndex` out of the loop for each input row. ### Why are the changes needed? When the `ScalarFunction` is trivial, `zipWithIndex` could incur significant costs, as shown below: <img width="899" alt="Screen Shot 2021-05-11 at 10 03 42 AM" src="https://user-images.githubusercontent.com/506679/117866421-fb19de80-b24b-11eb-8c94-d5e8c8b1eda9.png"> By removing it out of the loop, I'm seeing sometimes 2x speedup from `V2FunctionBenchmark`. For instance: Before: ``` scalar function (long + long) -> long, result_nullable = false codegen = false: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative native_long_add 32437 32896 434 15.4 64.9 1.0X java_long_add_default 85675 97045 NaN 5.8 171.3 0.4X ``` After: ``` scalar function (long + long) -> long, result_nullable = false codegen = false: Best Time(ms) Avg Time(ms) Stdev(ms) Rate(M/s) Per Row(ns) Relative native_long_add 30182 30387 279 16.6 60.4 1.0X java_long_add_default 42862 43009 209 11.7 85.7 0.7X ``` ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Existing tests Closes #32507 from sunchao/SPARK-35361. Authored-by: Chao Sun <sunchao@apple.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>	2021-05-12 10:16:35 +09:00
Kousuke Saruta	af0d99cce6	[SPARK-35375][INFRA] Use Jinja2 < 3.0.0 for Python linter dependency in GA ### What changes were proposed in this pull request? From a few hours ago, Python linter fails in GA. The latest Jinja 3.0.0 seems to cause this failure. https://pypi.org/project/Jinja2/ ``` Run ./dev/lint-python starting python compilation test... python compilation succeeded. starting pycodestyle test... pycodestyle checks passed. starting flake8 test... flake8 checks passed. starting mypy test... mypy checks passed. starting sphinx-build tests... sphinx-build checks failed: Running Sphinx v3.0.4 making output directory... done [autosummary] generating autosummary for: development/contributing.rst, development/debugging.rst, development/index.rst, development/setting_ide.rst, development/testing.rst, getting_started/index.rst, getting_started/install.rst, getting_started/quickstart.ipynb, index.rst, migration_guide/index.rst, ..., reference/pyspark.ml.rst, reference/pyspark.mllib.rst, reference/pyspark.resource.rst, reference/pyspark.rst, reference/pyspark.sql.rst, reference/pyspark.ss.rst, reference/pyspark.streaming.rst, user_guide/arrow_pandas.rst, user_guide/index.rst, user_guide/python_packaging.rst Exception occurred: File "/__w/spark/spark/python/docs/source/_templates/autosummary/class_with_docs.rst", line 26, in top-level template code {% if '__init__' in methods %} jinja2.exceptions.UndefinedError: 'methods' is undefined The full traceback has been saved in /tmp/sphinx-err-ypgyi75y.log, if you want to report the issue to the developers. Please also report this if it was a user error, so that a better error message can be provided next time. A bug report can be filed in the tracker at <https://github.com/sphinx-doc/sphinx/issues>. Thanks! make: * [Makefile:20: html] Error 2 re-running make html to print full warning list: Running Sphinx v3.0.4 making output directory... done [autosummary] generating autosummary for: development/contributing.rst, development/debugging.rst, development/index.rst, development/setting_ide.rst, development/testing.rst, getting_started/index.rst, getting_started/install.rst, getting_started/quickstart.ipynb, index.rst, migration_guide/index.rst, ..., reference/pyspark.ml.rst, reference/pyspark.mllib.rst, reference/pyspark.resource.rst, reference/pyspark.rst, reference/pyspark.sql.rst, reference/pyspark.ss.rst, reference/pyspark.streaming.rst, user_guide/arrow_pandas.rst, user_guide/index.rst, user_guide/python_packaging.rst Exception occurred: File "/__w/spark/spark/python/docs/source/_templates/autosummary/class_with_docs.rst", line 26, in top-level template code {% if '__init__' in methods %} jinja2.exceptions.UndefinedError: 'methods' is undefined The full traceback has been saved in /tmp/sphinx-err-fvtmvvwv.log, if you want to report the issue to the developers. Please also report this if it was a user error, so that a better error message can be provided next time. A bug report can be filed in the tracker at <https://github.com/sphinx-doc/sphinx/issues>. Thanks! make: * [Makefile:20: html] Error 2 Error: Process completed with exit code 2. ``` ### Why are the changes needed? To recover GA build. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? GA. Closes #32509 from sarutak/fix-python-lint-error. Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com> Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>	2021-05-12 10:13:38 +09:00
Hyukjin Kwon	b59d5ab060	[SPARK-35372][BUILD] Increase stack size for Scala compilation in Maven build ### What changes were proposed in this pull request? This PR increases the stack size for Scala compilation in Maven build to fix the error: ``` java.lang.StackOverflowError scala.reflect.internal.Trees$UnderConstructionTransformer.transform(Trees.scala:1741) scala.reflect.internal.Trees$UnderConstructionTransformer.transform$(Trees.scala:1740) scala.tools.nsc.transform.ExplicitOuter$OuterPathTransformer.transform(ExplicitOuter.scala:289) scala.tools.nsc.transform.ExplicitOuter$ExplicitOuterTransformer.transform(ExplicitOuter.scala:477) scala.tools.nsc.transform.ExplicitOuter$ExplicitOuterTransformer.transform(ExplicitOuter.scala:330) scala.reflect.api.Trees$Transformer.$anonfun$transformStats$1(Trees.scala:2597) scala.reflect.api.Trees$Transformer.transformStats(Trees.scala:2595) scala.reflect.internal.Trees.itransform(Trees.scala:1404) scala.reflect.internal.Trees.itransform$(Trees.scala:1374) scala.reflect.internal.SymbolTable.itransform(SymbolTable.scala:28) scala.reflect.internal.SymbolTable.itransform(SymbolTable.scala:28) scala.reflect.api.Trees$Transformer.transform(Trees.scala:2563) scala.tools.nsc.transform.TypingTransformers$TypingTransformer.transform(TypingTransformers.scala:51) scala.tools.nsc.transform.ExplicitOuter$OuterPathTransformer.scala$reflect$internal$Trees$UnderConstructionTransformer$$super$transform(ExplicitOuter.scala:212) scala.reflect.internal.Trees$UnderConstructionTransformer.transform(Trees.scala:1745) scala.reflect.internal.Trees$UnderConstructionTransformer.transform$(Trees.scala:1740) scala.tools.nsc.transform.ExplicitOuter$OuterPathTransformer.transform(ExplicitOuter.scala:289) scala.tools.nsc.transform.ExplicitOuter$ExplicitOuterTransformer.transform(ExplicitOuter.scala:477) scala.tools.nsc.transform.ExplicitOuter$ExplicitOuterTransformer.transform(ExplicitOuter.scala:330) scala.reflect.internal.Trees.itransform(Trees.scala:1383) ``` See https://github.com/apache/spark/runs/2554067779 ### Why are the changes needed? To recover JDK 11 compilation ### Does this PR introduce _any_ user-facing change? No, dev-only. ### How was this patch tested? CI in this PR will test it out. Closes #32502 from HyukjinKwon/SPARK-35372. Authored-by: Hyukjin Kwon <gurwls223@apache.org> Signed-off-by: Kousuke Saruta <sarutak@oss.nttdata.com>	2021-05-12 02:20:28 +09:00
Kousuke Saruta	2b6640a169	[SPARK-35229][WEBUI] Limit the maximum number of items on the timeline view ### What changes were proposed in this pull request? This PR proposes to introduces three new configurations to limit the maximum number of jobs/stages/executors on the timeline view. ### Why are the changes needed? If the number of items on the timeline view grows +1000, rendering can be significantly slow. https://issues.apache.org/jira/browse/SPARK-35229 The maximum number of tasks on the timeline is already limited by `spark.ui.timeline.tasks.maximum` so l proposed to mitigate this issue with the same manner. ### Does this PR introduce _any_ user-facing change? Yes. the maximum number of items shown on the timeline view is limited. I proposed the default value 500 for jobs and stages, and 250 for executors. A executor has at most 2 items (added and removed) 250 is chosen. ### How was this patch tested? I manually confirm this change works with the following procedures. ``` # launch a cluster $ bin/spark-shell --conf spark.ui.retainedDeadExecutors=300 --master "local-cluster[4, 1, 1024]" // Confirm the maximum number of jobs (1 to 1000).foreach { _ => sc.parallelize(List(1)).collect } // Confirm the maximum number of stages var df = sc.parallelize(1 to 2) (1 to 1000).foreach { i => df = df.repartition(i % 5 + 1) } df.collect // Confirm the maximum number of executors (1 to 300).foreach { _ => try sc.parallelize(List(1)).foreach { _ => System.exit(0) } catch { case e => }} ``` Screenshots here. ![jobs_limited](https://user-images.githubusercontent.com/4736016/116386937-3e8c4a00-a855-11eb-8f4c-151cf7ddd3b8.png) ![stages_limited](https://user-images.githubusercontent.com/4736016/116386990-49df7580-a855-11eb-9f71-8e129e3336ab.png) ![executors_limited](https://user-images.githubusercontent.com/4736016/116387009-4f3cc000-a855-11eb-8697-a2eb4c9c99e6.png) Closes #32381 from sarutak/mitigate-timeline-issue. Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com> Signed-off-by: Gengliang Wang <ltnwgl@gmail.com>	2021-05-11 20:53:11 +08:00
Yingyi Bu	7c9a9ec04f	[SPARK-35146][SQL] Migrate to transformWithPruning or resolveWithPruning for rules in finishAnalysis.scala ### What changes were proposed in this pull request? Added the following TreePattern enums: - BOOL_AGG - COUNT_IF - CURRENT_LIKE - RUNTIME_REPLACEABLE Added tree traversal pruning to the following rules: - ReplaceExpressions - RewriteNonCorrelatedExists - ComputeCurrentTime - GetCurrentDatabaseAndCatalog ### Why are the changes needed? Reduce the number of tree traversals and hence improve the query compilation latency. Performance improvement (org.apache.spark.sql.TPCDSQuerySuite): Rule name \| Total Time (baseline) \| Total Time (experiment) \| experiment/baseline ReplaceExpressions \| 27546369 \| 19753804 \| 0.72 RewriteNonCorrelatedExists \| 17304883 \| 2086194 \| 0.12 ComputeCurrentTime \| 35751301 \| 19984477 \| 0.56 GetCurrentDatabaseAndCatalog \| 37230787 \| 18874013 \| 0.51 ### How was this patch tested? Existing tests. Closes #32461 from sigmod/finish_analysis. Authored-by: Yingyi Bu <yingyi.bu@databricks.com> Signed-off-by: Gengliang Wang <ltnwgl@gmail.com>	2021-05-11 17:11:38 +08:00
Cheng Su	c4ca23207b	[SPARK-35363][SQL] Refactor sort merge join code-gen be agnostic to join type ### What changes were proposed in this pull request? This is a pre-requisite of https://github.com/apache/spark/pull/32476, in discussion of https://github.com/apache/spark/pull/32476#issuecomment-836469779 . This is to refactor sort merge join code-gen to depend on streamed/buffered terminology, which makes the code-gen agnostic to different join types and can be extended to support other join types than inner join. ### Why are the changes needed? Pre-requisite of https://github.com/apache/spark/pull/32476. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Existing unit test in `InnerJoinSuite.scala` for inner join code-gen. Closes #32495 from c21/smj-refactor. Authored-by: Cheng Su <chengsu@fb.com> Signed-off-by: Takeshi Yamamuro <yamamuro@apache.org>	2021-05-11 11:21:59 +09:00

... 3 4 5 6 7 ...

30262 commits