- [x] Upgrade Py4J to 0.9.1
- [x] SPARK-12657: Revert SPARK-12617
- [x] SPARK-12658: Revert SPARK-12511
- Still keep the change that only reading checkpoint once. This is a manual change and worth to take a look carefully. bfd4b5c040
- [x] Verify no leak any more after reverting our workarounds
Author: Shixiong Zhu <shixiong@databricks.com>
Closes#10692 from zsxwing/py4j-0.9.1.
This PR is based on the work of roji to support running Spark scripts from symlinks. Thanks for the great work roji . Would you mind taking a look at this PR, thanks a lot.
For releases like HDP and others, normally it will expose the Spark executables as symlinks and put in `PATH`, but current Spark's scripts do not support finding real path from symlink recursively, this will make spark fail to execute from symlink. This PR try to solve this issue by finding the absolute path from symlink.
Instead of using `readlink -f` like what this PR (https://github.com/apache/spark/pull/2386) implemented is that `-f` is not support for Mac, so here manually seeking the path through loop.
I've tested with Mac and Linux (Cent OS), looks fine.
This PR did not fix the scripts under `sbin` folder, not sure if it needs to be fixed also?
Please help to review, any comment is greatly appreciated.
Author: jerryshao <sshao@hortonworks.com>
Author: Shay Rojansky <roji@roji.org>
Closes#8669 from jerryshao/SPARK-2960.
In sbin/spark-config.sh, parameter expansion is used to extract source root as follows.
this="${BASH_SOURCE-$0}"
I think, the parameter expansion should be ":" instead of "".
If we use "-" and BASH_SOURCE="", (empty character is set, not unset),
"" (empty character) is set to $this.
Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
Closes#2930 from sarutak/SPARK-4076 and squashes the following commits:
32a0370 [Kousuke Saruta] Fixed wrong parameter expansion
https://issues.apache.org/jira/browse/SPARK-3696
We see if SPARK_CONF_DIR is already defined before assignment.
Author: WangTaoTheTonic <barneystinson@aliyun.com>
Closes#2541 from WangTaoTheTonic/confdir and squashes the following commits:
c3f31e0 [WangTaoTheTonic] Do not override the user-difined conf_dir
...
Tested ! TBH, it isn't a great idea to have directory with spaces within. Because emacs doesn't like it then hadoop doesn't like it. and so on...
Author: Prashant Sharma <prashant.s@imaginea.com>
Closes#2229 from ScrapCodes/SPARK-3337/quoting-shell-scripts and squashes the following commits:
d4ad660 [Prashant Sharma] SPARK-3337 Paranoid quoting in shell to allow install dirs with spaces within.
Author: Josh Rosen <joshrosen@apache.org>
Closes#1626 from JoshRosen/SPARK-2305 and squashes the following commits:
03fb283 [Josh Rosen] Update Py4J to version 0.8.2.1.
This reopens https://github.com/apache/incubator-spark/pull/640 against the new repo
Author: Sandy Ryza <sandy@cloudera.com>
Closes#30 from sryza/sandy-spark-1004 and squashes the following commits:
89889d4 [Sandy Ryza] Move unzipping py4j to the generate-resources phase so that it gets included in the jar the first time
5165a02 [Sandy Ryza] Fix docs
fd0df79 [Sandy Ryza] PySpark on YARN