Commit graph

4 commits

Author SHA1 Message Date
HyukjinKwon 742fcff350 [SPARK-32839][WINDOWS] Make Spark scripts working with the spaces in paths on Windows
### What changes were proposed in this pull request?

If you install Spark under the path that has whitespaces, it does not work on Windows, for example as below:

```
>>> SparkSession.builder.getOrCreate()
Presence of build for multiple Scala versions detected (C:\...\assembly\target\scala-2.13 and C:\...\assembly\target\scala-2.12).
Remove one of them or, set SPARK_SCALA_VERSION=2.13 in spark-env.cmd.
Visit https://spark.apache.org/docs/latest/configuration.html#environment-variables for more details about setting environment variables in spark-env.cmd.
Either clean one of them or, set SPARK_SCALA_VERSION in spark-env.cmd.
```

This PR fixes the whitespace handling to support any paths on Windows.

### Why are the changes needed?

To support Spark working with whitespaces in paths on Windows.

### Does this PR introduce _any_ user-facing change?

Yes, users will be able to install and run Spark under the paths with whitespaces.

### How was this patch tested?

Manually tested.

Closes #29706 from HyukjinKwon/window-space-path.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2020-09-14 13:15:14 +09:00
Dongjoon Hyun 4f79b9fffd [SPARK-32447][CORE] Use python3 by default in pyspark and find-spark-home scripts
### What changes were proposed in this pull request?

This PR aims to use `python3` instead of `python` inside `bin/pyspark`, `bin/find-spark-home` and `bin/find-spark-home.cmd` script.
```
$ git diff master --stat
 bin/find-spark-home     | 4 ++--
 bin/find-spark-home.cmd | 4 ++--
 bin/pyspark             | 4 ++--
```

### Why are the changes needed?

According to [PEP 394](https://www.python.org/dev/peps/pep-0394/), we have four different cases for `python` while `python3` will be there always.
```
- Distributors may choose to set the behavior of the python command as follows:
      python2,
      python3,
      not provide python command,
      allow python to be configurable by an end user or a system administrator.
```

Moreover, these scripts already depend on `find_spark_home.py` which is using `#!/usr/bin/env python3`.
```
FIND_SPARK_HOME_PYTHON_SCRIPT="$(cd "$(dirname "$0")"; pwd)/find_spark_home.py"
```

### Does this PR introduce _any_ user-facing change?

No. Apache Spark 3.1 already drops Python 2.7 via SPARK-32138 .

### How was this patch tested?

Pass the Jenkins or GitHub Action.

Closes #29246 from dongjoon-hyun/SPARK-FIND-SPARK-HOME.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
2020-07-26 15:55:48 -07:00
hyukjinkwon a1877f45c3 [SPARK-22597][SQL] Add spark-sql cmd script for Windows users
## What changes were proposed in this pull request?

This PR proposes to add cmd scripts so that Windows users can also run `spark-sql` script.

## How was this patch tested?

Manually tested on Windows.

**Before**

```cmd
C:\...\spark>.\bin\spark-sql
'.\bin\spark-sql' is not recognized as an internal or external command,
operable program or batch file.

C:\...\spark>.\bin\spark-sql.cmd
'.\bin\spark-sql.cmd' is not recognized as an internal or external command,
operable program or batch file.
```

**After**

```cmd
C:\...\spark>.\bin\spark-sql
...
spark-sql> SELECT 'Hello World !!';
...
Hello World !!
```

Author: hyukjinkwon <gurwls223@gmail.com>

Closes #19808 from HyukjinKwon/spark-sql-cmd.
2017-11-24 19:55:26 +01:00
Jakub Nowacki b4edafa99b [SPARK-22495] Fix setup of SPARK_HOME variable on Windows
## What changes were proposed in this pull request?

Fixing the way how `SPARK_HOME` is resolved on Windows. While the previous version was working with the built release download, the set of directories changed slightly for the PySpark `pip` or `conda` install. This has been reflected in Linux files in `bin` but not for Windows `cmd` files.

First fix improves the way how the `jars` directory is found, as this was stoping Windows version of `pip/conda` install from working; JARs were not found by on Session/Context setup.

Second fix is adding `find-spark-home.cmd` script, which uses `find_spark_home.py` script, as the Linux version, to resolve `SPARK_HOME`. It is based on `find-spark-home` bash script, though, some operations are done in different order due to the `cmd` script language limitations. If environment variable is set, the Python script `find_spark_home.py` will not be run. The process can fail if Python is not installed, but it will mostly use this way if PySpark is installed via `pip/conda`, thus, there is some Python in the system.

## How was this patch tested?

Tested on local installation.

Author: Jakub Nowacki <j.s.nowacki@gmail.com>

Closes #19370 from jsnowacki/fix_spark_cmds.
2017-11-23 12:47:38 +09:00