spark-instrumented-optimizer/bin/spark-class2.cmd
wuyi 925f620570 [SPARK-28302][CORE] Make sure to generate unique output file for SparkLauncher on Windows
## What changes were proposed in this pull request?

When using SparkLauncher to submit applications **concurrently** with multiple threads under **Windows**, some apps would show that "The process cannot access the file because it is being used by another process" and remains in LOST state at the end. The issue can be reproduced by  this [demo](https://issues.apache.org/jira/secure/attachment/12973920/Main.scala).

After digging into the code, I find that, Windows cmd `%RANDOM%` would return the same number if we call it  instantly(e.g. < 500ms) after last call. As a result, SparkLauncher would get same output file(spark-class-launcher-output-%RANDOM%.txt) for apps. Then, the following app would hit the issue when it tries to write the same file which has already been opened for writing by another app.

We should make sure to generate unique output file for SparkLauncher on Windows to avoid this issue.

## How was this patch tested?

Tested manually on Windows.

Closes #25076 from Ngone51/SPARK-28302.

Authored-by: wuyi <ngone_5451@163.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2019-07-09 15:49:31 +09:00

78 lines
2.7 KiB
Batchfile

@echo off
rem
rem Licensed to the Apache Software Foundation (ASF) under one or more
rem contributor license agreements. See the NOTICE file distributed with
rem this work for additional information regarding copyright ownership.
rem The ASF licenses this file to You under the Apache License, Version 2.0
rem (the "License"); you may not use this file except in compliance with
rem the License. You may obtain a copy of the License at
rem
rem http://www.apache.org/licenses/LICENSE-2.0
rem
rem Unless required by applicable law or agreed to in writing, software
rem distributed under the License is distributed on an "AS IS" BASIS,
rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
rem See the License for the specific language governing permissions and
rem limitations under the License.
rem
rem Figure out where the Spark framework is installed
call "%~dp0find-spark-home.cmd"
call "%SPARK_HOME%\bin\load-spark-env.cmd"
rem Test that an argument was given
if "x%1"=="x" (
echo Usage: spark-class ^<class^> [^<args^>]
exit /b 1
)
rem Find Spark jars.
if exist "%SPARK_HOME%\jars" (
set SPARK_JARS_DIR="%SPARK_HOME%\jars"
) else (
set SPARK_JARS_DIR="%SPARK_HOME%\assembly\target\scala-%SPARK_SCALA_VERSION%\jars"
)
if not exist "%SPARK_JARS_DIR%"\ (
echo Failed to find Spark jars directory.
echo You need to build Spark before running this program.
exit /b 1
)
set LAUNCH_CLASSPATH=%SPARK_JARS_DIR%\*
rem Add the launcher build dir to the classpath if requested.
if not "x%SPARK_PREPEND_CLASSES%"=="x" (
set LAUNCH_CLASSPATH="%SPARK_HOME%\launcher\target\scala-%SPARK_SCALA_VERSION%\classes;%LAUNCH_CLASSPATH%"
)
rem Figure out where java is.
set RUNNER=java
if not "x%JAVA_HOME%"=="x" (
set RUNNER=%JAVA_HOME%\bin\java
) else (
where /q "%RUNNER%"
if ERRORLEVEL 1 (
echo Java not found and JAVA_HOME environment variable is not set.
echo Install Java and set JAVA_HOME to point to the Java installation directory.
exit /b 1
)
)
rem The launcher library prints the command to be executed in a single line suitable for being
rem executed by the batch interpreter. So read all the output of the launcher into a variable.
:gen
set LAUNCHER_OUTPUT=%temp%\spark-class-launcher-output-%RANDOM%.txt
rem SPARK-28302: %RANDOM% would return the same number if we call it instantly after last call,
rem so we should make it sure to generate unique file to avoid process collision of writing into
rem the same file concurrently.
if exist %LAUNCHER_OUTPUT% goto :gen
"%RUNNER%" -Xmx128m -cp "%LAUNCH_CLASSPATH%" org.apache.spark.launcher.Main %* > %LAUNCHER_OUTPUT%
for /f "tokens=*" %%i in (%LAUNCHER_OUTPUT%) do (
set SPARK_CMD=%%i
)
del %LAUNCHER_OUTPUT%
%SPARK_CMD%