…build; fix some things that will be warnings or errors in 2.12; restore Scala 2.12 profile infrastructure
## What changes were proposed in this pull request?
This change adds back the infrastructure for a Scala 2.12 build, but does not enable it in the release or Python test scripts.
In order to make that meaningful, it also resolves compile errors that the code hits in 2.12 only, in a way that still works with 2.11.
It also updates dependencies to the earliest minor release of dependencies whose current version does not yet support Scala 2.12. This is in a sense covered by other JIRAs under the main umbrella, but implemented here. The versions below still work with 2.11, and are the _latest_ maintenance release in the _earliest_ viable minor release.
- Scalatest 2.x -> 3.0.3
- Chill 0.8.0 -> 0.8.4
- Clapper 1.0.x -> 1.1.2
- json4s 3.2.x -> 3.4.2
- Jackson 2.6.x -> 2.7.9 (required by json4s)
This change does _not_ fully enable a Scala 2.12 build:
- It will also require dropping support for Kafka before 0.10. Easy enough, just didn't do it yet here
- It will require recreating `SparkILoop` and `Main` for REPL 2.12, which is SPARK-14650. Possible to do here too.
What it does do is make changes that resolve much of the remaining gap without affecting the current 2.11 build.
## How was this patch tested?
Existing tests and build. Manually tested with `./dev/change-scala-version.sh 2.12` to verify it compiles, modulo the exceptions above.
Author: Sean Owen <sowen@cloudera.com>
Closes#18645 from srowen/SPARK-14280.
History Server Launch uses SparkClassCommandBuilder for launching the server. It is observed that SPARK_CLASSPATH has been removed and deprecated. For spark-submit this takes a different route and spark.driver.extraClasspath takes care of specifying additional jars in the classpath that were previously specified in the SPARK_CLASSPATH. Right now the only way specify the additional jars for launching daemons such as history server is using SPARK_DIST_CLASSPATH (https://spark.apache.org/docs/latest/hadoop-provided.html) but this I presume is a distribution classpath. It would be nice to have a similar config like spark.driver.extraClasspath for launching daemons similar to history server.
Added new environment variable SPARK_DAEMON_CLASSPATH to set classpath for launching daemons. Tested and verified for History Server and Standalone Mode.
## How was this patch tested?
Initially, history server start script would fail for the reason being that it could not find the required jars for launching the server in the java classpath. Same was true for running Master and Worker in standalone mode. By adding the environment variable SPARK_DAEMON_CLASSPATH to the java classpath, both the daemons(History Server, Standalone daemons) are starting up and running.
Author: pgandhi <pgandhi@yahoo-inc.com>
Author: pgandhi999 <parthkgandhi9@gmail.com>
Closes#19047 from pgandhi999/master.
This is a follow up to cba826d0; that commit set the app handle state
to "LOST" when the child process exited, but that can be ambiguous. This
change sets the state to "FAILED" if the exit code was non-zero and
the handle state wasn't a failure state, or "LOST" if the exit status
was zero.
Author: Marcelo Vanzin <vanzin@cloudera.com>
Closes#19012 from vanzin/SPARK-17742.
## What changes were proposed in this pull request?
Fix build warnings and Java lint errors. This just helps a bit in evaluating (new) warnings in another PR I have open.
## How was this patch tested?
Existing tests
Author: Sean Owen <sowen@cloudera.com>
Closes#19051 from srowen/JavaWarnings.
There're two code in Launcher and SparkSubmit will will explicitly list all the Spark submodules, newly added kvstore module is missing in this two parts, so submitting a minor PR to fix this.
Author: jerryshao <sshao@hortonworks.com>
Closes#19014 from jerryshao/missing-kvstore.
Currently the launcher handle does not monitor the child spark-submit
process it launches; this means that if the child exits with an error,
the handle's state will never change, and an application will not know
that the application has failed.
This change adds code to monitor the child process, and changes the
handle state appropriately when the child process exits.
Tested with added unit tests.
Author: Marcelo Vanzin <vanzin@cloudera.com>
Closes#18877 from vanzin/SPARK-17742.
## What changes were proposed in this pull request?
Resources in Core - SparkSubmitArguments.scala, Spark-launcher - AbstractCommandBuilder.java, resource-managers- YARN - Client.scala are released
## How was this patch tested?
No new test cases added, Unit test have been passed
Author: vinodkc <vinod.kc.in@gmail.com>
Closes#18880 from vinodkc/br_fixresouceleak.
The code was failing to account for some cases when setting up log
redirection. For example, if a user redirected only stdout to a file,
the launcher code would leave stderr without redirection, which could
lead to child processes getting stuck because stderr wasn't being
read.
So detect cases where only one of the streams is redirected, and
redirect the other stream to the log as appropriate.
For the old "launch()" API, redirection of the unconfigured stream
only happens if the user has explicitly requested for log redirection.
Log redirection is on by default with "startApplication()".
Most of the change is actually adding new unit tests to make sure the
different cases work as expected. As part of that, I moved some tests
that were in the core/ module to the launcher/ module instead, since
they don't depend on spark-submit.
Author: Marcelo Vanzin <vanzin@cloudera.com>
Closes#18696 from vanzin/SPARK-21490.
Handle the case where the server closes the socket before the full message
has been written by the client.
Author: Marcelo Vanzin <vanzin@cloudera.com>
Closes#18727 from vanzin/SPARK-21522.
## What changes were proposed in this pull request?
- Remove Scala 2.10 build profiles and support
- Replace some 2.10 support in scripts with commented placeholders for 2.12 later
- Remove deprecated API calls from 2.10 support
- Remove usages of deprecated context bounds where possible
- Remove Scala 2.10 workarounds like ScalaReflectionLock
- Other minor Scala warning fixes
## How was this patch tested?
Existing tests
Author: Sean Owen <sowen@cloudera.com>
Closes#17150 from srowen/SPARK-19810.
Blindly deserializing classes using Java serialization opens the code up to
issues in other libraries, since just deserializing data from a stream may
end up execution code (think readObject()).
Since the launcher protocol is pretty self-contained, there's just a handful
of classes it legitimately needs to deserialize, and they're in just two
packages, so add a filter that throws errors if classes from any other
package show up in the stream.
This also maintains backwards compatibility (the updated launcher code can
still communicate with the backend code in older Spark releases).
Tested with new and existing unit tests.
Author: Marcelo Vanzin <vanzin@cloudera.com>
Closes#18166 from vanzin/SPARK-20922.
Restore code that was removed as part of SPARK-17979, but instead of
using the deprecated env variable name to propagate the class path, use
a new one.
Verified by running "./bin/spark-class o.a.s.executor.CoarseGrainedExecutorBackend"
manually.
Author: Marcelo Vanzin <vanzin@cloudera.com>
Closes#18037 from vanzin/SPARK-20814.
## What changes were proposed in this pull request?
Remove redundant check for libdir in CommandBuilderUtils
## How was this patch tested?
Existing tests
Author: Sean Owen <sowen@cloudera.com>
Closes#18032 from srowen/SPARK-20806.
## What changes were proposed in this pull request?
Convert tests to use Java 8 lambdas, and modest related fixes to surrounding code.
## How was this patch tested?
Jenkins tests
Author: Sean Owen <sowen@cloudera.com>
Closes#16964 from srowen/SPARK-19534.
- Move external/java8-tests tests into core, streaming, sql and remove
- Remove MaxPermGen and related options
- Fix some reflection / TODOs around Java 8+ methods
- Update doc references to 1.7/1.8 differences
- Remove Java 7/8 related build profiles
- Update some plugins for better Java 8 compatibility
- Fix a few Java-related warnings
For the future:
- Update Java 8 examples to fully use Java 8
- Update Java tests to use lambdas for simplicity
- Update Java internal implementations to use lambdas
## How was this patch tested?
Existing tests
Author: Sean Owen <sowen@cloudera.com>
Closes#16871 from srowen/SPARK-19493.
## What changes were proposed in this pull request?
remove ununsed imports and outdated comments, and fix some minor code style issue.
## How was this patch tested?
existing ut
Author: uncleGen <hustyugm@gmail.com>
Closes#16591 from uncleGen/SPARK-19227.
Remove spark-tag's compile-scope dependency (and, indirectly, spark-core's compile-scope transitive-dependency) on scalatest by splitting test-oriented tags into spark-tags' test JAR.
Alternative to #16303.
Author: Ryan Williams <ryan.blake.williams@gmail.com>
Closes#16311 from ryan-williams/tt.
## What changes were proposed in this pull request?
Currently, some tests are being failed and hanging on Windows due to this problem. For the reason in SPARK-18718, some tests using `local-cluster` mode were disabled on Windows due to the length limitation by paths given to classpaths.
The limitation seems roughly 32K (see the [blog in MS](https://blogs.msdn.microsoft.com/oldnewthing/20031210-00/?p=41553/) and [another reference](https://support.thoughtworks.com/hc/en-us/articles/213248526-Getting-around-maximum-command-line-length-is-32767-characters-on-Windows)) but in `local-cluster` mode, executors were being launched as processes with the command such as [here](https://gist.github.com/HyukjinKwon/5bc81061c250d4af5a180869b59d42ea) in (only) tests.
This length is roughly 40K due to the classpaths given to `java` command. However, it seems duplicates are almost half of them. So, if we deduplicate the paths, it seems reduced to roughly 20K with the command, [here](https://gist.github.com/HyukjinKwon/dad0c8db897e5e094684a2dc6a417790).
Maybe, we should consider as some more paths are added in the future but it seems better than disabling all the tests for now with minimised changes.
Therefore, this PR proposes to deduplicate the paths in classpaths in case of launching executors as processes in `local-cluster` mode.
## How was this patch tested?
Existing tests in `ShuffleSuite` and `BroadcastJoinSuite` manually via AppVeyor
Author: hyukjinkwon <gurwls223@gmail.com>
Closes#16266 from HyukjinKwon/disable-local-cluster-tests.
These directories are added to the classpath of applications when testing or
using SPARK_PREPEND_CLASSES, otherwise updated classes are not seen. Also,
add the mesos directory which was missing.
Author: Marcelo Vanzin <vanzin@cloudera.com>
Closes#16202 from vanzin/SPARK-18662.
## What changes were proposed in this pull request?
This patch bumps master branch version to 2.2.0-SNAPSHOT.
## How was this patch tested?
N/A
Author: Reynold Xin <rxin@databricks.com>
Closes#16126 from rxin/SPARK-18695.
## What changes were proposed in this pull request?
This PR aims to provide a pip installable PySpark package. This does a bunch of work to copy the jars over and package them with the Python code (to prevent challenges from trying to use different versions of the Python code with different versions of the JAR). It does not currently publish to PyPI but that is the natural follow up (SPARK-18129).
Done:
- pip installable on conda [manual tested]
- setup.py installed on a non-pip managed system (RHEL) with YARN [manual tested]
- Automated testing of this (virtualenv)
- packaging and signing with release-build*
Possible follow up work:
- release-build update to publish to PyPI (SPARK-18128)
- figure out who owns the pyspark package name on prod PyPI (is it someone with in the project or should we ask PyPI or should we choose a different name to publish with like ApachePySpark?)
- Windows support and or testing ( SPARK-18136 )
- investigate details of wheel caching and see if we can avoid cleaning the wheel cache during our test
- consider how we want to number our dev/snapshot versions
Explicitly out of scope:
- Using pip installed PySpark to start a standalone cluster
- Using pip installed PySpark for non-Python Spark programs
*I've done some work to test release-build locally but as a non-committer I've just done local testing.
## How was this patch tested?
Automated testing with virtualenv, manual testing with conda, a system wide install, and YARN integration.
release-build changes tested locally as a non-committer (no testing of upload artifacts to Apache staging websites)
Author: Holden Karau <holden@us.ibm.com>
Author: Juliet Hougland <juliet@cloudera.com>
Author: Juliet Hougland <not@myemail.com>
Closes#15659 from holdenk/SPARK-1267-pip-install-pyspark.
## What changes were proposed in this pull request?
Allow user to set sparkr shell command through --conf spark.r.shell.command
## How was this patch tested?
Unit test is added and also verify it manually through
```
bin/sparkr --master yarn-client --conf spark.r.shell.command=/usr/local/bin/R
```
Author: Jeff Zhang <zjffdu@apache.org>
Closes#14744 from zjffdu/SPARK-17178.
Before this PR, user have to export environment variable to specify the python of driver & executor which is not so convenient for users. This PR is trying to allow user to specify python through configuration "--pyspark-driver-python" & "--pyspark-executor-python"
Manually test in local & yarn mode for pyspark-shell and pyspark batch mode.
Author: Jeff Zhang <zjffdu@apache.org>
Closes#13146 from zjffdu/SPARK-13081.
## What changes were proposed in this pull request?
Adds a few public methods to `SparkLauncher` to allow configuring some extra features of the `ProcessBuilder`, including the working directory, output and error stream redirection.
## How was this patch tested?
Unit testing + simple Spark driver programs
Author: Andrew Duffy <root@aduffy.org>
Closes#14201 from andreweduffy/feature/launcher.
https://issues.apache.org/jira/browse/SPARK-16535
## What changes were proposed in this pull request?
When I scan through the pom.xml of sub projects, I found this warning as below and attached screenshot
```
Definition of groupId is redundant, because it's inherited from the parent
```
![screen shot 2016-07-13 at 3 13 11 pm](https://cloud.githubusercontent.com/assets/3925641/16823121/744f893e-4916-11e6-8a52-042f83b9db4e.png)
I've tried to remove some of the lines with groupId definition, and the build on my local machine is still ok.
```
<groupId>org.apache.spark</groupId>
```
As I just find now `<maven.version>3.3.9</maven.version>` is being used in Spark 2.x, and Maven-3 supports versionless parent elements: Maven 3 will remove the need to specify the parent version in sub modules. THIS is great (in Maven 3.1).
ref: http://stackoverflow.com/questions/3157240/maven-3-worth-it/3166762#3166762
## How was this patch tested?
I've tested by re-building the project, and build succeeded.
Author: Xin Ren <iamshrek@126.com>
Closes#14189 from keypointt/SPARK-16535.
## What changes were proposed in this pull request?
After SPARK-16476 (committed earlier today as #14128), we can finally bump the version number.
## How was this patch tested?
N/A
Author: Reynold Xin <rxin@databricks.com>
Closes#14130 from rxin/SPARK-16477.
## What changes were proposed in this pull request?
revived #13464
Fix Java Lint errors introduced by #13286 and #13280
Before:
```
Using `mvn` from path: /Users/pichu/Project/spark/build/apache-maven-3.3.9/bin/mvn
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512M; support was removed in 8.0
Checkstyle checks failed at following occurrences:
[ERROR] src/main/java/org/apache/spark/launcher/LauncherServer.java:[340,5] (whitespace) FileTabCharacter: Line contains a tab character.
[ERROR] src/main/java/org/apache/spark/launcher/LauncherServer.java:[341,5] (whitespace) FileTabCharacter: Line contains a tab character.
[ERROR] src/main/java/org/apache/spark/launcher/LauncherServer.java:[342,5] (whitespace) FileTabCharacter: Line contains a tab character.
[ERROR] src/main/java/org/apache/spark/launcher/LauncherServer.java:[343,5] (whitespace) FileTabCharacter: Line contains a tab character.
[ERROR] src/main/java/org/apache/spark/sql/streaming/OutputMode.java:[41,28] (naming) MethodName: Method name 'Append' must match pattern '^[a-z][a-z0-9][a-zA-Z0-9_]*$'.
[ERROR] src/main/java/org/apache/spark/sql/streaming/OutputMode.java:[52,28] (naming) MethodName: Method name 'Complete' must match pattern '^[a-z][a-z0-9][a-zA-Z0-9_]*$'.
[ERROR] src/main/java/org/apache/spark/sql/execution/datasources/parquet/SpecificParquetRecordReaderBase.java:[61,8] (imports) UnusedImports: Unused import - org.apache.parquet.schema.PrimitiveType.
[ERROR] src/main/java/org/apache/spark/sql/execution/datasources/parquet/SpecificParquetRecordReaderBase.java:[62,8] (imports) UnusedImports: Unused import - org.apache.parquet.schema.Type.
```
## How was this patch tested?
ran `dev/lint-java` locally
Author: Sandeep Singh <sandeep@techaddict.me>
Closes#13559 from techaddict/minor-3.
## What changes were proposed in this pull request?
This situation can happen when the LauncherConnection gets an exception while reading through the socket and terminating silently without notifying making the client/listener think that the job is still in previous state.
The fix force sends a notification to client that the job finished with unknown status and let client handle it accordingly.
## How was this patch tested?
Added a unit test.
Author: Subroto Sanyal <ssanyal@datameer.com>
Closes#13497 from subrotosanyal/SPARK-15652-handle-spark-submit-jvm-crash.
## What changes were proposed in this pull request?
--kill and --status were not considered while handling in OptionParser and due to that it was failing. Now handling the --kill and --status options as part of OptionParser.handle.
## How was this patch tested?
Added a test org.apache.spark.launcher.SparkSubmitCommandBuilderSuite.testCliKillAndStatus() and also I have verified these manually by running --kill and --status commands.
Author: Devaraj K <devaraj@apache.org>
Closes#13407 from devaraj-kavali/SPARK-15665.
## What changes were proposed in this pull request?
More than 100 chars in line.
## How was this patch tested?
Author: Sandeep Singh <sandeep@techaddict.me>
Closes#13249 from techaddict/fix-1.
(Please fill in changes proposed in this fix)
In 2.0, ./bin/spark-submit doesn't print out usage, but it raises an exception.
In this PR, an exception handling is added in the Main.java when the exception is thrown. In the handling code, if there is no additional argument, it prints out usage.
(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
Manually tested.
./bin/spark-submit
Usage: spark-submit [options] <app jar | python file> [app arguments]
Usage: spark-submit --kill [submission ID] --master [spark://...]
Usage: spark-submit --status [submission ID] --master [spark://...]
Usage: spark-submit run-example [options] example-class [example args]
Options:
--master MASTER_URL spark://host:port, mesos://host:port, yarn, or local.
--deploy-mode DEPLOY_MODE Whether to launch the driver program locally ("client") or
on one of the worker machines inside the cluster ("cluster")
(Default: client).
--class CLASS_NAME Your application's main class (for Java / Scala apps).
--name NAME A name of your application.
--jars JARS Comma-separated list of local jars to include on the driver
and executor classpaths.
--packages Comma-separated list of maven coordinates of jars to include
on the driver and executor classpaths. Will search the local
maven repo, then maven central and any additional remote
repositories given by --repositories. The format for the
coordinates should be groupId:artifactId:version.
Author: wm624@hotmail.com <wm624@hotmail.com>
Closes#13163 from wangmiao1981/submit.
## What changes were proposed in this pull request?
(See https://github.com/apache/spark/pull/12416 where most of this was already reviewed and committed; this is just the module structure and move part. This change does not move the annotations into test scope, which was the apparently problem last time.)
Rename `spark-test-tags` -> `spark-tags`; move common annotations like `Since` to `spark-tags`
## How was this patch tested?
Jenkins tests.
Author: Sean Owen <sowen@cloudera.com>
Closes#13074 from srowen/SPARK-15290.
Without this, the code would build an invalid spark-submit command line,
and a more cryptic error would be presented to the user. Also, expose
a constant that allows users to set a dummy resource in cases where
they don't need an actual resource file; for backwards compatibility,
that uses the same "spark-internal" resource that Spark itself uses.
Tested via unit tests, run-example, spark-shell, and running the
thrift server with mixed spark and hive command line arguments.
Author: Marcelo Vanzin <vanzin@cloudera.com>
Closes#12909 from vanzin/SPARK-11249.
## What changes were proposed in this pull request?
Look for MaxPermSize arguments anywhere in an arg, to account for quoted args. See JIRA for discussion.
## How was this patch tested?
Jenkins tests
Author: Sean Owen <sowen@cloudera.com>
Closes#12985 from srowen/SPARK-15067.
There's actually a race here: the state of the handler was changed before
the connection was set, so the test code could be notified of the state
change, wake up, and still see the connection as null, triggering the assert.
Author: Marcelo Vanzin <vanzin@cloudera.com>
Closes#12785 from vanzin/SPARK-14391.
## What changes were proposed in this pull request?
This PR adds `since` tag into the matrix and vector classes in spark-mllib-local.
## How was this patch tested?
Scala-style checks passed.
Author: Pravin Gadakh <prgadakh@in.ibm.com>
Closes#12416 from pravingadakh/SPARK-14613.
## What changes were proposed in this pull request?
Currently Spark clients are started with the same memory setting for Xms and Xms leading to reserving unnecessary higher amounts of memory.
This behavior is changed and the clients can now specify an initial heap size using the extraJavaOptions in the config for driver,executor and am individually.
Note, that only -Xms can be provided through this config option, if the client wants to set the max size(-Xmx), this has to be done via the *.memory configuration knobs which are currently supported.
## How was this patch tested?
Monitored executor and yarn logs in debug mode to verify the commands through which they are being launched in client and cluster mode. The driver memory was verified locally using jps -v. Setting up -Xmx parameter in the javaExtraOptions raises exception with the info provided.
Author: Dhruve Ashar <dhruveashar@gmail.com>
Closes#12115 from dhruve/impr/SPARK-12384.
The current package name uses a dash, which is a little weird but seemed
to work. That is, until a new test tried to mock a class that references
one of those shaded types, and then things started failing.
Most changes are just noise to fix the logging configs.
For reference, SPARK-8815 also raised this issue, although at the time it
did not cause any issues in Spark, so it was not addressed.
Author: Marcelo Vanzin <vanzin@cloudera.com>
Closes#11941 from vanzin/SPARK-14134.
Most of the time tests should still pass really quickly; it's just
when machines are overloaded that the tests may take a little time,
but that's still preferable over just failing the test.
Author: Marcelo Vanzin <vanzin@cloudera.com>
Closes#12210 from vanzin/SPARK-14391.
This change modifies the "assembly/" module to just copy needed
dependencies to its build directory, and modifies the packaging
script to pick those up (and remove duplicate jars packages in the
examples module).
I also made some minor adjustments to dependencies to remove some
test jars from the final packaging, and remove jars that conflict with each
other when packaged separately (e.g. servlet api).
Also note that this change restores guava in applications' classpaths, even
though it's still shaded inside Spark. This is now needed for the Hadoop
libraries that are packaged with Spark, which now are not processed by
the shade plugin.
Author: Marcelo Vanzin <vanzin@cloudera.com>
Closes#11796 from vanzin/SPARK-13579.
## What changes were proposed in this pull request?
This PR contains the following 5 types of maintenance fix over 59 files (+94 lines, -93 lines).
- Fix typos(exception/log strings, testcase name, comments) in 44 lines.
- Fix lint-java errors (MaxLineLength) in 6 lines. (New codes after SPARK-14011)
- Use diamond operators in 40 lines. (New codes after SPARK-13702)
- Fix redundant semicolon in 5 lines.
- Rename class `InferSchemaSuite` to `CSVInferSchemaSuite` in CSVInferSchemaSuite.scala.
## How was this patch tested?
Manual and pass the Jenkins tests.
Author: Dongjoon Hyun <dongjoon@apache.org>
Closes#12139 from dongjoon-hyun/SPARK-14355.
Move the logic to find Spark jars to CommandBuilderUtils and make it
available for YARN code, so that it's possible to easily launch Spark
on YARN from a build directory.
Tested by running SparkPi from the build directory on YARN.
Author: Marcelo Vanzin <vanzin@cloudera.com>
Closes#11970 from vanzin/SPARK-13955.
## What changes were proposed in this pull request?
[Spark Coding Style Guide](https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide) has 100-character limit on lines, but it's disabled for Java since 11/09/15. This PR enables **LineLength** checkstyle again. To help that, this also introduces **RedundantImport** and **RedundantModifier**, too. The following is the diff on `checkstyle.xml`.
```xml
- <!-- TODO: 11/09/15 disabled - the lengths are currently > 100 in many places -->
- <!--
<module name="LineLength">
<property name="max" value="100"/>
<property name="ignorePattern" value="^package.*|^import.*|a href|href|http://|https://|ftp://"/>
</module>
- -->
<module name="NoLineWrap"/>
<module name="EmptyBlock">
<property name="option" value="TEXT"/>
-167,5 +164,7
</module>
<module name="CommentsIndentation"/>
<module name="UnusedImports"/>
+ <module name="RedundantImport"/>
+ <module name="RedundantModifier"/>
```
## How was this patch tested?
Currently, `lint-java` is disabled in Jenkins. It needs a manual test.
After passing the Jenkins tests, `dev/lint-java` should passes locally.
Author: Dongjoon Hyun <dongjoon@apache.org>
Closes#11831 from dongjoon-hyun/SPARK-14011.
## What changes were proposed in this pull request?
I'm seeing several PR builder builds fail after https://github.com/apache/spark/pull/11725/files. Example:
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test/job/spark-master-test-maven-hadoop-2.4/lastFailedBuild/console
```
testCommunication(org.apache.spark.launcher.LauncherServerSuite) Time elapsed: 0.023 sec <<< FAILURE!
java.lang.AssertionError: expected:<app-id> but was:<null>
at org.apache.spark.launcher.LauncherServerSuite.testCommunication(LauncherServerSuite.java:93)
```
However, other builds pass this same test, including the test when run locally and on the Jenkins PR builder. The failure itself concerns a change to how the test waits on a condition, and the wait can time out; therefore I think this is due to fast/slow machine differences.
This is an attempt at a hot fix; it's a little hard to verify since locally and on the PR builder, it passes anyway. The change itself should be harmless anyway.
Why didn't this happen before, if the new logic was supposed to be equivalent to the old? I think this is the sequence:
- First attempt to acquire semaphore for 10ms actually silently times out
- The changed being waited for happens just after that, a bit too late
- Assertion passes since condition became true just in time
- `release()` fires from the listener
- Next `tryAcquire` however immediately succeeds because the first `tryAcquire` didn't acquire anything, but its subsequent condition is not yet true; this would explain why the second one always fails
Versus the original using `notifyAll()`, there's a small difference: `wait()`-ing after `notifyAll()` just results in another wait; it doesn't make it return immediately. So this was a tiny latent issue that was masked by the semantics. Now the test asserts that the event actually happened (semaphore was acquired). (The timeout is still here to prevent the test from hanging forever, and to detect really slow response.) The timeout is increased to a second to allow plenty of time anyway.
## How was this patch tested?
Jenkins tests
Author: Sean Owen <sowen@cloudera.com>
Closes#11763 from srowen/SPARK-13823.3.
## What changes were proposed in this pull request?
Follow up to https://github.com/apache/spark/pull/11657
- Also update `String.getBytes("UTF-8")` to use `StandardCharsets.UTF_8`
- And fix one last new Coverity warning that turned up (use of unguarded `wait()` replaced by simpler/more robust `java.util.concurrent` classes in tests)
- And while we're here cleaning up Coverity warnings, just fix about 15 more build warnings
## How was this patch tested?
Jenkins tests
Author: Sean Owen <sowen@cloudera.com>
Closes#11725 from srowen/SPARK-13823.2.