Commit graph

116 commits

Author SHA1 Message Date
Paul Mackles f5fe63f7b8 [SPARK-22287][MESOS] SPARK_DAEMON_MEMORY not honored by MesosClusterD…
…ispatcher

## What changes were proposed in this pull request?

Allow JVM max heap size to be controlled for MesosClusterDispatcher via SPARK_DAEMON_MEMORY environment variable.

## How was this patch tested?

Tested on local Mesos cluster

Author: Paul Mackles <pmackles@adobe.com>

Closes #19515 from pmackles/SPARK-22287.
2017-11-09 16:42:33 -08:00
Andrew Ash 5433be44ca [SPARK-21991][LAUNCHER][FOLLOWUP] Fix java lint
## What changes were proposed in this pull request?

Fix java lint

## How was this patch tested?

Run `./dev/lint-java`

Author: Andrew Ash <andrew@andrewash.com>

Closes #19574 from ash211/aash/fix-java-lint.
2017-10-25 14:41:02 -07:00
Andrea zito 6ea8a56ca2 [SPARK-21991][LAUNCHER] Fix race condition in LauncherServer#acceptConnections
## What changes were proposed in this pull request?
This patch changes the order in which _acceptConnections_ starts the client thread and schedules the client timeout action ensuring that the latter has been scheduled before the former get a chance to cancel it.

## How was this patch tested?
Due to the non-deterministic nature of the patch I wasn't able to add a new test for this issue.

Author: Andrea zito <andrea.zito@u-hopper.com>

Closes #19217 from nivox/SPARK-21991.
2017-10-25 10:10:24 -07:00
Pavel Sakun dadd13f365 [SPARK] Misleading error message for missing --proxy-user value
Fix misleading error message when argument is expected.

## What changes were proposed in this pull request?

Change message to be accurate.

## How was this patch tested?

Messaging change, was tested manually.

Author: Pavel Sakun <pavel.sakun@gmail.com>

Closes #19457 from pavel-sakun/patch-1.
2017-10-09 23:00:04 +01:00
Armin b6ef1f57bc [SPARK-21970][CORE] Fix Redundant Throws Declarations in Java Codebase
## What changes were proposed in this pull request?

1. Removing all redundant throws declarations from Java codebase.
2. Removing dead code made visible by this from `ShuffleExternalSorter#closeAndGetSpills`

## How was this patch tested?

Build still passes.

Author: Armin <me@obrown.io>

Closes #19182 from original-brownbear/SPARK-21970.
2017-09-13 14:04:26 +01:00
caoxuewen ca00cc70d6 [SPARK-21963][CORE][TEST] Create temp file should be delete after use
## What changes were proposed in this pull request?

After you create a temporary table, you need to delete it, otherwise it will leave a file similar to the file name ‘SPARK194465907929586320484966temp’.

## How was this patch tested?

N / A

Author: caoxuewen <cao.xuewen@zte.com.cn>

Closes #19174 from heary-cao/DeleteTempFile.
2017-09-13 13:01:30 +01:00
Sean Owen 12ab7f7e89 [SPARK-14280][BUILD][WIP] Update change-version.sh and pom.xml to add Scala 2.12 profiles and enable 2.12 compilation
…build; fix some things that will be warnings or errors in 2.12; restore Scala 2.12 profile infrastructure

## What changes were proposed in this pull request?

This change adds back the infrastructure for a Scala 2.12 build, but does not enable it in the release or Python test scripts.

In order to make that meaningful, it also resolves compile errors that the code hits in 2.12 only, in a way that still works with 2.11.

It also updates dependencies to the earliest minor release of dependencies whose current version does not yet support Scala 2.12. This is in a sense covered by other JIRAs under the main umbrella, but implemented here. The versions below still work with 2.11, and are the _latest_ maintenance release in the _earliest_ viable minor release.

- Scalatest 2.x -> 3.0.3
- Chill 0.8.0 -> 0.8.4
- Clapper 1.0.x -> 1.1.2
- json4s 3.2.x -> 3.4.2
- Jackson 2.6.x -> 2.7.9 (required by json4s)

This change does _not_ fully enable a Scala 2.12 build:

- It will also require dropping support for Kafka before 0.10. Easy enough, just didn't do it yet here
- It will require recreating `SparkILoop` and `Main` for REPL 2.12, which is SPARK-14650. Possible to do here too.

What it does do is make changes that resolve much of the remaining gap without affecting the current 2.11 build.

## How was this patch tested?

Existing tests and build. Manually tested with `./dev/change-scala-version.sh 2.12` to verify it compiles, modulo the exceptions above.

Author: Sean Owen <sowen@cloudera.com>

Closes #18645 from srowen/SPARK-14280.
2017-09-01 19:21:21 +01:00
pgandhi 24e6c187fb [SPARK-21798] No config to replace deprecated SPARK_CLASSPATH config for launching daemons like History Server
History Server Launch uses SparkClassCommandBuilder for launching the server. It is observed that SPARK_CLASSPATH has been removed and deprecated. For spark-submit this takes a different route and spark.driver.extraClasspath takes care of specifying additional jars in the classpath that were previously specified in the SPARK_CLASSPATH. Right now the only way specify the additional jars for launching daemons such as history server is using SPARK_DIST_CLASSPATH (https://spark.apache.org/docs/latest/hadoop-provided.html) but this I presume is a distribution classpath. It would be nice to have a similar config like spark.driver.extraClasspath for launching daemons similar to history server.

Added new environment variable SPARK_DAEMON_CLASSPATH to set classpath for launching daemons. Tested and verified for History Server and Standalone Mode.

## How was this patch tested?
Initially, history server start script would fail for the reason being that it could not find the required jars for launching the server in the java classpath. Same was true for running Master and Worker in standalone mode. By adding the environment variable SPARK_DAEMON_CLASSPATH to the java classpath, both the daemons(History Server, Standalone daemons) are starting up and running.

Author: pgandhi <pgandhi@yahoo-inc.com>
Author: pgandhi999 <parthkgandhi9@gmail.com>

Closes #19047 from pgandhi999/master.
2017-08-28 08:51:22 -05:00
Marcelo Vanzin 628bdeabda [SPARK-17742][CORE] Fail launcher app handle if child process exits with error.
This is a follow up to cba826d0; that commit set the app handle state
to "LOST" when the child process exited, but that can be ambiguous. This
change sets the state to "FAILED" if the exit code was non-zero and
the handle state wasn't a failure state, or "LOST" if the exit status
was zero.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #19012 from vanzin/SPARK-17742.
2017-08-25 10:04:21 -07:00
Sean Owen de7af295c2 [MINOR][BUILD] Fix build warnings and Java lint errors
## What changes were proposed in this pull request?

Fix build warnings and Java lint errors. This just helps a bit in evaluating (new) warnings in another PR I have open.

## How was this patch tested?

Existing tests

Author: Sean Owen <sowen@cloudera.com>

Closes #19051 from srowen/JavaWarnings.
2017-08-25 16:07:13 +01:00
jerryshao 3ed1ae1005 [SPARK-20641][CORE] Add missing kvstore module in Laucher and SparkSubmit code
There're two code in Launcher and SparkSubmit will will explicitly list all the Spark submodules, newly added kvstore module is missing in this two parts, so submitting a minor PR to fix this.

Author: jerryshao <sshao@hortonworks.com>

Closes #19014 from jerryshao/missing-kvstore.
2017-08-22 10:14:45 -07:00
Marcelo Vanzin cba826d001 [SPARK-17742][CORE] Handle child process exit in SparkLauncher.
Currently the launcher handle does not monitor the child spark-submit
process it launches; this means that if the child exits with an error,
the handle's state will never change, and an application will not know
that the application has failed.

This change adds code to monitor the child process, and changes the
handle state appropriately when the child process exits.

Tested with added unit tests.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #18877 from vanzin/SPARK-17742.
2017-08-15 11:26:29 -07:00
vinodkc 83fe3b5e10 [SPARK-21665][CORE] Need to close resources after use
## What changes were proposed in this pull request?
Resources in Core - SparkSubmitArguments.scala, Spark-launcher - AbstractCommandBuilder.java, resource-managers- YARN - Client.scala are released

## How was this patch tested?
No new test cases added, Unit test have been passed

Author: vinodkc <vinod.kc.in@gmail.com>

Closes #18880 from vinodkc/br_fixresouceleak.
2017-08-09 15:18:38 +02:00
Marcelo Vanzin 9456176da0 [SPARK-21490][CORE] Make sure SparkLauncher redirects needed streams.
The code was failing to account for some cases when setting up log
redirection. For example, if a user redirected only stdout to a file,
the launcher code would leave stderr without redirection, which could
lead to child processes getting stuck because stderr wasn't being
read.

So detect cases where only one of the streams is redirected, and
redirect the other stream to the log as appropriate.

For the old "launch()" API, redirection of the unconfigured stream
only happens if the user has explicitly requested for log redirection.
Log redirection is on by default with "startApplication()".

Most of the change is actually adding new unit tests to make sure the
different cases work as expected. As part of that, I moved some tests
that were in the core/ module to the launcher/ module instead, since
they don't depend on spark-submit.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #18696 from vanzin/SPARK-21490.
2017-08-02 12:05:55 -07:00
Marcelo Vanzin b133501800 [SPARK-21522][CORE] Fix flakiness in LauncherServerSuite.
Handle the case where the server closes the socket before the full message
has been written by the client.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #18727 from vanzin/SPARK-21522.
2017-08-01 10:06:03 -07:00
Sean Owen d3f4a21196 [SPARK-15526][ML][FOLLOWUP] Make JPMML provided scope to avoid including unshaded JARs, and repromote to compile in MLlib
Following the comment at https://issues.apache.org/jira/browse/SPARK-15526?focusedCommentId=16086106&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16086106 -- this change actually needed a little more work to be complete.

This also marks JPMML as `provided` to make sure its JARs aren't included in the `jars` output, but then scopes to `compile` in `mllib`. This is how Guava is handled.

Checked result in `assembly/target/scala-2.11/jars` to verify there are no JPMML jars. Maven and SBT builds still work.

Author: Sean Owen <sowen@cloudera.com>

Closes #18637 from srowen/SPARK-15526.2.
2017-07-18 09:53:51 -07:00
Sean Owen 425c4ada4c [SPARK-19810][BUILD][CORE] Remove support for Scala 2.10
## What changes were proposed in this pull request?

- Remove Scala 2.10 build profiles and support
- Replace some 2.10 support in scripts with commented placeholders for 2.12 later
- Remove deprecated API calls from 2.10 support
- Remove usages of deprecated context bounds where possible
- Remove Scala 2.10 workarounds like ScalaReflectionLock
- Other minor Scala warning fixes

## How was this patch tested?

Existing tests

Author: Sean Owen <sowen@cloudera.com>

Closes #17150 from srowen/SPARK-19810.
2017-07-13 17:06:24 +08:00
Marcelo Vanzin 8efc6e9865 [SPARK-20922][CORE] Add whitelist of classes that can be deserialized by the launcher.
Blindly deserializing classes using Java serialization opens the code up to
issues in other libraries, since just deserializing data from a stream may
end up execution code (think readObject()).

Since the launcher protocol is pretty self-contained, there's just a handful
of classes it legitimately needs to deserialize, and they're in just two
packages, so add a filter that throws errors if classes from any other
package show up in the stream.

This also maintains backwards compatibility (the updated launcher code can
still communicate with the backend code in older Spark releases).

Tested with new and existing unit tests.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #18166 from vanzin/SPARK-20922.
2017-06-01 14:44:34 -07:00
Marcelo Vanzin df64fa79d6 [SPARK-20814][MESOS] Restore support for spark.executor.extraClassPath.
Restore code that was removed as part of SPARK-17979, but instead of
using the deprecated env variable name to propagate the class path, use
a new one.

Verified by running "./bin/spark-class o.a.s.executor.CoarseGrainedExecutorBackend"
manually.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #18037 from vanzin/SPARK-20814.
2017-05-22 12:34:15 -07:00
Sean Owen bbd8d7def1 [SPARK-20806][DEPLOY] Launcher: redundant check for Spark lib dir
## What changes were proposed in this pull request?

Remove redundant check for libdir in CommandBuilderUtils

## How was this patch tested?

Existing tests

Author: Sean Owen <sowen@cloudera.com>

Closes #18032 from srowen/SPARK-20806.
2017-05-20 15:27:13 +01:00
Josh Rosen f44c8a843c [SPARK-20453] Bump master branch version to 2.3.0-SNAPSHOT
This patch bumps the master branch version to `2.3.0-SNAPSHOT`.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #17753 from JoshRosen/SPARK-20453.
2017-04-24 21:48:04 -07:00
Yong Tang 8f0490e22b [SPARK-17979][SPARK-14453] Remove deprecated SPARK_YARN_USER_ENV and SPARK_JAVA_OPTS
This fix removes deprecated support for config `SPARK_YARN_USER_ENV`, as is mentioned in SPARK-17979.
This fix also removes deprecated support for the following:
```
SPARK_YARN_USER_ENV
SPARK_JAVA_OPTS
SPARK_CLASSPATH
SPARK_WORKER_INSTANCES
```

Related JIRA:
[SPARK-14453]: https://issues.apache.org/jira/browse/SPARK-14453
[SPARK-12344]: https://issues.apache.org/jira/browse/SPARK-12344
[SPARK-15781]: https://issues.apache.org/jira/browse/SPARK-15781

Existing tests should pass.

Author: Yong Tang <yong.tang.github@outlook.com>

Closes #17212 from yongtang/SPARK-17979.
2017-03-10 13:34:01 -08:00
Sean Owen 1487c9af20
[SPARK-19534][TESTS] Convert Java tests to use lambdas, Java 8 features
## What changes were proposed in this pull request?

Convert tests to use Java 8 lambdas, and modest related fixes to surrounding code.

## How was this patch tested?

Jenkins tests

Author: Sean Owen <sowen@cloudera.com>

Closes #16964 from srowen/SPARK-19534.
2017-02-19 09:42:50 -08:00
Sean Owen 0e2405490f
[SPARK-19550][BUILD][CORE][WIP] Remove Java 7 support
- Move external/java8-tests tests into core, streaming, sql and remove
- Remove MaxPermGen and related options
- Fix some reflection / TODOs around Java 8+ methods
- Update doc references to 1.7/1.8 differences
- Remove Java 7/8 related build profiles
- Update some plugins for better Java 8 compatibility
- Fix a few Java-related warnings

For the future:

- Update Java 8 examples to fully use Java 8
- Update Java tests to use lambdas for simplicity
- Update Java internal implementations to use lambdas

## How was this patch tested?

Existing tests

Author: Sean Owen <sowen@cloudera.com>

Closes #16871 from srowen/SPARK-19493.
2017-02-16 12:32:45 +00:00
uncleGen eefdf9f9dd
[SPARK-19227][SPARK-19251] remove unused imports and outdated comments
## What changes were proposed in this pull request?
remove ununsed imports and outdated comments, and fix some minor code style issue.

## How was this patch tested?
existing ut

Author: uncleGen <hustyugm@gmail.com>

Closes #16591 from uncleGen/SPARK-19227.
2017-01-18 09:44:32 +00:00
Ryan Williams afd9bc1d8a [SPARK-17807][CORE] split test-tags into test-JAR
Remove spark-tag's compile-scope dependency (and, indirectly, spark-core's compile-scope transitive-dependency) on scalatest by splitting test-oriented tags into spark-tags' test JAR.

Alternative to #16303.

Author: Ryan Williams <ryan.blake.williams@gmail.com>

Closes #16311 from ryan-williams/tt.
2016-12-21 16:37:20 -08:00
hyukjinkwon c6b8eb71a9
[SPARK-18842][TESTS][LAUNCHER] De-duplicate paths in classpaths in commands for local-cluster mode to work around the path length limitation on Windows
## What changes were proposed in this pull request?

Currently, some tests are being failed and hanging on Windows due to this problem. For the reason in SPARK-18718, some tests using `local-cluster` mode were disabled on Windows due to the length limitation by paths given to classpaths.

The limitation seems roughly 32K (see the [blog in MS](https://blogs.msdn.microsoft.com/oldnewthing/20031210-00/?p=41553/) and [another reference](https://support.thoughtworks.com/hc/en-us/articles/213248526-Getting-around-maximum-command-line-length-is-32767-characters-on-Windows)) but in `local-cluster` mode, executors were being launched as processes with the command such as [here](https://gist.github.com/HyukjinKwon/5bc81061c250d4af5a180869b59d42ea) in (only) tests.

This length is roughly 40K due to the classpaths given to `java` command. However, it seems duplicates are almost half of them. So, if we deduplicate the paths, it seems reduced to roughly 20K with the command, [here](https://gist.github.com/HyukjinKwon/dad0c8db897e5e094684a2dc6a417790).

Maybe, we should consider as some more paths are added in the future but it seems better than disabling all the tests for now with minimised changes.

Therefore, this PR proposes to deduplicate the paths in classpaths in case of launching executors as processes in `local-cluster` mode.

## How was this patch tested?

Existing tests in `ShuffleSuite` and `BroadcastJoinSuite` manually via AppVeyor

Author: hyukjinkwon <gurwls223@gmail.com>

Closes #16266 from HyukjinKwon/disable-local-cluster-tests.
2016-12-14 19:24:24 +00:00
Marcelo Vanzin b44d1b8fcf [SPARK-18662][HOTFIX] Add new resource-managers directories to SparkLauncher.
These directories are added to the classpath of applications when testing or
using SPARK_PREPEND_CLASSES, otherwise updated classes are not seen. Also,
add the mesos directory which was missing.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #16202 from vanzin/SPARK-18662.
2016-12-08 09:48:33 -08:00
Reynold Xin c7c7265950 [SPARK-18695] Bump master branch version to 2.2.0-SNAPSHOT
## What changes were proposed in this pull request?
This patch bumps master branch version to 2.2.0-SNAPSHOT.

## How was this patch tested?
N/A

Author: Reynold Xin <rxin@databricks.com>

Closes #16126 from rxin/SPARK-18695.
2016-12-02 21:09:37 -08:00
Holden Karau a36a76ac43 [SPARK-1267][SPARK-18129] Allow PySpark to be pip installed
## What changes were proposed in this pull request?

This PR aims to provide a pip installable PySpark package. This does a bunch of work to copy the jars over and package them with the Python code (to prevent challenges from trying to use different versions of the Python code with different versions of the JAR). It does not currently publish to PyPI but that is the natural follow up (SPARK-18129).

Done:
- pip installable on conda [manual tested]
- setup.py installed on a non-pip managed system (RHEL) with YARN [manual tested]
- Automated testing of this (virtualenv)
- packaging and signing with release-build*

Possible follow up work:
- release-build update to publish to PyPI (SPARK-18128)
- figure out who owns the pyspark package name on prod PyPI (is it someone with in the project or should we ask PyPI or should we choose a different name to publish with like ApachePySpark?)
- Windows support and or testing ( SPARK-18136 )
- investigate details of wheel caching and see if we can avoid cleaning the wheel cache during our test
- consider how we want to number our dev/snapshot versions

Explicitly out of scope:
- Using pip installed PySpark to start a standalone cluster
- Using pip installed PySpark for non-Python Spark programs

*I've done some work to test release-build locally but as a non-committer I've just done local testing.
## How was this patch tested?

Automated testing with virtualenv, manual testing with conda, a system wide install, and YARN integration.

release-build changes tested locally as a non-committer (no testing of upload artifacts to Apache staging websites)

Author: Holden Karau <holden@us.ibm.com>
Author: Juliet Hougland <juliet@cloudera.com>
Author: Juliet Hougland <not@myemail.com>

Closes #15659 from holdenk/SPARK-1267-pip-install-pyspark.
2016-11-16 14:22:15 -08:00
Jeff Zhang fa6347938f [SPARK-17178][SPARKR][SPARKSUBMIT] Allow to set sparkr shell command through --conf
## What changes were proposed in this pull request?

Allow user to set sparkr shell command through --conf spark.r.shell.command

## How was this patch tested?

Unit test is added and also verify it manually through
```
bin/sparkr --master yarn-client --conf spark.r.shell.command=/usr/local/bin/R
```

Author: Jeff Zhang <zjffdu@apache.org>

Closes #14744 from zjffdu/SPARK-17178.
2016-08-31 00:20:41 -07:00
Jeff Zhang 7a9e25c383 [SPARK-13081][PYSPARK][SPARK_SUBMIT] Allow set pythonExec of driver and executor through conf…
Before this PR, user have to export environment variable to specify the python of driver & executor which is not so convenient for users. This PR is trying to allow user to specify python through configuration "--pyspark-driver-python" & "--pyspark-executor-python"

Manually test in local & yarn mode for pyspark-shell and pyspark batch mode.

Author: Jeff Zhang <zjffdu@apache.org>

Closes #13146 from zjffdu/SPARK-13081.
2016-08-11 20:08:39 -07:00
Andrew Duffy 004e29cba5 [SPARK-14702] Make environment of SparkLauncher launched process more configurable
## What changes were proposed in this pull request?

Adds a few public methods to `SparkLauncher` to allow configuring some extra features of the `ProcessBuilder`, including the working directory, output and error stream redirection.

## How was this patch tested?

Unit testing + simple Spark driver programs

Author: Andrew Duffy <root@aduffy.org>

Closes #14201 from andreweduffy/feature/launcher.
2016-07-19 17:08:38 -07:00
Xin Ren 21a6dd2aef [SPARK-16535][BUILD] In pom.xml, remove groupId which is redundant definition and inherited from the parent
https://issues.apache.org/jira/browse/SPARK-16535

## What changes were proposed in this pull request?

When I scan through the pom.xml of sub projects, I found this warning as below and attached screenshot
```
Definition of groupId is redundant, because it's inherited from the parent
```
![screen shot 2016-07-13 at 3 13 11 pm](https://cloud.githubusercontent.com/assets/3925641/16823121/744f893e-4916-11e6-8a52-042f83b9db4e.png)

I've tried to remove some of the lines with groupId definition, and the build on my local machine is still ok.
```
<groupId>org.apache.spark</groupId>
```
As I just find now `<maven.version>3.3.9</maven.version>` is being used in Spark 2.x, and Maven-3 supports versionless parent elements: Maven 3 will remove the need to specify the parent version in sub modules. THIS is great (in Maven 3.1).

ref: http://stackoverflow.com/questions/3157240/maven-3-worth-it/3166762#3166762

## How was this patch tested?

I've tested by re-building the project, and build succeeded.

Author: Xin Ren <iamshrek@126.com>

Closes #14189 from keypointt/SPARK-16535.
2016-07-19 11:59:46 +01:00
Reynold Xin ffcb6e055a [SPARK-16477] Bump master version to 2.1.0-SNAPSHOT
## What changes were proposed in this pull request?
After SPARK-16476 (committed earlier today as #14128), we can finally bump the version number.

## How was this patch tested?
N/A

Author: Reynold Xin <rxin@databricks.com>

Closes #14130 from rxin/SPARK-16477.
2016-07-11 09:42:56 -07:00
Sandeep Singh f958c1c3e2 [MINOR] Fix Java Lint errors introduced by #13286 and #13280
## What changes were proposed in this pull request?

revived #13464

Fix Java Lint errors introduced by #13286 and #13280
Before:
```
Using `mvn` from path: /Users/pichu/Project/spark/build/apache-maven-3.3.9/bin/mvn
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=512M; support was removed in 8.0
Checkstyle checks failed at following occurrences:
[ERROR] src/main/java/org/apache/spark/launcher/LauncherServer.java:[340,5] (whitespace) FileTabCharacter: Line contains a tab character.
[ERROR] src/main/java/org/apache/spark/launcher/LauncherServer.java:[341,5] (whitespace) FileTabCharacter: Line contains a tab character.
[ERROR] src/main/java/org/apache/spark/launcher/LauncherServer.java:[342,5] (whitespace) FileTabCharacter: Line contains a tab character.
[ERROR] src/main/java/org/apache/spark/launcher/LauncherServer.java:[343,5] (whitespace) FileTabCharacter: Line contains a tab character.
[ERROR] src/main/java/org/apache/spark/sql/streaming/OutputMode.java:[41,28] (naming) MethodName: Method name 'Append' must match pattern '^[a-z][a-z0-9][a-zA-Z0-9_]*$'.
[ERROR] src/main/java/org/apache/spark/sql/streaming/OutputMode.java:[52,28] (naming) MethodName: Method name 'Complete' must match pattern '^[a-z][a-z0-9][a-zA-Z0-9_]*$'.
[ERROR] src/main/java/org/apache/spark/sql/execution/datasources/parquet/SpecificParquetRecordReaderBase.java:[61,8] (imports) UnusedImports: Unused import - org.apache.parquet.schema.PrimitiveType.
[ERROR] src/main/java/org/apache/spark/sql/execution/datasources/parquet/SpecificParquetRecordReaderBase.java:[62,8] (imports) UnusedImports: Unused import - org.apache.parquet.schema.Type.
```

## How was this patch tested?
ran `dev/lint-java` locally

Author: Sandeep Singh <sandeep@techaddict.me>

Closes #13559 from techaddict/minor-3.
2016-06-08 14:51:00 +01:00
Subroto Sanyal c409e23abd [SPARK-15652][LAUNCHER] Added a new State (LOST) for the listeners of SparkLauncher
## What changes were proposed in this pull request?
This situation can happen when the LauncherConnection gets an exception while reading through the socket and terminating silently without notifying making the client/listener think that the job is still in previous state.
The fix force sends a notification to client that the job finished with unknown status and let client handle it accordingly.

## How was this patch tested?
Added a unit test.

Author: Subroto Sanyal <ssanyal@datameer.com>

Closes #13497 from subrotosanyal/SPARK-15652-handle-spark-submit-jvm-crash.
2016-06-06 16:05:40 -07:00
Devaraj K efd3b11a47 [SPARK-15665][CORE] spark-submit --kill and --status are not working
## What changes were proposed in this pull request?
--kill and --status were not considered while handling in OptionParser and due to that it was failing. Now handling the --kill and --status options as part of OptionParser.handle.

## How was this patch tested?
Added a test org.apache.spark.launcher.SparkSubmitCommandBuilderSuite.testCliKillAndStatus() and also I have verified these manually by running --kill and --status commands.

Author: Devaraj K <devaraj@apache.org>

Closes #13407 from devaraj-kavali/SPARK-15665.
2016-06-03 13:03:28 -07:00
Sandeep Singh 3eff65f82d [MINOR] More than 100 chars in line in SparkSubmitCommandBuilderSuite
## What changes were proposed in this pull request?
More than 100 chars in line.

## How was this patch tested?

Author: Sandeep Singh <sandeep@techaddict.me>

Closes #13249 from techaddict/fix-1.
2016-05-22 09:19:28 -05:00
wm624@hotmail.com fe2fcb4803 [SPARK-15360][SPARK-SUBMIT] Should print spark-submit usage when no arguments is specified
(Please fill in changes proposed in this fix)
In 2.0, ./bin/spark-submit doesn't print out usage, but it raises an exception.
In this PR, an exception handling is added in the Main.java when the exception is thrown. In the handling code, if there is no additional argument, it prints out usage.

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
Manually tested.
./bin/spark-submit
Usage: spark-submit [options] <app jar | python file> [app arguments]
Usage: spark-submit --kill [submission ID] --master [spark://...]
Usage: spark-submit --status [submission ID] --master [spark://...]
Usage: spark-submit run-example [options] example-class [example args]

Options:
  --master MASTER_URL         spark://host:port, mesos://host:port, yarn, or local.
  --deploy-mode DEPLOY_MODE   Whether to launch the driver program locally ("client") or
                              on one of the worker machines inside the cluster ("cluster")
                              (Default: client).
  --class CLASS_NAME          Your application's main class (for Java / Scala apps).
  --name NAME                 A name of your application.
  --jars JARS                 Comma-separated list of local jars to include on the driver
                              and executor classpaths.
  --packages                  Comma-separated list of maven coordinates of jars to include
                              on the driver and executor classpaths. Will search the local
                              maven repo, then maven central and any additional remote
                              repositories given by --repositories. The format for the
                              coordinates should be groupId:artifactId:version.

Author: wm624@hotmail.com <wm624@hotmail.com>

Closes #13163 from wangmiao1981/submit.
2016-05-20 10:28:24 -07:00
Sean Owen 122302cbf5 [SPARK-15290][BUILD] Move annotations, like @Since / @DeveloperApi, into spark-tags
## What changes were proposed in this pull request?

(See https://github.com/apache/spark/pull/12416 where most of this was already reviewed and committed; this is just the module structure and move part. This change does not move the annotations into test scope, which was the apparently problem last time.)

Rename `spark-test-tags` -> `spark-tags`; move common annotations like `Since` to `spark-tags`

## How was this patch tested?

Jenkins tests.

Author: Sean Owen <sowen@cloudera.com>

Closes #13074 from srowen/SPARK-15290.
2016-05-17 09:55:53 +01:00
Marcelo Vanzin 0b9cae4242 [SPARK-11249][LAUNCHER] Throw error if app resource is not provided.
Without this, the code would build an invalid spark-submit command line,
and a more cryptic error would be presented to the user. Also, expose
a constant that allows users to set a dummy resource in cases where
they don't need an actual resource file; for backwards compatibility,
that uses the same "spark-internal" resource that Spark itself uses.

Tested via unit tests, run-example, spark-shell, and running the
thrift server with mixed spark and hive command line arguments.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #12909 from vanzin/SPARK-11249.
2016-05-10 10:35:54 -07:00
Sean Owen 6747171eb1 [SPARK-15067][YARN] YARN executors are launched with fixed perm gen size
## What changes were proposed in this pull request?

Look for MaxPermSize arguments anywhere in an arg, to account for quoted args. See JIRA for discussion.

## How was this patch tested?

Jenkins tests

Author: Sean Owen <sowen@cloudera.com>

Closes #12985 from srowen/SPARK-15067.
2016-05-09 11:10:36 -07:00
Marcelo Vanzin 73c20bf325 [SPARK-14391][LAUNCHER] Fix launcher communication test, take 2.
There's actually a race here: the state of the handler was changed before
the connection was set, so the test code could be notified of the state
change, wake up, and still see the connection as null, triggering the assert.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #12785 from vanzin/SPARK-14391.
2016-04-29 23:13:50 -07:00
Reynold Xin 24d07e45d4 [HOTFIX] Disable flaky LauncherServerSuite.testCommunication 2016-04-29 00:20:52 -07:00
Yin Huai 9c7c42bc6a Revert "[SPARK-14613][ML] Add @Since into the matrix and vector classes in spark-mllib-local"
This reverts commit dae538a4d7.
2016-04-28 19:57:41 -07:00
Pravin Gadakh dae538a4d7 [SPARK-14613][ML] Add @Since into the matrix and vector classes in spark-mllib-local
## What changes were proposed in this pull request?

This PR adds `since` tag into the matrix and vector classes in spark-mllib-local.

## How was this patch tested?

Scala-style checks passed.

Author: Pravin Gadakh <prgadakh@in.ibm.com>

Closes #12416 from pravingadakh/SPARK-14613.
2016-04-28 15:59:18 -07:00
Dhruve Ashar 033d808152 [SPARK-12384] Enables spark-clients to set the min(-Xms) and max(*.memory config) j…
## What changes were proposed in this pull request?

Currently Spark clients are started with the same memory setting for Xms and Xms leading to reserving unnecessary higher amounts of memory.
This behavior is changed and the clients can now specify an initial heap size using the extraJavaOptions in the config for driver,executor and am individually.
 Note, that only -Xms can be provided through this config option, if the client wants to set the max size(-Xmx), this has to be done via the *.memory configuration knobs which are currently supported.

## How was this patch tested?

Monitored executor and yarn logs in debug mode to verify the commands through which they are being launched in client and cluster mode. The driver memory was verified locally using jps -v. Setting up -Xmx parameter in the javaExtraOptions raises exception with the info provided.

Author: Dhruve Ashar <dhruveashar@gmail.com>

Closes #12115 from dhruve/impr/SPARK-12384.
2016-04-07 10:39:21 -05:00
Marcelo Vanzin 21d5ca128b [SPARK-14134][CORE] Change the package name used for shading classes.
The current package name uses a dash, which is a little weird but seemed
to work. That is, until a new test tried to mock a class that references
one of those shaded types, and then things started failing.

Most changes are just noise to fix the logging configs.

For reference, SPARK-8815 also raised this issue, although at the time it
did not cause any issues in Spark, so it was not addressed.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #11941 from vanzin/SPARK-14134.
2016-04-06 19:33:51 -07:00
Marcelo Vanzin de4792605a [SPARK-14391][LAUNCHER] Increase test timeouts.
Most of the time tests should still pass really quickly; it's just
when machines are overloaded that the tests may take a little time,
but that's still preferable over just failing the test.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #12210 from vanzin/SPARK-14391.
2016-04-06 15:45:03 -07:00