## What changes were proposed in this pull request?
Add stripXSS and stripXSSMap to Spark Core's UIUtils. Calling these functions at any point that getParameter is called against a HttpServletRequest.
## How was this patch tested?
Unit tests, IBM Security AppScan Standard no longer showing vulnerabilities, manual verification of WebUI pages.
Author: NICHOLAS T. MARION <nmarion@us.ibm.com>
Closes#17686 from n-marion/xss-fix.
## What changes were proposed in this pull request?
After SPARK-10997, client mode Netty RpcEnv doesn't require to start server, so port configurations are not used any more, here propose to remove these two configurations: "spark.executor.port" and "spark.am.port".
## How was this patch tested?
Existing UTs.
Author: jerryshao <sshao@hortonworks.com>
Closes#17866 from jerryshao/SPARK-20605.
Signed-off-by: liuxian <liu.xian3zte.com.cn>
## What changes were proposed in this pull request?
When the input parameter is null, may be a runtime exception occurs
## How was this patch tested?
Existing unit tests
Author: liuxian <liu.xian3@zte.com.cn>
Closes#17796 from 10110346/wip_lx_0428.
## What changes were proposed in this pull request?
Add test case for scenarios where executor.cores is set as a
(non)divisor of spark.cores.max
This tests the change in
#17786
## How was this patch tested?
Ran the existing test suite with the new tests
dbtsai
Author: Davis Shepherd <dshepherd@netflix.com>
Closes#17788 from dgshep/add_mesos_test.
## What changes were proposed in this pull request?
Set maxCores to be a multiple of the smallest executor that can be launched. This ensures that we correctly detect the condition where no more executors will be launched when spark.cores.max is not a multiple of spark.executor.cores
## How was this patch tested?
This was manually tested with other sample frameworks measuring their incoming offers to determine if starvation would occur.
dbtsai mgummelt
Author: Davis Shepherd <dshepherd@netflix.com>
Closes#17786 from dgshep/fix_mesos_max_cores.
## What changes were proposed in this pull request?
Submitted Time' field, the date format **needs to be formatted**, in running Drivers table or Completed Drivers table in master web ui.
Before fix this problem e.g.
Completed Drivers
Submission ID **Submitted Time** Worker State Cores Memory Main Class
driver-20170419145755-0005 **Wed Apr 19 14:57:55 CST 2017** worker-20170419145250-zdh120-40412 FAILED 1 1024.0 MB cn.zte.HdfsTest
please see the attachment:https://issues.apache.org/jira/secure/attachment/12863977/before_fix.png
After fix this problem e.g.
Completed Drivers
Submission ID **Submitted Time** Worker State Cores Memory Main Class
driver-20170419145755-0006 **2017/04/19 16:01:25** worker-20170419145250-zdh120-40412 FAILED 1 1024.0 MB cn.zte.HdfsTest
please see the attachment:https://issues.apache.org/jira/secure/attachment/12863976/after_fix.png
'Submitted Time' field, the date format **has been formatted**, in running Applications table or Completed Applicationstable in master web ui, **it is correct.**
e.g.
Running Applications
Application ID Name Cores Memory per Executor **Submitted Time** User State Duration
app-20170419160910-0000 (kill) SparkSQL::10.43.183.120 1 5.0 GB **2017/04/19 16:09:10** root RUNNING 53 s
**Format after the time easier to observe, and consistent with the applications table,so I think it's worth fixing.**
## How was this patch tested?
(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)
Please review http://spark.apache.org/contributing.html before opening a pull request.
Author: 郭小龙 10207633 <guo.xiaolong1@zte.com.cn>
Author: guoxiaolong <guo.xiaolong1@zte.com.cn>
Author: guoxiaolongzte <guo.xiaolong1@zte.com.cn>
Closes#17682 from guoxiaolongzte/SPARK-20385.
## What changes were proposed in this pull request?
Allow passing in arbitrary parameters into docker when launching spark executors on mesos with docker containerizer tnachen
## How was this patch tested?
Manually built and tested with passed in parameter
Author: Ji Yan <jiyan@Jis-MacBook-Air.local>
Closes#17109 from yanji84/ji/allow_set_docker_user.
## What changes were proposed in this pull request?
This PR proposes to run Spark unidoc to test Javadoc 8 build as Javadoc 8 is easily re-breakable.
There are several problems with it:
- It introduces little extra bit of time to run the tests. In my case, it took 1.5 mins more (`Elapsed :[94.8746569157]`). How it was tested is described in "How was this patch tested?".
- > One problem that I noticed was that Unidoc appeared to be processing test sources: if we can find a way to exclude those from being processed in the first place then that might significantly speed things up.
(see joshrosen's [comment](https://issues.apache.org/jira/browse/SPARK-18692?focusedCommentId=15947627&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15947627))
To complete this automated build, It also suggests to fix existing Javadoc breaks / ones introduced by test codes as described above.
There fixes are similar instances that previously fixed. Please refer https://github.com/apache/spark/pull/15999 and https://github.com/apache/spark/pull/16013
Note that this only fixes **errors** not **warnings**. Please see my observation https://github.com/apache/spark/pull/17389#issuecomment-288438704 for spurious errors by warnings.
## How was this patch tested?
Manually via `jekyll build` for building tests. Also, tested via running `./dev/run-tests`.
This was tested via manually adding `time.time()` as below:
```diff
profiles_and_goals = build_profiles + sbt_goals
print("[info] Building Spark unidoc (w/Hive 1.2.1) using SBT with these arguments: ",
" ".join(profiles_and_goals))
+ import time
+ st = time.time()
exec_sbt(profiles_and_goals)
+ print("Elapsed :[%s]" % str(time.time() - st))
```
produces
```
...
========================================================================
Building Unidoc API Documentation
========================================================================
...
[info] Main Java API documentation successful.
...
Elapsed :[94.8746569157]
...
Author: hyukjinkwon <gurwls223@gmail.com>
Closes#17477 from HyukjinKwon/SPARK-18692.
## What changes were proposed in this pull request?
Add spark.mesos.task.labels configuration option to add mesos key:value labels to the executor.
"k1:v1,k2:v2" as the format, colons separating key-value and commas to list out more than one.
Discussion of labels with mgummelt at #17404
## How was this patch tested?
Added unit tests to verify labels were added correctly, with incorrect labels being ignored and added a test to test the name of the executor.
Tested with: `./build/sbt -Pmesos mesos/test`
Please review http://spark.apache.org/contributing.html before opening a pull request.
Author: Kalvin Chau <kalvin.chau@viasat.com>
Closes#17413 from kalvinnchau/mesos-labels.
## What changes were proposed in this pull request?
Adding configurable mesos executor names and labels using `spark.mesos.task.name` and `spark.mesos.task.labels`.
Labels were defined as `k1:v1,k2:v2`.
mgummelt
## How was this patch tested?
Added unit tests to verify labels were added correctly, with incorrect labels being ignored and added a test to test the name of the executor.
Tested with: `./build/sbt -Pmesos mesos/test`
Please review http://spark.apache.org/contributing.html before opening a pull request.
Author: Kalvin Chau <kalvin.chau@viasat.com>
Closes#17404 from kalvinnchau/mesos-config.
This commit adds a killTaskAttempt method to SparkContext, to allow users to
kill tasks so that they can be re-scheduled elsewhere.
This also refactors the task kill path to allow specifying a reason for the task kill. The reason is propagated opaquely through events, and will show up in the UI automatically as `(N killed: $reason)` and `TaskKilled: $reason`. Without this change, there is no way to provide the user feedback through the UI.
Currently used reasons are "stage cancelled", "another attempt succeeded", and "killed via SparkContext.killTask". The user can also specify a custom reason through `SparkContext.killTask`.
cc rxin
In the stage overview UI the reasons are summarized:
![1](https://cloud.githubusercontent.com/assets/14922/23929209/a83b2862-08e1-11e7-8b3e-ae1967bbe2e5.png)
Within the stage UI you can see individual task kill reasons:
![2](https://cloud.githubusercontent.com/assets/14922/23929200/9a798692-08e1-11e7-8697-72b27ad8a287.png)
Existing tests, tried killing some stages in the UI and verified the messages are as expected.
Author: Eric Liang <ekl@databricks.com>
Author: Eric Liang <ekl@google.com>
Closes#17166 from ericl/kill-reason.
## What changes were proposed in this pull request?
Fixup typo in comment.
## How was this patch tested?
Don't need.
Author: Ye Yin <eyniy@qq.com>
Closes#17396 from hustcat/fix.
## What changes were proposed in this pull request?
Increase default refuse_seconds timeout, and make it configurable. See JIRA for details on how this reduces the risk of starvation.
## How was this patch tested?
Unit tests, Manual testing, and Mesos/Spark integration test suite
cc susanxhuynh skonto jmlvanre
Author: Michael Gummelt <mgummelt@mesosphere.io>
Closes#17031 from mgummelt/SPARK-19702-suppress-revive.
## What changes were proposed in this pull request?
See JIRA
## How was this patch tested?
Unit tests, Mesos/Spark integration tests
cc skonto susanxhuynh
Author: Michael Gummelt <mgummelt@mesosphere.io>
Closes#17045 from mgummelt/SPARK-19373-registered-resources.
## What changes were proposed in this pull request?
Adding the default UncaughtExceptionHandler to the MesosClusterDispatcher.
## How was this patch tested?
I verified it manually, when any of the dispatcher thread gets uncaught exceptions then the default UncaughtExceptionHandler will handle those exceptions.
Author: Devaraj K <devaraj@apache.org>
Closes#13072 from devaraj-kavali/SPARK-15288.
## What changes were proposed in this pull request?
`askSync` is already added in `RpcEndpointRef` (see SPARK-19347 and https://github.com/apache/spark/pull/16690#issuecomment-276850068) and `askWithRetry` is marked as deprecated.
As mentioned SPARK-18113(https://github.com/apache/spark/pull/16503#event-927953218):
>askWithRetry is basically an unneeded API, and a leftover from the akka days that doesn't make sense anymore. It's prone to cause deadlocks (exactly because it's blocking), it imposes restrictions on the caller (e.g. idempotency) and other things that people generally don't pay that much attention to when using it.
Since `askWithRetry` is just used inside spark and not in user logic. It might make sense to replace all of them with `askSync`.
## How was this patch tested?
This PR doesn't change code logic, existing unit test can cover.
Author: jinxing <jinxing@meituan.com>
Closes#16790 from jinxing64/SPARK-19450.
## What changes were proposed in this pull request?
Now handling the spark exception which gets thrown for invalid job configuration, marking that job as failed and continuing to launch the other drivers instead of throwing the exception.
## How was this patch tested?
I verified manually, now the misconfigured jobs move to Finished Drivers section in UI and continue to launch the other jobs.
Author: Devaraj K <devaraj@apache.org>
Closes#13077 from devaraj-kavali/SPARK-10748.
## What changes were proposed in this pull request?
After using Apache Parquet 1.8.2, `ParquetAvroCompatibilitySuite` fails on **Maven** test. It is because `org.apache.parquet.avro.AvroParquetWriter` in the test code used new `avro 1.8.0` specific class, `LogicalType`. This PR aims to fix the test dependency of `sql/core` module to use avro 1.8.0.
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/2530/consoleFull
```
ParquetAvroCompatibilitySuite:
*** RUN ABORTED ***
java.lang.NoClassDefFoundError: org/apache/avro/LogicalType
at org.apache.parquet.avro.AvroParquetWriter.writeSupport(AvroParquetWriter.java:144)
```
## How was this patch tested?
Pass the existing test with **Maven**.
```
$ build/mvn -Pyarn -Phadoop-2.7 -Pkinesis-asl -Phive -Phive-thriftserver test
...
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 02:07 h
[INFO] Finished at: 2017-02-04T05:41:43+00:00
[INFO] Final Memory: 77M/987M
[INFO] ------------------------------------------------------------------------
```
Author: Dongjoon Hyun <dongjoon@apache.org>
Closes#16795 from dongjoon-hyun/SPARK-19409-2.
This change introduces a new auth mechanism to the transport library,
to be used when users enable strong encryption. This auth mechanism
has better security than the currently used DIGEST-MD5.
The new protocol uses symmetric key encryption to mutually authenticate
the endpoints, and is very loosely based on ISO/IEC 9798.
The new protocol falls back to SASL when it thinks the remote end is old.
Because SASL does not support asking the server for multiple auth protocols,
which would mean we could re-use the existing SASL code by just adding a
new SASL provider, the protocol is implemented outside of the SASL API
to avoid the boilerplate of adding a new provider.
Details of the auth protocol are discussed in the included README.md
file.
This change partly undos the changes added in SPARK-13331; AES encryption
is now decoupled from SASL authentication. The encryption code itself,
though, has been re-used as part of this change.
## How was this patch tested?
- Unit tests
- Tested Spark 2.2 against Spark 1.6 shuffle service with SASL enabled
- Tested Spark 2.2 against Spark 2.2 shuffle service with SASL fallback disabled
Author: Marcelo Vanzin <vanzin@cloudera.com>
Closes#16521 from vanzin/SPARK-19139.
In the existing code, there are three layers of serialization
involved in sending a task from the scheduler to an executor:
- A Task object is serialized
- The Task object is copied to a byte buffer that also
contains serialized information about any additional JARs,
files, and Properties needed for the task to execute. This
byte buffer is stored as the member variable serializedTask
in the TaskDescription class.
- The TaskDescription is serialized (in addition to the serialized
task + JARs, the TaskDescription class contains the task ID and
other metadata) and sent in a LaunchTask message.
While it *is* necessary to have two layers of serialization, so that
the JAR, file, and Property info can be deserialized prior to
deserializing the Task object, the third layer of deserialization is
unnecessary. This commit eliminates a layer of serialization by moving
the JARs, files, and Properties into the TaskDescription class.
This commit also serializes the Properties manually (by traversing the map),
as is done with the JARs and files, which reduces the final serialized size.
Unit tests
This is a simpler alternative to the approach proposed in #15505.
shivaram and I did some benchmarking of this and #15505 on a 20-machine m2.4xlarge EC2 machines (160 cores). We ran ~30 trials of code [1] (a very simple job with 10K tasks per stage) and measured the average time per stage:
Before this change: 2490ms
With this change: 2345 ms (so ~6% improvement over the baseline)
With witgo's approach in #15505: 2046 ms (~18% improvement over baseline)
The reason that #15505 has a more significant improvement is that it also moves the serialization from the TaskSchedulerImpl thread to the CoarseGrainedSchedulerBackend thread. I added that functionality on top of this change, and got almost the same improvement [1] as #15505 (average of 2103ms). I think we should decouple these two changes, both so we have some record of the improvement form each individual improvement, and because this change is more about simplifying the code base (the improvement is negligible) while the other is about performance improvement. The plan, currently, is to merge this PR and then merge the remaining part of #15505 that moves serialization.
[1] The reason the improvement wasn't quite as good as with #15505 when we ran the benchmarks is almost certainly because, at the point when we ran the benchmarks, I hadn't updated the code to manually serialize the Properties (instead the code was using Java's default serialization for the Properties object, whereas #15505 manually serialized the Properties). This PR has since been updated to manually serialize the Properties, just like the other maps.
Author: Kay Ousterhout <kayousterhout@gmail.com>
Closes#16053 from kayousterhout/SPARK-17931.
## What changes were proposed in this pull request?
Not adding the Killed applications for retry.
## How was this patch tested?
I have verified manually in the Mesos cluster, with the changes the killed applications move to Finished Drivers section and will not retry.
Author: Devaraj K <devaraj@apache.org>
Closes#13323 from devaraj-kavali/SPARK-15555.