Commit graph

7433 commits

Author SHA1 Message Date
07ARB 4f9d3dc6ba [SPARK-30384][WEBUI] Needs to improve the Column name and Add tooltips for the Fair Scheduler Pool Table
### What changes were proposed in this pull request?
Needs to improve the Column name and tooltips for the Fair Scheduler Pool Table.

### Why are the changes needed?
Need to correct SchedulingMode  column name to  -> 'Scheduling Mode' and tooltips need to add for Minimum Share, Pool Weight and Scheduling Mode (require meaning full Tool tips for the end user to understand.)

### Does this PR introduce any user-facing change?
YES
![Screenshot 2020-01-03 at 10 10 47 AM](https://user-images.githubusercontent.com/8948111/71707687-7aee9800-2e11-11ea-93cc-52df0b9114dd.png)

### How was this patch tested?
Manual Testing.

Closes #27047 from 07ARB/SPARK-30384.

Lead-authored-by: 07ARB <ankitrajboudh@gmail.com>
Co-authored-by: Ankitraj <8948111+07ARB@users.noreply.github.com>
Signed-off-by: Gengliang Wang <gengliang.wang@databricks.com>
2020-01-02 21:13:15 -08:00
Wang Shuo 10cae04108 [SPARK-30285][CORE] Fix deadlock between LiveListenerBus#stop and AsyncEventQueue#removeListenerOnError
### What changes were proposed in this pull request?

There is a deadlock between `LiveListenerBus#stop` and `AsyncEventQueue#removeListenerOnError`.

We can reproduce as follows:

1. Post some events to `LiveListenerBus`
2. Call `LiveListenerBus#stop` and hold the synchronized lock of `bus`(5e92301723/core/src/main/scala/org/apache/spark/scheduler/LiveListenerBus.scala (L229)), waiting until all the events are processed by listeners, then remove all the queues
3. Event queue would drain out events by posting to its listeners. If a listener is interrupted, it will call `AsyncEventQueue#removeListenerOnError`,  inside it will call `bus.removeListener`(7b1b60c758/core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala (L207)), trying to acquire synchronized lock of bus, resulting in deadlock

This PR  removes the `synchronized` from `LiveListenerBus.stop` because underlying data structures themselves are thread-safe.

### Why are the changes needed?
To fix deadlock.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
New UT.

Closes #26924 from wangshuo128/event-queue-race-condition.

Authored-by: Wang Shuo <wangshuo128@gmail.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2020-01-02 16:40:22 -08:00
Liang-Chi Hsieh 9eff1186ae [SPARK-30379][CORE] Avoid OOM when using collection accumulator
### What changes were proposed in this pull request?

This patch proposes to only convert first few elements of collection accumulators in `LiveEntityHelpers.newAccumulatorInfos`.

### Why are the changes needed?

One Spark job on our cluster uses collection accumulator to collect something and has encountered an exception like:

```
java.lang.OutOfMemoryError: Java heap space
    at java.util.Arrays.copyOf(Arrays.java:3332)
    at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124)
    at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:448)
    at java.lang.StringBuilder.append(StringBuilder.java:136)
    at java.lang.StringBuilder.append(StringBuilder.java:131)
    at java.util.AbstractCollection.toString(AbstractCollection.java:462)
    at java.util.Collections$UnmodifiableCollection.toString(Collections.java:1035)
    at org.apache.spark.status.LiveEntityHelpers$$anonfun$newAccumulatorInfos$2$$anonfun$apply$3.apply(LiveEntity.scala:596)
    at org.apache.spark.status.LiveEntityHelpers$$anonfun$newAccumulatorInfos$2$$anonfun$apply$3.apply(LiveEntity.scala:596)
    at scala.Option.map(Option.scala:146)
    at org.apache.spark.status.LiveEntityHelpers$$anonfun$newAccumulatorInfos$2.apply(LiveEntity.scala:596)
    at org.apache.spark.status.LiveEntityHelpers$$anonfun$newAccumulatorInfos$2.apply(LiveEntity.scala:591)
```

`LiveEntityHelpers.newAccumulatorInfos` converts `AccumulableInfo`s to `v1.AccumulableInfo` by calling `toString` on accumulator's value. For collection accumulator, it might take much more memory when in string representation, for example, collection accumulator of long values, and cause OOM (in this job, the driver memory is 6g).

Looks like the results of `newAccumulatorInfos` are used in api and ui. For such usage, it also does not make sense to have very long string of complete collection accumulators.

### Does this PR introduce any user-facing change?

Yes. Collection accumulator now only shows first few elements in api and ui.

### How was this patch tested?

Unit test.

Manual test. Launched a Spark shell, ran:
```scala
val accum = sc.collectionAccumulator[Long]("Collection Accumulator Example")
sc.range(0, 10000, 1, 1).foreach(x => accum.add(x))
accum.value
```

<img width="2533" alt="Screen Shot 2019-12-30 at 2 03 43 PM" src="https://user-images.githubusercontent.com/68855/71602488-6eb2c400-2b0d-11ea-8725-dba36478198f.png">

Closes #27038 from viirya/partial-collect-accu.

Lead-authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Co-authored-by: Liang-Chi Hsieh <liangchi@uber.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2019-12-31 12:45:23 +09:00
Ajith f0fbbf014e [SPARK-30361][REST] Monitoring URL do not redact information about environment
UI and event logs redact sensitive information. But the monitoring URL, https://spark.apache.org/docs/latest/monitoring.html#rest-api , specifically /applications/[app-id]/environment does not, which is a security issue.

### What changes were proposed in this pull request?
REST api response is redacted before sending it

### Why are the changes needed?
If no redaction is done for rest API call, it can leak security information

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
Tested manually

Closes #27018 from ajithme/redactrest.

Authored-by: Ajith <ajith2489@gmail.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
2019-12-30 09:10:31 -06:00
07ARB 5af77410bb [SPARK-30383][WEBUI] Remove meaning less tooltip from All pages
### What changes were proposed in this pull request?
Remove meaning less tooltip from All pages.

### Why are the changes needed?
If we can't come up with meaningful tooltips, then tooltips not require to add.

### Does this PR introduce any user-facing change?
Yes

![67598045-351ab100-f78a-11e9-88cf-573e09d7c50e](https://user-images.githubusercontent.com/8948111/71558018-81c58580-2a74-11ea-9f38-dcaebd3f0bbf.png)

tooltips like highlight in above image got removed
### How was this patch tested?
Manual test.

Closes #27043 from 07ARB/SPARK-30383.

Authored-by: 07ARB <ankitrajboudh@gmail.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2019-12-30 18:28:02 +09:00
Jungtaek Lim (HeartSaVioR) 8092d634ea [SPARK-30348][SPARK-27510][CORE][TEST] Fix flaky test failure on "MasterSuite.: Master should avoid ..."
### What changes were proposed in this pull request?

This patch fixes the flaky test failure on MasterSuite, "SPARK-27510: Master should avoid dead loop while launching executor failed in Worker".

The culprit of test failure was ironically the test ran too fast; the interval of `eventually` is by default "15 ms", but it took only "8 ms" from submitting driver to removing app from master.

```
19/12/23 15:45:06.533 dispatcher-event-loop-6 INFO Master: Registering worker localhost:9999 with 10 cores, 3.6 GiB RAM
19/12/23 15:45:06.534 dispatcher-event-loop-6 INFO Master: Driver submitted org.apache.spark.FakeClass
19/12/23 15:45:06.535 dispatcher-event-loop-6 INFO Master: Launching driver driver-20191223154506-0000 on worker 10001
19/12/23 15:45:06.536 dispatcher-event-loop-9 INFO Master: Registering app name
19/12/23 15:45:06.537 dispatcher-event-loop-9 INFO Master: Registered app name with ID app-20191223154506-0000
19/12/23 15:45:06.537 dispatcher-event-loop-9 INFO Master: Launching executor app-20191223154506-0000/0 on worker 10001
19/12/23 15:45:06.537 dispatcher-event-loop-10 INFO Master: Removing executor app-20191223154506-0000/0 because it is FAILED
...
19/12/23 15:45:06.542 dispatcher-event-loop-19 ERROR Master: Application name with ID app-20191223154506-0000 failed 10 times; removing it
```

Given the interval is already tiny, instead of lowering interval, the patch considers above case as well when verifying the status.

### Why are the changes needed?

We observed intermittent test failure in Jenkins build which should be fixed.
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/115664/testReport/

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Modified UT.

Closes #27004 from HeartSaVioR/SPARK-30348.

Authored-by: Jungtaek Lim (HeartSaVioR) <kabhwan.opensource@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-12-30 14:37:17 +08:00
yi.wu b5c35d68e4 [SPARK-27348][CORE] HeartbeatReceiver should remove lost executors from CoarseGrainedSchedulerBackend
### What changes were proposed in this pull request?

Remove it from `CoarseGrainedSchedulerBackend` when `HeartbeatReceiver` recognizes a lost executor.

### Why are the changes needed?

Currently, an application may hang if we don't remove a lost executor from `CoarseGrainedSchedulerBackend` as it may happens due to:

1) In `expireDeadHosts()`, `HeartbeatReceiver` calls `scheduler.executorLost()`;

2) Before `HeartbeatReceiver` calls `sc.killAndReplaceExecutor()`(which would mark the lost executor as "pendingToRemove") in a separate thread,  `CoarseGrainedSchedulerBackend` may begins to launch tasks on that executor without realizing it has been lost indeed.

3) If that lost executor doesn't shut down gracefully, `CoarseGrainedSchedulerBackend ` may never receive a disconnect event. As a result, tasks launched on that lost executor become orphans. While at the same time, driver just thinks that those tasks are still running and waits forever.

Removing the lost executor from `CoarseGrainedSchedulerBackend` would let `TaskSetManager` mark those tasks as failed which avoids app hang. Furthermore, it cleans up records in `executorDataMap`, which may never be removed in such case.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

Updated existed tests.

Close #24350.

Closes #26980 from Ngone51/SPARK-27348.

Lead-authored-by: yi.wu <yi.wu@databricks.com>
Co-authored-by: Yuanjian Li <xyliyuanjian@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-12-30 12:29:24 +08:00
Ajith 4257a94447 [SPARK-30360][UI] Avoid Redact classpath entries in History Server UI
Currently SPARK history server display the classpath entries in the Environment tab with classpaths redacted, this is because EventLog file has the entry values redacted while writing. But when same is seen from a running application UI, its seen that it is not redacted. Classpath entries redact is not needed and can be avoided

### What changes were proposed in this pull request?
Event logs will not redect the classpath entries

### Why are the changes needed?
Redact of classpath entries is not needed

### Does this PR introduce any user-facing change?
NO

### How was this patch tested?
Tested manually to verify on UI

Closes #27016 from ajithme/redactui.

Authored-by: Ajith <ajith2489@gmail.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
2019-12-28 12:59:30 -06:00
yi.wu c35427f6b1 [SPARK-30355][CORE] Unify isExecutorActive between CoarseGrainedSchedulerBackend and DriverEndpoint
### What changes were proposed in this pull request?

Unify `DriverEndpoint. executorIsAlive()` and `CoarseGrainedSchedulerBackend .isExecutorActive()`.

### Why are the changes needed?

`DriverEndPoint` has method `executorIsAlive()` to check wether an executor is alive/active, while `CoarseGrainedSchedulerBackend` has method `isExecutorActive()` to do the same work. But, `isExecutorActive()` seems forget to consider `executorsPendingLossReason`. Unify these two methods makes behavior be consistent between `DriverEndPoint` and `CoarseGrainedSchedulerBackend` and make code more easier to maintain.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Pass Jenkins.

Closes #27012 from Ngone51/unify-is-executor-alive.

Authored-by: yi.wu <yi.wu@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-12-27 14:41:45 +08:00
Fu Chen 3584d84943 [MINOR][CORE] Quiet request executor remove message
### What changes were proposed in this pull request?

Settings to quiet for Class `ExecutorAllocationManager` that request message too verbose. otherwise, this class generates too many messages like
`INFO spark.ExecutorAllocationManager: Request to remove executorIds: 890`
 when we enabled DRA.

### Why are the changes needed?

Log level improvement.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

Closes #26925 from cfmcgrady/quiet-request-executor-remove-message.

Authored-by: Fu Chen <cfmcgrady@gmail.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
2019-12-26 09:59:41 -06:00
yi.wu 35506dced7 [SPARK-25855][CORE][FOLLOW-UP] Format config name to follow the other boolean conf naming convention
### What changes were proposed in this pull request?

Change config name from `spark.eventLog.allowErasureCoding` to `spark.eventLog.allowErasureCoding.enabled`.

### Why are the changes needed?

To follow the other boolean conf naming convention.

### Does this PR introduce any user-facing change?

No, it's newly added in Spark 3.0.

### How was this patch tested?

Tested manually and pass Jenkins.

Closes #26998 from Ngone51/SPARK-25855-FOLLOWUP.

Authored-by: yi.wu <yi.wu@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-12-25 19:24:58 +08:00
Liang-Chi Hsieh 0042ad575a [SPARK-30290][CORE] Count for merged block when fetching continuous blocks in batch
### What changes were proposed in this pull request?

We added shuffle block fetch optimization in SPARK-9853. In ShuffleBlockFetcherIterator, we merge single blocks into batch blocks. During merging, we should count merged blocks for `maxBlocksInFlightPerAddress`, not original single blocks.

### Why are the changes needed?

If `maxBlocksInFlightPerAddress` is specified, like set it to 1, it should mean one batch block, not one original single block. Otherwise, it will conflict with batch shuffle fetch.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

Unit test.

Closes #26930 from viirya/respect-max-blocks-inflight.

Lead-authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Co-authored-by: Liang-Chi Hsieh <liangchi@uber.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-12-25 18:57:02 +08:00
Kazuaki Ishizaki f31d9a629b [MINOR][DOC][SQL][CORE] Fix typo in document and comments
### What changes were proposed in this pull request?

Fixed typo in `docs` directory and in other directories

1. Find typo in `docs` and apply fixes to files in all directories
2. Fix `the the` -> `the`

### Why are the changes needed?

Better readability of documents

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

No test needed

Closes #26976 from kiszk/typo_20191221.

Authored-by: Kazuaki Ishizaki <ishizaki@jp.ibm.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-12-21 14:08:58 -08:00
Sean Owen 7dff3b125d [SPARK-30272][SQL][CORE] Remove usage of Guava that breaks in 27; replace with workalikes
### What changes were proposed in this pull request?

Remove usages of Guava that no longer work in Guava 27, and replace with workalikes. I'll comment on key types of changes below.

### Why are the changes needed?

Hadoop 3.2.1 uses Guava 27, so this helps us avoid problems running on Hadoop 3.2.1+ and generally lowers our exposure to Guava.

### Does this PR introduce any user-facing change?

Should not be, but see notes below on hash codes and toString.

### How was this patch tested?

Existing tests will verify whether these changes break anything for Guava 14.
I manually built with an updated version and it compiles with Guava 27; tests running manually locally now.

Closes #26911 from srowen/SPARK-30272.

Authored-by: Sean Owen <srowen@gmail.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
2019-12-20 08:55:04 -06:00
Kousuke Saruta 0945633844 [SPARK-29997][WEBUI][FOLLOWUP] Refactor code for job description of empty jobs
### What changes were proposed in this pull request?

Refactor the code brought by #26637 .
No more dummy StageInfo and its side-effects are needed at all.
This change also enable users to set job description to empty jobs though.

### Why are the changes needed?

The previous approach introduced dummy StageInfo and this causes side-effects.

### Does this PR introduce any user-facing change?

Yes. Description set by user will be shown in the AllJobsPage.

![](https://user-images.githubusercontent.com/4736016/70788638-acf17900-1dd4-11ea-95f9-6d6739b24083.png)

### How was this patch tested?

Manual test and newly added unit test.

Closes #26703 from sarutak/fix-ui-for-empty-job2.

Lead-authored-by: Kousuke Saruta <sarutak@oss.nttdata.com>
Co-authored-by: Dongjoon Hyun <dhyun@apple.com>
Co-authored-by: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-12-18 10:27:31 -08:00
Liang-Chi Hsieh b2baaa2fcc [SPARK-30274][CORE] Avoid BytesToBytesMap lookup hang forever when holding keys reaching max capacity
### What changes were proposed in this pull request?

We should not append keys to BytesToBytesMap to be its max capacity.

### Why are the changes needed?

BytesToBytesMap.append allows to append keys until the number of keys reaches MAX_CAPACITY. But once the the pointer array in the map holds MAX_CAPACITY keys, next time call of lookup will hang forever.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

Manually test by:
```java
Test
  public void testCapacity() {
    TestMemoryManager memoryManager2 =
            new TestMemoryManager(
                    new SparkConf()
                            .set(package$.MODULE$.MEMORY_OFFHEAP_ENABLED(), true)
                            .set(package$.MODULE$.MEMORY_OFFHEAP_SIZE(), 25600 * 1024 * 1024L)
                            .set(package$.MODULE$.SHUFFLE_SPILL_COMPRESS(), false)
                            .set(package$.MODULE$.SHUFFLE_COMPRESS(), false));
    TaskMemoryManager taskMemoryManager2 = new TaskMemoryManager(memoryManager2, 0);
    final long pageSizeBytes = 8000000 + 8; // 8 bytes for end-of-page marker
    final BytesToBytesMap map = new BytesToBytesMap(taskMemoryManager2, 1024, pageSizeBytes);

    try {
      for (long i = 0; i < BytesToBytesMap.MAX_CAPACITY + 1; i++) {
        final long[] value = new long[]{i};
        boolean succeed = map.lookup(value, Platform.LONG_ARRAY_OFFSET, 8).append(
                value,
                Platform.LONG_ARRAY_OFFSET,
                8,
                value,
                Platform.LONG_ARRAY_OFFSET,
                8);
      }
      map.free();
    } finally {
      map.free();
    }
  }
```

Once the map was appended to 536870912 keys (MAX_CAPACITY), the next lookup will hang.

Closes #26914 from viirya/fix-bytemap2.

Authored-by: Liang-Chi Hsieh <liangchi@uber.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-12-17 11:37:05 -08:00
“attilapiros” cdc8fc6233 [SPARK-30235][CORE] Switching off host local disk reading of shuffle blocks in case of useOldFetchProtocol
### What changes were proposed in this pull request?

When `spark.shuffle.useOldFetchProtocol` is enabled then switching off the direct disk reading of host-local shuffle blocks and falling back to remote block fetching (and this way avoiding the `GetLocalDirsForExecutors` block transfer message which is introduced from Spark 3.0.0).

### Why are the changes needed?

In `[SPARK-27651][Core] Avoid the network when shuffle blocks are fetched from the same host` a new block transfer message is introduced, `GetLocalDirsForExecutors`. This new message could be sent to the external shuffle service and as it is not supported by the previous version of external shuffle service it should be avoided when `spark.shuffle.useOldFetchProtocol` is true.

In the migration guide I changed the exception type as `org.apache.spark.network.shuffle.protocol.BlockTransferMessage.Decoder#fromByteBuffer`
throws a IllegalArgumentException with the given text and uses the message type which is just a simple number (byte). I have checked and this is true for version 2.4.4 too.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?

This specific case (considering one extra boolean to switch off host local disk reading feature) is not tested but existing tests were run.

Closes #26869 from attilapiros/SPARK-30235.

Authored-by: “attilapiros” <piros.attila.zsolt@gmail.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-12-17 10:32:15 -08:00
Xingbo Jiang 1c714befd8 [SPARK-25100][TEST][FOLLOWUP] Refactor test cases in FileSuite and KryoSerializerSuite
### What changes were proposed in this pull request?

Refactor test cases added by https://github.com/apache/spark/pull/26714, to improve code compactness.

### How was this patch tested?

Tested locally.

Closes #26916 from jiangxb1987/SPARK-25100.

Authored-by: Xingbo Jiang <xingbo.jiang@databricks.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-12-16 21:11:15 -08:00
shahid dd217e10fc [SPARK-25392][CORE][WEBUI] Prevent error page when accessing pools page from history server
### What changes were proposed in this pull request?

### Why are the changes needed?

Currently from history server, we will not able to access the pool info, as we aren't writing pool information to the event log other than pool name. Already spark is hiding pool table when accessing from history server. But from the pool column in the stage table will redirect to the pools table, and that will throw error when accessing the pools page. To prevent error page, we need to hide the pool column also in the stage table

### Does this PR introduce any user-facing change?

No

### How was this patch tested?
Manual test

Before change:
![Screenshot 2019-11-21 at 6 49 40 AM](https://user-images.githubusercontent.com/23054875/69293868-219b2280-0c30-11ea-9b9a-17140d024d3a.png)
![Screenshot 2019-11-21 at 6 48 51 AM](https://user-images.githubusercontent.com/23054875/69293834-147e3380-0c30-11ea-9dec-d5f67665486d.png)

After change:
![Screenshot 2019-11-21 at 7 29 01 AM](https://user-images.githubusercontent.com/23054875/69293991-9cfcd400-0c30-11ea-98a0-7a6268a4e5ab.png)

Closes #26616 from shahidki31/poolHistory.

Authored-by: shahid <shahidki31@gmail.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-12-16 15:02:34 -08:00
turbofei 5954311739 [SPARK-29043][CORE] Improve the concurrent performance of History Server
Even we set spark.history.fs.numReplayThreads to a large number, such as 30.
The history server still replays logs slowly.
We found that, if there is a straggler in a batch of replay tasks, all the other threads will wait for this
straggler.

In this PR, we create processing to save the logs which are being replayed.
So that the replay tasks can execute Asynchronously.

It can accelerate the speed to replay logs  for history server.

No.

UT.

Closes #25797 from turboFei/SPARK-29043.

Authored-by: turbofei <fwang12@ebay.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-12-16 14:45:27 -08:00
Marcelo Vanzin a9fbd31030 [SPARK-30240][CORE] Support HTTP redirects directly to a proxy server
### What changes were proposed in this pull request?

The PR adds a new config option to configure an address for the
proxy server, and a new handler that intercepts redirects and replaces
the URL with one pointing at the proxy server. This is needed on top
of the "proxy base path" support because redirects use full URLs, not
just absolute paths from the server's root.

### Why are the changes needed?

Spark's web UI has support for generating links to paths with a
prefix, to support a proxy server, but those do not apply when
the UI is responding with redirects. In that case, Spark is sending
its own URL back to the client, and if it's behind a dumb proxy
server that doesn't do rewriting (like when using stunnel for HTTPS
support) then the client will see the wrong URL and may fail.

### Does this PR introduce any user-facing change?

Yes. It's a new UI option.

### How was this patch tested?

Tested with added unit test, with Spark behind stunnel, and in a
more complicated app using a different HTTPS proxy.

Closes #26873 from vanzin/SPARK-30240.

Authored-by: Marcelo Vanzin <vanzin@cloudera.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-12-14 17:39:06 -08:00
xiaodeshan fb2f5a4906 [SPARK-25100][CORE] Register TaskCommitMessage to KyroSerializer
## What changes were proposed in this pull request?

Fix the bug when invoking saveAsNewAPIHadoopDataset to store data, the job will fail because the class TaskCommitMessage hasn't be registered if serializer is KryoSerializer and spark.kryo.registrationRequired is true

## How was this patch tested?

UT

Closes #26714 from deshanxiao/SPARK-25100.

Authored-by: xiaodeshan <xiaodeshan@xiaomi.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-12-14 17:15:30 -08:00
Sean Owen 46e950bea8 [SPARK-30263][CORE] Don't log potentially sensitive value of non-Spark properties ignored in spark-submit
### What changes were proposed in this pull request?

The value of non-Spark config properties ignored in spark-submit is no longer logged.

### Why are the changes needed?

The value isn't really needed in the logs, and could contain potentially sensitive info. While we can redact the values selectively too, I figured it's more robust to just not log them at all here, as the values aren't important in this log statement.

### Does this PR introduce any user-facing change?

Other than the change to logging above, no.

### How was this patch tested?

Existing tests

Closes #26893 from srowen/SPARK-30263.

Authored-by: Sean Owen <srowen@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-12-14 13:13:54 -08:00
Kousuke Saruta 61ebc81186 [SPARK-30167][REPL] Log4j configuration for REPL can't override the root logger properly
### What changes were proposed in this pull request?

In the current implementation of `SparkShellLoggingFilter`, if the log level of the root logger and the log level of a message are different, whether a message should logged is decided based on log4j's configuration but whether the message should be output to the REPL's console is not cared.
So, if the log level of the root logger is `DEBUG`, the log level of REPL's logger is `WARN` and the log level of a message is `INFO`, the message will output to the REPL's console even though `INFO < WARN`.
https://github.com/apache/spark/pull/26798/files#diff-bfd5810d8aa78ad90150e806d830bb78L237

The ideal behavior should be like as follows and implemented them in this change.

1. If the log level of a message is greater than or equal to the log level of the root logger, the message should be logged but whether the message is output to the REPL's console should be decided based on whether the log level of the message is greater than or equal to the log level of the REPL's logger.

2. If a log level or custom appenders are explicitly defined for a category, whether a log message via the logger corresponding to the category is logged and output to the REPL's console should be decided baed on the log level of the category.
We can confirm whether a log level or appenders are explicitly set to a logger for a category by `Logger#getLevel` and `Logger#getAllAppenders.hasMoreElements`.

### Why are the changes needed?

This is a bug breaking a compatibility.

#9816 enabled REPL's log4j configuration to override root logger but #23675 seems to have broken the feature.
You can see one example when you modifies the default log4j configuration like as follows.
```
# Change the log level for rootCategory to DEBUG
log4j.rootCategory=DEBUG, console

...
# The log level for repl.Main remains WARN
log4j.logger.org.apache.spark.repl.Main=WARN
```
If you launch REPL with the configuration, INFO level logs appear even though the log level for REPL is WARN.
```
・・・

19/12/08 23:31:38 INFO Utils: Successfully started service 'sparkDriver' on port 33083.
19/12/08 23:31:38 INFO SparkEnv: Registering MapOutputTracker
19/12/08 23:31:38 INFO SparkEnv: Registering BlockManagerMaster
19/12/08 23:31:38 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
19/12/08 23:31:38 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
19/12/08 23:31:38 INFO SparkEnv: Registering BlockManagerMasterHeartbeat

・・・
```
Before #23675 was applied, those INFO level logs are not shown with the same log4j.properties.

### Does this PR introduce any user-facing change?

Yes. The logging behavior for REPL is fixed.

### How was this patch tested?

Manual test and newly added unit test.

Closes #26798 from sarutak/fix-spark-shell-loglevel.

Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-12-13 14:30:11 -08:00
sharan.gk ec26dde36b [SPARK-29455][WEBUI] Improve tooltip information for Stages
### What changes were proposed in this pull request?
Adding tooltip to Stages tab for better usability.

### Why are the changes needed?
There are a few common points of confusion in the UI that could be clarified with tooltips. We
should add tooltips to explain.

### Does this PR introduce any user-facing change?
Yes
![image](https://user-images.githubusercontent.com/29914590/70693889-5a389400-1ce4-11ea-91bb-ee1e997a5c35.png)

### How was this patch tested?
Manual

Closes #26859 from sharangk/tooltip1.

Authored-by: sharan.gk <sharan.gk@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-12-13 11:35:00 -08:00
07ARB ce61ee8941 [SPARK-30126][CORE] support space in file path and name for addFile and addJar function
### What changes were proposed in this pull request?
sparkContext.addFile and sparkContext.addJar fails when file path contains spaces

### Why are the changes needed?
When uploading a file to the spark context via the addFile and addJar function, an exception is thrown when file path contains a space character. Escaping the space with %20 or
or + doesn't change the result.

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
Add test case.

Closes #26773 from 07ARB/SPARK-30126.

Authored-by: 07ARB <ankitrajboudh@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-12-12 20:30:47 +08:00
Liang-Chi Hsieh b4aeaf906f [SPARK-30198][CORE] BytesToBytesMap does not grow internal long array as expected
### What changes were proposed in this pull request?

This patch changes the condition to check if BytesToBytesMap should grow up its internal array. Specifically, it changes to compare by the capacity of the array, instead of its size.

### Why are the changes needed?

One Spark job on our cluster hangs forever at BytesToBytesMap.safeLookup. After inspecting, the long array size is 536870912.

Currently in BytesToBytesMap.append, we only grow the internal array if the size of the array is less than its MAX_CAPACITY that is 536870912. So in above case, the array can not be grown up, and safeLookup can not find an empty slot forever.

But it is wrong because we use two array entries per key, so the array size is twice the capacity. We should compare the current capacity of the array, instead of its size.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

This issue only happens when loading big number of values into BytesToBytesMap, so it is hard to do unit test. This is tested manually with internal Spark job.

Closes #26828 from viirya/fix-bytemap.

Lead-authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Co-authored-by: Liang-Chi Hsieh <liangchi@uber.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-12-11 14:58:21 -08:00
Sean Owen 33f53cb2d5 [SPARK-30195][SQL][CORE][ML] Change some function, import definitions to work with stricter compiler in Scala 2.13
### What changes were proposed in this pull request?

See https://issues.apache.org/jira/browse/SPARK-30195 for the background; I won't repeat it here. This is sort of a grab-bag of related issues.

### Why are the changes needed?

To cross-compile with Scala 2.13 later.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Existing tests for 2.12. I've been manually checking that this actually resolves the compile problems in 2.13 separately.

Closes #26826 from srowen/SPARK-30195.

Authored-by: Sean Owen <srowen@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-12-11 12:33:58 -08:00
Pavithra Ramachandran d46c03c3d3 [SPARK-29460][WEBUI] Add tooltip for Jobs page
### What changes were proposed in this pull request?
Adding tooltip for jobs tab column - Job Id (Job Group), Description ,Submitted, Duration, Stages, Tasks

Before:
![Screenshot from 2019-11-04 11-31-02](https://user-images.githubusercontent.com/51401130/68102467-e8a54300-fef8-11e9-9f9e-48dd1b393ac8.png)

After:
![Screenshot from 2019-11-04 11-30-53](https://user-images.githubusercontent.com/51401130/68102478-f3f86e80-fef8-11e9-921a-357678229cb4.png)

### Why are the changes needed?
Jobs tab do not have any tooltip for the columns, Some page provide tooltip , inorder to resolve the inconsistency and for better user experience.

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
Manual

Closes #26384 from PavithraRamachandran/jobTab_tooltip.

Authored-by: Pavithra Ramachandran <pavi.rams@gmail.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
2019-12-11 09:39:39 -06:00
root1 d7843dde0f [SPARK-29152][CORE] Executor Plugin shutdown when dynamic allocation is enabled
### What changes were proposed in this pull request?
Added `shutdownHook` for shutdown method of executor plugin. This will ensure that shutdown method will be called always.

### Why are the changes needed?
Whenever executors are not going down gracefully, i.e getting killed due to idle time or getting killed forcefully, shutdown method of executors plugin is not getting called. Shutdown method can be used to release any resources that plugin has acquired during its initialisation. So its important to make sure that every time a executor goes down shutdown method of plugin gets called.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?

Tested Manually.

Closes #26810 from iRakson/Executor_Plugin.

Authored-by: root1 <raksonrakesh@gmail.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-12-10 14:23:51 -08:00
Yuchen Huo ad238a2238 [SPARK-29976][CORE] Trigger speculation for stages with too few tasks
### What changes were proposed in this pull request?
This PR add an optional spark conf for speculation to allow speculative runs for stages where there are only a few tasks.
```
spark.speculation.task.duration.threshold
```

If provided, tasks would be speculatively run if the TaskSet contains less tasks than the number of slots on a single executor and the task is taking longer time than the threshold.

### Why are the changes needed?
This change helps avoid scenarios where there is single executor that could hang forever due to disk issue and we unfortunately assigned the single task in a TaskSet to that executor and cause the whole job to hang forever.

### Does this PR introduce any user-facing change?
yes. If the new config `spark.speculation.task.duration.threshold` is provided and the TaskSet contains less tasks than the number of slots on a single executor and the task is taking longer time than the threshold, then speculative tasks would be submitted for the running tasks in the TaskSet.

### How was this patch tested?
Unit tests are added to TaskSetManagerSuite.

Closes #26614 from yuchenhuo/SPARK-29976.

Authored-by: Yuchen Huo <yuchen.huo@databricks.com>
Signed-off-by: Thomas Graves <tgraves@apache.org>
2019-12-10 14:43:26 -06:00
Luca Canali 729f43f499 [SPARK-27189][CORE] Add Executor metrics and memory usage instrumentation to the metrics system
## What changes were proposed in this pull request?

This PR proposes to add instrumentation of memory usage via the Spark Dropwizard/Codahale metrics system. Memory usage metrics are available via the Executor metrics, recently implemented as detailed in https://issues.apache.org/jira/browse/SPARK-23206.
Additional notes: This takes advantage of the metrics poller introduced in #23767.

## Why are the changes needed?
Executor metrics bring have many useful insights on memory usage, in particular on the usage of storage memory and executor memory. This is useful for troubleshooting. Having the information in the metrics systems allows to add those metrics to Spark performance dashboards and study memory usage as a function of time, as in the example graph https://issues.apache.org/jira/secure/attachment/12962810/Example_dashboard_Spark_Memory_Metrics.PNG

## Does this PR introduce any user-facing change?
Adds `ExecutorMetrics` source to publish executor metrics via the Dropwizard metrics system. Details of the available metrics in docs/monitoring.md
Adds configuration parameter `spark.metrics.executormetrics.source.enabled`

## How was this patch tested?

Tested on YARN cluster and with an existing setup for a Spark dashboard based on InfluxDB and Grafana.

Closes #24132 from LucaCanali/memoryMetricsSource.

Authored-by: Luca Canali <luca.canali@cern.ch>
Signed-off-by: Imran Rashid <irashid@cloudera.com>
2019-12-09 08:55:30 -06:00
Sean Owen ebd83a544e [SPARK-30009][CORE][SQL][FOLLOWUP] Remove OrderingUtil and Utils.nanSafeCompare{Doubles,Floats} and use java.lang.{Double,Float}.compare directly
### What changes were proposed in this pull request?

Follow up on https://github.com/apache/spark/pull/26654#discussion_r353826162
Instead of OrderingUtil or Utils.nanSafeCompare{Doubles,Floats}, just use java.lang.{Double,Float}.compare directly. All work identically w.r.t. NaN when used to `compare`.

### Why are the changes needed?

Simplification of the previous change, which existed to support Scala 2.13 migration.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Existing tests

Closes #26761 from srowen/SPARK-30009.2.

Authored-by: Sean Owen <sean.owen@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-12-05 11:27:25 +08:00
Luca Canali 60f20e5ea2 [SPARK-30060][CORE] Rename metrics enable/disable configs
### What changes were proposed in this pull request?
This proposes to introduce a naming convention for Spark metrics configuration parameters used to enable/disable metrics source reporting using the Dropwizard metrics library:   `spark.metrics.sourceNameCamelCase.enabled` and update 2 parameters to use this naming convention.

### Why are the changes needed?
Currently Spark has a few parameters to enable/disable metrics reporting. Their naming pattern is not uniform and this can create confusion.  Currently we have:
`spark.metrics.static.sources.enabled`
`spark.app.status.metrics.enabled`
`spark.sql.streaming.metricsEnabled`

### Does this PR introduce any user-facing change?
Update parameters for enabling/disabling metrics reporting new in Spark 3.0: `spark.metrics.static.sources.enabled` -> `spark.metrics.staticSources.enabled`, `spark.app.status.metrics.enabled`  -> `spark.metrics.appStatusSource.enabled`.
Note: `spark.sql.streaming.metricsEnabled` is left unchanged as it is already in use in Spark 2.x.

### How was this patch tested?
Manually tested

Closes #26692 from LucaCanali/uniformNamingMetricsEnableParameters.

Authored-by: Luca Canali <luca.canali@cern.ch>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-12-03 14:31:06 -08:00
Sean Owen 4193d2f4cc [SPARK-30012][CORE][SQL] Change classes extending scala collection classes to work with 2.13
### What changes were proposed in this pull request?

Move some classes extending Scala collections into parallel source trees, to support 2.13; other minor collection-related modifications.

Modify some classes extending Scala collections to work with 2.13 as well as 2.12. In many cases, this means introducing parallel source trees, as the type hierarchy changed in ways that one class can't support both.

### Why are the changes needed?

To support building for Scala 2.13 in the future.

### Does this PR introduce any user-facing change?

There should be no behavior change.

### How was this patch tested?

Existing tests. Note that the 2.13 changes are not tested by the PR builder, of course. They compile in 2.13 but can't even be tested locally. Later, once the project can be compiled for 2.13, thus tested, it's possible the 2.13 implementations will need updates.

Closes #26728 from srowen/SPARK-30012.

Authored-by: Sean Owen <sean.owen@databricks.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-12-03 08:59:43 -08:00
Yuanjian Li 169415ffac [SPARK-30025][CORE] Continuous shuffle block fetching should be disabled by default when the old fetch protocol is used
### What changes were proposed in this pull request?
Disable continuous shuffle block fetching when the old fetch protocol in use.

### Why are the changes needed?
The new feature of continuous shuffle block fetching depends on the latest version of the shuffle fetch protocol. We should keep this constraint in `BlockStoreShuffleReader.fetchContinuousBlocksInBatch`.

### Does this PR introduce any user-facing change?
Users will not get the exception related to continuous shuffle block fetching when old version of the external shuffle service is used.

### How was this patch tested?
Existing UT.

Closes #26663 from xuanyuanking/SPARK-30025.

Authored-by: Yuanjian Li <xyliyuanjian@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-12-02 15:59:12 +08:00
shahid b182ed83f6 [SPARK-29724][SPARK-29726][WEBUI][SQL] Support JDBC/ODBC tab for HistoryServer WebUI
### What changes were proposed in this pull request?

 Support JDBC/ODBC tab for HistoryServer WebUI. Currently from Historyserver we can't access the JDBC/ODBC tab for thrift server applications. In this PR, I am doing 2 main changes
1. Refactor existing thrift server listener to support kvstore
2. Add history server plugin for thrift server listener and tab.

### Why are the changes needed?
Users can access Thriftserver tab from History server for both running and finished applications,

### Does this PR introduce any user-facing change?
Support for JDBC/ODBC tab  for the WEBUI from History server

### How was this patch tested?
Add UT and Manual tests
1. Start Thriftserver and Historyserver
```
sbin/stop-thriftserver.sh
sbin/stop-historyserver.sh
sbin/start-thriftserver.sh
sbin/start-historyserver.sh
```
2. Launch beeline
`bin/beeline -u jdbc:hive2://localhost:10000`

3. Run queries

Go to the JDBC/ODBC page of the WebUI from History server

![image](https://user-images.githubusercontent.com/23054875/68365501-cf013700-0156-11ea-84b4-fda8008c92c4.png)

Closes #26378 from shahidki31/ThriftKVStore.

Authored-by: shahid <shahidki31@gmail.com>
Signed-off-by: Gengliang Wang <gengliang.wang@databricks.com>
2019-11-29 19:44:31 -08:00
wuyi d075b3344e [SPARK-28366][CORE][FOLLOW-UP] Improve the conf IO_WARNING_LARGEFILETHRESHOLD
### What changes were proposed in this pull request?

Improve conf `IO_WARNING_LARGEFILETHRESHOLD` (a.k.a `spark.io.warning.largeFileThreshold`):

* reword documentation

* change type from `long` to `bytes`

### Why are the changes needed?

Improvements according to https://github.com/apache/spark/pull/25134#discussion_r350570804 & https://github.com/apache/spark/pull/25134#discussion_r350570917.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Pass Jenkins.

Closes #26691 from Ngone51/SPARK-28366-followup.

Authored-by: wuyi <ngone_5451@163.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-11-27 20:34:22 +08:00
Kousuke Saruta 08e2a39df2 [SPARK-29997][WEBUI] Show job name for empty jobs in WebUI
### What changes were proposed in this pull request?

In current implementation, job name for empty jobs is not shown so I've made change to show it.

### Why are the changes needed?

To make debug easier.

### Does this PR introduce any user-facing change?

Yes. Before applying my change, the `Job Page` will show as follows as the result of submitting a job which contains no partitions.

![fix-ui-for-empty-job-before](https://user-images.githubusercontent.com/4736016/69410847-33bfb280-0d4f-11ea-9878-d67638cbe4cb.png)

After applying my change, the page will show a display like a following screenshot.

![fix-ui-for-empty-job-after](https://user-images.githubusercontent.com/4736016/69411021-86996a00-0d4f-11ea-8dea-bb8456159d18.png)

### How was this patch tested?

Manual test.

Closes #26637 from sarutak/fix-ui-for-empty-job.

Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-11-26 19:38:46 -08:00
“attilapiros” fd2bf55aba [SPARK-27651][CORE] Avoid the network when shuffle blocks are fetched from the same host
## What changes were proposed in this pull request?

Before this PR `ShuffleBlockFetcherIterator` was partitioning the block fetches into two distinct sets: local reads and remote fetches. Within this PR (when the feature is enabled by "spark.shuffle.readHostLocalDisk.enabled") a new category is introduced: host-local reads. They are shuffle block fetches where although the block manager is different they are running on the same host along with the requester.

Moreover to get the local directories of the other executors/block managers a new RPC message is introduced `GetLocalDirs` which is sent the the block manager master where it is answered as `BlockManagerLocalDirs`. In `BlockManagerMasterEndpoint` for answering this request the `localDirs` is extracted from the `BlockManagerInfo` and stored separately in a hash map called `executorIdLocalDirs`. Because the earlier used `blockManagerInfo` contains data for the alive block managers (see `org.apache.spark.storage.BlockManagerMasterEndpoint#removeBlockManager`).

Now `executorIdLocalDirs` knows all the local dirs up to the application start (like the external shuffle service does) so in case of an RDD recalculation both host-local shuffle blocks and disk persisted RDD blocks on the same host can be served by reading the files behind the blocks directly.

## How was this patch tested?

### Unit tests

`ExternalShuffleServiceSuite`:
- "SPARK-27651: host local disk reading avoids external shuffle service on the same node"

`ShuffleBlockFetcherIteratorSuite`:
- "successful 3 local reads + 4 host local reads + 2 remote reads"

And with extending existing suites where shuffle metrics was tested.

### Manual tests

Running Spark on YARN in a 4 nodes cluster with 6 executors and having 12 shuffle blocks.

```
$ grep host-local experiment.log
19/07/30 03:57:12 INFO storage.ShuffleBlockFetcherIterator: Getting 12 (1496.8 MB) non-empty blocks including 2 (299.4 MB) local blocks and 2 (299.4 MB) host-local blocks and 8 (1197.4 MB) remote blocks
19/07/30 03:57:12 DEBUG storage.ShuffleBlockFetcherIterator: Start fetching host-local blocks: shuffle_0_2_1, shuffle_0_6_1
19/07/30 03:57:12 DEBUG storage.ShuffleBlockFetcherIterator: Got host-local blocks in 38 ms
19/07/30 03:57:12 INFO storage.ShuffleBlockFetcherIterator: Getting 12 (1496.8 MB) non-empty blocks including 2 (299.4 MB) local blocks and 2 (299.4 MB) host-local blocks and 8 (1197.4 MB) remote blocks
19/07/30 03:57:12 DEBUG storage.ShuffleBlockFetcherIterator: Start fetching host-local blocks: shuffle_0_0_0, shuffle_0_8_0
19/07/30 03:57:12 DEBUG storage.ShuffleBlockFetcherIterator: Got host-local blocks in 35 ms
```

Closes #25299 from attilapiros/SPARK-27651.

Authored-by: “attilapiros” <piros.attila.zsolt@gmail.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-11-26 11:02:25 -08:00
Sean Owen 29018025ba [SPARK-30009][CORE][SQL] Support different floating-point Ordering for Scala 2.12 / 2.13
### What changes were proposed in this pull request?

Make separate source trees for Scala 2.12/2.13 in order to accommodate mutually-incompatible support for Ordering of double, float.

Note: This isn't the last change that will need a split source tree for 2.13. But this particular change could go several ways:

- (Split source tree)
- Inline the Scala 2.12 implementation
- Reflection

For this change alone any are possible, and splitting the source tree is a bit overkill. But if it will be necessary for other JIRAs (see umbrella SPARK-25075), then it might be the easiest way to implement this.

### Why are the changes needed?

Scala 2.13 split Ordering.Double into Ordering.Double.TotalOrdering and Ordering.Double.IeeeOrdering. Neither can be used in a single build that supports 2.12 and 2.13.

TotalOrdering works like java.lang.Double.compare. IeeeOrdering works like Scala 2.12 Ordering.Double. They differ in how NaN is handled - compares always above other values? or always compares as 'false'? In theory they have different uses: TotalOrdering is important if floating-point values are sorted. IeeeOrdering behaves like 2.12 and JVM comparison operators.

I chose TotalOrdering as I think we care more about stable sorting, and because elsewhere we rely on java.lang comparisons. It is also possible to support with two methods.

### Does this PR introduce any user-facing change?

Pending tests, will see if it obviously affects any sort order. We need to see if it changes NaN sort order.

### How was this patch tested?

Existing tests so far.

Closes #26654 from srowen/SPARK-30009.

Authored-by: Sean Owen <sean.owen@databricks.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-11-26 08:25:53 -08:00
wuyi 7b1b60c758 [SPARK-28574][CORE][FOLLOW-UP] Several minor improvements for event queue capacity config
### What changes were proposed in this pull request?

* Replace hard-coded conf `spark.scheduler.listenerbus.eventqueue` with a constant variable(`LISTENER_BUS_EVENT_QUEUE_PREFIX `) defined in `config/package.scala`.

* Update documentation for `spark.scheduler.listenerbus.eventqueue.capacity` in both `config/package.scala` and `docs/configuration.md`.

### Why are the changes needed?

* Better code maintainability

* Better user guidance of the conf

### Does this PR introduce any user-facing change?

No behavior changes but user will see the updated document.

### How was this patch tested?

Pass Jenkins.

Closes #26676 from Ngone51/SPARK-28574-followup.

Authored-by: wuyi <ngone_5451@163.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-11-26 08:20:26 -08:00
wuyi 780555bf60 [MINOR][CORE] Make EventLogger codec be consistent between EventLogFileWriter and SparkContext
### What changes were proposed in this pull request?

Use the same function (`codecName(conf: SparkConf)`) between `EventLogFileWriter` and `SparkContext` to get the consistent codec name for EventLogger.

### Why are the changes needed?

#24921 added a new conf for EventLogger's compression codec. We should reflect this change into `SparkContext` as well. Though I didn't find any places that `SparkContext.eventLogCodec` really takes an effect, I think it'd be better to have it as a right value.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Pass Jenkins.

Closes #26665 from Ngone51/consistent-eventLogCodec.

Authored-by: wuyi <ngone_5451@163.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-11-26 12:54:34 +08:00
Dongjoon Hyun 38240a74dc [SPARK-30030][INFRA] Use RegexChecker instead of TokenChecker to check org.apache.commons.lang.
### What changes were proposed in this pull request?

This PR replace `TokenChecker` with `RegexChecker` in `scalastyle` and fixes the missed instances.

### Why are the changes needed?

This will remove the old `comons-lang2` dependency from `core` module

**BEFORE**
```
$ dev/scalastyle
Scalastyle checks failed at following occurrences:
[error] /Users/dongjoon/PRS/SPARK-SerializationUtils/core/src/test/scala/org/apache/spark/util/PropertiesCloneBenchmark.scala:23:7: Use Commons Lang 3 classes (package org.apache.commons.lang3.*) instead
[error]     of Commons Lang 2 (package org.apache.commons.lang.*)
[error] Total time: 23 s, completed Nov 25, 2019 11:47:44 AM
```

**AFTER**
```
$ dev/scalastyle
Scalastyle checks passed.
```

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Pass the GitHub Action linter.

Closes #26666 from dongjoon-hyun/SPARK-29081-2.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-11-25 12:03:15 -08:00
shahid bec2068ae8 [SPARK-26260][CORE] For disk store tasks summary table should show only successful tasks summary
…sks metrics for disk store

### What changes were proposed in this pull request?

After https://github.com/apache/spark/pull/23088 task Summary table in the stage page shows successful tasks metrics for lnMemory store. In this PR, it added for disk store also.

### Why are the changes needed?

Now both InMemory and disk store will be consistent in showing the task summary table in the UI, if there are non successful tasks

### Does this PR introduce any user-facing change?

no
### How was this patch tested?

Added UT. Manually verified

Test steps:
1. add the config in spark-defaults.conf -> **spark.history.store.path /tmp/store**
2. sbin/start-hitoryserver
3. bin/spark-shell
4. `sc.parallelize(1 to 1000, 2).map(x => throw new Exception("fail")).count`

![Screenshot 2019-11-14 at 3 51 39 AM](https://user-images.githubusercontent.com/23054875/68809546-268d2e80-0692-11ea-8b2c-bee767478135.png)

Closes #26508 from shahidki31/task.

Authored-by: shahid <shahidki31@gmail.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-11-25 10:04:25 -08:00
Thomas Graves 2d5de25a99 [SPARK-29415][CORE] Stage Level Sched: Add base ResourceProfile and Request classes
### What changes were proposed in this pull request?

This PR is adding the base classes needed for Stage level scheduling. Its adding a ResourceProfile and the executor and task resource request classes.  These are made private for now until we get all the parts implemented, at which point this will become public interfaces.  I am adding them first as all the other subtasks for this feature require these classes.  If people have better ideas on breaking this feature up please let me know.

See https://issues.apache.org/jira/browse/SPARK-29415 for more detailed design.

### Why are the changes needed?

New API for stage level scheduling.  Its easier to add these first because the other jira for this features will all use them.

### Does this PR introduce any user-facing change?

Yes adds API to create a ResourceProfile with executor/task resources, see the spip jira https://issues.apache.org/jira/browse/SPARK-27495

Example of the api:
val rp = new ResourceProfile()
rp.require(new ExecutorResourceRequest("cores", 2))
rp.require(new ExecutorResourceRequest("gpu", 1, Some("/opt/gpuScripts/getGpus")))
rp.require(new TaskResourceRequest("gpu", 1))

### How was this patch tested?

Tested using Unit tests added with this PR.

Closes #26284 from tgravescs/SPARK-29415.

Authored-by: Thomas Graves <tgraves@nvidia.com>
Signed-off-by: Thomas Graves <tgraves@apache.org>
2019-11-25 09:36:39 -06:00
wuyi 456cfe6e46 [SPARK-29939][CORE] Add spark.shuffle.mapStatus.compression.codec conf
### What changes were proposed in this pull request?

Add a new conf named `spark.shuffle.mapStatus.compression.codec` for user to decide which codec should be used(default by `zstd`) for `MapStatus` compression.

### Why are the changes needed?

We already have this functionality for `broadcast`/`rdd`/`shuffle`/`shuflleSpill`,
so it might be better to have the same functionality for `MapStatus` as well.

### Does this PR introduce any user-facing change?

Yes, user now could use `spark.shuffle.mapStatus.compression.codec` to decide which codec should be used during `MapStatus` compression.
### How was this patch tested?

N/A

Closes #26611 from Ngone51/SPARK-29939.

Authored-by: wuyi <ngone_5451@163.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-11-24 21:21:19 -08:00
Prakhar Jain 6898be9f02 [SPARK-29681][WEBUI] Support column sorting in Environment tab
### What changes were proposed in this pull request?
Add extra classnames to table headers in EnvironmentPage tables in Spark UI.

### Why are the changes needed?
SparkUI uses sorttable.js to provide the sort functionality in different tables. This library tries to guess the datatype of each column during initialization phase - numeric/alphanumeric etc and uses it to sort the columns whenever user clicks on a column. That way it guesses incorrect data type for environment tab.

sorttable.js has way to hint the datatype of table columns explicitly. This is done by passing custom HTML class attribute.

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
Manually tested sorting in tables in Environment tab in Spark UI.

![Annotation 2019-11-22 154058](https://user-images.githubusercontent.com/2551496/69417432-a8d6bc00-0d3e-11ea-865b-f8017976c6f4.png)
![Annotation 2019-11-22 153600](https://user-images.githubusercontent.com/2551496/69417433-a8d6bc00-0d3e-11ea-9a75-8e1f4d66107e.png)
![Annotation 2019-11-22 153841](https://user-images.githubusercontent.com/2551496/69417435-a96f5280-0d3e-11ea-85f6-9f61b015e161.png)

Closes #26638 from prakharjain09/SPARK-29681-SPARK-UI-SORT.

Authored-by: Prakhar Jain <prakjai@microsoft.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-11-23 18:09:02 -08:00
Kousuke Saruta 6cd6d5f57e [SPARK-29970][WEBUI] Preserver open/close state of Timelineview
### What changes were proposed in this pull request?

Fix a bug related to Timelineview that does not preserve open/close state properly.

### Why are the changes needed?

To preserve open/close state is originally intended but it doesn't work.

### Does this PR introduce any user-facing change?

Yes. open/close state for Timeineview is to be preserved so if you open Timelineview in Stage page and go to another page, and then go back to Stage page, Timelineview should keep open.

### How was this patch tested?

Manual test.

Closes #26607 from sarutak/fix-timeline-view-state.

Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-11-23 16:16:24 -08:00
gengjiaan 85c004d5b0 [SPARK-29885][PYTHON][CORE] Improve the exception message when reading the daemon port
### What changes were proposed in this pull request?
In production environment, my PySpark application occurs an exception and it's message as below:
```
19/10/28 16:15:03 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
org.apache.spark.SparkException: No port number in pyspark.daemon's stdout
	at org.apache.spark.api.python.PythonWorkerFactory.startDaemon(PythonWorkerFactory.scala:204)
	at org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:122)
	at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:95)
	at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:117)
	at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:108)
	at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:65)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
	at org.apache.spark.rdd.RDD$$anonfun$7.apply(RDD.scala:337)
	at org.apache.spark.rdd.RDD$$anonfun$7.apply(RDD.scala:335)
	at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1182)
	at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1156)
	at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1091)
	at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1156)
	at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:882)
	at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:335)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:286)
	at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:65)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
	at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
	at org.apache.spark.scheduler.Task.run(Task.scala:121)
	at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
```
At first, I think a physical node has many ports are occupied by a large number of processes.
But I found the total number of ports in use is only 671.
```
[yarnr1115 ~]$ netstat -a | wc -l 671
671
```
I  checked the code of PythonWorkerFactory in line 204 and found:
```
 daemon = pb.start()
 val in = new DataInputStream(daemon.getInputStream)
 try {
 daemonPort = in.readInt()
 } catch {
 case _: EOFException =>
 throw new SparkException(s"No port number in $daemonModule's stdout")
 }
```
I added some code here:
```
logError("Meet EOFException, daemon is alive: ${daemon.isAlive()}")
logError("Exit value: ${daemon.exitValue()}")
```
Then I recurrent the exception and it's message as below:
```
19/10/28 16:15:03 ERROR PythonWorkerFactory: Meet EOFException, daemon is alive: false
19/10/28 16:15:03 ERROR PythonWorkerFactory: Exit value: 139
19/10/28 16:15:03 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
org.apache.spark.SparkException: No port number in pyspark.daemon's stdout
 at org.apache.spark.api.python.PythonWorkerFactory.startDaemon(PythonWorkerFactory.scala:206)
 at org.apache.spark.api.python.PythonWorkerFactory.createThroughDaemon(PythonWorkerFactory.scala:122)
 at org.apache.spark.api.python.PythonWorkerFactory.create(PythonWorkerFactory.scala:95)
 at org.apache.spark.SparkEnv.createPythonWorker(SparkEnv.scala:117)
 at org.apache.spark.api.python.BasePythonRunner.compute(PythonRunner.scala:108)
 at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:65)
 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
 at org.apache.spark.rdd.RDD$$anonfun$7.apply(RDD.scala:337)
 at org.apache.spark.rdd.RDD$$anonfun$7.apply(RDD.scala:335)
 at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1182)
 at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1156)
 at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:1091)
 at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1156)
 at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:882)
 at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:335)
 at org.apache.spark.rdd.RDD.iterator(RDD.scala:286)
 at org.apache.spark.api.python.PythonRDD.compute(PythonRDD.scala:65)
 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
 at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
 at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
 at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
 at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
 at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
 at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
 at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
 at org.apache.spark.scheduler.Task.run(Task.scala:121)
 at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
 at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
 at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
```
I think the exception message has caused me a lot of confusion.
This PR will add meaningful log for exception information.

### Why are the changes needed?
In order to clarify the exception and try three times default.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
Exists UT.

Closes #26510 from beliefer/improve-except-message.

Authored-by: gengjiaan <gengjiaan@360.cn>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2019-11-21 16:13:42 +09:00
Sean Owen 1febd373ea [MINOR][TESTS] Replace JVM assert with JUnit Assert in tests
### What changes were proposed in this pull request?

Use JUnit assertions in tests uniformly, not JVM assert() statements.

### Why are the changes needed?

assert() statements do not produce as useful errors when they fail, and, if they were somehow disabled, would fail to test anything.

### Does this PR introduce any user-facing change?

No. The assertion logic should be identical.

### How was this patch tested?

Existing tests.

Closes #26581 from srowen/assertToJUnit.

Authored-by: Sean Owen <sean.owen@databricks.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-11-20 14:04:15 -06:00
jiake a8d98833b8 [SPARK-29893] improve the local shuffle reader performance by changing the reading task number from 1 to multi
### What changes were proposed in this pull request?
This PR update the local reader task number from 1 to multi `partitionStartIndices.length`.

### Why are the changes needed?
Improve the performance of local shuffle reader.

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
Existing UTs

Closes #26516 from JkSelf/improveLocalShuffleReader.

Authored-by: jiake <ke.a.jia@intel.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-11-19 19:18:08 +08:00
yudovin 2e71a6e7ba [SPARK-27558][CORE] Gracefully cleanup task when it fails with OOM exception
### What changes were proposed in this pull request?

When a task fails with OOM exception, the `UnsafeInMemorySorter.array` could be `null`. In the meanwhile, the `cleanupResources()` on task completion would call `UnsafeInMemorySorter.getMemoryUsage` in turn, and that lead to another NPE thrown.

### Why are the changes needed?

Check if `array` is null in `UnsafeInMemorySorter.getMemoryUsage` and it should help to avoid NPE.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
It was tested manually.

Closes #26349 from ayudovin/fix-npe-in-listener.

Authored-by: yudovin <artsiom.yudovin@profitero.com>
Signed-off-by: Xingbo Jiang <xingbo.jiang@databricks.com>
2019-11-18 22:05:34 -08:00
Pavithra Ramachandran 5336473004 [SPARK-29476][WEBUI] add tooltip for Thread
### What changes were proposed in this pull request?
Adding tooltip for Thread Dump - Thread Locks

Before:
![Screenshot from 2019-11-04 17-11-22](https://user-images.githubusercontent.com/51401130/68127349-b963f580-ff3b-11e9-8547-e01907382632.png)

After:
![Screenshot from 2019-11-13 18-12-54](https://user-images.githubusercontent.com/51401130/68768698-08e7a700-0649-11ea-804b-2eb4d5f162b4.png)

### Why are the changes needed?
Thread Dump tab do not have any tooltip for the columns, Some page provide tooltip , inorder to resolve the inconsistency and for better user experience.

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
Manual

Closes #26386 from PavithraRamachandran/threadDump_tooltip.

Authored-by: Pavithra Ramachandran <pavi.rams@gmail.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-11-16 13:20:05 -06:00
HyukjinKwon 17321782de [SPARK-26923][R][SQL][FOLLOW-UP] Show stderr in the exception whenever possible in RRunner
### What changes were proposed in this pull request?

This is a followup of https://github.com/apache/spark/pull/23977 I made a mistake related to this line: 3725b1324f (diff-71c2cad03f08cb5f6c70462aa4e28d3aL112)

Previously,

1. the reader iterator for R worker read some initial data eagerly during RDD materialization. So it read the data before actual execution. For some reasons, in this case, it showed standard error from R worker.

2. After that, when error happens during actual execution, stderr wasn't shown: 3725b1324f (diff-71c2cad03f08cb5f6c70462aa4e28d3aL260)

After my change 3725b1324f (diff-71c2cad03f08cb5f6c70462aa4e28d3aL112), it now ignores 1. case and only does 2. of previous code path, because 1. does not happen anymore as I avoided to such eager execution (which is consistent with PySpark code path).

This PR proposes to do only 1.  before/after execution always because It is pretty much possible R worker was failed during actual execution and it's best to show the stderr from R worker whenever possible.

### Why are the changes needed?

It currently swallows standard error from R worker which makes debugging harder.

### Does this PR introduce any user-facing change?

Yes,

```R
df <- createDataFrame(list(list(n=1)))
collect(dapply(df, function(x) {
  stop("asdkjasdjkbadskjbsdajbk")
  x
}, structType("a double")))
```

**Before:**

```
Error in handleErrors(returnStatus, conn) :
  org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 13.0 failed 1 times, most recent failure: Lost task 0.0 in stage 13.0 (TID 13, 192.168.35.193, executor driver): org.apache.spark.SparkException: R worker exited unexpectedly (cranshed)
	at org.apache.spark.api.r.RRunner$$anon$1.read(RRunner.scala:130)
	at org.apache.spark.api.r.BaseRRunner$ReaderIterator.hasNext(BaseRRunner.scala:118)
	at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
	at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown Source)
	at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
	at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:726)
	at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:337)
	at org.apache.spark.
```

**After:**

```
Error in handleErrors(returnStatus, conn) :
  org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1.0 (TID 1, 192.168.35.193, executor driver): org.apache.spark.SparkException: R unexpectedly exited.
R worker produced errors: Error in computeFunc(inputData) : asdkjasdjkbadskjbsdajbk

	at org.apache.spark.api.r.BaseRRunner$ReaderIterator$$anonfun$1.applyOrElse(BaseRRunner.scala:144)
	at org.apache.spark.api.r.BaseRRunner$ReaderIterator$$anonfun$1.applyOrElse(BaseRRunner.scala:137)
	at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:38)
	at org.apache.spark.api.r.RRunner$$anon$1.read(RRunner.scala:128)
	at org.apache.spark.api.r.BaseRRunner$ReaderIterator.hasNext(BaseRRunner.scala:113)
	at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
	at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegen
```

### How was this patch tested?

Manually tested and unittest was added.

Closes #26517 from HyukjinKwon/SPARK-26923-followup.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2019-11-15 11:13:36 +09:00
turbofei ab981f10a6 [SPARK-29857][WEB UI] Defer render the spark UI dataTables
### What changes were proposed in this pull request?
This PR support defer render the spark UI page.
### Why are the changes needed?
When there are many items, such as tasks and application lists, the renderer of dataTables is heavy, we can enable deferRender to optimize it.
See details in https://datatables.net/examples/ajax/defer_render.html
### Does this PR introduce any user-facing change?
No.
### How was this patch tested?
Not needed.

Closes #26482 from turboFei/SPARK-29857-defer-render.

Authored-by: turbofei <fwang12@ebay.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-11-14 18:16:45 -06:00
Liang-Chi Hsieh 39596b913b [SPARK-29649][SQL] Stop task set if FileAlreadyExistsException was thrown when writing to output file
### What changes were proposed in this pull request?

We already know task attempts that do not clean up output files in staging directory can cause job failure (SPARK-27194). There was proposals trying to fix it by changing output filename, or deleting existing output files. These proposals are not reliable completely.

The difficulty is, as previous failed task attempt wrote the output file, at next task attempt the output file is still under same staging directory, even the output file name is different.

If the job will go to fail eventually, there is no point to re-run the task until max attempts are reached. For the jobs running a lot of time, re-running the task can waste a lot of time.

This patch proposes to let Spark detect such file already exist exception and stop the task set early.

### Why are the changes needed?

For now, if FileAlreadyExistsException is thrown during data writing job in SQL, the job will continue re-running task attempts until max failure number is reached. It is no point for re-running tasks as task attempts will also fail because they can not write to the existing file too. We should stop the task set early.

### Does this PR introduce any user-facing change?

Yes. If FileAlreadyExistsException is thrown during data writing job in SQL, no more task attempts are re-tried and the task set will be stoped early.

### How was this patch tested?

Unit test.

Closes #26312 from viirya/stop-taskset-if-outputfile-exists.

Authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-11-13 18:01:38 -08:00
Kent Yao 15a72f3755 [SPARK-29287][CORE] Add LaunchedExecutor message to tell driver which executor is ready for making offers
### What changes were proposed in this pull request?

Add `LaunchedExecuto`r message and send it to the driver when the executor if fully constructed, then the driver can assign the associated executor's totalCores to freeCores for making offers.

### Why are the changes needed?
The executors send RegisterExecutor messages to the driver when onStart.

The driver put the executor data in “the ready to serve map” if it could be, then send RegisteredExecutor back to the executor.  The driver now can make an offer to this executor.

But the executor is not fully constructed yet. When it received RegisteredExecutor, it start to construct itself, initializing block manager, maybe register to the local shuffle server in the way of retrying, then start the heart beating to driver ...

The task allocated here may fail if the executor fails to start or cannot get heart beating to the driver in time.

Sometimes, even worse, when dynamic allocation and blacklisting is enabled and when the runtime executor number down to min executor setting, and those executors receive tasks before fully constructed and if any error happens, the application may be blocked or tear down.

### Does this PR introduce any user-facing change?

NO

### How was this patch tested?

Closes #25964 from yaooqinn/SPARK-29287.

Authored-by: Kent Yao <yaooqinn@hotmail.com>
Signed-off-by: Xingbo Jiang <xingbo.jiang@databricks.com>
2019-11-13 16:14:12 -08:00
Nishchal Venkataramana 833a9f12e2 [SPARK-24203][CORE] Make executor's bindAddress configurable
### What changes were proposed in this pull request?
With this change, executor's bindAddress is passed as an input parameter for RPCEnv.create.
A previous PR https://github.com/apache/spark/pull/21261 which addressed the same, was using a Spark Conf property to get the bindAddress which wouldn't have worked for multiple executors.
This PR is to enable anyone overriding CoarseGrainedExecutorBackend with their custom one to be able to invoke CoarseGrainedExecutorBackend.main() along with the option to configure bindAddress.

### Why are the changes needed?
This is required when Kernel-based Virtual Machine (KVM)'s are used inside Linux container where the hostname is not the same as container hostname.

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
Tested by running jobs with executors on KVMs inside a linux container.

Closes #26331 from nishchalv/SPARK-29670.

Lead-authored-by: Nishchal Venkataramana <nishchal@apple.com>
Co-authored-by: nishchal <nishchal@apple.com>
Signed-off-by: DB Tsai <d_tsai@apple.com>
2019-11-13 22:01:48 +00:00
Marcelo Vanzin 56a0b5421e [SPARK-29399][CORE] Remove old ExecutorPlugin interface
SPARK-29397 added new interfaces for creating driver and executor
plugins. These were added in a new, more isolated package that does
not pollute the main o.a.s package.

The old interface is now redundant. Since it's a DeveloperApi and
we're about to have a new major release, let's remove it instead of
carrying more baggage forward.

Closes #26390 from vanzin/SPARK-29399.

Authored-by: Marcelo Vanzin <vanzin@cloudera.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2019-11-13 09:52:40 +09:00
Ankitraj 45e212e161 [SPARK-29570][WEBUI] Improve tooltip for Executor Tab for Shuffle Write,Blacklisted,Logs,Threaddump columns
### What changes were proposed in this pull request?
All tooltips message will display in centre.

### Why are the changes needed?
Some time tooltips will hide the data of column and tooltips display position will be inconsistent in UI.

### Does this PR introduce any user-facing change?
yes.

![Screenshot 2019-10-26 at 3 08 51 AM](https://user-images.githubusercontent.com/8948111/67606124-04dd0d80-f79e-11e9-865a-b7e9bffc9890.png)

### How was this patch tested?
Manual test.

Closes #26263 from 07ARB/SPARK-29570.

Lead-authored-by: Ankitraj <8948111+07ARB@users.noreply.github.com>
Co-authored-by: 07ARB <ankitrajboudh@gmail.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-11-12 18:49:54 -06:00
lajin 5cb05f4100 [SPARK-29298][CORE] Separate block manager heartbeat endpoint from driver endpoint
### What changes were proposed in this pull request?
Executor's heartbeat will send synchronously to BlockManagerMaster to let it know that the block manager is still alive. In a heavy cluster, it will timeout and cause block manager re-register unexpected.
This improvement will separate a heartbeat endpoint from the driver endpoint. In our production environment, this was really helpful to prevent executors from unstable up and down.

### Why are the changes needed?
`BlockManagerMasterEndpoint` handles many events from executors like `RegisterBlockManager`, `GetLocations`, `RemoveShuffle`, `RemoveExecutor` etc. In a heavy cluster/app, it is always busy. The `BlockManagerHeartbeat` event also was handled in this endpoint. We found it may timeout when it's busy. So we add a new endpoint `BlockManagerMasterHeartbeatEndpoint` to handle heartbeat separately.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
Exist UTs

Closes #25971 from LantaoJin/SPARK-29298.

Lead-authored-by: lajin <lajin@ebay.com>
Co-authored-by: Alan Jin <lajin@ebay.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-11-12 16:24:48 +08:00
Xingbo Jiang 0346afa8fc [SPARK-29001][CORE] Print events that take too long time to process
### What changes were proposed in this pull request?
Print events that take too long time to process, to help find out what type of events is slow.
Introduce two extra configs:
* **spark.scheduler.listenerbus.logSlowEvent.enabled** Whether to enable log the events that are slow
* **spark.scheduler.listenerbus.logSlowEvent.threshold** The time threshold of whether an event is considered to be slow.

### How was this patch tested?
N/A

Closes #25702 from jiangxb1987/SPARK-29001.

Authored-by: Xingbo Jiang <xingbo.jiang@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-11-12 14:08:13 +08:00
Jungtaek Lim (HeartSaVioR) df08e903b5 [SPARK-29755][CORE] Provide @JsonDeserialize for Option[Long] in LogInfo & AttemptInfoWrapper
### What changes were proposed in this pull request?

This patch adds `JsonDeserialize` annotation for the field which type is `Option[Long]` in LogInfo/AttemptInfoWrapper. It hits https://github.com/FasterXML/jackson-module-scala/wiki/FAQ#deserializing-optionint-and-other-primitive-challenges - other existing json models take care of this, but we missed to add annotation to these classes.

### Why are the changes needed?

Without this change, SHS will throw ClassNotFoundException when rebuilding App UI.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Manually tested.

Closes #26397 from HeartSaVioR/SPARK-29755.

Authored-by: Jungtaek Lim (HeartSaVioR) <kabhwan.opensource@gmail.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-11-11 15:49:16 -08:00
Luca Canali 2888009d66 [SPARK-29654][CORE] Add configuration to allow disabling registration of static sources to the metrics system
### What changes were proposed in this pull request?
The Spark metrics system produces many different metrics and not all of them are used at the same time. This proposes to introduce a configuration parameter to allow disabling the registration of metrics in the "static sources" category.

### Why are the changes needed?

This allows to reduce the load and clutter on the sink, in the cases when the metrics in question are not needed. The metrics registerd as "static sources" are under the namespaces CodeGenerator and HiveExternalCatalog and can produce a significant amount of data, as they are registered for the driver and executors.

### Does this PR introduce any user-facing change?
It introduces a new configuration parameter `spark.metrics.register.static.sources.enabled`

### How was this patch tested?
Manually tested.

```
$ cat conf/metrics.properties
*.sink.prometheusServlet.class=org.apache.spark.metrics.sink.PrometheusServlet
*.sink.prometheusServlet.path=/metrics/prometheus
master.sink.prometheusServlet.path=/metrics/master/prometheus
applications.sink.prometheusServlet.path=/metrics/applications/prometheus

$ bin/spark-shell

$ curl -s http://localhost:4040/metrics/prometheus/ | grep Hive
metrics_local_1573330115306_driver_HiveExternalCatalog_fileCacheHits_Count 0
metrics_local_1573330115306_driver_HiveExternalCatalog_filesDiscovered_Count 0
metrics_local_1573330115306_driver_HiveExternalCatalog_hiveClientCalls_Count 0
metrics_local_1573330115306_driver_HiveExternalCatalog_parallelListingJobCount_Count 0
metrics_local_1573330115306_driver_HiveExternalCatalog_partitionsFetched_Count 0

$ bin/spark-shell --conf spark.metrics.static.sources.enabled=false
$ curl -s http://localhost:4040/metrics/prometheus/ | grep Hive
```

Closes #26320 from LucaCanali/addConfigRegisterStaticMetrics.

Authored-by: Luca Canali <luca.canali@cern.ch>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-11-09 12:13:13 -08:00
Sean Owen 4d9c36d5ba [SPARK-29795][CORE] Explicitly clear registered metrics on MetricSystem shutdown
### What changes were proposed in this pull request?

Explicitly clear registered metrics when `MetricsSystem` shuts down.

### Why are the changes needed?

See https://issues.apache.org/jira/browse/SPARK-29795 for a complete explanation. The TL;DR is there is some evidence this could leak resources after Spark is shut down, and that may be a minor issue in Spark 3+ for apps or tests that re-start SparkContexts in the same JVM.

### Does this PR introduce any user-facing change?

The possible difference here is that, after Spark is stopped, metrics are no longer available. It's unclear to me whether this is intended behavior anyway.

### How was this patch tested?

See https://issues.apache.org/jira/browse/SPARK-29795 for more context:
- Spark 3 already passes tests without this change
- Spark 2.4 does too, as exists in branch-2.4 now
- Spark 2.4 fails tests if metrics 4.x is used, without this change

The last point is not directly relevant, as Spark 2.4 will not use metrics 4.x. It's evidence that it addresses some potential issue, however.

Closes #26427 from srowen/SPARK-29795.

Authored-by: Sean Owen <sean.owen@databricks.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-11-09 08:47:05 -06:00
Gabor Somogyi 12598e1b93 [MINOR] FsHistoryProvider import cleanup
### What changes were proposed in this pull request?
As it has been discussed in https://github.com/apache/spark/pull/26397#discussion_r343726691 `FsHistoryProvider` import section has to be cleaned up.

### Why are the changes needed?
Unused imports.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
Existing unit tests.

Closes #26436 from gaborgsomogyi/SPARK-29755.

Authored-by: Gabor Somogyi <gabor.g.somogyi@gmail.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-11-09 08:40:56 -06:00
HyukjinKwon 4ec04e5ef3 [SPARK-22340][PYTHON] Add a mode to pin Python thread into JVM's
## What changes were proposed in this pull request?

This PR proposes to add **Single threading model design (pinned thread model)** mode which is an experimental mode to sync threads on PVM and JVM. See https://www.py4j.org/advanced_topics.html#using-single-threading-model-pinned-thread

### Multi threading model

Currently, PySpark uses this model. Threads on PVM and JVM are independent. For instance, in a different Python thread, callbacks are received and relevant Python codes are executed. JVM threads are reused when possible.

Py4J will create a new thread every time a command is received and there is no thread available. See the current model we're using - https://www.py4j.org/advanced_topics.html#the-multi-threading-model

One problem in this model is that we can't sync threads on PVM and JVM out of the box. This leads to some problems in particular at some codes related to threading in JVM side. See:
7056e004ee/core/src/main/scala/org/apache/spark/SparkContext.scala (L334)
Due to reusing JVM threads, seems the job groups in Python threads cannot be set in each thread as described in the JIRA.

### Single threading model design (pinned thread model)

This mode pins and syncs the threads on PVM and JVM to work around the problem above. For instance, in the same Python thread, callbacks are received and relevant Python codes are executed. See https://www.py4j.org/advanced_topics.html#the-single-threading-model

Even though this mode can sync threads on PVM and JVM for other thread related code paths,
 this might cause another problem: seems unable to inherit properties as below (assuming multi-thread mode still creates new threads when existing threads are busy, I suspect this issue already exists when multiple jobs are submitted in multi-thread mode; however, it can be always seen in single threading mode):

```bash
$ PYSPARK_PIN_THREAD=true ./bin/pyspark
```

```python
import threading

spark.sparkContext.setLocalProperty("a", "hi")
def print_prop():
    print(spark.sparkContext.getLocalProperty("a"))

threading.Thread(target=print_prop).start()
```

```
None
```

Unlike Scala side:

```scala
spark.sparkContext.setLocalProperty("a", "hi")
new Thread(new Runnable {
  def run() = println(spark.sparkContext.getLocalProperty("a"))
}).start()
```

```
hi
```

This behaviour potentially could cause weird issues but this PR currently does not target this fix this for now since this mode is experimental.

### How does this PR fix?

Basically there are two types of Py4J servers `GatewayServer` and `ClientServer`.  The former is for multi threading and the latter is for single threading. This PR adds a switch to use the latter.

In Scala side:
The logic to select a server is encapsulated in `Py4JServer` and use `Py4JServer` at `PythonRunner` for Spark summit and `PythonGatewayServer` for Spark shell. Each uses `ClientServer` when `PYSPARK_PIN_THREAD` is `true` and `GatewayServer` otherwise.

In Python side:
Simply do an if-else to switch the server to talk. It uses `ClientServer` when `PYSPARK_PIN_THREAD` is `true` and `GatewayServer` otherwise.

This is disabled by default for now.

## How was this patch tested?

Manually tested. This can be tested via:

```python
PYSPARK_PIN_THREAD=true ./bin/pyspark
```

and/or

```bash
cd python
./run-tests --python-executables=python --testnames "pyspark.tests.test_pin_thread"
```

Also, ran the Jenkins tests with `PYSPARK_PIN_THREAD` enabled.

Closes #24898 from HyukjinKwon/pinned-thread.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2019-11-08 06:44:58 +09:00
Jungtaek Lim (HeartSaVioR) 252ecd333f [SPARK-29635][SS] Extract base test suites between Kafka micro-batch sink and Kafka continuous sink
### What changes were proposed in this pull request?

This patch leverages V2 continuous memory stream to extract tests from Kafka micro-batch sink suite and continuous sink suite and deduplicate them. These tests are basically doing the same, except how to run and verify the result.

### Why are the changes needed?

We no longer have same tests spotted on two places - brings 300 lines deletion.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Existing UTs.

Closes #26292 from HeartSaVioR/SPARK-29635.

Authored-by: Jungtaek Lim (HeartSaVioR) <kabhwan.opensource@gmail.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-11-06 17:08:42 -08:00
Thomas Graves 075cd557f1 [SPARK-29763] Fix Stage UI Page not showing all accumulators in Task Table
### What changes were proposed in this pull request?
Fix the task table UI to show all accumulators.

Below example was creating 2 accumulators
scala> val accum = sc.longAccumulator("My Accumulator")
scala> val accum2 = sc.longAccumulator("My Accumulator")
scala> sc.parallelize(Array(1, 2, 3, 4)).foreach(x => {
     accum2.add(x)
     accum.add(x)
     })

Before this change, only shows a single on in task table:
![beforefixtaskui](https://user-images.githubusercontent.com/4563792/68225858-b0fcd080-ffb6-11e9-8561-3dc25a81a106.png)

After this change you can see all of them:
![tasktablegood](https://user-images.githubusercontent.com/4563792/68225911-c5d96400-ffb6-11e9-952a-18d3738711d1.png)

### Why are the changes needed?

Its not showing all accumulators now.

### Does this PR introduce any user-facing change?

no

### How was this patch tested?

Manual testing the UI.

Closes #26402 from tgravescs/SPARK-29763.

Authored-by: Thomas Graves <tgraves@nvidia.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-11-05 14:15:14 -08:00
Alessandro Bellina 3cb18d90c4 [SPARK-29151][CORE] Support fractional resources for task resource scheduling
### What changes were proposed in this pull request?
This PR adds the ability for tasks to request fractional resources, in order to be able to execute more than 1 task per resource. For example, if you have 1 GPU in the executor, and the task configuration is 0.5 GPU/task, the executor can schedule two tasks to run on that 1 GPU.

### Why are the changes needed?
Currently there is no good way to share a resource such that multiple tasks can run on a single unit. This allows multiple tasks to share an executor resource.

### Does this PR introduce any user-facing change?
Yes: There is a configuration change where `spark.task.resource.[resource type].amount` can now be fractional.

### How was this patch tested?
Unit tests and manually on standalone mode, and yarn.

Closes #26078 from abellina/SPARK-29151.

Authored-by: Alessandro Bellina <abellina@nvidia.com>
Signed-off-by: Thomas Graves <tgraves@apache.org>
2019-11-05 08:57:43 -06:00
Marcelo Vanzin d51d228048 [SPARK-29397][CORE] Extend plugin interface to include the driver
Spark 2.4 added the ability for executor plugins to be loaded into
Spark (see SPARK-24918). That feature intentionally skipped the
driver to keep changes small, and also because it is possible to
load code into the Spark driver using listeners + configuration.

But that is a bit awkward, because the listener interface does not
provide hooks into a lot of Spark functionality. This change reworks
the executor plugin interface to also extend to the driver.

- there's a "SparkPlugin" main interface that provides APIs to
  load driver and executor components.
- custom metric support (added in SPARK-28091) can be used by
  plugins to register metrics both in the driver process and in
  executors.
- a communication channel now exists that allows the plugin's
  executor components to send messages to the plugin's driver
  component easily, using the existing Spark RPC system.

The latter was a feature intentionally left out of the original
plugin design (also because it didn't include a driver component).

To avoid polluting the "org.apache.spark" namespace, I added the new
interfaces to the "org.apache.spark.api" package, which seems like
a better place in any case. The actual implementation is kept in
an internal package.

The change includes unit tests for the new interface and features,
but I've also been running a custom plugin that extends the new
API in real applications.

Closes #26170 from vanzin/SPARK-29397.

Authored-by: Marcelo Vanzin <vanzin@cloudera.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-11-04 14:33:17 -08:00
shahid 9023c69db8 [SPARK-29590][WEBUI] JDBC/ODBC tab in the spark UI support hide tables, to make it consistent with other tabs
### What changes were proposed in this pull request?

Currently, JDBC/ODBC tab in the WEBUI doesn't support hiding table. Other tabs in the web ui like, Jobs, stages, SQL etc supports hiding table (refer https://github.com/apache/spark/pull/22592).
In this PR, added the support for hide table in the jdbc/odbc tab also.

### Why are the changes needed?
Spark ui about the contents of the form need to have hidden and show features, when the table records very much. Because sometimes you do not care about the record of the table, you just want to see the contents of the next table, but you have to scroll the scroll bar for a long time to see the contents of the next table.

### Does this PR introduce any user-facing change?
No, except support of hide table

### How was this patch tested?
Manually tested
 ![Screenshot 2019-11-01 at 12 10 05 PM](https://user-images.githubusercontent.com/23054875/68007364-61aa5d80-fca1-11e9-841e-c5a7382871fa.png)
![Screenshot 2019-11-01 at 12 10 43 PM](https://user-images.githubusercontent.com/23054875/68007355-5a834f80-fca1-11e9-844a-f4ba1a333db7.png)

Closes #26353 from shahidki31/hideTable.

Authored-by: shahid <shahidki31@gmail.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-11-04 09:44:10 -06:00
Kent Yao 8cf76f8d61 [SPARK-29285][SHUFFLE] Temporary shuffle files should be able to handle disk failures
### What changes were proposed in this pull request?

The `getFile` method in `DiskBlockManager` may return a file with an existing subdirectory. But when a disk failure occurs on that subdirectory. this file is inaccessible.
Then the FileNotFoundException like the following usually tear down the entire task, which is a bit heavy.
```
java.io.FileNotFoundException: /mnt/dfs/4/yarn/local/usercache/da_haitao/appcache/application_1568691584183_1953115/blockmgr-cc4689f5-eddd-4b99-8af4-4166a86ec30b/10/temp_shuffle_79be5049-d1d5-4a81-8e67-4ef236d3834f (No such file or directory)
	at java.io.FileOutputStream.open0(Native Method)
	at java.io.FileOutputStream.open(FileOutputStream.java:270)
	at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
	at org.apache.spark.storage.DiskBlockObjectWriter.initialize(DiskBlockObjectWriter.scala:103)
	at org.apache.spark.storage.DiskBlockObjectWriter.open(DiskBlockObjectWriter.scala:116)
	at org.apache.spark.storage.DiskBlockObjectWriter.write(DiskBlockObjectWriter.scala:249)
	at org.apache.spark.shuffle.sort.ShuffleExternalSorter.writeSortedFile(ShuffleExternalSorter.java:209)
	at org.apache.spark.shuffle.sort.ShuffleExternalSorter.closeAndGetSpills(ShuffleExternalSorter.java:416)
	at org.apache.spark.shuffle.sort.UnsafeShuffleWriter.closeAndWriteOutput(UnsafeShuffleWriter.java:230)
	at org.apache.spark.shuffle.sort.UnsafeShuffleWriter.write(UnsafeShuffleWriter.java:190)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
	at org.apache.spark.scheduler.Task.run(Task.scala:109)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
```
This change pre-touch the temporary file to check whether the parent directory is available or not. If NOT, we may try another possibly heathy disk util we reach the max attempts.
### Why are the changes needed?

Re-running the whole task is much heavier than pick another heathy disk to output the temporary results.

### Does this PR introduce any user-facing change?

NO

### How was this patch tested?

ADD UT

Closes #25962 from yaooqinn/SPARK-29285.

Authored-by: Kent Yao <yaooqinn@hotmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-11-04 18:21:57 +08:00
Sean Owen 19b8c71436 [SPARK-29674][CORE] Update dropwizard metrics to 4.1.x for JDK 9+
### What changes were proposed in this pull request?

Update the version of dropwizard metrics that Spark uses for metrics to 4.1.x, from 3.2.x.

### Why are the changes needed?

This helps JDK 9+ support, per for example https://github.com/dropwizard/metrics/pull/1236

### Does this PR introduce any user-facing change?

No, although downstream users with custom metrics may be affected.

### How was this patch tested?

Existing tests.

Closes #26332 from srowen/SPARK-29674.

Authored-by: Sean Owen <sean.owen@databricks.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-11-03 15:13:06 -08:00
Maxim Gekk 80a89873b2 [SPARK-29733][TESTS] Fix wrong order of parameters passed to assertEquals
### What changes were proposed in this pull request?
The `assertEquals` method of JUnit Assert requires the first parameter to be the expected value. In this PR, I propose to change the order of parameters when the expected value is passed as the second parameter.

### Why are the changes needed?
Wrong order of assert parameters confuses when the assert fails and the parameters have special string representation. For example:
```java
assertEquals(input1.add(input2), new CalendarInterval(5, 5, 367200000000L));
```
```
java.lang.AssertionError:
Expected :interval 5 months 5 days 101 hours
Actual   :interval 5 months 5 days 102 hours
```

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
By existing tests.

Closes #26377 from MaxGekk/fix-order-in-assert-equals.

Authored-by: Maxim Gekk <max.gekk@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-11-03 11:21:28 -08:00
root1 39fff9258a [SPARK-29452][WEBUI] Improve Storage tab tooltip
### What changes were proposed in this pull request?
Added Tootips for each column in storage tab of Web UI.

### Why are the changes needed?
Tooltips will help users in understanding columns of storage tabs.

### Does this PR introduce any user-facing change?
Yes

### How was this patch tested?
Manually Tested.

Closes #26226 from iRakson/storage_tooltip.

Authored-by: root1 <raksonrakesh@gmail.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-11-01 08:27:34 -05:00
Gabor Somogyi 9c817a83fc [SPARK-29637][CORE] Add description to Job SHS web API
### Why are the changes needed?
Starting from Spark 2.3, the SHS REST API endpoint `/applications/<app_id>/jobs/` is not including `description` in the JobData returned. This is not the case until Spark 2.2.

In this PR I've added the mentioned field.

### Does this PR introduce any user-facing change?
Yes.

Old API response:
```
[ {
  "jobId" : 0,
  "name" : "foreach at <console>:26",
  "submissionTime" : "2019-10-28T12:41:54.301GMT",
  "completionTime" : "2019-10-28T12:41:54.731GMT",
  "stageIds" : [ 0 ],
  "jobGroup" : "test",
  "status" : "SUCCEEDED",
  "numTasks" : 1,
  "numActiveTasks" : 0,
  "numCompletedTasks" : 1,
  "numSkippedTasks" : 0,
  "numFailedTasks" : 0,
  "numKilledTasks" : 0,
  "numCompletedIndices" : 1,
  "numActiveStages" : 0,
  "numCompletedStages" : 1,
  "numSkippedStages" : 0,
  "numFailedStages" : 0,
  "killedTasksSummary" : { }
} ]
```
New API response:
```
[ {
  "jobId" : 0,
  "name" : "foreach at <console>:26",
  "description" : "job",                            <= This is the addition here
  "submissionTime" : "2019-10-28T13:37:24.107GMT",
  "completionTime" : "2019-10-28T13:37:24.613GMT",
  "stageIds" : [ 0 ],
  "jobGroup" : "test",
  "status" : "SUCCEEDED",
  "numTasks" : 1,
  "numActiveTasks" : 0,
  "numCompletedTasks" : 1,
  "numSkippedTasks" : 0,
  "numFailedTasks" : 0,
  "numKilledTasks" : 0,
  "numCompletedIndices" : 1,
  "numActiveStages" : 0,
  "numCompletedStages" : 1,
  "numSkippedStages" : 0,
  "numFailedStages" : 0,
  "killedTasksSummary" : { }
} ]
```

### How was this patch tested?
Extended + existing unit tests.

Manually:
* Open spark-shell
```
scala> sc.setJobGroup("test", "job", false);
scala> val foo = sc.textFile("/user/foo.txt");
foo: org.apache.spark.rdd.RDD[String] = /user/foo.txt MapPartitionsRDD[1] at textFile at <console>:24
scala> foo.foreach(println);
```
* Access REST API `http://SHS-host:port/api/v1/applications/<app-id>/jobs/`

Closes #26295 from gaborgsomogyi/SPARK-29637.

Authored-by: Gabor Somogyi <gabor.g.somogyi@gmail.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-10-29 12:00:52 -07:00
Liang-Chi Hsieh ae5b60da32 [SPARK-29182][CORE][FOLLOWUP] Cache preferred locations of checkpointed RDD
### What changes were proposed in this pull request?

This is a followup to #25856. This fixes the document about the config value of spark.rdd.checkpoint.cachePreferredLocsExpireTime.

### Why are the changes needed?

The document is not correct. spark.rdd.checkpoint.cachePreferredLocsExpireTime can not be 0.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

This is document only change.

Closes #26251 from viirya/SPARK-29182-followup.

Authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-25 11:13:06 -07:00
Pavithra Ramachandran 1296bbb8ac [SPARK-29504][WEBUI] Toggle full job description on click
### What changes were proposed in this pull request?
On clicking job description in jobs page, the description was not shown fully.
Add the function for the click event on description.

### Why are the changes needed?
when there is a long description of a job, it cannot be seen fully in the UI.
The feature was added in https://github.com/apache/spark/pull/24145
But it is missed after https://github.com/apache/spark/pull/25374

Before change:
![Screenshot from 2019-10-23 11-23-00](https://user-images.githubusercontent.com/51401130/67361914-827b0080-f587-11e9-9181-d49a6a836046.png)
After change: on Double click over decription
![Screenshot from 2019-10-23 11-20-02](https://user-images.githubusercontent.com/51401130/67361936-932b7680-f587-11e9-9e59-d290abed4b70.png)

### Does this PR introduce any user-facing change?

No

### How was this patch tested?
Manually test

Closes #26222 from PavithraRamachandran/jobs_description_tooltip.

Authored-by: Pavithra Ramachandran <pavi.rams@gmail.com>
Signed-off-by: Gengliang Wang <gengliang.wang@databricks.com>
2019-10-24 11:14:31 +02:00
DB Tsai fd899d6331 [SPARK-29576][CORE] Use Spark's CompressionCodec for Ser/Deser of MapOutputStatus
### What changes were proposed in this pull request?
Instead of using ZStd codec directly, we use Spark's CompressionCodec which wraps ZStd codec in a buffered stream to avoid overhead excessive of JNI call while trying to compress/decompress small amount of data.

Also, by using Spark's CompressionCodec, we can easily to make it configurable in the future if it's needed.

### Why are the changes needed?
Faster performance.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
Existing tests.

Closes #26235 from dbtsai/optimizeDeser.

Lead-authored-by: DB Tsai <d_tsai@apple.com>
Co-authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-23 18:17:37 -07:00
Xianyang Liu 0a7095156b [SPARK-29499][CORE][PYSPARK] Add mapPartitionsWithIndex for RDDBarrier
### What changes were proposed in this pull request?

Add mapPartitionsWithIndex for RDDBarrier.

### Why are the changes needed?

There is only one method in `RDDBarrier`. We often use the partition index as a label for the current partition. We need to get the index from `TaskContext`  index in the method of `mapPartitions` which is not convenient.
### Does this PR introduce any user-facing change?

No

### How was this patch tested?

New UT.

Closes #26148 from ConeyLiu/barrier-index.

Authored-by: Xianyang Liu <xianyang.liu@intel.com>
Signed-off-by: Xingbo Jiang <xingbo.jiang@databricks.com>
2019-10-23 13:46:09 +02:00
Sean Owen 80094688fd [SPARK-29556][CORE] Avoid putting request path in error response in ErrorServlet
### What changes were proposed in this pull request?

Don't include `$path` from user query in the error response.

### Why are the changes needed?

The path could contain input that is then rendered as HTML in the error response. It's not clear whether it's exploitable, but better safe than sorry as the path info really isn't that important in this context.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

Existing tests.

Closes #26211 from srowen/SPARK-29556.

Authored-by: Sean Owen <sean.owen@databricks.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-22 14:14:59 -07:00
chenjuanni 2036a8cca7 [SPARK-29488][WEBUI] In Web UI, stage page has js error when sort table
### What changes were proposed in this pull request?
In Web UI, stage page has js error when sort table.
https://issues.apache.org/jira/browse/SPARK-29488

### Why are the changes needed?
In Web UI, follow the steps below, get js error "Uncaught TypeError: Failed to execute 'removeChild' on 'Node': parameter 1 is not of type 'Node'.".
 1) Click "Summary Metrics..." 's tablehead "Min"
 2) Click "Aggregated Metrics by Executor" 's tablehead "Task Time"
 3) Click "Summary Metrics..." 's tablehead "Min"(the same as step 1.)

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
In Web UI, follow the steps below, no error occur.
 1) Click "Summary Metrics..." 's tablehead "Min"
 2) Click "Aggregated Metrics by Executor" 's tablehead "Task Time"
 3) Click "Summary Metrics..." 's tablehead "Min"(the same as step 1.)
![image](https://user-images.githubusercontent.com/7802338/66899878-464b1b80-f02e-11e9-9660-6cdaab283491.png)

Closes #26136 from cjn082030/SPARK-1.

Authored-by: chenjuanni <chenjuanni@inspur.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-10-22 08:58:12 -05:00
DB Tsai f4d5aa4213 [SPARK-29434][CORE] Improve the MapStatuses Serialization Performance
### What changes were proposed in this pull request?
Instead of using GZIP for compressing the serialized `MapStatuses`, ZStd provides better compression rate and faster compression time.

The original approach is serializing and writing data directly into `GZIPOutputStream` as one step; however, the compression time is faster if a bigger chuck of the data is processed by the codec at once. As a result, in this PR, the serialized data is written into an uncompressed byte array first, and then the data is compressed. For smaller `MapStatues`, we find it's 2x faster.

Here is the benchmark result.

#### 20k map outputs, and each has 500 blocks
1. ZStd two steps in this PR: 0.402 ops/ms, 89,066 bytes
2. ZStd one step as the original approach: 0.370 ops/ms, 89,069 bytes
3. GZip: 0.092 ops/ms, 217,345 bytes

#### 20k map outputs, and each has 5 blocks
1. ZStd two steps in this PR: 0.9 ops/ms, 75,449 bytes
2. ZStd one step as the original approach: 0.38 ops/ms, 75,452 bytes
3. GZip: 0.21 ops/ms, 160,094 bytes

### Why are the changes needed?
Decrease the time for serializing the `MapStatuses` in large scale job.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
Existing tests.

Closes #26085 from dbtsai/mapStatus.

Lead-authored-by: DB Tsai <d_tsai@apple.com>
Co-authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-20 13:56:23 -07:00
Dongjoon Hyun e4b4a35de2 [SPARK-29466][WEBUI] Show Duration for running drivers in Standalone master web UI
### What changes were proposed in this pull request?

This PR aims to add a new column `Duration` for running drivers in Apache Spark `Standalone` master web UI in order to improve UX. This help users like the other `Duration` columns in the `Running` and `Completed` application tables.

### Why are the changes needed?

When we use `--supervise`, the drivers can survive longer.
Technically, the `Duration` column is not the same. (Please see the image below.)

### Does this PR introduce any user-facing change?

Yes. The red box is added newly.

<img width="1312" alt="Screen Shot 2019-10-14 at 12 53 43 PM" src="https://user-images.githubusercontent.com/9700541/66779127-50301b80-ee82-11e9-853f-72222cd011ac.png">

### How was this patch tested?

Manual since this is a UI column. After starting standalone cluster and jobs, kill the `DriverWrapper` and see the UI.

```
$ sbin/start-master.sh
$ sbin/start-slave.sh spark://$(hostname):7077
$ bin/spark-submit --master spark://(hostname):7077 --deploy-mode cluster --supervise --class org.apache.spark.examples.JavaSparkPi examples/target/scala-2.12/jars/spark-examples_2.12-3.0.0-SNAPSHOT.jar 1000
$ jps
41521 DriverWrapper
...
$ kill -9 41521   // kill the `DriverWrapper`.
```

Closes #26113 from dongjoon-hyun/SPARK-29466.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-18 15:39:44 -07:00
DB Tsai 23f45f1822 [SPARK-29515][CORE] MapStatuses SerDeser Benchmark
<!--
Thanks for sending a pull request!  Here are some tips for you:
  1. If this is your first time, please read our contributor guidelines: https://spark.apache.org/contributing.html
  2. Ensure you have added or run the appropriate tests for your PR: https://spark.apache.org/developer-tools.html
  3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][SPARK-XXXX] Your PR title ...'.
  4. Be sure to keep the PR description updated to reflect all changes.
  5. Please write your PR title to summarize what this PR proposes.
  6. If possible, provide a concise example to reproduce the issue for a faster review.
-->

### What changes were proposed in this pull request?
Add benchmark code for MapStatuses serialization & deserialization performance.

### Why are the changes needed?
For comparing the performance differences against optimization.

### Does this PR introduce any user-facing change?
<!--
If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description and/or an example to show the behavior difference if possible.
If no, write 'No'.
-->
No

### How was this patch tested?
<!--
If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
If tests were not added, please describe why they were not added and/or why it was difficult to add.
-->
No test is required.

Closes #26169 from dbtsai/benchmark.

Lead-authored-by: DB Tsai <d_tsai@apple.com>
Co-authored-by: Dongjoon Hyun <dhyun@apple.com>
Co-authored-by: DB Tsai <dbtsai@dbtsai.com>
Signed-off-by: DB Tsai <d_tsai@apple.com>
2019-10-18 21:30:36 +00:00
Yuanjian Li 8616109061 [SPARK-9853][CORE][FOLLOW-UP] Regularize all the shuffle configurations related to adaptive execution
### What changes were proposed in this pull request?
1. Regularize all the shuffle configurations related to adaptive execution.
2. Add default value for `BlockStoreShuffleReader.shouldBatchFetch`.

### Why are the changes needed?
It's a follow-up PR for #26040.
Regularize the existing `spark.sql.adaptive.shuffle` namespace in SQLConf.

### Does this PR introduce any user-facing change?
Rename one released user config `spark.sql.adaptive.minNumPostShufflePartitions` to `spark.sql.adaptive.shuffle.minNumPostShufflePartitions`, other changed configs is not released yet.

### How was this patch tested?
Existing UT.

Closes #26147 from xuanyuanking/SPARK-9853.

Authored-by: Yuanjian Li <xyliyuanjian@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-10-18 15:39:35 +08:00
Ivan Gozali 00347a3c78 [SPARK-28762][CORE] Read JAR main class if JAR is not located in local file system
### What changes were proposed in this pull request?

JIRA: https://issues.apache.org/jira/browse/SPARK-28762

TL;DR: Automatically read the `Main-Class` from a JAR's manifest even if the JAR isn't in the local file system (i.e. in S3 or HDFS).

### Why are the changes needed?

When deploying a fat JAR (e.g. using `sbt-assembly`) to S3/HDFS, users might choose to include the main class for the JAR in its manifest. This change allows the user to `spark-submit` the JAR without having to specify the main class again via the `--class` argument.

### Does this PR introduce any user-facing change?

Yes. Previously, if the primary resource is a JAR and isn't located in the local file system, it will fail with the error:
```
$ spark-submit s3a://nonexistent.jar
Exception in thread "main" org.apache.spark.SparkException: Cannot load main class from JAR s3a://nonexistent.jar with URI s3a. Please specify a class through --class.
    ...
```

With this PR, the main class will be read from the manifest, assuming the classpath contains the appropriate JAR to read the file system.

### How was this patch tested?

Added some tests in `core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala`.

Closes #25910 from igozali/SPARK-28762.

Authored-by: Ivan Gozali <gozaliivan@gmail.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-10-17 14:36:01 -07:00
Jungtaek Lim (HeartSaVioR) 100fc58da5 [SPARK-28869][CORE] Roll over event log files
### What changes were proposed in this pull request?

This patch is a part of [SPARK-28594](https://issues.apache.org/jira/browse/SPARK-28594) and design doc for SPARK-28594 is linked here: https://docs.google.com/document/d/12bdCC4nA58uveRxpeo8k7kGOI2NRTXmXyBOweSi4YcY/edit?usp=sharing

This patch proposes adding new feature to event logging, rolling event log files via configured file size.

Previously event logging is done with single file and related codebase (`EventLoggingListener`/`FsHistoryProvider`) is tightly coupled with it. This patch adds layer on both reader (`EventLogFileReader`) and writer (`EventLogFileWriter`) to decouple implementation details between "handling events" and "how to read/write events from/to file".

This patch adds two properties, `spark.eventLog.rollLog` and `spark.eventLog.rollLog.maxFileSize` which provides configurable behavior of rolling log. The feature is disabled by default, as we only expect huge event log for huge/long-running application. For other cases single event log file would be sufficient and still simpler.

### Why are the changes needed?

This is a part of SPARK-28594 which addresses event log growing infinitely for long-running application.

This patch itself also provides some option for the situation where event log file gets huge and consume their storage. End users may give up replaying their events and want to delete the event log file, but given application is still running and writing the file, it's not safe to delete the file. End users will be able to delete some of old files after applying rolling over event log.

### Does this PR introduce any user-facing change?

No, as the new feature is turned off by default.

### How was this patch tested?

Added unit tests, as well as basic manual tests.

Basic manual tests - ran SHS, ran structured streaming query with roll event log enabled, verified split files are generated as well as SHS can load these files, with handling app status as incomplete/complete.

Closes #25670 from HeartSaVioR/SPARK-28869.

Lead-authored-by: Jungtaek Lim (HeartSaVioR) <kabhwan.opensource@gmail.com>
Co-authored-by: Jungtaek Lim (HeartSaVioR) <kabhwan@gmail.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-10-17 11:15:25 -07:00
Marcelo Vanzin 2f0a38cb50 [SPARK-29398][CORE] Support dedicated thread pools for RPC endpoints
The current RPC backend in Spark supports single- and multi-threaded
message delivery to endpoints, but they all share the same underlying
thread pool. So an RPC endpoint that blocks a dispatcher thread can
negatively affect other endpoints.

This can be more pronounced with configurations that limit the number
of RPC dispatch threads based on configuration and / or running
environment. And exposing the RPC layer to other code (for example
with something like SPARK-29396) could make it easy to affect normal
Spark operation with a badly written RPC handler.

This change adds a new RPC endpoint type that tells the RPC env to
create dedicated dispatch threads, so that those effects are minimised.
Other endpoints will still need CPU to process their messages, but
they won't be able to actively block the dispatch thread of these
isolated endpoints.

As part of the change, I've changed the most important Spark endpoints
(the driver, executor and block manager endpoints) to be isolated from
others. This means a couple of extra threads are created on the driver
and executor for these endpoints.

Tested with existing unit tests, which hammer the RPC system extensively,
and also by running applications on a cluster (with a prototype of
SPARK-29396).

Closes #26059 from vanzin/SPARK-29398.

Authored-by: Marcelo Vanzin <vanzin@cloudera.com>
Signed-off-by: Imran Rashid <irashid@cloudera.com>
2019-10-17 13:14:32 -05:00
Yuanjian Li 239ee3f561 [SPARK-9853][CORE] Optimize shuffle fetch of continuous partition IDs
This PR takes over #19788. After we split the shuffle fetch protocol from `OpenBlock` in #24565, this optimization can be extended in the new shuffle protocol. Credit to yucai, closes #19788.

### What changes were proposed in this pull request?
This PR adds the support for continuous shuffle block fetching in batch:

- Shuffle client changes:
    - Add new feature tag `spark.shuffle.fetchContinuousBlocksInBatch`, implement the decision logic in `BlockStoreShuffleReader`.
    - Merge the continuous shuffle block ids in batch if needed in ShuffleBlockFetcherIterator.
- Shuffle server changes:
    - Add support in `ExternalBlockHandler` for the external shuffle service side.
    - Make `ShuffleBlockResolver.getBlockData` accept getting block data by range.
- Protocol changes:
    - Add new block id type `ShuffleBlockBatchId` represent continuous shuffle block ids.
    - Extend `FetchShuffleBlocks` and `OneForOneBlockFetcher`.
    - After the new shuffle fetch protocol completed in #24565, the backward compatibility for external shuffle service can be controlled by `spark.shuffle.useOldFetchProtocol`.

### Why are the changes needed?
In adaptive execution, one reducer may fetch multiple continuous shuffle blocks from one map output file. However, as the original approach, each reducer needs to fetch those 10 reducer blocks one by one. This way needs many IO and impacts performance. This PR is to support fetching those continuous shuffle blocks in one IO (batch way). See below example:

The shuffle block is stored like below:
![image](https://user-images.githubusercontent.com/2989575/51654634-c37fbd80-1fd3-11e9-935e-5652863676c3.png)
The ShuffleId format is s"shuffle_$shuffleId_$mapId_$reduceId", referring to BlockId.scala.

In adaptive execution, one reducer may want to read output for reducer 5 to 14, whose block Ids are from shuffle_0_x_5 to shuffle_0_x_14.
Before this PR, Spark needs 10 disk IOs + 10 network IOs for each output file.
After this PR, Spark only needs 1 disk IO and 1 network IO. This way can reduce IO dramatically.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
Add new UT.
Integrate test with setting `spark.sql.adaptive.enabled=true`.

Closes #26040 from xuanyuanking/SPARK-9853.

Lead-authored-by: Yuanjian Li <xyliyuanjian@gmail.com>
Co-authored-by: yucai <yyu1@ebay.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-10-17 14:47:56 +08:00
prasha2 57edb42582 [SPARK-27259][CORE] Allow setting -1 as length for FileBlock
### What changes were proposed in this pull request?

This PR aims to update the validation check on `length` from `length >= 0` to `length >= -1` in order to allow set `-1` to keep the default value.

### Why are the changes needed?

At Apache Spark 2.2.0, [SPARK-18702](https://github.com/apache/spark/pull/16133/files#diff-2c5519b1cf4308d77d6f12212971544fR27-R38) adds `class FileBlock` with the default `length` value, `-1`, initially.

There is no way to set `filePath` only while keeping `length` is `-1`.
```scala
  def set(filePath: String, startOffset: Long, length: Long): Unit = {
     require(filePath != null, "filePath cannot be null")
     require(startOffset >= 0, s"startOffset ($startOffset) cannot be negative")
     require(length >= 0, s"length ($length) cannot be negative")
     inputBlock.set(new FileBlock(UTF8String.fromString(filePath), startOffset, length))
   }
```

For compressed files (like GZ), the size of split can be set to -1. This was allowed till Spark 2.1 but regressed starting with spark 2.2.x. Please note that split length of -1 also means the length was unknown - a valid scenario. Thus, split length of -1 should be acceptable like pre Spark 2.2.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

This is updating the corner case on the requirement check. Manually check the code.

Closes #26123 from praneetsharma/fix-SPARK-27259.

Authored-by: prasha2 <prasha22@mail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-15 22:22:37 -07:00
Wenchen Fan 51f10ed90f [SPARK-28560][SQL][FOLLOWUP] code cleanup for local shuffle reader
### What changes were proposed in this pull request?

A followup of https://github.com/apache/spark/pull/25295

This PR proposes a few code cleanups:
1. rename the special `getMapSizesByExecutorId` to `getMapSizesByMapIndex`
2. rename the parameter `mapId` to `mapIndex` as that's really a mapper index.
3. `BlockStoreShuffleReader` should take `blocksByAddress` directly instead of a map id.
4. rename `getMapReader` to `getReaderForOneMapper` to be more clearer.

### Why are the changes needed?

make code easier to understand

### Does this PR introduce any user-facing change?

no

### How was this patch tested?

existing tests

Closes #26128 from cloud-fan/followup.

Authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-10-16 11:19:16 +08:00
Dongjoon Hyun 39d53d3e74 [SPARK-29470][BUILD] Update plugins to latest versions
### What changes were proposed in this pull request?

This PR updates plugins to latest versions.

### Why are the changes needed?

This brings bug fixes like the following.
- https://issues.apache.org/jira/projects/MCOMPILER/versions/12343484 (maven-compiler-plugin)
- https://issues.apache.org/jira/projects/MJAVADOC/versions/12345060 (maven-javadoc-plugin)
- https://issues.apache.org/jira/projects/MCHECKSTYLE/versions/12342397 (maven-checkstyle-plugin)
- https://checkstyle.sourceforge.io/releasenotes.html#Release_8.25 (checkstyle)
- https://checkstyle.sourceforge.io/releasenotes.html#Release_8.24 (checkstyle)

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Pass the Jenkins building and testing with the existing code.

Closes #26117 from dongjoon-hyun/SPARK-29470.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-15 11:55:52 -07:00
Liang-Chi Hsieh 4ecbdbb6a7 [SPARK-29182][CORE] Cache preferred locations of checkpointed RDD
### What changes were proposed in this pull request?

This proposes to add a Spark config to control the caching behavior of ReliableCheckpointRDD.getPreferredLocations. If it is enabled, getPreferredLocations will only compute preferred locations once and cache it for later usage.

The drawback of caching the preferred locations is that when the cached locations are outdated, and lose data locality. It was documented in config document. To mitigate this, this patch also adds a config to set up expire time (default is 60 mins) for the cache. If time expires, the cache will be invalid and it needs to query updated location info.

This adds a test case. Looks like the most suitable test suite is CheckpointCompressionSuite. So this renames CheckpointCompressionSuite to CheckpointStorageSuite and put the test case into.

### Why are the changes needed?

One Spark job in our cluster fits many ALS models in parallel. The fitting goes well, but in next when we union all factors, the union operation is very slow.

By looking into the driver stack dump, looks like the driver spends a lot of time on computing preferred locations. As we checkpoint training data before fitting ALS, the time is spent on ReliableCheckpointRDD.getPreferredLocations. In this method, it will call DFS interface to query file status and block locations. As we have big number of partitions derived from the checkpointed RDD,  the union will spend a lot of time on querying the same information.

It reduces the time on huge union from few hours to dozens of minutes.

This issue is not limited to ALS so this change is not specified to ALS. Actually it is common usage to checkpoint data in Spark, to increase reliability and cut RDD linage. Spark operations on the checkpointed data, will be beneficial.

### Does this PR introduce any user-facing change?

Yes. This adds a Spark config users can use to control the cache behavior of preferred locations of checkpointed RDD.

### How was this patch tested?

Unit test added and manual test on development cluster.

Closes #25856 from viirya/cache-checkpoint-preferredloc.

Authored-by: Liang-Chi Hsieh <liangchi@uber.com>
Signed-off-by: Liang-Chi Hsieh <liangchi@uber.com>
2019-10-15 10:45:18 -07:00
Yifei Huang 2e28622d8a [SPARK-28211][CORE][SHUFFLE] Propose Shuffle Driver Components API
### What changes were proposed in this pull request?

This is the next step of the Spark-25299 work of proposing a new Shuffle storage API. This patch includes the components of the plugin that hook into the driver, including driver shuffle initialization, application cleanup, and shuffle cleanup.

### How was this patch tested?
Existing unit tests, plus an additional test for testing the interactions between the driver and executor initialization.

Closes #25823 from yifeih/yh/upstream/driver-lifecycle.

Lead-authored-by: Yifei Huang <yifeih@palantir.com>
Co-authored-by: mccheah <mcheah@palantir.com>
Signed-off-by: Imran Rashid <irashid@cloudera.com>
2019-10-15 12:26:49 -05:00
jiake 9ac4b2dbc5 [SPARK-28560][SQL] Optimize shuffle reader to local shuffle reader when smj converted to bhj in adaptive execution
## What changes were proposed in this pull request?
Implement a rule in the new adaptive execution framework introduced in [SPARK-23128](https://issues.apache.org/jira/browse/SPARK-23128). This rule is used to optimize the shuffle reader to local shuffle reader when smj is converted to bhj in adaptive execution.

## How was this patch tested?
Existing tests

Closes #25295 from JkSelf/localShuffleOptimization.

Authored-by: jiake <ke.a.jia@intel.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-10-15 21:51:15 +08:00
Marcelo Vanzin 857f109c47 [SPARK-10614][CORE] Add monotonic time to Clock interface
This change adds a new method to the Clock interface that returns
the time from a monotonic time source, so that code that needs that
feature can also mock the Clock in tests.

The original getTimeMillis and waitTillTime methods are unchanged, since
streaming code that uses the Clock interface seems to rely on wall clock
semantics, not monotonic clock. So, in a way, this doesn't directly
address the problem raised in the bug, that waitTillTime can be affected
by drift, but then the places being modified to use the new API don't
really rely on that API.

The dynamic allocation code was modified to use the new time source,
since they shouldn't be basing their decisions on wall clock time.

For a longer discussion on how monotonic clocks work on Linux/x64, the
following blog post (and links within) shed a lot of light on the safety of
`System.nanoTime()`:

http://btorpey.github.io/blog/2014/02/18/clock-sources-in-linux/

Tested with unit test and also running apps with dynamic allocation on.

Closes #26058 from vanzin/SPARK-10614.

Authored-by: Marcelo Vanzin <vanzin@cloudera.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-14 19:38:04 -07:00
Thomas Graves a42d894a40 [SPARK-29417][CORE] Resource Scheduling - add TaskContext.resource java api
### What changes were proposed in this pull request?
We added a TaskContext.resources() api, but I realized this is returning a scala Map which is not ideal for access from Java.  Here I add a resourcesJMap function which returns a java.util.Map to make it easily accessible from Java.

### Why are the changes needed?
Java API access

### Does this PR introduce any user-facing change?
<!--
If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description and/or an example to show the behavior difference if possible.
If no, write 'No'.
-->
Yes, new TaskContext function to access from Java

### How was this patch tested?
<!--
If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
If tests were not added, please describe why they were not added and/or why it was difficult to add.
-->
new unit test

Closes #26083 from tgravescs/SPARK-29417.

Lead-authored-by: Thomas Graves <tgraves@ngvpn01-168-221.dyn.scz.us.nvidia.com>
Co-authored-by: Thomas Graves <tgraves@TGRAVES-MLT.local>
Co-authored-by: Thomas Graves <tgraves@nvidia.com>
Signed-off-by: Xiangrui Meng <meng@databricks.com>
2019-10-14 13:27:34 -07:00
sandeep katta ba04562234 [SPARK-29435][CORE] MapId in Shuffle Block is inconsistent at the writer and reader part when spark.shuffle.useOldFetchProtocol=true
### What changes were proposed in this pull request?
Shuffle Block Construction during Shuffle Write and Read is wrong

Shuffle Map Task  (Shuffle Write)
19/10/11 22:07:32| ERROR| [Executor task launch worker for task 3] org.apache.spark.shuffle.IndexShuffleBlockResolver: ####### For Debug ############ /tmp/hadoop-root1/nm-local-dir/usercache/root1/appcache/application_1570422377362_0008/blockmgr-6d03250d-6e7c-4bc2-bbb7-22b8e3981c35/0d/**shuffle_0_3_0.index**

Result Task (Shuffle Read)
19/10/11 22:07:32| ERROR| [Executor task launch worker for task 6] org.apache.spark.storage.ShuffleBlockFetcherIterator: Error occurred while fetching local blocks
java.nio.file.NoSuchFileException: /tmp/hadoop-root1/nm-local-dir/usercache/root1/appcache/application_1570422377362_0008/blockmgr-6d03250d-6e7c-4bc2-bbb7-22b8e3981c35/30/**shuffle_0_0_0.index**
	at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)

As per [SPARK-25341](https://issues.apache.org/jira/browse/SPARK-25341) `mapId` of `SortShuffleManager.getWriter `changed to `context.taskAttemptId() ` from `partitionId`

[code]( https://github.com/apache/spark/pull/25620/files#diff-363c53ca5a72cfdc37dac4a723309638R54)

But `MapOutputTracker.convertMapStatuses` returns the wrong ShuffleBlock, if `spark.shuffle.useOldFetchProtocol `enabled, it returns `paritionId` as `mapID` which is wrong . [Code](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/MapOutputTracker.scala#L912)

### Why are the changes needed?
Already MapStatus returned by the ShuffleWriter has the mapId for e.g.[ code here](https://github.com/apache/spark/blob/master/core/src/main/java/org/apache/spark/shuffle/sort/BypassMergeSortShuffleWriter.java#L134). So it's nice to use `status.mapTaskId`

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
Existing UT  and manually tested with `spark.shuffle.useOldFetchProtocol` as true and false

![image](https://user-images.githubusercontent.com/35216143/66716530-4f4caa80-edec-11e9-833d-7131a9fbd442.png)

Closes #26095 from sandeep-katta/shuffleIssue.

Lead-authored-by: sandeep katta <sandeep.katta2007@gmail.com>
Co-authored-by: Yuanjian Li <xyliyuanjian@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-10-15 00:03:13 +08:00
Pablo 782a94d289 [SPARK-29433][WEBUI] Fix tooltip stages table
### What changes were proposed in this pull request?
In the Web UI, Stages table, the tool tip of Input and output column are not corrrect.

Actual tooltip messages:

Bytes and records read from Hadoop or from Spark storage.
Bytes and records written to Hadoop.
In this column we are only showing bytes, not records

![image](https://user-images.githubusercontent.com/12819544/66608286-85a0e480-ebb6-11e9-812a-9760bea53664.png)
![image](https://user-images.githubusercontent.com/12819544/66608323-96515a80-ebb6-11e9-9e5f-e3f2cc99a3b3.png)
![image](https://user-images.githubusercontent.com/12819544/66608450-de707d00-ebb6-11e9-84e3-0917b5cfe6f6.png)
![image](https://user-images.githubusercontent.com/12819544/66608468-eaf4d580-ebb6-11e9-8c5b-2a9a290bea9c.png)

### Why are the changes needed?
Simple correction of a tooltip

### Does this PR introduce any user-facing change?
Yes, tooltip correction

### How was this patch tested?
Manual testing

Closes #26084 from planga82/feature/SPARK-29433_Tooltip_correction.

Lead-authored-by: Pablo <soypab@gmail.com>
Co-authored-by: Unknown <soypab@gmail.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-10-12 10:34:29 -05:00
liucht e94abd7f16 [SPARK-29323][WEBUI] Add tooltip for The Executors Tab's column names in the Spark history server Page
### What changes were proposed in this pull request?

This PR is Adding tooltip for The Executors Tab's column names include RDD Blocks, Disk Used,Cores, Activity Tasks, Failed Tasks , Complete Tasks, Total Tasks in the history server Page.
https://issues.apache.org/jira/browse/SPARK-29323

![image](https://user-images.githubusercontent.com/28332082/66017759-b6c24a80-e50e-11e9-807b-5b076f701d2f.png)

I have modify the following code in executorspage-template.html
Before:
<th>RDD Blocks</th>
<th>Disk Used</th>
<th>Cores</th>
<th>Active Tasks</th>
<th>Failed Tasks</th>
<th>Complete Tasks</th>
<th>Total Tasks</th>

![image](https://user-images.githubusercontent.com/28332082/66018111-4ddbd200-e510-11e9-9cfc-19f3eae81e76.png)

After:

<th><span data-toggle="tooltip" data-placement="top" title="RDD Blocks">RDD Blocks</span></th>
<th><span data-toggle="tooltip" data-placement="top" title="Disk Used">Disk Used</span></th>
<th><span data-toggle="tooltip" data-placement="top" title="Cores">Cores</span></th>
<th><span data-toggle="tooltip" data-placement="top" title="Active Tasks">Active Tasks</span></th>
<th><span data-toggle="tooltip" data-placement="top" title="Failed Tasks">Failed Tasks</span></th>
<th><span data-toggle="tooltip" data-placement="top" title="Complete Tasks">Complete Tasks</span></th>
<th><span data-toggle="tooltip" data-placement="top" title="Total Tasks">Total Tasks</span></th>

![image](https://user-images.githubusercontent.com/28332082/66018157-79f75300-e510-11e9-96ba-6230aa0940c7.png)

### Why are the changes needed?

the spark Executors of history Tab page, the Summary part shows the line in the list of title, but format is irregular.
Some column names have tooltip, such as Storage Memory, Task Time(GC Time), Input, Shuffle Read,
Shuffle Write and Blacklisted, but there are still some list names that have not tooltip. They are RDD Blocks, Disk Used,Cores, Activity Tasks, Failed Tasks , Complete Tasks and Total Tasks. oddly, Executors section below,All the column names Contains the column names above have tooltip .
It's important for open source projects to have consistent style and user-friendly UI, and I'm working on keeping it consistent And more user-friendly.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Manual tests for Chrome, Firefox and Safari

Authored-by: liucht-inspur <liuchtinspur.com>
Signed-off-by: liucht-inspur <liuchtinspur.com>

Closes #25994 from liucht-inspur/master.

Authored-by: liucht <liucht@inspur.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-10-12 09:40:31 -05:00
Dongjoon Hyun e946104c42 [SPARK-29400][CORE] Improve PrometheusResource to use labels
### What changes were proposed in this pull request?

[SPARK-29064](https://github.com/apache/spark/pull/25770) introduced `PrometheusResource` to expose `ExecutorSummary`. This PR aims to improve it further more `Prometheus`-friendly to use [Prometheus labels](https://prometheus.io/docs/concepts/data_model/#metric-names-and-labels).

### Why are the changes needed?

**BEFORE**
```
metrics_app_20191008151432_0000_driver_executor_rddBlocks_Count 0
metrics_app_20191008151432_0000_driver_executor_memoryUsed_Count 0
metrics_app_20191008151432_0000_driver_executor_diskUsed_Count 0
```

**AFTER**
```
$ curl -s http://localhost:4040/metrics/executors/prometheus/ | head -n3
metrics_executor_rddBlocks_Count{application_id="app-20191008151625-0000", application_name="Spark shell", executor_id="driver"} 0
metrics_executor_memoryUsed_Count{application_id="app-20191008151625-0000", application_name="Spark shell", executor_id="driver"} 0
metrics_executor_diskUsed_Count{application_id="app-20191008151625-0000", application_name="Spark shell", executor_id="driver"} 0
```

### Does this PR introduce any user-facing change?

No, but `Prometheus` understands the new format and shows more intelligently.

<img width="735" alt="ui" src="https://user-images.githubusercontent.com/9700541/66438279-1756f900-e9e1-11e9-91c7-c04c6ce9172f.png">

### How was this patch tested?

Manually.

**SETUP**
```
$ sbin/start-master.sh
$ sbin/start-slave.sh spark://`hostname`:7077
$ bin/spark-shell --master spark://`hostname`:7077 --conf spark.ui.prometheus.enabled=true
```

**RESULT**
```
$ curl -s http://localhost:4040/metrics/executors/prometheus/ | head -n3
metrics_executor_rddBlocks_Count{application_id="app-20191008151625-0000", application_name="Spark shell", executor_id="driver"} 0
metrics_executor_memoryUsed_Count{application_id="app-20191008151625-0000", application_name="Spark shell", executor_id="driver"} 0
metrics_executor_diskUsed_Count{application_id="app-20191008151625-0000", application_name="Spark shell", executor_id="driver"} 0
```

Closes #26060 from dongjoon-hyun/SPARK-29400.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-10 08:47:12 -07:00
Sean Owen cc7493fa21 [SPARK-29416][CORE][ML][SQL][MESOS][TESTS] Use .sameElements to compare arrays, instead of .deep (gone in 2.13)
### What changes were proposed in this pull request?

Use `.sameElements` to compare (non-nested) arrays, as `Arrays.deep` is removed in 2.13 and wasn't the best way to do this in the first place.

### Why are the changes needed?

To compile with 2.13.

### Does this PR introduce any user-facing change?

None.

### How was this patch tested?

Existing tests.

Closes #26073 from srowen/SPARK-29416.

Authored-by: Sean Owen <sean.owen@databricks.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-09 17:00:48 -07:00
Sean Owen 4d93fb7dfa [SPARK-29413][CORE] Rewrite ThreadUtils.parmap to avoid TraversableLike, gone in 2.13
### What changes were proposed in this pull request?

Rewrite declaration of internal `ThreadUtils.parmap` method to avoid `TraversableLike`, which is removed in Scala 2.13.

### Why are the changes needed?

To compile in Scala 2.13.

### Does this PR introduce any user-facing change?

None.

### How was this patch tested?

Existing tests.

Closes #26072 from srowen/SPARK-29413.

Authored-by: Sean Owen <sean.owen@databricks.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-09 10:53:51 -07:00
Sean Owen 3b0bca42ac [SPARK-29401][FOLLOWUP] Additional cases where a .parallelize call with Array is ambiguous in 2.13
This is just a followup on https://github.com/apache/spark/pull/26062 -- see it for more detail.

I think we will eventually find more cases of this. It's hard to get them all at once as there are many different types of compile errors in earlier modules. I'm trying to address them in as a big a chunk as possible.

Closes #26074 from srowen/SPARK-29401.2.

Authored-by: Sean Owen <sean.owen@databricks.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-09 10:27:05 -07:00
Sean Owen fa95a5c395 [SPARK-29411][CORE][ML][SQL][DSTREAM] Replace use of Unit object with () for Scala 2.13
### What changes were proposed in this pull request?

Replace `Unit` with equivalent `()` where code refers to the `Unit` companion object.

### Why are the changes needed?

It doesn't compile otherwise in Scala 2.13.
- https://github.com/scala/scala/blob/v2.13.0/src/library/scala/Unit.scala#L30

### Does this PR introduce any user-facing change?

Should be no behavior change at all.

### How was this patch tested?

Existing tests.

Closes #26070 from srowen/SPARK-29411.

Authored-by: Sean Owen <sean.owen@databricks.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-09 10:24:13 -07:00
Sean Owen ee83d09b53 [SPARK-29401][CORE][ML][SQL][GRAPHX][TESTS] Replace calls to .parallelize Arrays of tuples, ambiguous in Scala 2.13, with Seqs of tuples
### What changes were proposed in this pull request?

Invocations like `sc.parallelize(Array((1,2)))` cause a compile error in 2.13, like:
```
[ERROR] [Error] /Users/seanowen/Documents/spark_2.13/core/src/test/scala/org/apache/spark/ShuffleSuite.scala:47: overloaded method value apply with alternatives:
  (x: Unit,xs: Unit*)Array[Unit] <and>
  (x: Double,xs: Double*)Array[Double] <and>
  (x: Float,xs: Float*)Array[Float] <and>
  (x: Long,xs: Long*)Array[Long] <and>
  (x: Int,xs: Int*)Array[Int] <and>
  (x: Char,xs: Char*)Array[Char] <and>
  (x: Short,xs: Short*)Array[Short] <and>
  (x: Byte,xs: Byte*)Array[Byte] <and>
  (x: Boolean,xs: Boolean*)Array[Boolean]
 cannot be applied to ((Int, Int), (Int, Int), (Int, Int), (Int, Int))
```
Using a `Seq` instead appears to resolve it, and is effectively equivalent.

### Why are the changes needed?

To better cross-build for 2.13.

### Does this PR introduce any user-facing change?

None.

### How was this patch tested?

Existing tests.

Closes #26062 from srowen/SPARK-29401.

Authored-by: Sean Owen <sean.owen@databricks.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-08 20:22:02 -07:00
Sean Owen 2d871ad0e7 [SPARK-29392][CORE][SQL][STREAMING] Remove symbol literal syntax 'foo, deprecated in Scala 2.13, in favor of Symbol("foo")
### What changes were proposed in this pull request?

Syntax like `'foo` is deprecated in Scala 2.13. Replace usages with `Symbol("foo")`

### Why are the changes needed?

Avoids ~50 deprecation warnings when attempting to build with 2.13.

### Does this PR introduce any user-facing change?

None, should be no functional change at all.

### How was this patch tested?

Existing tests.

Closes #26061 from srowen/SPARK-29392.

Authored-by: Sean Owen <sean.owen@databricks.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-08 20:15:37 -07:00
Imran Rashid 0da667d314 [SPARK-28917][CORE] Synchronize access to RDD mutable state
RDD dependencies and partitions can be simultaneously
accessed and mutated by user threads and spark's scheduler threads, so
access must be thread-safe.  In particular, as partitions and
dependencies are lazily-initialized, before this change they could get
initialized multiple times, which would lead to the scheduler having an
inconsistent view of the pendings stages and get stuck.

Tested with existing unit tests.

Closes #25951 from squito/SPARK-28917.

Authored-by: Imran Rashid <irashid@cloudera.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-10-08 11:35:54 -07:00
Marcelo Vanzin d2f21b0199 [SPARK-27468][CORE] Track correct storage level of RDDs and partitions
Previously, the RDD level would change depending on the status reported
by executors for the block they were storing, and individual blocks would
reflect that. That is wrong because different blocks may be stored differently
in different executors.

So now the RDD tracks the user-provided storage level, while the individual
partitions reflect the current storage level of that particular block,
including the current number of replicas.

Closes #25779 from vanzin/SPARK-27468.

Authored-by: Marcelo Vanzin <vanzin@cloudera.com>
Signed-off-by: Imran Rashid <irashid@cloudera.com>
2019-10-07 16:07:00 -05:00
Xingbo Jiang 80afc79d89 [SPARK-29263][SCHEDULER][FOLLOWUP][TEST] Update FakeTask.createTaskSet() method
### What changes were proposed in this pull request?

Update `FakeTask.createTaskSet()` method to make the generated TaskSet makes use of the `priority` param.

### How was this patch tested?

All the places referring to this method pass `priority = 0`, but we still need to fix this to avoid future test failure.

Closes #26030 from jiangxb1987/SPARK-292630-FOLLOWUP.

Authored-by: Xingbo Jiang <xingbo.jiang@databricks.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-05 14:44:58 -07:00
DB Tsai 8b71e545d5 [SPARK-29351][CORE] Avoid Full Synchronization in ShuffleMapStage
### What changes were proposed in this pull request?

This PR use read/write locks instead of `synchronized`.

### Why are the changes needed?

In one of our production streaming jobs that has more than 1k executors, and each has 20 cores, Spark spends significant portion of time (30s) in sending out the `ShuffeStatus`. We find there are two issues.

1. In driver's message loop, it's calling `serializedMapStatus` which is in sync block. When the job scales really big, it can cause the contention.
2. When the job is big, the `MapStatus` is huge as well, the serialization time and compression time is slow.

This PR aims to address the first problem.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Test with existing test cases.

Closes #26017 from dbtsai/readWriteLock.

Authored-by: DB Tsai <d_tsai@apple.com>
Signed-off-by: DB Tsai <d_tsai@apple.com>
2019-10-04 03:14:10 +00:00
zhouyongjin aedf090ab7 [SPARK-25468][WEBUI][FOLLOWUP] Current page index keep style with dataTable in the spark UI
### What changes were proposed in this pull request?
Current page index keep style with dataTable  in the spark UI.
https://issues.apache.org/jira/browse/SPARK-25468

move
.paginate_button.active > a {
    color: #999999;
    text-decoration: underline;
}

add
.paginate_button.active {
  border: 1px solid #979797 !important;
  background: white linear-gradient(to bottom, #fff 0%, #dcdcdc 100%);
}

### How was this patch tested?
![image](https://user-images.githubusercontent.com/8166352/65872683-b5303f80-e3b3-11e9-9785-9e1d15b0b3cf.png)
![image](https://user-images.githubusercontent.com/8166352/65872692-ba8d8a00-e3b3-11e9-8cf6-db6f905d3387.png)

Closes #25976 from YongjinZhou/master.

Authored-by: zhouyongjin <zhouyongjin@inspur.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-10-03 14:03:21 -05:00
Sean Owen 7aca0dd658 [SPARK-29296][BUILD][CORE] Remove use of .par to make 2.13 support easier; add scala-2.13 profile to enable pulling in par collections library separately, for the future
### What changes were proposed in this pull request?

Scala 2.13 removes the parallel collections classes to a separate library, so first, this establishes a `scala-2.13` profile to bring it back, for future use.

However the library enables use of `.par` implicit conversions via a new class that is not in 2.12, which makes cross-building hard. This implements a suggested workaround from https://github.com/scala/scala-parallel-collections/issues/22 to avoid `.par` entirely.

### Why are the changes needed?

To compile for 2.13 and later to work with 2.13.

### Does this PR introduce any user-facing change?

Should not, no.

### How was this patch tested?

Existing tests.

Closes #25980 from srowen/SPARK-29296.

Authored-by: Sean Owen <sean.owen@databricks.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-10-03 08:56:08 -05:00
Jungtaek Lim (HeartSaVioR) e44d191bf7 [SPARK-29322][CORE] Enable closeFrameOnFlush on ZstdOutputStream for event log file
### What changes were proposed in this pull request?

This patch proposes to enable `closeFrameOnFlush` for ZstdOutputStream specific to event logger, so that continuous input stream of zstd is not stuck when reading "inprogress" event log file.

The issue seems to be introduced from [SPARK-26283](https://issues.apache.org/jira/browse/SPARK-26283) which addressed some bug via reading event log file with enabling continuous mode, but it changed the behavior of input stream to read open frame, which seem to wait for frame to be closed. Enabling `closeFrameOnFlush` would close frame whenever flush is called, so input stream could read the frame sooner.

As a pair of `compressedContinuousInputStream`, this patch adds `compressedContinuousOutputStream` which will be only used for event logging.

### Why are the changes needed?

Without this patch, the reader thread in SHS is stuck on reading inprogress event log file compressed with zstd until the application becomes finished.

### Does this PR introduce any user-facing change?

It might bring some overhead on each flush when writing zstd compressed event log, so some sort of performance hit could be introduced. I've restricted the case to only event logging.

### How was this patch tested?

Manually tested, via setting Spark configuration as below:

```
spark.eventLog.enabled                     true
spark.eventLog.compress                  true
spark.eventLog.compression.codec zstd
```

and start Spark application. While the application is running, load the application in SHS webpage.

Before this patch, it may succeed to replay the event log, but high likely it will be stuck and loading page will be also stuck. After this patch, SHS can properly reads the inprogress event log file.

Closes #25996 from HeartSaVioR/SPARK-29322.

Authored-by: Jungtaek Lim (HeartSaVioR) <kabhwan.opensource@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-10-02 20:48:38 -07:00
huangweiyi 85dafabeb4 [SPARK-29273][CORE] Save peakExecutionMemory value when writing task end to event log
in TaskMetrics, there is a exposed metrics peakExecutionMemory, but the value never been set. when a task is finished, it generate a SparkListenerTaskEnd event info, incuding the metrics value. actually the peakExecutionMemory is stored in the Accumulables which is a member of TaskInfo.

so when parse the SparkListenerTaskEnd event, we can get the `internal.metrics.peakExecutionMemory` value from the parsed taskInfo object, and set back to the TaskMetricsInfo with supply a setPeakExecutionMemory method.

Closes #25949 from 012huang/fix-peakExecutionMemory-metrics-value.

Lead-authored-by: huangweiyi <huangweiyi_2006@qq.com>
Co-authored-by: hwy <huangweiyi_2006@qq.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-10-02 08:45:12 -07:00
Josh Rosen c6938eab57 [SPARK-29310][CORE][TESTS] TestMemoryManager should implement getExecutionMemoryUsageForTask()
### What changes were proposed in this pull request?

This PR updates `TestMemoryManager`, a class used only in unit tests, to override the `getExecutionMemoryUsageForTask()` and `releaseAllExecutionMemoryForTask()` methods. I also `synchronized` its state-accessing methods (to make the class thread-safe) and added some additional assertions to guard against freeing memory memory than has been allocated.

### Why are the changes needed?

Spark uses a `TestMemoryManager` class to mock out memory manager functionality in tests, allowing test authors to exercise control over certain behaviors (e.g. to simulate OOMs).

Several of Spark's test suites have memory-leak detection to ensure that all allocated memory is cleaned up at the end of each test case; this helps to guard against bugs that could cause production memory leaks. For example, see `testWithMemoryLeakDetection` in `UnsafeFixedWidthAggregationMapSuite`.

Unfortunately, however, this leak-detection logic is broken for tests which use TestMemoryManager because it does not override the `getExecutionMemoryUsageForTask()` method that is used by the leak-detection checks.

This PR fixes that problem, thereby strengthening our existing tests.

I spotted this problem while reviewing #25953: I tried introducing a change to remove a `freePage()` call (purposely inducing a memory leak) but no tests failed.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Added a new `TestMemoryManagerSuite`, with tests covering `TestMemoryManager` itself.

Ran a subset of existing tests on my laptop and uncovered a bug in one test's `free()` calls, plus missing cleanup calls in another test suite; both of these issues are fixed in this PR.

Closes #25985 from JoshRosen/SPARK-29310-testmemorymanager-getExecutionMemoryUsageForTask.

Lead-authored-by: Josh Rosen <rosenville@gmail.com>
Co-authored-by: joshrosen-stripe <48632449+joshrosen-stripe@users.noreply.github.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-10-02 11:07:19 +08:00
Liang-Chi Hsieh 0cd436be66 [SPARK-29244][CORE] Prevent freed page in BytesToBytesMap free again
### What changes were proposed in this pull request?

When BytesToBytesMap cannot allocate a page, allocated page was freed by TaskMemoryManager. In this case, we should not keep it in longArray field of BytesToBytesMap. Otherwise it might be freed again in task completion listener of UnsafeFixedWidthAggregationMap and cause confusing error.

Note that because we catch Throwable when invoking completion listeners, this error should not affect other listeners, except for the current listener. In the completion listener of UnsafeFixedWidthAggregationMap, it only performs BytesToBytesMap.free().

BytesToBytesMap.free() does two things: freeing allocated pages and deleting spilled files. When it tries to free a freed page, this error hits and skips remaining pages (Executor.cleanUpAllAllocatedMemory will guard memory leak) and spilled files.

### Why are the changes needed?

By chance, it is possibly that we free an already freed page in BytesToBytesMap. Because we have some guards when freeing a page, a confusing error would be hit:

```
16:07:33.550 ERROR org.apache.spark.TaskContextImpl: Error in TaskCompletionListener
java.lang.AssertionError: Called freePage() on a memory block that has already been freed
        at org.apache.spark.memory.TaskMemoryManager.freePage(TaskMemoryManager.java:332)
        at org.apache.spark.memory.MemoryConsumer.freePage(MemoryConsumer.java:129)
        at org.apache.spark.memory.MemoryConsumer.freeArray(MemoryConsumer.java:107)
        at org.apache.spark.unsafe.map.BytesToBytesMap.free(BytesToBytesMap.java:806)
        at org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap.free(UnsafeFixedWidthAggregationMap.java:226)
        at org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap.lambda$new$0(UnsafeFixedWidthAggregationMap.java:112)
        at org.apache.spark.TaskContextImpl.$anonfun$markTaskCompleted$1(TaskContextImpl.scala:119)
        at org.apache.spark.TaskContextImpl.$anonfun$markTaskCompleted$1$adapted(TaskContextImpl.scala:119)
        at org.apache.spark.TaskContextImpl.$anonfun$invokeListeners$1(TaskContextImpl.scala:132)
        at org.apache.spark.TaskContextImpl.$anonfun$invokeListeners$1$adapted(TaskContextImpl.scala:130)
        at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
        at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
        at org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:130)
        at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:119)
        at org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMapSuite.$anonfun$new$19(UnsafeFixedWidthAggregationMapSuite.scala:425)
        at org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMapSuite.$anonfun$testWithMemoryLeakDetection$1(UnsafeFixedWidthAggregationMapSuite.scala:87)
        at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
        at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
        at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
        at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
        at org.scalatest.Transformer.apply(Transformer.scala:22)
        at org.scalatest.Transformer.apply(Transformer.scala:20)
        at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186)
        at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:149)
        at org.scalatest.FunSuiteLike.invokeWithFixture$1(FunSuiteLike.scala:184)
        at org.scalatest.FunSuiteLike.$anonfun$runTest$1(FunSuiteLike.scala:196)
        at org.scalatest.SuperEngine.runTestImpl(Engine.scala:289)
        at org.scalatest.FunSuiteLike.runTest(FunSuiteLike.scala:196)
        at org.scalatest.FunSuiteLike.runTest$(FunSuiteLike.scala:178)
        at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterEach$$super$runTest(SparkFunSuite.scala:56)
        at org.scalatest.BeforeAndAfterEach.runTest(BeforeAndAfterEach.scala:221)
        at org.scalatest.BeforeAndAfterEach.runTest$(BeforeAndAfterEach.scala:214)
        at org.apache.spark.SparkFunSuite.runTest(SparkFunSuite.scala:56)
        at org.scalatest.FunSuiteLike.$anonfun$runTests$1(FunSuiteLike.scala:229)
        at org.scalatest.SuperEngine.$anonfun$runTestsInBranch$1(Engine.scala:396)
        at scala.collection.immutable.List.foreach(List.scala:392)
        at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:384)
        at org.scalatest.SuperEngine.runTestsInBranch(Engine.scala:379)
        at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:461)
        at org.scalatest.FunSuiteLike.runTests(FunSuiteLike.scala:229)
        at org.scalatest.FunSuiteLike.runTests$(FunSuiteLike.scala:228)
        at org.scalatest.FunSuite.runTests(FunSuite.scala:1560)
        at org.scalatest.Suite.run(Suite.scala:1147)
        at org.scalatest.Suite.run$(Suite.scala:1129)
        at org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1560)
        at org.scalatest.FunSuiteLike.$anonfun$run$1(FunSuiteLike.scala:233)
        at org.scalatest.SuperEngine.runImpl(Engine.scala:521)
        at org.scalatest.FunSuiteLike.run(FunSuiteLike.scala:233)
        at org.scalatest.FunSuiteLike.run$(FunSuiteLike.scala:232)
        at org.apache.spark.SparkFunSuite.org$scalatest$BeforeAndAfterAll$$super$run(SparkFunSuite.scala:56)
        at org.scalatest.BeforeAndAfterAll.liftedTree1$1(BeforeAndAfterAll.scala:213)
        at org.scalatest.BeforeAndAfterAll.run(BeforeAndAfterAll.scala:210)
        at org.scalatest.BeforeAndAfterAll.run$(BeforeAndAfterAll.scala:208)
        at org.apache.spark.SparkFunSuite.run(SparkFunSuite.scala:56)
        at org.scalatest.tools.Framework.org$scalatest$tools$Framework$$runSuite(Framework.scala:314)
        at org.scalatest.tools.Framework$ScalaTestTask.execute(Framework.scala:507)
        at sbt.ForkMain$Run$2.call(ForkMain.java:296)
        at sbt.ForkMain$Run$2.call(ForkMain.java:286)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
[info] - SPARK-29244 *** FAILED *** (16 milliseconds)
[info]   org.apache.spark.util.TaskCompletionListenerException: Called freePage() on a memory block that has already been freed
[info]
[info] Previous exception in task: Unable to acquire 4096 bytes of memory, got 0
[info]  org.apache.spark.memory.MemoryConsumer.throwOom(MemoryConsumer.java:157)
[info]  org.apache.spark.memory.MemoryConsumer.allocateArray(MemoryConsumer.java:97)
[info]  org.apache.spark.unsafe.map.BytesToBytesMap.allocate(BytesToBytesMap.java:790)
[info]  org.apache.spark.unsafe.map.BytesToBytesMap.reset(BytesToBytesMap.java:893)
[info]  org.apache.spark.sql.execution.UnsafeKVExternalSorter.<init>(UnsafeKVExternalSorter.java:170)
[info]  org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap.destructAndCreateExternalSorter(UnsafeFixedWidthAggregationMap.java:249)
[info]  org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMapSuite.$anonfun$new$19(UnsafeFixedWidthAggregationMapSuite.scala:421)
[info]  org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMapSuite.$anonfun$testWithMemoryLeakDetection$1(UnsafeFixedWidthAggregationMapSuite.scala:87)
```

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

Added unit test.

Closes #25953 from viirya/SPARK-29244.

Lead-authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Co-authored-by: Liang-Chi Hsieh <liangchi@uber.com>
Signed-off-by: Liang-Chi Hsieh <liangchi@uber.com>
2019-10-01 11:50:31 -07:00
Jungtaek Lim (HeartSaVioR) a4601cb44e [SPARK-29055][CORE] Update driver/executors' storage memory when block is removed from BlockManager
### What changes were proposed in this pull request?

This patch proposes to fix the issue that storage memory is not decreasing even block is removed in BlockManager. Originally the issue is found while removed broadcast doesn't reflect the storage memory on driver/executors.

AppStatusListener expects the value of memory in events on block update as "delta" so that it adjusts driver/executors' storage memory based on delta, but when removing block BlockManager reports the delta as 0, so the storage memory is not decreased. `BlockManager.dropFromMemory` deals with this correctly, so some of path of freeing memory has been updated correctly.

### Why are the changes needed?

The storage memory in metrics in AppStatusListener is now out of sync which lets end users easy to confuse as memory leak is happening.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Modified UTs. Also manually tested via running simple query repeatedly and observe executor page of Spark UI to see the value of storage memory is decreasing as well.

Please refer the description of [SPARK-29055](https://issues.apache.org/jira/browse/SPARK-29055) to get simple reproducer.

Closes #25973 from HeartSaVioR/SPARK-29055.

Authored-by: Jungtaek Lim (HeartSaVioR) <kabhwan.opensource@gmail.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-10-01 09:41:51 -07:00
Dongjoon Hyun bd031c2173 [SPARK-29307][BUILD][TESTS] Remove scalatest deprecation warnings
### What changes were proposed in this pull request?

This PR aims to remove `scalatest` deprecation warnings with the following changes.
- `org.scalatest.mockito.MockitoSugar` -> `org.scalatestplus.mockito.MockitoSugar`
- `org.scalatest.selenium.WebBrowser` -> `org.scalatestplus.selenium.WebBrowser`
- `org.scalatest.prop.Checkers` -> `org.scalatestplus.scalacheck.Checkers`
- `org.scalatest.prop.GeneratorDrivenPropertyChecks` -> `org.scalatestplus.scalacheck.ScalaCheckDrivenPropertyChecks`

### Why are the changes needed?

According to the Jenkins logs, there are 118 warnings about this.
```
 grep "is deprecated" ~/consoleText | grep scalatest | wc -l
     118
```

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

After Jenkins passes, we need to check the Jenkins log.

Closes #25982 from dongjoon-hyun/SPARK-29307.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-09-30 21:00:11 -07:00
Sean Owen e1ea806b30 [SPARK-29291][CORE][SQL][STREAMING][MLLIB] Change procedure-like declaration to function + Unit for 2.13
### What changes were proposed in this pull request?

Scala 2.13 emits a deprecation warning for procedure-like declarations:

```
def foo() {
 ...
```

This is equivalent to the following, so should be changed to avoid a warning:

```
def foo(): Unit = {
  ...
```

### Why are the changes needed?

It will avoid about a thousand compiler warnings when we start to support Scala 2.13. I wanted to make the change in 3.0 as there are less likely to be back-ports from 3.0 to 2.4 than 3.1 to 3.0, for example, minimizing that downside to touching so many files.

Unfortunately, that makes this quite a big change.

### Does this PR introduce any user-facing change?

No behavior change at all.

### How was this patch tested?

Existing tests.

Closes #25968 from srowen/SPARK-29291.

Authored-by: Sean Owen <sean.owen@databricks.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-09-30 10:03:23 -07:00
Unknown 3ea9d6825b [SPARK-29019][WEBUI] Improve tooltip JDBC/ODBC Server tab
### What changes were proposed in this pull request?
Some of the columns of JDBC/ODBC server tab in Web UI are hard to understand.
We have documented it at SPARK-28373 but I think it is better to have some tooltips in the SQL statistics table to explain the columns
![image](https://user-images.githubusercontent.com/12819544/64489775-38e48980-d257-11e9-868a-5f5f6a0f1e46.png)
The columns with new tooltips are finish time, close time, execution time and duration
![image](https://user-images.githubusercontent.com/12819544/64489858-1141f100-d258-11e9-9e4e-fae3299da465.png)
Improvements in UIUtils can be used in other tables in WebUI to add tooltips

### Why are the changes needed?
It is interesting to improve the undestanding of the WebUI

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
Unit tests are added and manual test.

Closes #25723 from planga82/feature/SPARK-29019_tooltipjdbcServer.

Lead-authored-by: Unknown <soypab@gmail.com>
Co-authored-by: Pablo <soypab@gmail.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-09-29 18:34:24 -05:00
Dongjoon Hyun df28671800 [SPARK-29282][TESTS] Use the same VM configurations for test/benchmark
### What changes were proposed in this pull request?

This PR aims to specify the JDK8 default configurations `-XX:+UseParallelGC -XX:-UseDynamicNumberOfGCThreads` explicitly. As we see in this PR [here](https://github.com/apache/spark/pull/25966/files#diff-12b89b7ee67c63c2254b749c8f8d0694R10), this will make the comparison between JDK8 and JDK11 easier by removing a misleading regression.

**NOTE THAT THESE JVM CONFS ARE ONLY FOR BENCHMARK COMPARISON, NOT FOR A PRODUCTION**

### Why are the changes needed?

There exists many JVM-level changes between JDK8 and JDK11. For example, the followings are notable changes and it turns out that especially (1) and (2) shows a misleading regression in our micro-benchmark environment because our microbenchmark uses small VM memory.
1. [JEP 248: Make G1 the Default Garbage Collector](https://bugs.openjdk.java.net/browse/JDK-8073273) **JDK9+**
2. [Enable UseDynamicNumberOfGCThreads by default](https://bugs.openjdk.java.net/browse/JDK-8198547) **JDK11+**
3. [Change default value of HeapSizePerGCThread](https://bugs.openjdk.java.net/browse/JDK-8200417) **JDK11+**

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

This is a test-only JVM configuration change. Manually, run the benchmark.

Closes #25966 from dongjoon-hyun/SPARK-29282.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-09-29 15:11:46 -07:00
Juliusz Sompolski 420abb457d [SPARK-29263][SCHEDULER] Update availableSlots in resourceOffers() before checking available slots for barrier taskSet
### What changes were proposed in this pull request?

availableSlots are computed before the for loop looping over all TaskSets in resourceOffers. But the number of slots changes in every iteration, as in every iteration these slots are taken. The number of available slots checked by a barrier task set has therefore to be recomputed in every iteration from availableCpus.

### Why are the changes needed?

Bugfix.
This could make resourceOffer attempt to start a barrier task set, even though it has not enough slots available. That would then be caught by the `require` in line 519, which will throw an exception, which will get caught and ignored by Dispatcher's MessageLoop, so nothing terrible would happen, but the exception would prevent resourceOffers from considering further TaskSets.
Note that launching the barrier TaskSet can still fail if other requirements are not satisfied, and still can be rolled-back by throwing exception in this `require`. Handling it more gracefully remains a TODO in SPARK-24818, but this fix at least should resolve the situation when it's unable to launch because of insufficient slots.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

Added UT

Closes #23375

Closes #25946 from juliuszsompolski/SPARK-29263.

Authored-by: Juliusz Sompolski <julek@databricks.com>
Signed-off-by: Xingbo Jiang <xingbo.jiang@databricks.com>
2019-09-27 11:18:32 -07:00
sev7e0 cd04607785 [SPARK-29246][CORE] Remove unnecessary imports in core module
### What changes were proposed in this pull request?

Remove unnecessary imports in `core` module.

### Why are the changes needed?

Clean  code for Apache Spark 3.0.0.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Local test.

Closes #25927 from sev7e0/dev_0925.

Authored-by: sev7e0 <sev7e0@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-09-26 20:43:08 -07:00
Xianyang Liu e07cbbe9c9 [SPARK-29236][CORE] Access 'executorDataMap' out of 'DriverEndpoint' should be protected by lock
### What changes were proposed in this pull request?

Protected the `executorDataMap` under lock when accessing it out of 'DriverEndpoint''s methods.

### Why are the changes needed?

Just as the comments:

>

// Accessing `executorDataMap` in `DriverEndpoint.receive/receiveAndReply` doesn't need any
// protection. But accessing `executorDataMap` out of `DriverEndpoint.receive/receiveAndReply`
// must be protected by `CoarseGrainedSchedulerBackend.this`. Besides, `executorDataMap` should
// only be modified in `DriverEndpoint.receive/receiveAndReply` with protection by
// `CoarseGrainedSchedulerBackend.this`.

`executorDataMap` is not threadsafe, it should be protected by lock when accessing it out of `DriverEndpoint`

### Does this PR introduce any user-facing change?

NO

### How was this patch tested?

Existed UT.

Closes #25922 from ConeyLiu/executorDataMap.

Authored-by: Xianyang Liu <xianyang.liu@intel.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-09-25 22:38:59 +08:00
Tomoko Komiyama 58989cd1b0 [SPARK-29168][WEBUI][FOLLOW-UP] Use a dark colors on selected Executor removed on timeline view
### What changes were proposed in this pull request?
Changed Executor color settings in timeline-view.css

### Why are the changes needed?
In WebUI, color of executor changes to dark blue when you click it. It might be confused user because of the color.

[ Before ]
![r_before](https://user-images.githubusercontent.com/55128575/65562403-48671080-df81-11e9-98ff-193e3f058cec.png)

[ After ]
![r_after](https://user-images.githubusercontent.com/55128575/65562427-62085800-df81-11e9-8465-6fb9a966d0bc.png)

### Does this PR introduce any user-facing change?
Yes.

### How was this patch tested?
Manually test.

Closes #25921 from TomokoKomiyama/fix-js-2.

Authored-by: Tomoko Komiyama <btkomiyamatm@oss.nttdata.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-09-25 05:15:29 -07:00
WeichenXu d8b0914c2e [SPARK-28957][SQL] Copy any "spark.hive.foo=bar" spark properties into hadoop conf as "hive.foo=bar"
### What changes were proposed in this pull request?

Copy any "spark.hive.foo=bar" spark properties into hadoop conf as "hive.foo=bar"

### Why are the changes needed?
Providing spark side config entry for hive configurations.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
UT.

Closes #25661 from WeichenXu123/add_hive_conf.

Authored-by: WeichenXu <weichen.xu@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-09-25 15:54:44 +08:00
sev7e0 e650f8fdba [SPARK-29230][CORE][TEST] Fix NPE in ProcfsMetricsGetterSuite
### What changes were proposed in this pull request?

When I use `ProcfsMetricsGetterSuite for` testing, always throw out `java.lang.NullPointerException`. I think there is a problem with locating `new ProcfsMetricsGetter`, which will lead to `SparkEnv` not being initialized in time. This leads to `java.lang.NullPointerException` when the method is executed.

### Why are the changes needed?
For test.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

Local testing

Closes #25918 from sev7e0/dev_0924.

Authored-by: sev7e0 <sev7e0@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-09-24 14:09:40 -07:00
Gabor Somogyi d75588c57a [SPARK-29082][CORE] Skip delegation token generation if no credentials are available
This PR is an enhanced version of https://github.com/apache/spark/pull/25805 so I've kept the original text. The problem with the original PR can be found in comment.

This situation can happen when an external system (e.g. Oozie) generates
delegation tokens for a Spark application. The Spark driver will then run
against secured services, have proper credentials (the tokens), but no
kerberos credentials. So trying to do things that requires a kerberos
credential fails.

Instead, if no kerberos credentials are detected, just skip the whole
delegation token code.

Tested with an application that simulates Oozie; fails before the fix,
passes with the fix. Also with other DT-related tests to make sure other
functionality keeps working.

Closes #25901 from gaborgsomogyi/SPARK-29082.

Authored-by: Gabor Somogyi <gabor.g.somogyi@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-09-24 11:12:26 -07:00
TomokoKomiyama cb72b10b91 [SPARK-29168][WEBUI] Use a unique color on selected item on timeline view
### What changes were proposed in this pull request?

Changed color settings in .vis-timeline .vis-item.executor.vis-selected (timeline-view.css)

### Why are the changes needed?

In WebUI, executor bar's color changes blue to green when you click it. It might be confused user because of the color.

[ Before ]
![before](https://user-images.githubusercontent.com/55128575/65487629-40a45f00-dee2-11e9-8974-dc7027824b52.png)

[ After ]
![after](https://user-images.githubusercontent.com/55128575/65487674-5580f280-dee2-11e9-8e70-28f4ddcf56c3.png)

### Does this PR introduce any user-facing change?

Yes.

### How was this patch tested?

Manually test.

Closes #25846 from TomokoKomiyama/fix-js.

Authored-by: TomokoKomiyama <btkomiyamatm@oss.nttdata.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-09-24 00:15:54 -07:00
Kousuke Saruta 7c8596823a [SPARK-29218][WEBUI] Increase Show Additional Metrics checkbox width in StagePage
### What changes were proposed in this pull request?

Modified widths of some checkboxes in StagePage.

### Why are the changes needed?

When we increase the font size of the browsers or the default font size is big, the labels of checkbox of `Show Additional Metrics` in `StagePage` are wrapped like as follows.

![before-modified1](https://user-images.githubusercontent.com/4736016/65449180-634c5e80-de75-11e9-9f27-88f4cc1313b7.png)
![before-modified2](https://user-images.githubusercontent.com/4736016/65449182-63e4f500-de75-11e9-96b8-46e92a61f40c.png)

### Does this PR introduce any user-facing change?

Yes.

### How was this patch tested?

Run the following and visit the `Stage Detail` page. Then, increase the font size.
```
$ bin/spark-shell
...
scala> spark.range(100000).groupBy("id").count.collect
```

Closes #25905 from sarutak/adjust-checkbox-width.

Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-09-23 23:57:08 -07:00
Liang-Chi Hsieh d50f6e6344 [SPARK-25903][CORE] TimerTask should be synchronized on ContextBarrierState
### What changes were proposed in this pull request?

BarrierCoordinator sets up a TimerTask for a round of global sync. Currently the run method is synchronized on the created TimerTask. But to be synchronized with handleRequest, it should be synchronized on the ContextBarrierState object, not TimerTask object.

### Why are the changes needed?

ContextBarrierState.handleRequest and TimerTask.run both access the internal status of a ContextBarrierState object. If TimerTask doesn't be synchronized on the same ContextBarrierState object, when the timer task is triggered, handleRequest still accepts new request and modify  requesters field in the ContextBarrierState object. It makes the behavior inconsistency.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

Test locally

Closes #25897 from viirya/SPARK-25903.

Authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-09-24 00:13:38 +08:00
Daoyuan Wang c08bc37281 [SPARK-29177][CORE] fix zombie tasks after stage abort
### What changes were proposed in this pull request?
Do task handling even the task exceeds maxResultSize configured. More details are in the jira description https://issues.apache.org/jira/browse/SPARK-29177 .

### Why are the changes needed?
Without this patch, the zombie tasks will prevent yarn from recycle those containers running these tasks, which will affect other applications.

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
unit test and production test with a very large `SELECT` in spark thriftserver.

Closes #25850 from adrian-wang/zombie.

Authored-by: Daoyuan Wang <me@daoyuan.wang>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-09-23 19:46:01 +08:00
Yuanjian Li f725d472f5 [SPARK-25341][CORE] Support rolling back a shuffle map stage and re-generate the shuffle files
After the newly added shuffle block fetching protocol in #24565, we can keep this work by extending the FetchShuffleBlocks message.

### What changes were proposed in this pull request?
In this patch, we achieve the indeterminate shuffle rerun by reusing the task attempt id(unique id within an application) in shuffle id, so that each shuffle write attempt has a different file name. For the indeterministic stage, when the stage resubmits, we'll clear all existing map status and rerun all partitions.

All changes are summarized as follows:
- Change the mapId to mapTaskAttemptId in shuffle related id.
- Record the mapTaskAttemptId in MapStatus.
- Still keep mapId in ShuffleFetcherIterator for fetch failed scenario.
- Add the determinate flag in Stage and use it in DAGScheduler and the cleaning work for the intermediate stage.

### Why are the changes needed?
This is a follow-up work for #22112's future improvment[1]: `Currently we can't rollback and rerun a shuffle map stage, and just fail.`

Spark will rerun a finished shuffle write stage while meeting fetch failures, currently, the rerun shuffle map stage will only resubmit the task for missing partitions and reuse the output of other partitions. This logic is fine in most scenarios, but for indeterministic operations(like repartition), multiple shuffle write attempts may write different data, only rerun the missing partition will lead a correctness bug. So for the shuffle map stage of indeterministic operations, we need to support rolling back the shuffle map stage and re-generate the shuffle files.

### Does this PR introduce any user-facing change?
Yes, after this PR, the indeterminate stage rerun will be accepted by rerunning the whole stage. The original behavior is aborting the stage and fail the job.

### How was this patch tested?
- UT: Add UT for all changing code and newly added function.
- Manual Test: Also providing a manual test to verify the effect.
```
import scala.sys.process._
import org.apache.spark.TaskContext

val determinateStage0 = sc.parallelize(0 until 1000 * 1000 * 100, 10)
val indeterminateStage1 = determinateStage0.repartition(200)
val indeterminateStage2 = indeterminateStage1.repartition(200)
val indeterminateStage3 = indeterminateStage2.repartition(100)
val indeterminateStage4 = indeterminateStage3.repartition(300)
val fetchFailIndeterminateStage4 = indeterminateStage4.map { x =>
if (TaskContext.get.attemptNumber == 0 && TaskContext.get.partitionId == 190 &&
  TaskContext.get.stageAttemptNumber == 0) {
  throw new Exception("pkill -f -n java".!!)
  }
  x
}
val indeterminateStage5 = fetchFailIndeterminateStage4.repartition(200)
val finalStage6 = indeterminateStage5.repartition(100).collect().distinct.length
```
It's a simple job with multi indeterminate stage, it will get a wrong answer while using old Spark version like 2.2/2.3, and will be killed after #22112. With this fix, the job can retry all indeterminate stage as below screenshot and get the right result.
![image](https://user-images.githubusercontent.com/4833765/63948434-3477de00-caab-11e9-9ed1-75abfe6d16bd.png)

Closes #25620 from xuanyuanking/SPARK-25341-8.27.

Authored-by: Yuanjian Li <xyliyuanjian@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-09-23 16:16:52 +08:00
madianjun e2c47876e9 [CORE][MINOR] Correct a log message in DAGScheduler
### What changes were proposed in this pull request?

Correct a word in a log message.

### Why are the changes needed?

Log message will be more clearly.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Test is not needed.

Closes #25880 from mdianjun/fix-a-word.

Authored-by: madianjun <madianjun@jd.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2019-09-22 17:22:37 +09:00
Jungtaek Lim (HeartSaVioR) 81b6f11a3a [SPARK-29160][CORE] Use UTF-8 explicitly for reading/writing event log file
### What changes were proposed in this pull request?

Credit to vanzin as he found and commented on this while reviewing #25670 - [comment](https://github.com/apache/spark/pull/25670#discussion_r325383512).

This patch proposes to specify UTF-8 explicitly while reading/writer event log file.

### Why are the changes needed?

The event log file is being read/written as default character set of JVM process which may open the chance to bring some problems on reading event log files from another machines. Spark's de facto standard character set is UTF-8, so it should be explicitly set to.

### Does this PR introduce any user-facing change?

Yes, if end users have been running Spark process with different default charset than "UTF-8", especially their driver JVM processes. No otherwise.

### How was this patch tested?

Existing UTs, as ReplayListenerSuite contains "end-to-end" event logging/reading tests (both uncompressed/compressed).

Closes #25845 from HeartSaVioR/SPARK-29160.

Authored-by: Jungtaek Lim (HeartSaVioR) <kabhwan@gmail.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2019-09-21 23:59:37 +09:00
colinma 076186e881 [SPARK-19147][CORE] Gracefully handle error in task after executor is stopped
### What changes were proposed in this pull request?

TransportClientFactory.createClient() is called by task and TransportClientFactory.close() is called by executor.
When stop the executor, close() will set workerGroup = null, NPE will occur in createClient which generate many exception in log.
For exception occurs after close(), treated it as an expected Exception
and transform it to InterruptedException which can be processed by Executor.

### Why are the changes needed?

The change can reduce the exception stack trace in log file, and user won't be confused by these excepted exception.

### Does this PR introduce any user-facing change?

N/A

### How was this patch tested?

New tests are added in TransportClientFactorySuite and ExecutorSuite

Closes #25759 from colinmjj/spark-19147.

Authored-by: colinma <colinma@tencent.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-09-21 07:31:39 -05:00
Dongjoon Hyun ff3a737c75 [SPARK-29192][TESTS] Extend BenchmarkBase to write JDK9+ results separately
### What changes were proposed in this pull request?

This PR aims to extend the existing benchmarks to save JDK9+ result separately.
All `core` module benchmark test results are added. I'll run the other test suites in another PR.
After regenerating all results, we will check JDK11 performance regressions.

### Why are the changes needed?

From Apache Spark 3.0, we support both JDK8 and JDK11. We need to have a way to find the performance regression.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Manually run the benchmark.

Closes #25873 from dongjoon-hyun/SPARK-JDK11-PERF.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-09-20 19:41:25 -07:00
Holden Karau 42050c3f4f [SPARK-27659][PYTHON] Allow PySpark to prefetch during toLocalIterator
### What changes were proposed in this pull request?

This PR allows Python toLocalIterator to prefetch the next partition while the first partition is being collected. The PR also adds a demo micro bench mark in the examples directory, we may wish to keep this or not.

### Why are the changes needed?

In https://issues.apache.org/jira/browse/SPARK-23961 / 5e79ae3b40 we changed PySpark to only pull one partition at a time. This is memory efficient, but if partitions take time to compute this can mean we're spending more time blocking.

### Does this PR introduce any user-facing change?

A new param is added to toLocalIterator

### How was this patch tested?

New unit test inside of `test_rdd.py` checks the time that the elements are evaluated at. Another test that the results remain the same are added to `test_dataframe.py`.

I also ran a micro benchmark in the examples directory `prefetch.py` which shows an improvement of ~40% in this specific use case.

>
> 19/08/16 17:11:36 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
> Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
> Setting default log level to "WARN".
> To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
> Running timers:
>
> [Stage 32:>                                                         (0 + 1) / 1]
> Results:
>
> Prefetch time:
>
> 100.228110831
>
>
> Regular time:
>
> 188.341721614
>
>
>

Closes #25515 from holdenk/SPARK-27659-allow-pyspark-tolocalitr-to-prefetch.

Authored-by: Holden Karau <hkarau@apple.com>
Signed-off-by: Holden Karau <hkarau@apple.com>
2019-09-20 09:59:31 -07:00
Jungtaek Lim (HeartSaVioR) 27d0c3f913 [SPARK-29139][CORE][TESTS] Increase timeout to wait for executor(s) to be up in SparkContextSuite
### What changes were proposed in this pull request?

This patch proposes to increase timeout to wait for executor(s) to be up in SparkContextSuite, as we observed these tests failed due to wait timeout.

### Why are the changes needed?

There's some case that CI build is extremely slow which requires 3x or more time to pass the test.
(https://issues.apache.org/jira/browse/SPARK-29139?focusedCommentId=16934034&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16934034)

Allocating higher timeout wouldn't bring additional latency, as the code checks the condition with sleeping 10 ms per loop iteration.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

N/A, as the case is not likely to be occurred frequently.

Closes #25864 from HeartSaVioR/SPARK-29139.

Authored-by: Jungtaek Lim (HeartSaVioR) <kabhwan@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-09-20 08:57:47 -07:00
HyukjinKwon a23ad25ba4 [SPARK-29158][SQL][FOLLOW-UP] Create an actual test case under src/test and minor documentation correction
### What changes were proposed in this pull request?

This PR is a followup of https://github.com/apache/spark/pull/25838 and proposes to create an actual test case under `src/test`. Previously, compile only test existed at `src/main`.

Also, just changed the wordings in `SerializableConfiguration` just only to describe what it does (remove other words).

### Why are the changes needed?

Tests codes should better exist in `src/test` not `src/main`. Also, it should better test a basic functionality.

### Does this PR introduce any user-facing change?

No except minor doc change.

### How was this patch tested?

Unit test was added.

Closes #25867 from HyukjinKwon/SPARK-29158.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-09-20 08:52:30 -07:00
Jungtaek Lim (HeartSaVioR) 5e92301723 [SPARK-29161][CORE][SQL][STREAMING] Unify default wait time for waitUntilEmpty
### What changes were proposed in this pull request?

This is a follow-up of the [review comment](https://github.com/apache/spark/pull/25706#discussion_r321923311).

This patch unifies the default wait time to be 10 seconds as it would fit most of UTs (as they have smaller timeouts) and doesn't bring additional latency since it will return if the condition is met.

This patch doesn't touch the one which waits 100000 milliseconds (100 seconds), to not break anything unintentionally, though I'd rather questionable that we really need to wait for 100 seconds.

### Why are the changes needed?

It simplifies the test code and get rid of various heuristic values on timeout.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

CI build will test the patch, as it would be the best environment to test the patch (builds are running there).

Closes #25837 from HeartSaVioR/MINOR-unify-default-wait-time-for-wait-until-empty.

Authored-by: Jungtaek Lim (HeartSaVioR) <kabhwan@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-09-19 23:11:54 -07:00
Holden Karau bd05339171 [SPARK-29158][SQL] Expose SerializableConfiguration for DataSource V2 developers
### What changes were proposed in this pull request?

Currently the SerializableConfiguration, which makes the Hadoop configuration serializable is private. This makes it public, with a developer annotation.

### Why are the changes needed?

Many data source depend on the Hadoop configuration which may have specific components on the driver. Inside of Spark's own DataSourceV2 implementations this is frequently used (Parquet, Json, Orc, etc.)

### Does this PR introduce any user-facing change?

This provides a new developer API.

### How was this patch tested?

No new tests are added as this only exposes a previously developed & thoroughly used + tested component.

Closes #25838 from holdenk/SPARK-29158-expose-serializableconfiguration-for-dsv2.

Authored-by: Holden Karau <hkarau@apple.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2019-09-20 14:39:24 +09:00
Dongjoon Hyun 76ebf2241a Revert "[SPARK-29082][CORE] Skip delegation token generation if no credentials are available"
This reverts commit f32f16fd68.
2019-09-19 17:54:42 -07:00
Sean Owen c5d8a51f3b [MINOR][BUILD] Fix about 15 misc build warnings
### What changes were proposed in this pull request?

This addresses about 15 miscellaneous warnings that appear in the current build.

### Why are the changes needed?

No functional changes, it just slightly reduces the amount of extra warning output.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Existing tests, run manually.

Closes #25852 from srowen/BuildWarnings.

Authored-by: Sean Owen <sean.owen@databricks.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-09-19 11:37:42 -07:00
Marcelo Vanzin f32f16fd68 [SPARK-29082][CORE] Skip delegation token generation if no credentials are available
This situation can happen when an external system (e.g. Oozie) generates
delegation tokens for a Spark application. The Spark driver will then run
against secured services, have proper credentials (the tokens), but no
kerberos credentials. So trying to do things that requires a kerberos
credential fails.

Instead, if no kerberos credentials are detected, just skip the whole
delegation token code.

Tested with an application that simulates Oozie; fails before the fix,
passes with the fix. Also with other DT-related tests to make sure other
functionality keeps working.

Closes #25805 from vanzin/SPARK-29082.

Authored-by: Marcelo Vanzin <vanzin@cloudera.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-09-18 13:30:00 -07:00
Luca Canali cd481773c3 [SPARK-28091][CORE] Extend Spark metrics system with user-defined metrics using executor plugins
## What changes were proposed in this pull request?

This proposes to improve Spark instrumentation by adding a hook for user-defined metrics, extending Spark’s Dropwizard/Codahale metrics system.
The original motivation of this work was to add instrumentation for S3 filesystem access metrics by Spark job. Currently, [[ExecutorSource]] instruments HDFS and local filesystem metrics. Rather than extending the code there, we proposes with this JIRA to add a metrics plugin system which is of more flexible and general use.
Context: The Spark metrics system provides a large variety of metrics, see also , useful to  monitor and troubleshoot Spark workloads. A typical workflow is to sink the metrics to a storage system and build dashboards on top of that.
Highlights:
-	The metric plugin system makes it easy to implement instrumentation for S3 access by Spark jobs.
-	The metrics plugin system allows for easy extensions of how Spark collects HDFS-related workload metrics. This is currently done using the Hadoop Filesystem GetAllStatistics method, which is deprecated in recent versions of Hadoop. Recent versions of Hadoop Filesystem recommend using method GetGlobalStorageStatistics, which also provides several additional metrics. GetGlobalStorageStatistics is not available in Hadoop 2.7 (had been introduced in Hadoop 2.8). Using a metric plugin for Spark would allow an easy way to “opt in” using such new API calls for those deploying suitable Hadoop versions.
-	We also have the use case of adding Hadoop filesystem monitoring for a custom Hadoop compliant filesystem in use in our organization (EOS using the XRootD protocol). The metrics plugin infrastructure makes this easy to do. Others may have similar use cases.
-	More generally, this method makes it straightforward to plug in Filesystem and other metrics to the Spark monitoring system. Future work on plugin implementation can address extending monitoring to measure usage of external resources (OS, filesystem, network, accelerator cards, etc), that maybe would not normally be considered general enough for inclusion in Apache Spark code, but that can be nevertheless useful for specialized use cases, tests or troubleshooting.

Implementation:
The proposed implementation extends and modifies the work on Executor Plugin of SPARK-24918. Additionally, this is related to recent work on extending Spark executor metrics, such as SPARK-25228.
As discussed during the review, the implementaiton of this feature modifies the Developer API for Executor Plugins, such that the new version is incompatible with the original version in Spark 2.4.

## How was this patch tested?

This modifies existing tests for ExecutorPluginSuite to adapt them to the API changes. In addition, the new funtionality for registering pluginMetrics has been manually tested running Spark on YARN and K8S clusters, in particular for monitoring S3 and for extending HDFS instrumentation with the Hadoop Filesystem “GetGlobalStorageStatistics” metrics. Executor metric plugin example and code used for testing are available, for example at: https://github.com/cerndb/SparkExecutorPlugins

Closes #24901 from LucaCanali/executorMetricsPlugin.

Authored-by: Luca Canali <luca.canali@cern.ch>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-09-18 10:32:10 -07:00
Marcelo Vanzin 276aaaae8d [SPARK-29105][CORE] Keep driver log file size up to date in HDFS
HDFS doesn't update the file size reported by the NM if you just keep
writing to the file; this makes the SHS believe the file is inactive,
and so it may delete it after the configured max age for log files.

This change uses hsync to keep the log file as up to date as possible
when using HDFS. It also disables erasure coding by default for these
logs, since hsync (& friends) does not work with EC.

Tested with a SHS configured to aggressively clean up logs; verified
a spark-shell session kept updating the log, which was not deleted by
the SHS.

Closes #25819 from vanzin/SPARK-29105.

Authored-by: Marcelo Vanzin <vanzin@cloudera.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-09-18 09:11:55 -07:00
Pavithra Ramachandran 600a2a4ae5 [SPARK-28972][DOCS] Updating unit description in configurations, to maintain consistency
### What changes were proposed in this pull request?
Updating unit description in configurations, inorder to maintain consistency across configurations.

### Why are the changes needed?
the description does not mention about suffix that can be mentioned while configuring this value.
For better user understanding

### Does this PR introduce any user-facing change?
yes. Doc description

### How was this patch tested?
generated document and checked.
![Screenshot from 2019-09-05 11-09-17](https://user-images.githubusercontent.com/51401130/64314853-07a55880-cfce-11e9-8af0-6416a50b0188.png)

Closes #25689 from PavithraRamachandran/heapsize_config.

Authored-by: Pavithra Ramachandran <pavi.rams@gmail.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-09-18 09:11:15 -05:00
turbofei eef5e6d348 [SPARK-29113][DOC] Fix some annotation errors and remove meaningless annotations in project
### What changes were proposed in this pull request?

In this PR, I fix some annotation errors and remove meaningless annotations in project.
### Why are the changes needed?
There are some annotation errors and meaningless annotations in project.
### Does this PR introduce any user-facing change?
No.
### How was this patch tested?
Verified manually.

Closes #25809 from turboFei/SPARK-29113.

Authored-by: turbofei <fwang12@ebay.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2019-09-18 13:12:18 +09:00
Dongjoon Hyun 3ece8ee157 [SPARK-29124][CORE] Use MurmurHash3 bytesHash(data, seed) instead of bytesHash(data)
### What changes were proposed in this pull request?

This PR changes `bytesHash(data)` API invocation with the underlaying `byteHash(data, arraySeed)` invocation.
```scala
def bytesHash(data: Array[Byte]): Int = bytesHash(data, arraySeed)
```

### Why are the changes needed?

The original API is changed between Scala versions by the following commit. From Scala 2.12.9, the semantic of the function is changed. If we use the underlying form, we are safe during Scala version migration.
- 846ee2b1a4 (diff-ac889f851e109fc4387cd738d52ce177)

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

This is a kind of refactoring.

Pass the Jenkins with the existing tests.

Closes #25821 from dongjoon-hyun/SPARK-SCALA-HASH.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2019-09-18 10:33:03 +09:00
Chris Martin 05988b256e [SPARK-27463][PYTHON] Support Dataframe Cogroup via Pandas UDFs
### What changes were proposed in this pull request?

Adds a new cogroup Pandas UDF.  This allows two grouped dataframes to be cogrouped together and apply a (pandas.DataFrame, pandas.DataFrame) -> pandas.DataFrame UDF to each cogroup.

**Example usage**

```
from pyspark.sql.functions import pandas_udf, PandasUDFType
df1 = spark.createDataFrame(
   [(20000101, 1, 1.0), (20000101, 2, 2.0), (20000102, 1, 3.0), (20000102, 2, 4.0)],
   ("time", "id", "v1"))

df2 = spark.createDataFrame(
   [(20000101, 1, "x"), (20000101, 2, "y")],
    ("time", "id", "v2"))

pandas_udf("time int, id int, v1 double, v2 string", PandasUDFType.COGROUPED_MAP)
   def asof_join(l, r):
      return pd.merge_asof(l, r, on="time", by="id")

df1.groupby("id").cogroup(df2.groupby("id")).apply(asof_join).show()

```

        +--------+---+---+---+
        |    time| id| v1| v2|
        +--------+---+---+---+
        |20000101|  1|1.0|  x|
        |20000102|  1|3.0|  x|
        |20000101|  2|2.0|  y|
        |20000102|  2|4.0|  y|
        +--------+---+---+---+

### How was this patch tested?

Added unit test test_pandas_udf_cogrouped_map

Closes #24981 from d80tb7/SPARK-27463-poc-arrow-stream.

Authored-by: Chris Martin <chris@cmartinit.co.uk>
Signed-off-by: Bryan Cutler <cutlerb@gmail.com>
2019-09-17 17:13:50 -07:00
Dongjoon Hyun 34915b22ab [SPARK-29104][CORE][TESTS] Fix PipedRDDSuite to use eventually to check thread termination
### What changes were proposed in this pull request?

`PipedRDD` will invoke `stdinWriterThread.interrupt()` at task completion, and `obj.wait` will get `InterruptedException`. However, there exists a possibility which the thread termination gets delayed because the thread starts from `obj.wait()` with that exception. To prevent test flakiness, we need to use `eventually`. Also, This PR fixes the typo in code comment and variable name.

### Why are the changes needed?

```
- stdin writer thread should be exited when task is finished *** FAILED ***
  Some(Thread[stdin writer for List(cat),5,]) was not empty (PipedRDDSuite.scala:107)
```

- https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7/6867/testReport/junit/org.apache.spark.rdd/PipedRDDSuite/stdin_writer_thread_should_be_exited_when_task_is_finished/

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Manual.

We can reproduce the same failure like Jenkins if we catch `InterruptedException` and sleep longer than the `eventually` timeout inside the test code. The following is the example to reproduce it.
```scala
val nums = sc.makeRDD(Array(1, 2, 3, 4), 1).map { x =>
  try {
    obj.synchronized {
      obj.wait() // make the thread waits here.
    }
  } catch {
    case ie: InterruptedException =>
      Thread.sleep(15000)
      throw ie
  }
  x
}
```

Closes #25808 from dongjoon-hyun/SPARK-29104.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2019-09-17 20:21:25 +09:00
WeichenXu 104b9b6f8c [SPARK-28483][FOLLOW-UP] Fix flaky test in BarrierTaskContextSuite
### What changes were proposed in this pull request?

I fix the test "barrier task killed" which is flaky:

* Split interrupt/no interrupt test into separate sparkContext. Prevent them to influence each other.
* only check exception on partiton-0. partition-1 is hang on sleep which may throw other exception.

### Why are the changes needed?
Make test robust.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
N/A

Closes #25799 from WeichenXu123/oss_fix_barrier_test.

Authored-by: WeichenXu <weichen.xu@databricks.com>
Signed-off-by: WeichenXu <weichen.xu@databricks.com>
2019-09-17 19:08:09 +08:00
iRakson 79b10a1aab [SPARK-28929][CORE] Spark Logging level should be INFO instead of DEBUG in Executor Plugin API
### What changes were proposed in this pull request?

Log levels in Executor.scala are changed from DEBUG to INFO.

### Why are the changes needed?

Logging level DEBUG is too low here. These messages are simple acknowledgement for successful loading and initialization of plugins. So its better to keep them in INFO level.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

Manually tested.

Closes #25634 from iRakson/ExecutorPlugin.

Authored-by: iRakson <raksonrakesh@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-09-17 00:53:12 -07:00
mcheah 67751e2694 [SPARK-29072][CORE] Put back usage of TimeTrackingOutputStream for UnsafeShuffleWriter and SortShuffleWriter
### What changes were proposed in this pull request?
The previous refactors of the shuffle writers using the shuffle writer plugin resulted in shuffle write metric updates - particularly write times - being lost in particular situations. This patch restores the lost metric updates.

### Why are the changes needed?
This fixes a regression. I'm pretty sure that without this, the Spark UI will lose shuffle write time information.

### Does this PR introduce any user-facing change?
No change from Spark 2.4. Without this, there would be a user-facing bug in Spark 3.0.

### How was this patch tested?
Existing unit tests.

Closes #25780 from mccheah/fix-write-metrics.

Authored-by: mcheah <mcheah@palantir.com>
Signed-off-by: Imran Rashid <irashid@cloudera.com>
2019-09-16 09:08:25 -05:00
Dongjoon Hyun 729b3180bc [SPARK-29087][CORE][STREAMING] Use DelegatingServletContextHandler to avoid CCE
### What changes were proposed in this pull request?

[SPARK-27122](https://github.com/apache/spark/pull/24088) fixes `ClassCastException` at `yarn` module by introducing `DelegatingServletContextHandler`. Initially, this was discovered with JDK9+, but the class path issues affected JDK8 environment, too. After [SPARK-28709](https://github.com/apache/spark/pull/25439), I also hit the similar issue at `streaming` module.

This PR aims to fix `streaming` module by adding `getContextPath` to `DelegatingServletContextHandler` and using it.

### Why are the changes needed?

Currently, when we test `streaming` module independently, it fails like the following.
```
$ build/mvn test -pl streaming
...
UISeleniumSuite:
- attaching and detaching a Streaming tab *** FAILED ***
  java.lang.ClassCastException: org.sparkproject.jetty.servlet.ServletContextHandler cannot be cast to org.eclipse.jetty.servlet.ServletContextHandler
...
Tests: succeeded 337, failed 1, canceled 0, ignored 1, pending 0
*** 1 TEST FAILED ***
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
```

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Pass the Jenkins with the modified tests. And do the following manually.
Since you can observe this when you run `streaming` module test only (instead of running all), you need to install the changed `core` module and use it.

```
$ java -version
openjdk version "1.8.0_222"
OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_222-b10)
OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.222-b10, mixed mode)
$ build/mvn install -DskipTests
$ build/mvn test -pl streaming
```

Closes #25791 from dongjoon-hyun/SPARK-29087.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-09-15 10:15:49 -07:00
David Lewis 8c0e961f6c [SPARK-29081][CORE] Replace calls to SerializationUtils.clone on properties with a faster implementation
Replace use of `SerializationUtils.clone` with new `Utils.cloneProperties` method
Add benchmark + results showing dramatic speed up for effectively equivalent functionality.

### What changes were proposed in this pull request?
While I am not sure that SerializationUtils.clone is a performance issue in production, I am sure that it is overkill for the task it is doing (providing a distinct copy of a `Properties` object).
This PR provides a benchmark showing the dramatic improvement over the clone operation and replaces uses of `SerializationUtils.clone` on `Properties` with the more specialized `Utils.cloneProperties`.

### Does this PR introduce any user-facing change?
Strings are immutable so there is no reason to serialize and deserialize them, it just creates extra garbage.
The only functionality that would be changed is the unsupported insertion of non-String objects into the spark local properties.

### How was this patch tested?

1. Pass the Jenkins with the existing tests.
2. Since this is a performance improvement PR, manually run the benchmark.

Closes #25787 from databricks-david-lewis/SPARK-29081.

Authored-by: David Lewis <david.lewis@databricks.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-09-15 00:28:32 -07:00
Andy Zhang 956f6e988c [SPARK-29080][CORE][SPARKR] Support R file extension case-insensitively
### What changes were proposed in this pull request?

Make r file extension check case insensitive for spark-submit.

### Why are the changes needed?

spark-submit does not accept `.r` files as R scripts. Some codebases have r files that end with lowercase file extensions. It is inconvenient to use spark-submit with lowercase extension R files. The error is not very clear (https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala#L232).

```
$ ./bin/spark-submit examples/src/main/r/dataframe.r
Exception in thread "main" org.apache.spark.SparkException: Cannot load main class from JAR file:/Users/dongjoon/APACHE/spark-release/spark-2.4.4-bin-hadoop2.7/examples/src/main/r/dataframe.r
```

### Does this PR introduce any user-facing change?

Yes. spark-submit can now be used to run R scripts with `.r` file extension.

### How was this patch tested?

Manual.

```
$ mv examples/src/main/r/dataframe.R examples/src/main/r/dataframe.r
$ ./bin/spark-submit examples/src/main/r/dataframe.r
```

Closes #25778 from Loquats/r-case.

Authored-by: Andy Zhang <yue.zhang@databricks.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-09-15 00:17:11 -07:00
Shanyu Zhao 174d9104cf [SPARK-29003][CORE] Add start method to ApplicationHistoryProvider to avoid deadlock on startup
### What changes were proposed in this pull request?

During Spark History Server startup, there are two things happening simultaneously that call into `java.nio.file.FileSystems.getDefault()` and we sometime hit [JDK-8194653](https://bugs.openjdk.java.net/browse/JDK-8194653).
1) start jetty server
2) start ApplicationHistoryProvider (which reads files from HDFS)

We should do these two things sequentially instead of in parallel.
We introduce a start() method in ApplicationHistoryProvider (and its subclass FsHistoryProvider), and we do initialize inside the start() method instead of the constructor.
In HistoryServer, we explicitly call provider.start() after we call bind() which starts the Jetty server.

### Why are the changes needed?
It is a bug that occasionally starting Spark History Server results in process hang due to deadlock among threads.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
I stress tested this PR with a bash script to stop and start Spark History Server more than 1000 times, it worked fine. Previously I can only do the stop/start loop less than 10 times before I hit the deadlock issue.

Closes #25705 from shanyu/shanyu-29003.

Authored-by: Shanyu Zhao <shzhao@microsoft.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-09-13 21:09:17 -07:00
Dongjoon Hyun 8f057a9612 [SPARK-29032][CORE] Add PrometheusServlet to monitor Master/Worker/Driver
### What changes were proposed in this pull request?

This PR aims to simplify `Prometheus` support by adding `PrometheusServlet`. The main use cases are `K8s` and `Spark Standalone` cluster environments.

### Why are the changes needed?

Prometheus.io is a CNCF project used widely with K8s.
- https://github.com/prometheus/prometheus

For `Master/Worker/Driver`, `Spark JMX Sink` and `Prometheus JMX Converter` combination is used in many cases. One way to achieve that is having the followings.

**JMX Sink (conf/metrics.properties)**
```
*.sink.jmx.class=org.apache.spark.metrics.sink.JmxSink
```

**JMX Converter(conf/spark-env.sh)**
- https://repo1.maven.org/maven2/io/prometheus/jmx/jmx_prometheus_javaagent/0.12.0/jmx_prometheus_javaagent-0.12.0.jar
```
export SPARK_DAEMON_JAVA_OPTS=
"-javaagent:${PWD}/jmx_prometheus_javaagent-${JMX_PROMETHEUS_VERSION}.jar=
${PORT_AGENT}:jmx_prometheus.yaml"
```

This agent approach requires `PORT_AGENT` additionally. Instead, this PR natively support `Prometheus` format exporting with reusing REST API port for the better UX.

### Does this PR introduce any user-facing change?

Yes. New web interfaces are added along with the existing JSON API.

|              |                         JSON End Point                       |            Prometheus End Point          |
| ------- | ------------------------------------------- | ---------------------------------- |
| Master | /metrics/master/json/                                     | /metrics/master/prometheus/          |
| Master | /metrics/applications/json/                             | /metrics/applications/prometheus/ |
| Worker | /metrics/json/                                                   | /metrics/prometheus/                      |
| Driver   | /metrics/json/                                                   | /metrics/prometheus/                      |

### How was this patch tested?

Manually connect the new end-points with `curl`.

**Setup (Master/Worker/Driver)**
Add the followings at `conf/metrics.properties` (`conf/metrics.properties.template` has these examples)
```
*.sink.prometheusServlet.class=org.apache.spark.metrics.sink.PrometheusServlet
*.sink.prometheusServlet.path=/metrics/prometheus
master.sink.prometheusServlet.path=/metrics/master/prometheus
applications.sink.prometheusServlet.path=/metrics/applications/prometheus
```
```
$ sbin/start-master.sh
$ sbin/start-slave.sh spark://`hostname`:7077
$ bin/spark-shell --master spark://`hostname`:7077
```

```
$ curl -s http://localhost:8080/metrics/master/json/ | jq
{
  "version": "3.1.3",
  "gauges": {
    "master.aliveWorkers": {
      "value": 1
    },
    "master.apps": {
      "value": 1
    },
    "master.waitingApps": {
      "value": 0
    },
    "master.workers": {
      "value": 1
    }
  },
...
$ curl -s http://localhost:8080/metrics/master/prometheus/ | grep master
metrics_master_aliveWorkers_Value 1
metrics_master_apps_Value 1
metrics_master_waitingApps_Value 0
metrics_master_workers_Value 1
```

```
$ curl -s http://localhost:8080/metrics/applications/json/ | jq
{
  "version": "3.1.3",
  "gauges": {
    "application.Spark shell.1568261490667.cores": {
      "value": 16
    },
    "application.Spark shell.1568261490667.runtime_ms": {
      "value": 108966
    },
    "application.Spark shell.1568261490667.status": {
      "value": "RUNNING"
    }
  },
...
$ curl -s http://localhost:8080/metrics/applications/prometheus/ | grep application
metrics_application_Spark_shell_1568261490667_cores_Value 16
metrics_application_Spark_shell_1568261490667_runtime_ms_Value 143174
```

```
$ curl -s http://localhost:8081/metrics/json/ | jq
{
  "version": "3.1.3",
  "gauges": {
    "worker.coresFree": {
      "value": 0
    },
    "worker.coresUsed": {
      "value": 16
    },
    "worker.executors": {
      "value": 1
    },
    "worker.memFree_MB": {
      "value": 30720
    },
    "worker.memUsed_MB": {
      "value": 1024
    }
  },
...
$ curl -s http://localhost:8081/metrics/prometheus/ | grep worker
metrics_worker_coresFree_Value 0
metrics_worker_coresUsed_Value 16
metrics_worker_executors_Value 1
metrics_worker_memFree_MB_Value 30720
metrics_worker_memUsed_MB_Value 1024
```

```
$ curl -s http://localhost:4040/metrics/json/ | jq
{
  "version": "3.1.3",
  "gauges": {
    "app-20190911211130-0000.driver.BlockManager.disk.diskSpaceUsed_MB": {
      "value": 0
    },
    "app-20190911211130-0000.driver.BlockManager.memory.maxMem_MB": {
      "value": 732
    },
    "app-20190911211130-0000.driver.BlockManager.memory.maxOffHeapMem_MB": {
      "value": 0
    },
    "app-20190911211130-0000.driver.BlockManager.memory.maxOnHeapMem_MB": {
      "value": 732
    },
...
$ curl -s http://localhost:4040/metrics/prometheus/ | head -n5
metrics_app_20190911211130_0000_driver_BlockManager_disk_diskSpaceUsed_MB_Value 0
metrics_app_20190911211130_0000_driver_BlockManager_memory_maxMem_MB_Value 732
metrics_app_20190911211130_0000_driver_BlockManager_memory_maxOffHeapMem_MB_Value 0
metrics_app_20190911211130_0000_driver_BlockManager_memory_maxOnHeapMem_MB_Value 732
metrics_app_20190911211130_0000_driver_BlockManager_memory_memUsed_MB_Value 0
```

Closes #25769 from dongjoon-hyun/SPARK-29032-2.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: DB Tsai <d_tsai@apple.com>
2019-09-13 21:31:21 +00:00
Dongjoon Hyun bbfaadb280 [SPARK-29064][CORE] Add PrometheusResource to export Executor metrics
### What changes were proposed in this pull request?

At Apache Spark 3.0.0, [SPARK-23429](https://github.com/apache/spark/pull/21221) added the ability to collect executor metrics via heartbeats and to expose it as a REST API. This PR aims to extend it to support `Prometheus` format additionally.

### Why are the changes needed?

Prometheus.io is a CNCF project used widely with K8s.
- https://github.com/prometheus/prometheus

### Does this PR introduce any user-facing change?

Yes. New web interfaces are added along with the existing JSON API.

|              |                JSON End Point                    |            Prometheus End Point         |
| ------- | ------------------------------------ | --------------------------------- |
| Driver   | /api/v1/applications/{id}/executors/   | /metrics/executors/prometheus/   |

### How was this patch tested?

Manually connect to the new end-points with `curl` and compare with JSON.

**SETUP**
```
$ sbin/start-master.sh
$ sbin/start-slave.sh spark://`hostname`:7077
$ bin/spark-shell --master spark://`hostname`:7077 --conf spark.ui.prometheus.enabled=true
```

**JSON (existing after SPARK-23429)**
```
$ curl -s http://localhost:4040/api/v1/applications/app-20190911204823-0000/executors
[ {
  "id" : "driver",
  "hostPort" : "localhost:52615",
  "isActive" : true,
  "rddBlocks" : 0,
  "memoryUsed" : 0,
  "diskUsed" : 0,
  "totalCores" : 0,
  "maxTasks" : 0,
  "activeTasks" : 0,
  "failedTasks" : 0,
  "completedTasks" : 0,
  "totalTasks" : 0,
  "totalDuration" : 0,
  "totalGCTime" : 0,
  "totalInputBytes" : 0,
  "totalShuffleRead" : 0,
  "totalShuffleWrite" : 0,
  "isBlacklisted" : false,
  "maxMemory" : 384093388,
  "addTime" : "2019-09-12T03:48:23.875GMT",
  "executorLogs" : { },
  "memoryMetrics" : {
    "usedOnHeapStorageMemory" : 0,
    "usedOffHeapStorageMemory" : 0,
    "totalOnHeapStorageMemory" : 384093388,
    "totalOffHeapStorageMemory" : 0
  },
  "blacklistedInStages" : [ ],
  "peakMemoryMetrics" : {
    "JVMHeapMemory" : 229995952,
    "JVMOffHeapMemory" : 145872280,
    "OnHeapExecutionMemory" : 0,
    "OffHeapExecutionMemory" : 0,
    "OnHeapStorageMemory" : 0,
    "OffHeapStorageMemory" : 0,
    "OnHeapUnifiedMemory" : 0,
    "OffHeapUnifiedMemory" : 0,
    "DirectPoolMemory" : 75891,
    "MappedPoolMemory" : 0,
    "ProcessTreeJVMVMemory" : 0,
    "ProcessTreeJVMRSSMemory" : 0,
    "ProcessTreePythonVMemory" : 0,
    "ProcessTreePythonRSSMemory" : 0,
    "ProcessTreeOtherVMemory" : 0,
    "ProcessTreeOtherRSSMemory" : 0,
    "MinorGCCount" : 8,
    "MinorGCTime" : 82,
    "MajorGCCount" : 3,
    "MajorGCTime" : 128
  },
  "attributes" : { },
  "resources" : { }
}, {
  "id" : "0",
  "hostPort" : "127.0.0.1:52619",
  "isActive" : true,
  "rddBlocks" : 0,
  "memoryUsed" : 0,
  "diskUsed" : 0,
  "totalCores" : 16,
  "maxTasks" : 16,
  "activeTasks" : 0,
  "failedTasks" : 0,
  "completedTasks" : 0,
  "totalTasks" : 0,
  "totalDuration" : 0,
  "totalGCTime" : 0,
  "totalInputBytes" : 0,
  "totalShuffleRead" : 0,
  "totalShuffleWrite" : 0,
  "isBlacklisted" : false,
  "maxMemory" : 384093388,
  "addTime" : "2019-09-12T03:48:25.907GMT",
  "executorLogs" : {
    "stdout" : "http://127.0.0.1:8081/logPage/?appId=app-20190911204823-0000&executorId=0&logType=stdout",
    "stderr" : "http://127.0.0.1:8081/logPage/?appId=app-20190911204823-0000&executorId=0&logType=stderr"
  },
  "memoryMetrics" : {
    "usedOnHeapStorageMemory" : 0,
    "usedOffHeapStorageMemory" : 0,
    "totalOnHeapStorageMemory" : 384093388,
    "totalOffHeapStorageMemory" : 0
  },
  "blacklistedInStages" : [ ],
  "attributes" : { },
  "resources" : { }
} ]
```

**Prometheus**
```
$ curl -s http://localhost:4040/metrics/executors/prometheus
metrics_app_20190911204823_0000_driver_executor_rddBlocks_Count 0
metrics_app_20190911204823_0000_driver_executor_memoryUsed_Count 0
metrics_app_20190911204823_0000_driver_executor_diskUsed_Count 0
metrics_app_20190911204823_0000_driver_executor_totalCores_Count 0
metrics_app_20190911204823_0000_driver_executor_maxTasks_Count 0
metrics_app_20190911204823_0000_driver_executor_activeTasks_Count 0
metrics_app_20190911204823_0000_driver_executor_failedTasks_Count 0
metrics_app_20190911204823_0000_driver_executor_completedTasks_Count 0
metrics_app_20190911204823_0000_driver_executor_totalTasks_Count 0
metrics_app_20190911204823_0000_driver_executor_totalDuration_Value 0
metrics_app_20190911204823_0000_driver_executor_totalGCTime_Value 0
metrics_app_20190911204823_0000_driver_executor_totalInputBytes_Count 0
metrics_app_20190911204823_0000_driver_executor_totalShuffleRead_Count 0
metrics_app_20190911204823_0000_driver_executor_totalShuffleWrite_Count 0
metrics_app_20190911204823_0000_driver_executor_maxMemory_Count 384093388
metrics_app_20190911204823_0000_driver_executor_usedOnHeapStorageMemory_Count 0
metrics_app_20190911204823_0000_driver_executor_usedOffHeapStorageMemory_Count 0
metrics_app_20190911204823_0000_driver_executor_totalOnHeapStorageMemory_Count 384093388
metrics_app_20190911204823_0000_driver_executor_totalOffHeapStorageMemory_Count 0
metrics_app_20190911204823_0000_driver_executor_JVMHeapMemory_Count 230406336
metrics_app_20190911204823_0000_driver_executor_JVMOffHeapMemory_Count 146132592
metrics_app_20190911204823_0000_driver_executor_OnHeapExecutionMemory_Count 0
metrics_app_20190911204823_0000_driver_executor_OffHeapExecutionMemory_Count 0
metrics_app_20190911204823_0000_driver_executor_OnHeapStorageMemory_Count 0
metrics_app_20190911204823_0000_driver_executor_OffHeapStorageMemory_Count 0
metrics_app_20190911204823_0000_driver_executor_OnHeapUnifiedMemory_Count 0
metrics_app_20190911204823_0000_driver_executor_OffHeapUnifiedMemory_Count 0
metrics_app_20190911204823_0000_driver_executor_DirectPoolMemory_Count 97049
metrics_app_20190911204823_0000_driver_executor_MappedPoolMemory_Count 0
metrics_app_20190911204823_0000_driver_executor_ProcessTreeJVMVMemory_Count 0
metrics_app_20190911204823_0000_driver_executor_ProcessTreeJVMRSSMemory_Count 0
metrics_app_20190911204823_0000_driver_executor_ProcessTreePythonVMemory_Count 0
metrics_app_20190911204823_0000_driver_executor_ProcessTreePythonRSSMemory_Count 0
metrics_app_20190911204823_0000_driver_executor_ProcessTreeOtherVMemory_Count 0
metrics_app_20190911204823_0000_driver_executor_ProcessTreeOtherRSSMemory_Count 0
metrics_app_20190911204823_0000_driver_executor_MinorGCCount_Count 8
metrics_app_20190911204823_0000_driver_executor_MinorGCTime_Count 82
metrics_app_20190911204823_0000_driver_executor_MajorGCCount_Count 3
metrics_app_20190911204823_0000_driver_executor_MajorGCTime_Count 128
metrics_app_20190911204823_0000_0_executor_rddBlocks_Count 0
metrics_app_20190911204823_0000_0_executor_memoryUsed_Count 0
metrics_app_20190911204823_0000_0_executor_diskUsed_Count 0
metrics_app_20190911204823_0000_0_executor_totalCores_Count 16
metrics_app_20190911204823_0000_0_executor_maxTasks_Count 16
metrics_app_20190911204823_0000_0_executor_activeTasks_Count 0
metrics_app_20190911204823_0000_0_executor_failedTasks_Count 0
metrics_app_20190911204823_0000_0_executor_completedTasks_Count 0
metrics_app_20190911204823_0000_0_executor_totalTasks_Count 0
metrics_app_20190911204823_0000_0_executor_totalDuration_Value 0
metrics_app_20190911204823_0000_0_executor_totalGCTime_Value 0
metrics_app_20190911204823_0000_0_executor_totalInputBytes_Count 0
metrics_app_20190911204823_0000_0_executor_totalShuffleRead_Count 0
metrics_app_20190911204823_0000_0_executor_totalShuffleWrite_Count 0
metrics_app_20190911204823_0000_0_executor_maxMemory_Count 384093388
metrics_app_20190911204823_0000_0_executor_usedOnHeapStorageMemory_Count 0
metrics_app_20190911204823_0000_0_executor_usedOffHeapStorageMemory_Count 0
metrics_app_20190911204823_0000_0_executor_totalOnHeapStorageMemory_Count 384093388
metrics_app_20190911204823_0000_0_executor_totalOffHeapStorageMemory_Count 0
```

Closes #25770 from dongjoon-hyun/SPARK-29064.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: DB Tsai <d_tsai@apple.com>
2019-09-13 21:28:27 +00:00
Liang-Chi Hsieh c610de6952 [SPARK-29042][CORE] Sampling-based RDD with unordered input should be INDETERMINATE
### What changes were proposed in this pull request?

We already have found and fixed the correctness issue before when RDD output is INDETERMINATE. One missing part is sampling-based RDD. This kind of RDDs is order sensitive to its input. A sampling-based RDD with unordered input, should be INDETERMINATE.

### Why are the changes needed?

A sampling-based RDD with unordered input is just like MapPartitionsRDD with isOrderSensitive parameter as true. The RDD output can be different after a rerun.

It is a problem in ML applications.

In ML, sample is used to prepare training data. ML algorithm fits the model based on the sampled data. If rerun tasks of sample produce different output during model fitting, ML results will be unreliable and also buggy.

Each sample is random output, but once you sampled, the output should be determinate.

### Does this PR introduce any user-facing change?

Previously, a sampling-based RDD can possibly come with different output after a rerun.
After this patch, sampling-based RDD is INDETERMINATE. For an INDETERMINATE map stage, currently Spark scheduler will re-try all the tasks of the failed stage.

### How was this patch tested?

Added test.

Closes #25751 from viirya/sample-order-sensitive.

Authored-by: Liang-Chi Hsieh <liangchi@uber.com>
Signed-off-by: Liang-Chi Hsieh <liangchi@uber.com>
2019-09-13 14:07:00 -07:00
dengziming 8f632d7045 [MINOR][DOCS] Fix few typos in the java docs
JIRA :https://issues.apache.org/jira/browse/SPARK-29050
'a hdfs' change into  'an hdfs'
'an unique' change into 'a unique'
'an url' change into 'a url'
'a error' change into 'an error'

Closes #25756 from dengziming/feature_fix_typos.

Authored-by: dengziming <dengziming@growingio.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2019-09-12 09:30:03 +09:00
Jungtaek Lim (HeartSaVioR) 2736efa32d [SPARK-26989][CORE][TEST] DAGSchedulerSuite: ensure listeners are fully processed before checking recorded values
### What changes were proposed in this pull request?

This patch ensures accessing recorded values in listener is always after letting listeners fully process all events. To ensure this, this patch adds new class to hide these values and access with methods which will ensure above condition. Without this guard, two threads are running concurrently - 1) listeners process thread 2) test main thread - and race condition would occur.

That's why we also see very odd thing, error message saying condition is met but test failed:
```
- Barrier task failures from the same stage attempt don't trigger multiple stage retries *** FAILED ***
  ArrayBuffer(0) did not equal List(0) (DAGSchedulerSuite.scala:2656)
```
which means verification failed, and condition is met just before constructing error message.

The guard is properly placed in many spots, but missed in some places. This patch enforces that it can't be missed.

### Why are the changes needed?

UT fails intermittently and this patch will address the flakyness.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

Modified UT.

Also made the flaky tests artificially failing via applying 50ms of sleep on each onXXX method.

![Screen Shot 2019-09-07 at 7 44 15 AM](https://user-images.githubusercontent.com/1317309/64465178-1747ad00-d146-11e9-92f6-f4ed4a1f4b08.png)

I found 3 methods being failed. (They've marked as X. Just ignore ! as they failed on waiting listener in given timeout and these tests don't deal with these recorded values - it uses other timeout value 1000ms than 10000ms for this listener so affected via side-effect.)

When I applied same in this patch all tests marked as X passed.

Closes #25706 from HeartSaVioR/SPARK-26989.

Authored-by: Jungtaek Lim (HeartSaVioR) <kabhwan@gmail.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-09-11 10:24:57 -07:00
Liu,Linhong f263909ee1 [SPARK-23243][CORE][FOLLOWUP] Remove todo added by SPARK-23207
### What changes were proposed in this pull request?
PR #22112 fixed the todo added by PR #20393(SPARK-23207). We can remove it now.

### Why are the changes needed?
In order not to confuse developers.

### Does this PR introduce any user-facing change?
no

### How was this patch tested?
no need to test

Closes #25755 from LinhongLiu/remove-todo.

Authored-by: Liu,Linhong <liulinhong@baidu.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-09-11 21:51:48 +08:00
mcheah 7f36cd2aa5 [SPARK-28570][CORE][SHUFFLE] Make UnsafeShuffleWriter use the new API
## What changes were proposed in this pull request?

Uses the APIs introduced in SPARK-28209 in the UnsafeShuffleWriter.

## How was this patch tested?

Since this is just a refactor, existing unit tests should cover the relevant code paths. Micro-benchmarks from the original fork where this code was built show no degradation in performance.

Closes #25304 from mccheah/shuffle-writer-refactor-unsafe-writer.

Lead-authored-by: mcheah <mcheah@palantir.com>
Co-authored-by: mccheah <mcheah@palantir.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-09-10 17:30:02 -07:00
hongdd bdc1598a43 [SPARK-28657][CORE] Fix currentContext Instance failed sometimes
## What changes were proposed in this pull request?

Running spark on yarn, I got
```
java.lang.ClassCastException: org.apache.hadoop.ipc.CallerContext$Builder cannot be cast to scala.runtime.Nothing$
```
Utils.classForName return Class[Nothing], I think it should be defind as Class[_] to resolve this issue

## How was this patch tested?

not need

Closes #25389 from hddong/SPARK-28657-fix-currentContext-Instance-failed.

Lead-authored-by: hongdd <jn_hdd@163.com>
Co-authored-by: hongdongdong <hongdongdong@cmss.chinamobile.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-09-09 18:02:52 -05:00
Sean Owen 6378d4bc06 [SPARK-28980][CORE][SQL][STREAMING][MLLIB] Remove most items deprecated in Spark 2.2.0 or earlier, for Spark 3
### What changes were proposed in this pull request?

- Remove SQLContext.createExternalTable and Catalog.createExternalTable, deprecated in favor of createTable since 2.2.0, plus tests of deprecated methods
- Remove HiveContext, deprecated in 2.0.0, in favor of `SparkSession.builder.enableHiveSupport`
- Remove deprecated KinesisUtils.createStream methods, plus tests of deprecated methods, deprecate in 2.2.0
- Remove deprecated MLlib (not Spark ML) linear method support, mostly utility constructors and 'train' methods, and associated docs. This includes methods in LinearRegression, LogisticRegression, Lasso, RidgeRegression. These have been deprecated since 2.0.0
- Remove deprecated Pyspark MLlib linear method support, including LogisticRegressionWithSGD, LinearRegressionWithSGD, LassoWithSGD
- Remove 'runs' argument in KMeans.train() method, which has been a no-op since 2.0.0
- Remove deprecated ChiSqSelector isSorted protected method
- Remove deprecated 'yarn-cluster' and 'yarn-client' master argument in favor of 'yarn' and deploy mode 'cluster', etc

Notes:

- I was not able to remove deprecated DataFrameReader.json(RDD) in favor of DataFrameReader.json(Dataset); the former was deprecated in 2.2.0, but, it is still needed to support Pyspark's .json() method, which can't use a Dataset.
- Looks like SQLContext.createExternalTable was not actually deprecated in Pyspark, but, almost certainly was meant to be? Catalog.createExternalTable was.
- I afterwards noted that the toDegrees, toRadians functions were almost removed fully in SPARK-25908, but Felix suggested keeping just the R version as they hadn't been technically deprecated. I'd like to revisit that. Do we really want the inconsistency? I'm not against reverting it again, but then that implies leaving SQLContext.createExternalTable just in Pyspark too, which seems weird.
- I *kept* LogisticRegressionWithSGD, LinearRegressionWithSGD, LassoWithSGD, RidgeRegressionWithSGD in Pyspark, though deprecated, as it is hard to remove them (still used by StreamingLogisticRegressionWithSGD?) and they are not fully removed in Scala. Maybe should not have been deprecated.

### Why are the changes needed?

Deprecated items are easiest to remove in a major release, so we should do so as much as possible for Spark 3. This does not target items deprecated 'recently' as of Spark 2.3, which is still 18 months old.

### Does this PR introduce any user-facing change?

Yes, in that deprecated items are removed from some public APIs.

### How was this patch tested?

Existing tests.

Closes #25684 from srowen/SPARK-28980.

Lead-authored-by: Sean Owen <sean.owen@databricks.com>
Co-authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-09-09 10:19:40 -05:00
colinma dadb72028a [SPARK-28340][CORE] Noisy exceptions when tasks are killed: "DiskBloc…
### What changes were proposed in this pull request?

If a Spark task is killed due to intentional job kills, automated killing of redundant speculative tasks, etc, ClosedByInterruptException occurs if task has unfinished I/O operation with AbstractInterruptibleChannel. A single cancelled task can result in hundreds of stack trace of ClosedByInterruptException being logged.
In this PR, stack trace of ClosedByInterruptException won't be logged like Executor.run do for InterruptedException.

### Why are the changes needed?

Large numbers of spurious exceptions is confusing to users when they are inspecting Spark logs to diagnose other issues.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

N/A

Closes #25674 from colinmjj/spark-28340.

Authored-by: colinma <colinma@tencent.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-09-09 05:27:53 -05:00
shivusondur cb488ecf41 [SPARK-28942][WEBUI] Spark in local mode hostname display localhost in the Host Column of Task Summary Page
### What changes were proposed in this pull request?
In spark-shell local mode, in the task page, host name is coming as localhost
This PR changes it to show machine IP, as shown in the "spark.driver.host" in the environment page

### Why are the changes needed?
To show the proper IP in the task page host column

### Does this PR introduce any user-facing change?
It updates the SPARK UI->Task page->Host Column

### How was this patch tested?
verfied in spark UI

![image](https://user-images.githubusercontent.com/7912929/64079045-253d9e00-cd00-11e9-8092-26caec4e21dc.png)

Closes #25645 from shivusondur/localhost1.

Authored-by: shivusondur <shivusondur@gmail.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-09-08 19:45:19 -05:00
Jungtaek Lim (HeartSaVioR) 905b7f7fc7 [SPARK-28967][CORE] Include cloned version of "properties" to avoid ConcurrentModificationException
### What changes were proposed in this pull request?

This patch fixes the bug which throws ConcurrentModificationException when job with 0 partition is submitted via DAGScheduler.

### Why are the changes needed?

Without this patch, structured streaming query throws ConcurrentModificationException, like below stack trace:

```
19/09/04 09:48:49 ERROR AsyncEventQueue: Listener EventLoggingListener threw an exception
java.util.ConcurrentModificationException
	at java.util.Hashtable$Enumerator.next(Hashtable.java:1387)
	at scala.collection.convert.Wrappers$JPropertiesWrapper$$anon$6.next(Wrappers.scala:424)
	at scala.collection.convert.Wrappers$JPropertiesWrapper$$anon$6.next(Wrappers.scala:420)
	at scala.collection.Iterator.foreach(Iterator.scala:941)
	at scala.collection.Iterator.foreach$(Iterator.scala:941)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1429)
	at scala.collection.IterableLike.foreach(IterableLike.scala:74)
	at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
	at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
	at scala.collection.TraversableLike.map(TraversableLike.scala:237)
	at scala.collection.TraversableLike.map$(TraversableLike.scala:230)
	at scala.collection.AbstractTraversable.map(Traversable.scala:108)
	at org.apache.spark.util.JsonProtocol$.mapToJson(JsonProtocol.scala:514)
	at org.apache.spark.util.JsonProtocol$.$anonfun$propertiesToJson$1(JsonProtocol.scala:520)
	at scala.Option.map(Option.scala:163)
	at org.apache.spark.util.JsonProtocol$.propertiesToJson(JsonProtocol.scala:519)
	at org.apache.spark.util.JsonProtocol$.jobStartToJson(JsonProtocol.scala:155)
	at org.apache.spark.util.JsonProtocol$.sparkEventToJson(JsonProtocol.scala:79)
	at org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:149)
	at org.apache.spark.scheduler.EventLoggingListener.onJobStart(EventLoggingListener.scala:217)
	at org.apache.spark.scheduler.SparkListenerBus.doPostEvent(SparkListenerBus.scala:37)
	at org.apache.spark.scheduler.SparkListenerBus.doPostEvent$(SparkListenerBus.scala:28)
	at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
	at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:37)
	at org.apache.spark.util.ListenerBus.postToAll(ListenerBus.scala:99)
	at org.apache.spark.util.ListenerBus.postToAll$(ListenerBus.scala:84)
	at org.apache.spark.scheduler.AsyncEventQueue.super$postToAll(AsyncEventQueue.scala:102)
	at org.apache.spark.scheduler.AsyncEventQueue.$anonfun$dispatch$1(AsyncEventQueue.scala:102)
	at scala.runtime.java8.JFunction0$mcJ$sp.apply(JFunction0$mcJ$sp.java:23)
	at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
	at org.apache.spark.scheduler.AsyncEventQueue.org$apache$spark$scheduler$AsyncEventQueue$$dispatch(AsyncEventQueue.scala:97)
	at org.apache.spark.scheduler.AsyncEventQueue$$anon$2.$anonfun$run$1(AsyncEventQueue.scala:93)
	at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1319)
	at org.apache.spark.scheduler.AsyncEventQueue$$anon$2.run(AsyncEventQueue.scala:93)
```

Please refer https://issues.apache.org/jira/browse/SPARK-28967 for detailed reproducer.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

Newly added UT. Also manually tested via running simple structured streaming query in spark-shell.

Closes #25672 from HeartSaVioR/SPARK-28967.

Authored-by: Jungtaek Lim (HeartSaVioR) <kabhwan@gmail.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-09-06 09:06:39 -05:00
Wing Yew Poon 151b954e52 [SPARK-28770][CORE][TEST] Fix ReplayListenerSuite tests that sometimes fail
### What changes were proposed in this pull request?

`ReplayListenerSuite` depends on a listener class to listen for replayed events. This class was implemented by extending `EventLoggingListener`. `EventLoggingListener` does not log executor metrics update events, but uses them to update internal state; on a stage completion event, it then logs stage executor metrics events using this internal state. As executor metrics update events do not get written to the event log, they do not get replayed. The internal state of the replay listener can therefore be different from the original listener, leading to different stage completion events being logged.

We reimplement the replay listener to simply buffer each and every event it receives. This makes it a simpler yet better tool for verifying the events that get sent through the ReplayListenerBus.

### Why are the changes needed?

As explained above. Tests sometimes fail due to events being received by the `EventLoggingListener` that do not get logged (and thus do not get replayed) but influence other events that get logged.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Existing unit tests.

Closes #25673 from wypoon/SPARK-28770.

Authored-by: Wing Yew Poon <wypoon@cloudera.com>
Signed-off-by: Imran Rashid <irashid@cloudera.com>
2019-09-05 15:55:22 -05:00
Shixiong Zhu 84a4d3a17c
[SPARK-28976][CORE] Use KeyLock to simplify MapOutputTracker.getStatuses
### What changes were proposed in this pull request?

Use `KeyLock` added in #25612 to simplify `MapOutputTracker.getStatuses`. It also has some improvement after the refactoring:
- `InterruptedException` is no longer sallowed.
- When a shuffle block is fetched, we don't need to wake up unrelated sleeping threads.

### Why are the changes needed?

`MapOutputTracker.getStatuses` is pretty hard to maintain right now because it has a special lock mechanism which we needs to pay attention to whenever updating this method. As we can use `KeyLock` to hide the complexity of locking behind a dedicated lock class, it's better to refactor it to make it easy to understand and maintain.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

Existing tests.

Closes #25680 from zsxwing/getStatuses.

Authored-by: Shixiong Zhu <zsxwing@gmail.com>
Signed-off-by: Shixiong Zhu <zsxwing@gmail.com>
2019-09-04 23:20:27 -07:00
Xianjin YE ca71177868 [SPARK-28907][CORE] Review invalid usage of new Configuration()
### What changes were proposed in this pull request?
Replaces some incorrect usage of `new Configuration()` as it will load default configs defined in Hadoop

### Why are the changes needed?
Unexpected config could be accessed instead of the expected config, see SPARK-28203 for example

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
Existed tests.

Closes #25616 from advancedxy/remove_invalid_configuration.

Authored-by: Xianjin YE <advancedxy@gmail.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-09-04 19:52:19 -05:00
maryannxue a7a3935c97 [SPARK-11150][SQL] Dynamic Partition Pruning
### What changes were proposed in this pull request?
This patch implements dynamic partition pruning by adding a dynamic-partition-pruning filter if there is a partitioned table and a filter on the dimension table. The filter is then planned using a heuristic approach:
1. As a broadcast relation if it is a broadcast hash join. The broadcast relation will then be transformed into a reused broadcast exchange by the `ReuseExchange` rule; or
2. As a subquery duplicate if the estimated benefit of partition table scan being saved is greater than the estimated cost of the extra scan of the duplicated subquery; otherwise
3. As a bypassed condition (`true`).

### Why are the changes needed?
This is an important performance feature.

### Does this PR introduce any user-facing change?
No.

### How was this patch tested?
Added UT
- Testing DPP by enabling / disabling the reuse broadcast results feature and / or the subquery duplication feature.
- Testing DPP with reused broadcast results.
- Testing the key iterators on different HashedRelation types.
- Testing the packing and unpacking of the broadcast keys in a LongType.

Closes #25600 from maryannxue/dpp.

Authored-by: maryannxue <maryannxue@apache.org>
Signed-off-by: Xiao Li <gatorsmile@gmail.com>
2019-09-04 13:13:23 -07:00
Jungtaek Lim (HeartSaVioR) 712874fa09 [SPARK-28931][CORE][TESTS] Fix couple of bugs in FsHistoryProviderSuite
### What changes were proposed in this pull request?

This patch fixes the bugs in test code itself, FsHistoryProviderSuite.

1. When creating log file via `newLogFile`, codec is ignored, leading to wrong file name. (No one tends to create test for test code, as well as the bug doesn't affect existing tests indeed, so not easy to catch.)
2. When writing events to log file via `writeFile`, metadata (in case of new format) gets written to file regardless of its codec, and the content is overwritten by another stream, hence no information for Spark version is available. It affects existing test, hence we have wrong expected value to workaround the bug.

This patch also removes redundant parameter `isNewFormat` in `writeFile`, as according to review comment, Spark no longer supports old format.

### Why are the changes needed?

Explained in above section why they're bugs, though they only reside in test-code. (Please note that the bug didn't come from non-test side of code.)

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

Modified existing UTs, as well as read event log file in console to see metadata is not overwritten by other contents.

Closes #25629 from HeartSaVioR/MINOR-FIX-FsHistoryProviderSuite.

Authored-by: Jungtaek Lim (HeartSaVioR) <kabhwan@gmail.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-09-04 09:41:15 -07:00
Thomas Graves 4c8f114783 [SPARK-27489][WEBUI] UI updates to show executor resource information
### What changes were proposed in this pull request?
We are adding other resource type support to the executors and Spark. We should show the resource information for each executor on the UI Executors page.
This also adds a toggle button to show the resources column.  It is off by default.

![executorui1](https://user-images.githubusercontent.com/4563792/63891432-c815b580-c9aa-11e9-9f41-62975649efbc.png)

![Screenshot from 2019-08-28 14-56-26](https://user-images.githubusercontent.com/4563792/63891516-fd220800-c9aa-11e9-9fe4-89fcdca37306.png)

### Why are the changes needed?
to show user what resources the executors have. Like Gpus, fpgas, etc

### Does this PR introduce any user-facing change?
Yes introduces UI and rest api changes to show the resources

### How was this patch tested?
Unit tests and manual UI tests on yarn and standalone modes.

Closes #25613 from tgravescs/SPARK-27489-gpu-ui-latest.

Authored-by: Thomas Graves <tgraves@nvidia.com>
Signed-off-by: Gengliang Wang <gengliang.wang@databricks.com>
2019-09-04 09:45:44 +08:00
Shixiong Zhu 89800931aa
[SPARK-3137][CORE] Replace the global TorrentBroadcast lock with fine grained KeyLock
### What changes were proposed in this pull request?

This PR provides a new lock mechanism `KeyLock` to lock  with a given key. Also use this new lock in `TorrentBroadcast` to avoid blocking tasks from fetching different broadcast values.

### Why are the changes needed?

`TorrentBroadcast.readObject` uses a global lock so only one task can be fetching the blocks at the same time. This is not optimal if we are running multiple stages concurrently because they should be able to independently fetch their own blocks.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Existing tests.

Closes #25612 from zsxwing/SPARK-3137.

Authored-by: Shixiong Zhu <zsxwing@gmail.com>
Signed-off-by: Shixiong Zhu <zsxwing@gmail.com>
2019-09-03 14:09:07 -07:00
Sean Owen eb037a8180 [SPARK-28855][CORE][ML][SQL][STREAMING] Remove outdated usages of Experimental, Evolving annotations
### What changes were proposed in this pull request?

The Experimental and Evolving annotations are both (like Unstable) used to express that a an API may change. However there are many things in the code that have been marked that way since even Spark 1.x. Per the dev thread, anything introduced at or before Spark 2.3.0 is pretty much 'stable' in that it would not change without a deprecation cycle. Therefore I'd like to remove most of these annotations. And, remove the `:: Experimental ::` scaladoc tag too. And likewise for Python, R.

The changes below can be summarized as:
- Generally, anything introduced at or before Spark 2.3.0 has been unmarked as neither Evolving nor Experimental
- Obviously experimental items like DSv2, Barrier mode, ExperimentalMethods are untouched
- I _did_ unmark a few MLlib classes introduced in 2.4, as I am quite confident they're not going to change (e.g. KolmogorovSmirnovTest, PowerIterationClustering)

It's a big change to review, so I'd suggest scanning the list of _files_ changed to see if any area seems like it should remain partly experimental and examine those.

### Why are the changes needed?

Many of these annotations are incorrect; the APIs are de facto stable. Leaving them also makes legitimate usages of the annotations less meaningful.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Existing tests.

Closes #25558 from srowen/SPARK-28855.

Authored-by: Sean Owen <sean.owen@databricks.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-09-01 10:15:00 -05:00
mcheah ea90ea6ce7 [SPARK-28571][CORE][SHUFFLE] Use the shuffle writer plugin for the SortShuffleWriter
## What changes were proposed in this pull request?

Use the shuffle writer APIs introduced in SPARK-28209 in the sort shuffle writer.

## How was this patch tested?

Existing unit tests were changed to use the plugin instead, and they used the local disk version to ensure that there were no regressions.

Closes #25342 from mccheah/shuffle-writer-refactor-sort-shuffle-writer.

Lead-authored-by: mcheah <mcheah@palantir.com>
Co-authored-by: mccheah <mcheah@palantir.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-08-30 09:43:07 -07:00
Ryan Blue 31b59bd805 [SPARK-28843][PYTHON] Set OMP_NUM_THREADS to executor cores for python if not set
### What changes were proposed in this pull request?

When starting python processes, set `OMP_NUM_THREADS` to the number of cores allocated to an executor or driver if `OMP_NUM_THREADS` is not already set. Each python process will use the same `OMP_NUM_THREADS` setting, even if workers are not shared.

This avoids creating an OpenMP thread pool for parallel processing with a number of threads equal to the number of cores on the executor and [significantly reduces memory consumption](https://github.com/numpy/numpy/issues/10455). Instead, this threadpool should use the number of cores allocated to the executor, if available. If a setting for number of cores is not available, this doesn't change any behavior. OpenMP is used by numpy and pandas.

### Why are the changes needed?

To reduce memory consumption for PySpark jobs.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Validated this reduces python worker memory consumption by more than 1GB on our cluster.

Closes #25545 from rdblue/SPARK-28843-set-omp-num-cores.

Authored-by: Ryan Blue <blue@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2019-08-30 10:29:46 +09:00
wuyi 70f4bbccc5 [SPARK-28414][WEBUI] UI updates to show resource info in Standalone
## What changes were proposed in this pull request?

Since SPARK-27371 has supported GPU-aware resource scheduling in Standalone, this PR adds resources info in Standalone UI.

## How was this patch tested?

Updated `JsonProtocolSuite` and tested manually.

Master page:

![masterpage](https://user-images.githubusercontent.com/16397174/62835958-b933c100-bc90-11e9-814f-22bae048303d.png)

Worker page

![workerpage](https://user-images.githubusercontent.com/16397174/63417947-d2790200-c434-11e9-8979-36b8f558afd3.png)

Application page

![applicationpage](https://user-images.githubusercontent.com/16397174/62835964-cbadfa80-bc90-11e9-99a2-26e05421619a.png)

Closes #25409 from Ngone51/SPARK-28414.

Authored-by: wuyi <ngone_5451@163.com>
Signed-off-by: Thomas Graves <tgraves@apache.org>
2019-08-27 08:59:29 -05:00
mcheah 2efa6f5dd3 [SPARK-28607][CORE][SHUFFLE] Don't store partition lengths twice
## What changes were proposed in this pull request?

The shuffle writer API introduced in SPARK-28209 has a flaw that leads to a memory usage regression - we ended up tracking the partition lengths in two places. Here, we modify the API slightly to avoid redundant tracking. The implementation of the shuffle writer plugin is now responsible for tracking the lengths of partitions, and propagating this back up to the higher shuffle writer as part of the commitAllPartitions API.

## How was this patch tested?

Existing unit tests.

Closes #25341 from mccheah/dont-redundantly-store-part-lengths.

Authored-by: mcheah <mcheah@palantir.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-08-26 10:39:29 -07:00
Nikita Gorbachevsky 9f8c7a2804 [SPARK-28709][DSTREAMS] Fix StreamingContext leak through Streaming
## What changes were proposed in this pull request?

In my application spark streaming is restarted programmatically by stopping StreamingContext without stopping of SparkContext and creating/starting a new one. I use it for automatic detection of Kafka topic/partition changes and automatic failover in case of non fatal exceptions.

However i notice that after multiple restarts driver fails with OOM. During investigation of heap dump i figured out that StreamingContext object isn't cleared by GC after stopping.

<img width="1901" alt="Screen Shot 2019-08-14 at 12 23 33" src="https://user-images.githubusercontent.com/13151161/63010149-83f4c200-be8e-11e9-9f48-12b6e97839f4.png">

There are several places which holds reference to it :

1. StreamingTab registers StreamingJobProgressListener which holds reference to Streaming Context directly to LiveListenerBus shared queue via ssc.sc.addSparkListener(listener) method invocation. However this listener isn't unregistered at stop method.
2. json handlers (/streaming/json and /streaming/batch/json) aren't unregistered in SparkUI, while they hold reference to StreamingJobProgressListener. Basically the same issue affects all the pages, i assume that renderJsonHandler should be added to pageToHandlers cache on attachPage method invocation in order to unregistered it as well on detachPage.
3. SparkUi holds reference to StreamingJobProgressListener in the corresponding local variable which isn't cleared after stopping of StreamingContext.

## How was this patch tested?

Added tests to existing test suites.
After i applied these changes via reflection in my app OOM on driver side gone.

Closes #25439 from choojoyq/SPARK-28709-fix-streaming-context-leak-on-stop.

Authored-by: Nikita Gorbachevsky <nikitag@playtika.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-08-26 09:30:36 -05:00
Anton Kirillov f17f1d01e2 [SPARK-28778][MESOS] Fixed executors advertised address when running in virtual network
### What changes were proposed in this pull request?
Resolves [SPARK-28778: Shuffle jobs fail due to incorrect advertised address when running in a virtual network on Mesos](https://issues.apache.org/jira/browse/SPARK-28778).

This patch fixes a bug which occurs when shuffle jobs are launched by Mesos in a virtual network. Mesos scheduler sets executor `--hostname` parameter to `0.0.0.0` in the case when `spark.mesos.network.name` is provided. This makes executors use `0.0.0.0` as their advertised address and, in the presence of shuffle, executors fail to fetch shuffle blocks from each other using `0.0.0.0` as the origin. When a virtual network is used the hostname or IP address is not known upfront and assigned to a container at its start time so the executor process needs to advertise the correct dynamically assigned address to be reachable by other executors.

Changes:
- added a fallback to `Utils.localHostName()` in Spark Executors when `--hostname` is not provided
- removed setting executor address to `0.0.0.0` from Mesos scheduler
- refactored the code related to building executor command in Mesos scheduler
- added network configuration support to Docker containerizer
- added unit tests

### Why are the changes needed?
The bug described above prevents Mesos users from running any jobs which involve shuffle due to the inability of executors to fetch shuffle blocks because of incorrect advertised address when virtual network is used.

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
- added unit test to `MesosCoarseGrainedSchedulerBackendSuite` which verifies the absence of `--hostname` parameter  when `spark.mesos.network.name` is provided and its presence otherwise
- added unit test to `MesosSchedulerBackendUtilSuite` which verifies that `MesosSchedulerBackendUtil.buildContainerInfo` sets network-related properties for Docker containerizer
- unit tests from this repo launched with profiles: `./build/mvn test -Pmesos -Pnetlib-lgpl -Psparkr -Phive -Phive-thriftserver`, build log attached: [mvn.test.log](https://github.com/apache/spark/files/3516891/mvn.test.log)
- integration tests from [DCOS Spark repo](https://github.com/mesosphere/spark-build), more specifically - [test_spark_cni.py](https://github.com/mesosphere/spark-build/blob/master/tests/test_spark_cni.py) which runs a specific [shuffle job](https://github.com/mesosphere/spark-build/blob/master/tests/jobs/scala/src/main/scala/ShuffleApp.scala) and verifies its successful completion, Mesos task network configuration, and IP addresses for both Mesos and Docker containerizers

Closes #25500 from akirillov/DCOS-45840-fix-advertised-ip-in-virtual-networks.

Authored-by: Anton Kirillov <akirillov@mesosophere.io>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-08-23 18:30:05 -07:00
HyukjinKwon d25cbd44ee [SPARK-28839][CORE] Avoids NPE in context cleaner when dynamic allocation and shuffle service are on
### What changes were proposed in this pull request?

This PR proposes to avoid to thrown NPE at context cleaner when shuffle service is on - it is kind of a small followup of https://github.com/apache/spark/pull/24817

Seems like it sets `null` for `shuffleIds` to track when the service is on. Later, `removeShuffle` tries to remove an element at `shuffleIds` which leads to NPE. It fixes it by explicitly not sending the event (`ShuffleCleanedEvent`) in this case.

See the code path below:

cbad616d4c/core/src/main/scala/org/apache/spark/SparkContext.scala (L584)

cbad616d4c/core/src/main/scala/org/apache/spark/ContextCleaner.scala (L125)

cbad616d4c/core/src/main/scala/org/apache/spark/ContextCleaner.scala (L190)

cbad616d4c/core/src/main/scala/org/apache/spark/ContextCleaner.scala (L220-L230)

cbad616d4c/core/src/main/scala/org/apache/spark/scheduler/dynalloc/ExecutorMonitor.scala (L353-L357)

cbad616d4c/core/src/main/scala/org/apache/spark/scheduler/dynalloc/ExecutorMonitor.scala (L347)

cbad616d4c/core/src/main/scala/org/apache/spark/scheduler/dynalloc/ExecutorMonitor.scala (L400-L406)

cbad616d4c/core/src/main/scala/org/apache/spark/scheduler/dynalloc/ExecutorMonitor.scala (L475)

cbad616d4c/core/src/main/scala/org/apache/spark/scheduler/dynalloc/ExecutorMonitor.scala (L427)

### Why are the changes needed?

This is a bug fix.

### Does this PR introduce any user-facing change?

It prevents the exception:

```
19/08/21 06:44:01 ERROR AsyncEventQueue: Listener ExecutorMonitor threw an exception
java.lang.NullPointerException
	at org.apache.spark.scheduler.dynalloc.ExecutorMonitor$Tracker.removeShuffle(ExecutorMonitor.scala:479)
	at org.apache.spark.scheduler.dynalloc.ExecutorMonitor.$anonfun$cleanupShuffle$2(ExecutorMonitor.scala:408)
	at org.apache.spark.scheduler.dynalloc.ExecutorMonitor.$anonfun$cleanupShuffle$2$adapted(ExecutorMonitor.scala:407)
	at scala.collection.Iterator.foreach(Iterator.scala:941)
	at scala.collection.Iterator.foreach$(Iterator.scala:941)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1429)
	at scala.collection.IterableLike.foreach(IterableLike.scala:74)
	at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
	at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
	at org.apache.spark.scheduler.dynalloc.ExecutorMonitor.cleanupShuffle(ExecutorMonitor.scala:407)
	at org.apache.spark.scheduler.dynalloc.ExecutorMonitor.onOtherEvent(ExecutorMonitor.sc
```

### How was this patch test?

Unittest was added.

Closes #25551 from HyukjinKwon/SPARK-28839.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-08-23 12:44:56 -07:00
Kousuke Saruta 33e45ec7b8 [SPARK-28769][CORE] Improve warning message of BarrierExecutionMode when required slots > maximum slots
### What changes were proposed in this pull request?
Improved warning message in Barrier Execution Mode when required slots > maximum slots.
The new message contains information about required slots, maximum slots and how many times retry failed.

### Why are the changes needed?
Providing to users with the number of required slots, maximum slots and how many times retry failed might help users to decide what they should do.
For example, continuing to wait for retry succeeded or killing jobs.

### Does this PR introduce any user-facing change?
Yes.
If `spark.scheduler.barrier.maxConcurrentTaskCheck.maxFailures=3`, we get following warning message.

Before applying this change:

```
19/08/18 15:18:09 WARN DAGScheduler: The job 2 requires to run a barrier stage that requires more slots than the total number of slots in the cluster currently.
19/08/18 15:18:24 WARN DAGScheduler: The job 2 requires to run a barrier stage that requires more slots than the total number of slots in the cluster currently.
19/08/18 15:18:39 WARN DAGScheduler: The job 2 requires to run a barrier stage that requires more slots than the total number of slots in the cluster currently.
19/08/18 15:18:54 WARN DAGScheduler: The job 2 requires to run a barrier stage that requires more slots than the total number of slots in the cluster currently.
org.apache.spark.scheduler.BarrierJobSlotsNumberCheckFailed: [SPARK-24819]: Barrier execution mode does not allow run a barrier stage that requires more slots than the total number of slots in the cluster currently. Please init a new cluster with more CPU cores or repartition the input RDD(s) to reduce the number of slots required to run this barrier stage.
  at org.apache.spark.scheduler.DAGScheduler.checkBarrierStageWithNumSlots(DAGScheduler.scala:439)
  at org.apache.spark.scheduler.DAGScheduler.createResultStage(DAGScheduler.scala:453)
  at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:983)
  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2140)
  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2132)
  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2121)
  at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
  at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:749)
  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2080)
  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2101)
  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2120)
  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2145)
  at org.apache.spark.rdd.RDD.$anonfun$collect$1(RDD.scala:961)
  at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
  at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
  at org.apache.spark.rdd.RDD.withScope(RDD.scala:366)
  at org.apache.spark.rdd.RDD.collect(RDD.scala:960)
  ... 47 elided
```
After applying this change:

```
19/08/18 16:52:23 WARN DAGScheduler: The job 0 requires to run a barrier stage that requires 3 slots than the total number of slots(2) in the cluster currently.
19/08/18 16:52:38 WARN DAGScheduler: The job 0 requires to run a barrier stage that requires 3 slots than the total number of slots(2) in the cluster currently (Retry 1/3 failed).
19/08/18 16:52:53 WARN DAGScheduler: The job 0 requires to run a barrier stage that requires 3 slots than the total number of slots(2) in the cluster currently (Retry 2/3 failed).
19/08/18 16:53:08 WARN DAGScheduler: The job 0 requires to run a barrier stage that requires 3 slots than the total number of slots(2) in the cluster currently (Retry 3/3 failed).
org.apache.spark.scheduler.BarrierJobSlotsNumberCheckFailed: [SPARK-24819]: Barrier execution mode does not allow run a barrier stage that requires more slots than the total number of slots in the cluster currently. Please init a new cluster with more CPU cores or repartition the input RDD(s) to reduce the number of slots required to run this barrier stage.
  at org.apache.spark.scheduler.DAGScheduler.checkBarrierStageWithNumSlots(DAGScheduler.scala:439)
  at org.apache.spark.scheduler.DAGScheduler.createResultStage(DAGScheduler.scala:453)
  at org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:983)
  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2140)
  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2132)
  at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2121)
  at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
  at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:749)
  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2080)
  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2101)
  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2120)
  at org.apache.spark.SparkContext.runJob(SparkContext.scala:2145)
  at org.apache.spark.rdd.RDD.$anonfun$collect$1(RDD.scala:961)
  at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
  at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
  at org.apache.spark.rdd.RDD.withScope(RDD.scala:366)
  at org.apache.spark.rdd.RDD.collect(RDD.scala:960)
  ... 47 elided
```

### How was this patch tested?
I tested manually using Spark Shell with following configuration and script. And then, checked log message.

```
$ bin/spark-shell --master local[2] --conf spark.scheduler.barrier.maxConcurrentTasksCheck.maxFailures=3
scala> sc.parallelize(1 to 100, sc.defaultParallelism+1).barrier.mapPartitions(identity(_)).collect
```

Closes #25487 from sarutak/barrier-exec-mode-warning-message.

Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-08-22 14:06:58 -05:00
WeichenXu 9779a82ea0 [SPARK-28483][CORE][FOLLOW-UP] Dealing with interrupted exception in BarrierTaskContext.barrier()
<!--
Thanks for sending a pull request!  Here are some tips for you:
  1. If this is your first time, please read our contributor guidelines: https://spark.apache.org/contributing.html
  2. Ensure you have added or run the appropriate tests for your PR: https://spark.apache.org/developer-tools.html
  3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][SPARK-XXXX] Your PR title ...'.
  4. Be sure to keep the PR description updated to reflect all changes.
  5. Please write your PR title to summarize what this PR proposes.
  6. If possible, provide a concise example to reproduce the issue for a faster review.
-->

### What changes were proposed in this pull request?
<!--
Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue.
If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
  1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
  2. If you fix some SQL features, you can provide some references of other DBMSes.
  3. If there is design documentation, please add the link.
  4. If there is a discussion in the mailing list, please add the link.
-->
Dealing with interrupted exception in BarrierTaskContext.barrier()

### Why are the changes needed?
<!--
Please clarify why the changes are needed. For instance,
  1. If you propose a new API, clarify the use case for a new API.
  2. If you fix a bug, you can clarify why it is a bug.
-->
Interrupted exception will happen in the case sparkContext local property "spark.job.interruptOnCancel" set true.

### Does this PR introduce any user-facing change?
<!--
If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description and/or an example to show the behavior difference if possible.
If no, write 'No'.
-->
No.

### How was this patch tested?
<!--
If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
If tests were not added, please describe why they were not added and/or why it was difficult to add.
-->
UT.

Closes #25519 from WeichenXu123/barrier_fl.

Authored-by: WeichenXu <weichen.xu@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-08-21 19:51:45 +08:00
Dhruve Ashar a50959a7f6 [SPARK-27937][CORE] Revert partial logic for auto namespace discovery
## What changes were proposed in this pull request?
This change reverts the logic which was introduced as a part of SPARK-24149 and a subsequent followup PR.

With existing logic:
- Spark fails to launch with HDFS federation enabled while trying to get a path to a logical nameservice.
- It gets tokens for unrelated namespaces if they are used in HDFS Federation
- Automatic namespace discovery is supported only if these are on the same cluster.

Rationale for change:
- For accessing data from related namespaces, viewfs should handle getting tokens for spark
- For accessing data from unrelated namespaces(user explicitly specifies them using existing configs) as these could be on the same or different cluster.

(Please fill in changes proposed in this fix)
Revert the changes.

## How was this patch tested?
Ran few manual tests and unit test.

Closes #24785 from dhruve/bug/SPARK-27937.

Authored-by: Dhruve Ashar <dhruveashar@gmail.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-08-20 12:42:35 -07:00
WeichenXu bc75ed675b [SPARK-28483][CORE] Fix canceling a spark job using barrier mode but barrier tasks blocking on BarrierTaskContext.barrier()
## What changes were proposed in this pull request?

Fix canceling a spark job using barrier mode but barrier tasks do not exit.
Currently, when spark tasks are killed, `BarrierTaskContext.barrier()` cannot be killed (it will blocking on RPC request), cause the task blocking and cannot exit.

In my PR I implement an interface for RPC which support `abort` in class `RpcEndpointRef`
```
  def askAbortable[T: ClassTag](
      message: Any,
      timeout: RpcTimeout): AbortableRpcFuture[T]
```

The returned `AbortableRpcFuture` instance include an `abort` method so that we can abort the RPC before it timeout.

## How was this patch tested?

Unit test added.

Manually test:

### Test code
launch spark-shell via `spark-shell --master local[4]`
and run following code:
```
sc.setLogLevel("INFO")
import org.apache.spark.BarrierTaskContext
val n = 4
def taskf(iter: Iterator[Int]): Iterator[Int] = {
  val context = BarrierTaskContext.get()
  val x = iter.next()
  if (x % 2 == 0) {
    // sleep 6000000 seconds with task killed checking
    for (i <- 0 until 6000000) {
      Thread.sleep(1000)
      if (context.isInterrupted()) {
        throw new org.apache.spark.TaskKilledException()
      }
    }
  }
  context.barrier()
  return Iterator.empty
}

// launch spark job, including 4 tasks, tasks 1/3 will enter `barrier()`, and tasks 0/2 will enter `sleep`
sc.parallelize((0 to n), n).barrier().mapPartitions(taskf).collect()
```
And then press Ctrl+C to exit the running job.

### Before
press Ctrl+C to exit the running job, then open spark UI we can see 2 tasks (task 1/3) are not killed. They are blocking.

### After
press Ctrl+C to exit the running job,  we can see in spark UI all tasks killed successfully.

Please review https://spark.apache.org/contributing.html before opening a pull request.

Closes #25235 from WeichenXu123/sc_14848.

Authored-by: WeichenXu <weichen.xu@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-08-20 14:21:47 +08:00
Yuanjian Li 0d3a783cc5 [SPARK-28699][CORE] Fix a corner case for aborting indeterminate stage
### What changes were proposed in this pull request?
Change the logic of collecting the indeterminate stage, we should look at stages from mapStage, not failedStage during handle FetchFailed.

### Why are the changes needed?
In the fetch failed error handle logic, the original logic of collecting indeterminate stage from the fetch failed stage. And in the scenario of the fetch failed happened in the first task of this stage, this logic will cause the indeterminate stage to resubmit partially. Eventually, we are capable of getting correctness bug.

### Does this PR introduce any user-facing change?
It makes the corner case of indeterminate stage abort as expected.

### How was this patch tested?
New UT in DAGSchedulerSuite.
Run below integrated test with `local-cluster[5, 2, 5120]`, and set `spark.sql.execution.sortBeforeRepartition`=false, it will abort the indeterminate stage as expected:
```
import scala.sys.process._
import org.apache.spark.TaskContext

val res = spark.range(0, 10000 * 10000, 1).map{ x => (x % 1000, x)}
// kill an executor in the stage that performs repartition(239)
val df = res.repartition(113).map{ x => (x._1 + 1, x._2)}.repartition(239).map { x =>
  if (TaskContext.get.attemptNumber == 0 && TaskContext.get.partitionId < 1 && TaskContext.get.stageAttemptNumber == 0) {
    throw new Exception("pkill -f -n java".!!)
  }
  x
}
val r2 = df.distinct.count()
```

Closes #25498 from xuanyuanking/SPARK-28699-followup.

Authored-by: Yuanjian Li <xyliyuanjian@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-08-20 13:47:59 +08:00
angerszhu 036fd3903f [SPARK-27637][SHUFFLE][FOLLOW-UP] For nettyBlockTransferService, if IOException occurred while create client, check whether relative executor is alive before retry #24533
### What changes were proposed in this pull request?

In pr #[24533](https://github.com/apache/spark/pull/24533/files) , it prevent retry to a removed Executor.
In my test, I can't catch exceptions from
` new OneForOneBlockFetcher(client, appId, execId, blockIds, listener,
              transportConf, tempFileManager).start()`
And I check the code carefully, method **start()** will handle exception of IOException in it's retry logical, won't throw it out. until it meet maxRetry times or meet exception that is not  IOException.

And if we meet the situation that when we fetch block , the executor is dead, when we rerun
`RetryingBlockFetcher.BlockFetchStarter.createAndStart()`
we may failed when we create a transport client to dead executor. it will throw a IOException.
We should catch this IOException.

### Why are the changes needed?
Old solution not comprehensive. Didn't cover more case.

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
Existed Unit Test

Closes #25469 from AngersZhuuuu/SPARK-27637-FLLOW-UP.

Authored-by: angerszhu <angers.zhu@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-08-16 23:24:32 +08:00
Steve Loughran 2ac6163a5d [SPARK-23977][SQL] Support High Performance S3A committers [test-hadoop3.2]
This patch adds the binding classes to enable spark to switch dataframe output to using the S3A zero-rename committers shipping in Hadoop 3.1+. It adds a source tree into the hadoop-cloud-storage module which only compiles with the hadoop-3.2 profile, and contains a binding for normal output and a specific bridge class for Parquet (as the parquet output format requires a subclass of `ParquetOutputCommitter`.

Commit algorithms are a critical topic. There's no formal proof of correctness, but the algorithms are documented an analysed in [A Zero Rename Committer](https://github.com/steveloughran/zero-rename-committer/releases). This also reviews the classic v1 and v2 algorithms, IBM's swift committer and the one from EMRFS which they admit was based on the concepts implemented here.

Test-wise

* There's a public set of scala test suites [on github](https://github.com/hortonworks-spark/cloud-integration)
* We have run integration tests against Spark on Yarn clusters.
* This code has been shipping for ~12 months in HDP-3.x.

Closes #24970 from steveloughran/cloud/SPARK-23977-s3a-committer.

Authored-by: Steve Loughran <stevel@cloudera.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-08-15 09:39:26 -07:00
Xianjin YE 3ec24fd128 [SPARK-28203][CORE][PYTHON] PythonRDD should respect SparkContext's hadoop configuration
## What changes were proposed in this pull request?
1. PythonHadoopUtil.mapToConf generates a Configuration with loadDefaults disabled
2. merging hadoop conf in several places of PythonRDD is consistent.

## How was this patch tested?
Added a new test and existed tests

Closes #25002 from advancedxy/SPARK-28203.

Authored-by: Xianjin YE <advancedxy@gmail.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2019-08-15 10:39:33 +09:00
Marcelo Vanzin 0343854f54 [SPARK-28487][K8S] More responsive dynamic allocation with K8S
This change implements a few changes to the k8s pod allocator so
that it behaves a little better when dynamic allocation is on.

(i) Allow the application to ramp up immediately when there's a
change in the target number of executors. Without this change,
scaling would only trigger when a change happened in the state of
the cluster, e.g. an executor going down, or when the periodical
snapshot was taken (default every 30s).

(ii) Get rid of pending pod requests, both acknowledged (i.e. Spark
knows that a pod is pending resource allocation) and unacknowledged
(i.e. Spark has requested the pod but the API server hasn't created it
yet), when they're not needed anymore. This avoids starting those
executors to just remove them after the idle timeout, wasting resources
in the meantime.

(iii) Re-work some of the code to avoid unnecessary logging. While not
bad without dynamic allocation, the existing logging was very chatty
when dynamic allocation was on. With the changes, all the useful
information is still there, but only when interesting changes happen.

(iv) Gracefully shut down executors when they become idle. Just deleting
the pod causes a lot of ugly logs to show up, so it's better to ask pods
to exit nicely. That also allows Spark to respect the "don't delete
pods" option when dynamic allocation is on.

Tested on a small k8s cluster running different TPC-DS workloads.

Closes #25236 from vanzin/SPARK-28487.

Authored-by: Marcelo Vanzin <vanzin@cloudera.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-08-13 17:29:54 -07:00
Kousuke Saruta 247bebcf94 [SPARK-28561][WEBUI] DAG viz for barrier-execution mode
## What changes were proposed in this pull request?

In the current UI, we cannot identify which RDDs are barrier. Visualizing it will make easy to debug.
Following images are shown after this change.

![Screenshot from 2019-07-30 16-30-35](https://user-images.githubusercontent.com/4736016/62110508-83cec100-b2e9-11e9-83b9-bc2e485a4cbe.png)
![Screenshot from 2019-07-30 16-31-09](https://user-images.githubusercontent.com/4736016/62110509-83cec100-b2e9-11e9-9e2e-47c4dae23a52.png)

The boxes in pale green mean barrier (We might need to discuss which color is proper).

## How was this patch tested?

Tested manually.
The images above are shown by following operations.

```
val rdd1 = sc.parallelize(1 to 10)
val rdd2 = sc.parallelize(1 to 10)
val rdd3 = rdd1.zip(rdd2).barrier.mapPartitions(identity(_))
val rdd4 = rdd3.map(identity(_))
val rdd5 = rdd4.reduceByKey(_+_)
rdd5.collect
```

Closes #25296 from sarutak/barrierexec-dagviz.

Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com>
Signed-off-by: Xingbo Jiang <xingbo.jiang@databricks.com>
2019-08-12 22:38:10 -07:00
Kousuke Saruta 25857c6559 [SPARK-28647][WEBUI] Recover additional metric feature and remove additional-metrics.js
## What changes were proposed in this pull request?

By SPARK-17019, `On Heap Memory` and `Off Heap Memory` are introduced as optional metrics.
But they are not displayed because they are made `display: none` in css and there are no way to appear them.

I know #22595 also try to resolve this issue but that will use `additional-metrics.js`.
Initially, `additional-metrics.js` is created for `StagePage` but `StagePage` currently uses `stagepage.js` for its additional metrics to be toggle because `DataTable (one of jQuery plugins)` was introduced and we needed another mechanism to add/remove columns for additional metrics.

Now that `ExecutorsPage` also uses `DataTable` so it might be better to introduce same mechanism as `StagePage` for additional metrics.

![Screenshot from 2019-08-10 05-37-25](https://user-images.githubusercontent.com/4736016/62807960-c4240f80-bb31-11e9-8e1a-1a44e2f91597.png)

And then, we can remove `additional-metrics.js` which is no longer used from anywhere.

## How was this patch tested?

After this change is applied, I confirmed `ExecutorsPage` and `StagePage` are properly rendered and all checkboxes for additional metrics work.

Closes #25374 from sarutak/remove-additional-metrics.js.

Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-08-12 17:02:28 -07:00
Gengliang Wang 48d04f74ca [SPARK-28638][WEBUI] Task summary should only contain successful tasks' metrics
## What changes were proposed in this pull request?

Currently, on requesting summary metrics, cached data are returned if the current number of "SUCCESS" tasks is the same as the value in cached data.
However, the number of "SUCCESS" tasks is wrong when there are running tasks. In `AppStatusStore`, the KVStore is `ElementTrackingStore`, instead of `InMemoryStore`. The value count is always the number of "SUCCESS" tasks + "RUNNING" tasks.
Thus, even when the running tasks are finished, the out-of-update cached data is returned.

This PR is to fix the code in getting the number of "SUCCESS" tasks.

## How was this patch tested?

Test manually, run
```
sc.parallelize(1 to 160, 40).map(i => Thread.sleep(i*100)).collect()
```
and keep refreshing the stage page , we can see the task summary metrics is wrong.

### Before fix:
![image](https://user-images.githubusercontent.com/1097932/62560343-6a141780-b8af-11e9-8942-d88540659a93.png)

### After fix:
![image](https://user-images.githubusercontent.com/1097932/62560355-7009f880-b8af-11e9-8ba8-10c083a48d7b.png)

Closes #25369 from gengliangwang/fixStagePage.

Authored-by: Gengliang Wang <gengliang.wang@databricks.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-08-12 11:47:29 -07:00
WeichenXu 0f2efe6825 [SPARK-28366][CORE][FOLLOW-UP] Refine logging in driver when loading single large unsplittable file
## What changes were proposed in this pull request?

* Add log in `NewHadoopRDD`
* Remove some words in logs which related to specific user API.

## How was this patch tested?

Manual.

Please review https://spark.apache.org/contributing.html before opening a pull request.

Closes #25391 from WeichenXu123/log_sf.

Authored-by: WeichenXu <weichen.xu@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-08-12 19:15:00 +08:00
Kousuke Saruta 31ef268bae [SPARK-28639][CORE][DOC] Configuration doc for Barrier Execution Mode
## What changes were proposed in this pull request?

SPARK-24817 and SPARK-24819 introduced new 3 non-internal properties for barrier-execution mode but they are not documented.
So I've added a section into configuration.md for barrier-mode execution.

## How was this patch tested?
Built using jekyll and confirm the layout by browser.

Closes #25370 from sarutak/barrier-exec-mode-conf-doc.

Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-08-11 08:13:19 -05:00
Kousuke Saruta dd5599efaf [SPARK-28677][WEBUI] "Select All" checkbox in StagePage doesn't work properly
## What changes were proposed in this pull request?

In StagePage, only the first optional column (Scheduler Delay, in this case) appears even though "Select All" checkbox is checked.

![Screenshot from 2019-08-09 18-46-05](https://user-images.githubusercontent.com/4736016/62771600-8f379e80-bad8-11e9-9faa-6da8d57739d2.png)

The cause is that wrong method is used to manipulate multiple columns. columns should have been used but column was used.
I've fixed this issue by replacing the `column` with `columns`.

## How was this patch tested?

Confirmed behavior of the check-box.

![Screenshot from 2019-08-09 18-54-33](https://user-images.githubusercontent.com/4736016/62771614-98c10680-bad8-11e9-9cc0-5879ac47d1e1.png)

Closes #25397 from sarutak/fix-stagepage.js.

Authored-by: Kousuke Saruta <sarutak@oss.nttdata.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-08-10 16:51:12 -05:00
younggyu chun 8535df7261 [MINOR] Fix typos in comments and replace an explicit type with <>
## What changes were proposed in this pull request?
This PR fixed typos in comments and replace the explicit type with '<>' for Java 8+.

## How was this patch tested?
Manually tested.

Closes #25338 from younggyuchun/younggyu.

Authored-by: younggyu chun <younggyuchun@gmail.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-08-10 16:47:11 -05:00
Ajith ef80c32266 [SPARK-28676][CORE] Avoid Excessive logging from ContextCleaner
## What changes were proposed in this pull request?

In high workload environments, ContextCleaner seems to have excessive logging at INFO level which do not give much information. In one Particular case we see that ``INFO ContextCleaner: Cleaned accumulator`` message is 25-30% of the generated logs. We can log this information for cleanup in DEBUG level instead.

## How was this patch tested?

This do not modify any functionality. This is just changing cleanup log levels to DEBUG for  ContextCleaner

Closes #25396 from ajithme/logss.

Authored-by: Ajith <ajith2489@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-08-09 15:49:20 -07:00
wuyi cbad616d4c [SPARK-27371][CORE] Support GPU-aware resources scheduling in Standalone
## What changes were proposed in this pull request?

In this PR, we implements a complete process of GPU-aware resources scheduling
in Standalone. The whole process looks like: Worker sets up isolated resources
when it starts up and registers to master along with its resources. And, Master
picks up usable workers according to driver/executor's resource requirements to
launch driver/executor on them. Then, Worker launches the driver/executor after
preparing resources file, which is created under driver/executor's working directory,
with specified resource addresses(told by master). When driver/executor finished,
their resources could be recycled to worker. Finally, if a worker stops, it
should always release its resources firstly.

For the case of Workers and Drivers in **client** mode run on the same host, we introduce
a config option named `spark.resources.coordinate.enable`(default true) to indicate
whether Spark should coordinate resources for user. If `spark.resources.coordinate.enable=false`, user should be responsible for configuring different resources for Workers and Drivers when use resourcesFile or discovery script. If true, Spark would help user to assign different  resources for Workers and Drivers.

The solution for Spark to coordinate resources among Workers and Drivers is:

Generally, use a shared file named *____allocated_resources____.json* to sync allocated
resources info among Workers and Drivers on the same host.

After a Worker or Driver found all resources using the configured resourcesFile and/or
discovery script during launching, it should filter out available resources by excluding resources already allocated in *____allocated_resources____.json* and acquire resources from available resources according to its own requirement. After that, it should write its allocated resources along with its process id (pid) into *____allocated_resources____.json*.  Pid (proposed by tgravescs) here used to check whether the allocated resources are still valid in case of Worker or Driver crashes and doesn't release resources properly. And when a Worker or Driver finished, normally, it would always clean up its own allocated resources in *____allocated_resources____.json*.

Note that we'll always get a file lock before any access to file *____allocated_resources____.json*
and release the lock finally.

Futhermore, we appended resources info in `WorkerSchedulerStateResponse` to work
around master change behaviour in HA mode.

## How was this patch tested?

Added unit tests in WorkerSuite, MasterSuite, SparkContextSuite.

Manually tested with client/cluster mode (e.g. multiple workers) in a single node Standalone.

Closes #25047 from Ngone51/SPARK-27371.

Authored-by: wuyi <ngone_5451@163.com>
Signed-off-by: Thomas Graves <tgraves@apache.org>
2019-08-09 07:49:03 -05:00
Jungtaek Lim (HeartSaVioR) 128ea37bda [SPARK-28601][CORE][SQL] Use StandardCharsets.UTF_8 instead of "UTF-8" string representation, and get rid of UnsupportedEncodingException
## What changes were proposed in this pull request?

This patch tries to keep consistency whenever UTF-8 charset is needed, as using `StandardCharsets.UTF_8` instead of using "UTF-8". If the String type is needed, `StandardCharsets.UTF_8.name()` is used.

This change also brings the benefit of getting rid of `UnsupportedEncodingException`, as we're providing `Charset` instead of `String` whenever possible.

This also changes some private Catalyst helper methods to operate on encodings as `Charset` objects rather than strings.

## How was this patch tested?

Existing unit tests.

Closes #25335 from HeartSaVioR/SPARK-28601.

Authored-by: Jungtaek Lim (HeartSaVioR) <kabhwan@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-08-05 20:45:54 -07:00
wuyi 94499af6f0 [SPARK-28486][CORE][PYTHON] Map PythonBroadcast's data file to a BroadcastBlock to avoid delete by GC
## What changes were proposed in this pull request?

Currently, PythonBroadcast may delete its data file while a python worker still needs it. This happens because PythonBroadcast overrides the `finalize()` method to delete its data file. So, when GC happens and no  references on broadcast variable, it may trigger `finalize()` to delete
data file. That's also means, data under python Broadcast variable couldn't be deleted when `unpersist()`/`destroy()` called but relys on GC.

In this PR, we removed the `finalize()` method, and map the PythonBroadcast data file to a BroadcastBlock(which has the same broadcast id with the broadcast variable who wrapped this PythonBroadcast) when PythonBroadcast is deserializing. As a result, the data file could be deleted just like other pieces of the Broadcast variable when `unpersist()`/`destroy()` called and do not rely on GC any more.

## How was this patch tested?

Added a Python test, and tested manually(verified create/delete the broadcast block).

Closes #25262 from Ngone51/SPARK-28486.

Authored-by: wuyi <ngone_5451@163.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2019-08-05 20:18:53 +09:00
Yuanjian Li db39f45baf [SPARK-28593][CORE] Rename ShuffleClient to BlockStoreClient which more close to its usage
## What changes were proposed in this pull request?

After SPARK-27677, the shuffle client not only handles the shuffle block but also responsible for local persist RDD blocks. For better code scalability and precise semantics(as the [discussion](https://github.com/apache/spark/pull/24892#discussion_r300173331)), here we did several changes:

- Rename ShuffleClient to BlockStoreClient.
- Correspondingly rename the ExternalShuffleClient to ExternalBlockStoreClient, also change the server-side class from ExternalShuffleBlockHandler to ExternalBlockHandler.
- Move MesosExternalBlockStoreClient to Mesos package.

Note, we still keep the name of BlockTransferService, because the `Service` contains both client and server, also the name of BlockTransferService is not referencing shuffle client only.

## How was this patch tested?

Existing UT.

Closes #25327 from xuanyuanking/SPARK-28593.

Lead-authored-by: Yuanjian Li <xyliyuanjian@gmail.com>
Co-authored-by: Yuanjian Li <yuanjian.li@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-08-05 14:54:45 +08:00
yunzoud c212c9d9ed
[SPARK-28574][CORE] Allow to config different sizes for event queues
## What changes were proposed in this pull request?
Add configuration spark.scheduler.listenerbus.eventqueue.${name}.capacity to allow configuration of different event queue size.

## How was this patch tested?
Unit test in core/src/test/scala/org/apache/spark/scheduler/SparkListenerSuite.scala

Closes #25307 from yunzoud/SPARK-28574.

Authored-by: yunzoud <yun.zou@databricks.com>
Signed-off-by: Shixiong Zhu <zsxwing@gmail.com>
2019-08-02 15:27:33 -07:00
Nick Karpov 6d32deeecc [SPARK-28475][CORE] Add regex MetricFilter to GraphiteSink
## What changes were proposed in this pull request?

Today all registered metric sources are reported to GraphiteSink with no filtering mechanism, although the codahale project does support it.

GraphiteReporter (ScheduledReporter) from the codahale project requires you implement and supply the MetricFilter interface (there is only a single implementation by default in the codahale project, MetricFilter.ALL).

Propose to add an additional regex config to match and filter metrics to the GraphiteSink

## How was this patch tested?

Included a GraphiteSinkSuite that tests:

1. Absence of regex filter (existing default behavior maintained)
2. Presence of `regex=<regexexpr>` correctly filters metric keys

Closes #25232 from nkarpov/graphite_regex.

Authored-by: Nick Karpov <nick@nickkarpov.com>
Signed-off-by: jerryshao <jerryshao@tencent.com>
2019-08-02 17:50:15 +08:00
Marcelo Vanzin 607fb87906 [SPARK-28584][CORE] Fix thread safety issue in blacklist timer, tests
There's a small, probably very hard to hit thread-safety issue in the blacklist
abort timers in the task scheduler, where they access a non-thread-safe map without
locks.

In the tests, the code was also calling methods on the TaskSetManager without
holding the proper locks, which could cause threads to call non-thread-safe
TSM methods concurrently.

Closes #25317 from vanzin/SPARK-28584.

Authored-by: Marcelo Vanzin <vanzin@cloudera.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-08-01 10:37:47 -07:00
Wing Yew Poon 80ab19b9fd [SPARK-26329][CORE] Faster polling of executor memory metrics.
## What changes were proposed in this pull request?

Prior to this change, in an executor, on each heartbeat, memory metrics are polled and sent in the heartbeat. The heartbeat interval is 10s by default. With this change, in an executor, memory metrics can optionally be polled in a separate poller at a shorter interval.

For each executor, we use a map of (stageId, stageAttemptId) to (count of running tasks, executor metric peaks) to track what stages are active as well as the per-stage memory metric peaks. When polling the executor memory metrics, we attribute the memory to the active stage(s), and update the peaks. In a heartbeat, we send the per-stage peaks (for stages active at that time), and then reset the peaks. The semantics would be that the per-stage peaks sent in each heartbeat are the peaks since the last heartbeat.

We also keep a map of taskId to memory metric peaks. This tracks the metric peaks during the lifetime of the task. The polling thread updates this as well. At end of a task, we send the peak metric values in the task result. In case of task failure, we send the peak metric values in the `TaskFailedReason`.

We continue to do the stage-level aggregation in the EventLoggingListener.

For the driver, we still only poll on heartbeats. What the driver sends will be the current values of the metrics in the driver at the time of the heartbeat. This is semantically the same as before.

## How was this patch tested?

Unit tests. Manually tested applications on an actual system and checked the event logs; the metrics appear in the SparkListenerTaskEnd and SparkListenerStageExecutorMetrics events.

Closes #23767 from wypoon/wypoon_SPARK-26329.

Authored-by: Wing Yew Poon <wypoon@cloudera.com>
Signed-off-by: Imran Rashid <irashid@cloudera.com>
2019-08-01 09:09:46 -05:00
WeichenXu 26d03b62e2 [SPARK-28366][CORE] Logging in driver when loading single large unsplittable file
## What changes were proposed in this pull request?

Logging in driver when loading single large unsplittable file via `sc.textFile` or csv/json datasouce.
Current condition triggering logging is
* only generate one partition
* file is unsplittable, possible reason is:
   - compressed by unsplittable compression algo such as gzip.
   - multiLine mode in csv/json datasource
   - wholeText mode in text datasource
* file size exceed the config threshold `spark.io.warning.largeFileThreshold` (default value is 1GB)

## How was this patch tested?

Manually test.
Generate one gzip file exceeding 1GB,
```
base64 -b 50 /dev/urandom | head -c 2000000000 > file1.txt
cat file1.txt | gzip > file1.gz
```
then launch spark-shell,

run
```
sc.textFile("file:///path/to/file1.gz").count()
```
Will print log like:
```
WARN HadoopRDD: Loading one large unsplittable file file:/.../f1.gz with only one partition, because the file is compressed by unsplittable compression codec
```

run
```
sc.textFile("file:///path/to/file1.txt").count()
```
Will print log like:
```
WARN HadoopRDD: Loading one large file file:/.../f1.gz with only one partition, we can increase partition numbers by the `minPartitions` argument in method `sc.textFile
```

run
```
spark.read.csv("file:///path/to/file1.gz").count
```
Will print log like:
```
WARN CSVScan: Loading one large unsplittable file file:/.../f1.gz with only one partition, the reason is: the file is compressed by unsplittable compression codec
```

run
```
spark.read.option("multiLine", true).csv("file:///path/to/file1.gz").count
```
Will print log like:
```
WARN CSVScan: Loading one large unsplittable file file:/.../f1.gz with only one partition, the reason is: the csv datasource is set multiLine mode
```

JSON and Text datasource also tested with similar cases.

Please review https://spark.apache.org/contributing.html before opening a pull request.

Closes #25134 from WeichenXu123/log_gz.

Authored-by: WeichenXu <weichen.xu@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-08-01 20:29:18 +08:00
Marcelo Vanzin b3ffd8be14 [SPARK-24352][CORE][TESTS] De-flake StandaloneDynamicAllocationSuite blacklist test
The issue is that the test tried to stop an existing scheduler and replace it with
a new one set up for the test. That can cause issues because both were sharing the
same RpcEnv underneath, and unregistering RpcEndpoints is actually asynchronous
(see comment in Dispatcher.unregisterRpcEndpoint). So that could lead to races where
the new scheduler tried to register before the old one was fully unregistered.

The updated test avoids the issue by using a separate RpcEnv / scheduler instance
altogether, and also avoids a misleading NPE in the test logs.

Closes #25318 from vanzin/SPARK-24352.

Authored-by: Marcelo Vanzin <vanzin@cloudera.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-07-31 17:44:20 -07:00
sychen 70ef9064a8 [SPARK-28564][CORE] Access history application defaults to the last attempt id
## What changes were proposed in this pull request?
When we set ```spark.history.ui.maxApplications``` to a small value, we can't get some apps from the page search.
If the url is spliced (http://localhost:18080/history/local-xxx), it can be accessed if the app has no attempt.
But in the case of multiple attempted apps, such a url cannot be accessed, and the page displays Not Found.

## How was this patch tested?
Add UT

Closes #25301 from cxzl25/hs_app_last_attempt_id.

Authored-by: sychen <sychen@ctrip.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-07-31 13:24:36 -07:00
HyukjinKwon b8e13b0aea [SPARK-28153][PYTHON] Use AtomicReference at InputFileBlockHolder (to support input_file_name with Python UDF)
## What changes were proposed in this pull request?

This PR proposes to use `AtomicReference` so that parent and child threads can access to the same file block holder.

Python UDF expressions are turned to a plan and then it launches a separate thread to consume the input iterator. In the separate child thread, the iterator sets `InputFileBlockHolder.set` before the parent does which the parent thread is unable to read later.

1. In this separate child thread, if it happens to call `InputFileBlockHolder.set` first without initialization of the parent's thread local (which is done when the `ThreadLocal.get()` is first called), the child thread seems calling its own `initialValue` to initialize.

2. After that, the parent calls its own `initialValue` to initializes at the first call of `ThreadLocal.get()`.

3. Both now have two different references. Updating at child isn't reflected to parent.

This PR fixes it via initializing parent's thread local with `AtomicReference` for file status so that they can be used in each task, and children thread's update is reflected.

I also tried to explain this a bit more at https://github.com/apache/spark/pull/24958#discussion_r297203041.

## How was this patch tested?

Manually tested and unittest was added.

Closes #24958 from HyukjinKwon/SPARK-28153.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
2019-07-31 22:40:01 +08:00
mcheah abef84a868 [SPARK-28209][CORE][SHUFFLE] Proposed new shuffle writer API
## What changes were proposed in this pull request?

As part of the shuffle storage API proposed in SPARK-25299, this introduces an API for persisting shuffle data in arbitrary storage systems.

This patch introduces several concepts:
* `ShuffleDataIO`, which is the root of the entire plugin tree that will be proposed over the course of the shuffle API project.
* `ShuffleExecutorComponents` - the subset of plugins for managing shuffle-related components for each executor. This will in turn instantiate shuffle readers and writers.
* `ShuffleMapOutputWriter` interface - instantiated once per map task. This provides child `ShufflePartitionWriter` instances for persisting the bytes for each partition in the map task.

The default implementation of these plugins exactly mirror what was done by the existing shuffle writing code - namely, writing the data to local disk and writing an index file. We leverage the APIs in the `BypassMergeSortShuffleWriter` only. Follow-up PRs will use the APIs in `SortShuffleWriter` and `UnsafeShuffleWriter`, but are left as future work to minimize the review surface area.

## How was this patch tested?

New unit tests were added. Micro-benchmarks indicate there's no slowdown in the affected code paths.

Closes #25007 from mccheah/spark-shuffle-writer-refactor.

Lead-authored-by: mcheah <mcheah@palantir.com>
Co-authored-by: mccheah <mcheah@palantir.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-07-30 14:17:30 -07:00
pgandhi 70910e6ad0 [SPARK-26755][SCHEDULER] : Optimize Spark Scheduler to dequeue speculative tasks…
… more efficiently

This PR improves the performance of scheduling speculative tasks to be O(1) instead of O(numSpeculativeTasks), using the same approach used for scheduling regular tasks. The performance of this method is particularly important because a lock is held on the TaskSchedulerImpl which is a bottleneck for all scheduling operations. We ran a Join query on a large dataset with speculation enabled and out of 100000 tasks for the ShuffleMapStage, the maximum number of speculatable tasks that was noted was close to 7900-8000 at a point. That is when we start seeing the bottleneck on the scheduler lock.

In particular, this works by storing a separate stack of tasks by executor, node, and rack locality preferences. Then when trying to schedule a speculative task, rather than scanning all speculative tasks to find ones which match the given executor (or node, or rack) preference, we can jump to a quick check of tasks matching the resource offer. This technique was already used for regular tasks -- this change refactors the code to allow sharing with regular and speculative task execution.

## What changes were proposed in this pull request?

Have split the main queue "speculatableTasks" into 5 separate queues based on locality preference similar to how normal tasks are enqueued. Thus, the "dequeueSpeculativeTask" method will avoid performing locality checks for each task at runtime and simply return the preferable task to be executed.

## How was this patch tested?
We ran a spark job that performed a join on a 10 TB dataset to test the code change.
Original Code:
<img width="1433" alt="screen shot 2019-01-28 at 5 07 22 pm" src="https://user-images.githubusercontent.com/22228190/51873321-572df280-2322-11e9-9149-0aae08d5edc6.png">

Optimized Code:
<img width="1435" alt="screen shot 2019-01-28 at 5 08 19 pm" src="https://user-images.githubusercontent.com/22228190/51873343-6745d200-2322-11e9-947b-2cfd0f06bcab.png">

As you can see, the run time of the ShuffleMapStage came down from 40 min to 6 min approximately, thus, reducing the overall running time of the spark job by a significant amount.

Another example for the same job:

Original Code:
<img width="1440" alt="screen shot 2019-01-28 at 5 11 30 pm" src="https://user-images.githubusercontent.com/22228190/51873355-70cf3a00-2322-11e9-9c3a-af035449a306.png">

Optimized Code:
<img width="1440" alt="screen shot 2019-01-28 at 5 12 16 pm" src="https://user-images.githubusercontent.com/22228190/51873367-7dec2900-2322-11e9-8d07-1b1b49285f71.png">

Closes #23677 from pgandhi999/SPARK-26755.

Lead-authored-by: pgandhi <pgandhi@verizonmedia.com>
Co-authored-by: pgandhi <pgandhi@oath.com>
Signed-off-by: Imran Rashid <irashid@cloudera.com>
2019-07-30 09:54:51 -05:00
Lee Dongjin d98aa2a184 [MINOR] Trivial cleanups
These are what I found during working on #22282.

- Remove unused value: `UnsafeArraySuite#defaultTz`
- Remove redundant new modifier to the case class, `KafkaSourceRDDPartition`
- Remove unused variables from `RDD.scala`
- Remove trailing space from `structured-streaming-kafka-integration.md`
- Remove redundant parameter from `ArrowConvertersSuite`: `nullable` is `true` by default.
- Remove leading empty line: `UnsafeRow`
- Remove trailing empty line: `KafkaTestUtils`
- Remove unthrown exception type: `UnsafeMapData`
- Replace unused declarations: `expressions`
- Remove duplicated default parameter: `AnalysisErrorSuite`
- `ObjectExpressionsSuite`: remove duplicated parameters, conversions and unused variable

Closes #25251 from dongjinleekr/cleanup/201907.

Authored-by: Lee Dongjin <dongjin@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2019-07-29 23:38:02 +09:00
Dongjoon Hyun a428f40669 [SPARK-28549][BUILD][CORE][SQL] Use text.StringEscapeUtils instead lang3.StringEscapeUtils
## What changes were proposed in this pull request?

`org.apache.commons.lang3.StringEscapeUtils` was deprecated over two years ago at [LANG-1316](https://issues.apache.org/jira/browse/LANG-1316). There is no bug fixes after that.
```java
/**
 * <p>Escapes and unescapes {code String}s for
 * Java, Java Script, HTML and XML.</p>
 *
 * <p>#ThreadSafe#</p>
 * since 2.0
 * deprecated as of 3.6, use commons-text
 * <a href="https://commons.apache.org/proper/commons-text/javadocs/api-release/org/apache/commons/text/StringEscapeUtils.html">
 * StringEscapeUtils</a> instead
 */
Deprecated
public class StringEscapeUtils {
```

This PR aims to use the latest one from `commons-text` module which has more bug fixes like
[TEXT-100](https://issues.apache.org/jira/browse/TEXT-100), [TEXT-118](https://issues.apache.org/jira/browse/TEXT-118) and [TEXT-120](https://issues.apache.org/jira/browse/TEXT-120) by the following replacement.
```scala
-import org.apache.commons.lang3.StringEscapeUtils
+import org.apache.commons.text.StringEscapeUtils
```

This will add a new dependency to `hadoop-2.7` profile distribution. In `hadoop-3.2` profile, we already have it.
```
+commons-text-1.6.jar
```

## How was this patch tested?

Pass the Jenkins with the existing tests.
- [Hadoop 2.7](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108281)
- [Hadoop 3.2](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/108282)

Closes #25281 from dongjoon-hyun/SPARK-28549.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2019-07-29 11:45:29 +09:00
Marcelo Vanzin 7f84104b39 [SPARK-28535][CORE][TEST] Slow down tasks to de-flake JobCancellationSuite
This test tries to detect correct behavior in racy code, where the event
thread is racing with the executor thread that's trying to kill the running
task.

If the event that signals the stage end arrives first, any delay in the
delivery of the message to kill the task causes the code to rapidly process
elements, and may cause the test to assert. Adding a 10ms delay in
LocalSchedulerBackend before the task kill makes the test run through
~1000 elements. A longer delay can easily cause the 10000 elements to
be processed.

Instead, by adding a small delay (10ms) in the test code that processes
elements, there's a much lower probability that the kill event will not
arrive before the end; that leaves a window of 100s for the event
to be delivered to the executor. And because each element only sleeps for
10ms, the test is not really slowed down at all.

Closes #25270 from vanzin/SPARK-28535.

Authored-by: Marcelo Vanzin <vanzin@cloudera.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-07-27 11:06:35 -07:00
Luca Canali f2a2d980ed [SPARK-25285][CORE] Add startedTasks and finishedTasks to the metrics system in the executor instance
## What changes were proposed in this pull request?

The motivation for these additional metrics is to help in troubleshooting and monitoring task execution workload when running on a cluster. Currently available metrics include executor threadpool metrics for task completed and for active tasks. The addition of threadpool taskStarted metric will allow for example to collect info on the (approximate) number of failed tasks by computing the difference thread started – (active threads + completed tasks and/or successfully finished tasks).
The proposed metric finishedTasks is also intended for this type of troubleshooting. The difference between finshedTasks and threadpool.completeTasks, is that the latter is a (dropwizard library) gauge taken from the threadpool, while the former is a (dropwizard) counter computed in the [[Executor]] class, when a task successfully finishes, together with several other task metrics counters.
Note, there are similarities with some of the metrics introduced in SPARK-24398, however there are key differences, coming from the fact that this PR concerns the executor source, therefore providing metric values per executor + metric values do not require to pass through the listerner bus in this case.

## How was this patch tested?

Manually tested on a YARN cluster

Closes #22290 from LucaCanali/AddMetricExecutorStartedTasks.

Lead-authored-by: Luca Canali <luca.canali@cern.ch>
Co-authored-by: LucaCanali <luca.canali@cern.ch>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-07-26 14:03:57 -07:00
Marcelo Vanzin a3e013391e [SPARK-28455][CORE] Avoid overflow when calculating executor timeout time
This would cause the timeout time to be negative, so executors would be
timed out immediately (instead of never).

I also tweaked a couple of log messages that could get pretty long when
lots of executors were active.

Added unit test (which failed without the fix).

Closes #25208 from vanzin/SPARK-28455.

Authored-by: Marcelo Vanzin <vanzin@cloudera.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-07-22 14:31:58 -07:00
Josh Rosen 3776fbdfde [SPARK-28430][UI] Fix stage table rendering when some tasks' metrics are missing
## What changes were proposed in this pull request?

The Spark UI's stages table misrenders the input/output metrics columns when some tasks are missing input metrics. See the screenshot below for an example of the problem:

![image](https://user-images.githubusercontent.com/50748/61420042-a3abc100-a8b5-11e9-8a92-7986563ee712.png)

This is because those columns' are defined as

```scala
 {if (hasInput(stage)) {
  metricInfo(task) { m =>
    ...
   <td>....</td>
  }
}
```

where `metricInfo` renders the node returned by the closure in case metrics are defined or returns `Nil` in case metrics are not defined. If metrics are undefined then we'll fail to render the empty `<td></td>` tag, causing columns to become misaligned as shown in the screenshot.

To fix this, this patch changes this to

```scala
 {if (hasInput(stage)) {
  <td>{
    metricInfo(task) { m =>
      ...
     Unparsed(...)
    }
  }</td>
}
```

which is an idiom that's already in use for the shuffle read / write columns.

## How was this patch tested?

It isn't. I'm arguing for correctness because the modifications are consistent with rendering methods that work correctly for other columns.

Closes #25183 from JoshRosen/joshrosen/fix-task-table-with-partial-io-metrics.

Authored-by: Josh Rosen <rosenville@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-07-18 13:15:39 -07:00
Marcelo Vanzin 2ddeff97d7 [SPARK-27963][CORE] Allow dynamic allocation without a shuffle service.
This change adds a new option that enables dynamic allocation without
the need for a shuffle service. This mode works by tracking which stages
generate shuffle files, and keeping executors that generate data for those
shuffles alive while the jobs that use them are active.

A separate timeout is also added for shuffle data; so that executors that
hold shuffle data can use a separate timeout before being removed because
of being idle. This allows the shuffle data to be kept around in case it
is needed by some new job, or allow users to be more aggressive in timing
out executors that don't have shuffle data in active use.

The code also hooks up to the context cleaner so that shuffles that are
garbage collected are detected, and the respective executors not held
unnecessarily.

Testing done with added unit tests, and also with TPC-DS workloads on
YARN without a shuffle service.

Closes #24817 from vanzin/SPARK-27963.

Authored-by: Marcelo Vanzin <vanzin@cloudera.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-07-16 16:37:38 -07:00
Angers be4a55220a [SPARK-28106][SQL] When Spark SQL use "add jar" , before add to SparkContext, check jar path exist first.
## What changes were proposed in this pull request?
ISSUE :  https://issues.apache.org/jira/browse/SPARK-28106
When we use add jar in SQL, it will have three step:

- add jar to HiveClient's classloader
- HiveClientImpl.runHiveSQL("ADD JAR" + PATH)
- SessionStateBuilder.addJar

The second step seems has no impact to the whole process. Since event it failed, we still can execute.
The first step will add jar path to HiveClient's ClassLoader, then we can use the jar in HiveClientImpl
The Third Step will add this jar path to SparkContext. But expect local file path, it will call RpcServer's FileServer to add this to Env, the is you pass wrong path. it will cause error, but if you pass HDFS path or VIEWFS path, it won't check it and just add it to jar Path Map.

Then when next TaskSetManager send out Task, this path will be brought by TaskDescription. Then Executor will call updateDependencies, this method will check all jar path and file path in TaskDescription. Then error happends like below:

![image](https://user-images.githubusercontent.com/46485123/59817635-4a527f80-9353-11e9-9e08-9407b2b54023.png)

## How was this patch tested?
Exist Unit Test
Environment Test

Closes #24909 from AngersZhuuuu/SPARK-28106.

Lead-authored-by: Angers <angers.zhu@gamil.com>
Co-authored-by: 朱夷 <zhuyi01@corp.netease.com>
Signed-off-by: jerryshao <jerryshao@tencent.com>
2019-07-16 15:29:05 +08:00
Marcelo Vanzin 8d1e87ac90 [SPARK-28150][CORE][FOLLOW-UP] Don't try to log in when impersonating.
When fetching delegation tokens for a proxy user, don't try to log in,
since it will fail.

Closes #25141 from vanzin/SPARK-28150.2.

Authored-by: Marcelo Vanzin <vanzin@cloudera.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-07-15 10:32:34 -07:00
Liang-Chi Hsieh 591de42351 [SPARK-28381][PYSPARK] Upgraded version of Pyrolite to 4.30
## What changes were proposed in this pull request?

This upgraded to a newer version of Pyrolite. Most updates [1] in the newer version are for dotnot. For java, it includes a bug fix to Unpickler regarding cleaning up Unpickler memo, and support of protocol 5.

After upgrading, we can remove the fix at SPARK-27629 for the bug in Unpickler.

[1] https://github.com/irmen/Pyrolite/compare/pyrolite-4.23...master

## How was this patch tested?

Manually tested on Python 3.6 in local on existing tests.

Closes #25143 from viirya/upgrade-pyrolite.

Authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2019-07-15 12:29:58 +09:00
Jesse Cai 79e2047703 [SPARK-28355][CORE][PYTHON] Use Spark conf for threshold at which command is compressed by broadcast
## What changes were proposed in this pull request?

The `_prepare_for_python_RDD` method currently broadcasts a pickled command if its length is greater than the hardcoded value `1 << 20` (1M). This change sets this value as a Spark conf instead.

## How was this patch tested?

Unit tests, manual tests.

Closes #25123 from jessecai/SPARK-28355.

Authored-by: Jesse Cai <jesse.cai@databricks.com>
Signed-off-by: gatorsmile <gatorsmile@gmail.com>
2019-07-13 08:44:16 -07:00
Dongjoon Hyun 1c29212394 [SPARK-28357][CORE][TEST] Fix Flaky Test - FileAppenderSuite.rollingfile appender - size-based rolling compressed
## What changes were proposed in this pull request?

`SizeBasedRollingPolicy.shouldRollover` returns false when the size is equal to `rolloverSizeBytes`.
```scala
  /** Should rollover if the next set of bytes is going to exceed the size limit */
  def shouldRollover(bytesToBeWritten: Long): Boolean = {
    logDebug(s"$bytesToBeWritten + $bytesWrittenSinceRollover > $rolloverSizeBytes")
    bytesToBeWritten + bytesWrittenSinceRollover > rolloverSizeBytes
  }
```
- https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107553/testReport/org.apache.spark.util/FileAppenderSuite/rolling_file_appender___size_based_rolling__compressed_/
```
org.scalatest.exceptions.TestFailedException: 1000 was not less than 1000
```

## How was this patch tested?

Pass the Jenkins with the updated test.

Closes #25125 from dongjoon-hyun/SPARK-28357.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2019-07-12 18:40:07 +09:00
Gabor Somogyi f83000597f [SPARK-23472][CORE] Add defaultJavaOptions for driver and executor.
## What changes were proposed in this pull request?

This PR adds two new config properties: `spark.driver.defaultJavaOptions` and `spark.executor.defaultJavaOptions`. These are intended to be set by administrators in a file of defaults for options like JVM garbage collection algorithm. Users will still set `extraJavaOptions` properties, and both sets of JVM options will be added to start a JVM (default options are prepended to extra options).

## How was this patch tested?

Existing + additional unit tests.
```
cd docs/
SKIP_API=1 jekyll build
```
Manual webpage check.

Closes #24804 from gaborgsomogyi/SPARK-23472.

Authored-by: Gabor Somogyi <gabor.g.somogyi@gmail.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-07-11 09:37:26 -07:00
Thomas Graves f84cca2d84 [SPARK-28234][CORE][PYTHON] Add python and JavaSparkContext support to get resources
## What changes were proposed in this pull request?

Add python api support and JavaSparkContext support for resources().  I needed the JavaSparkContext support for it to properly translate into python with the py4j stuff.

## How was this patch tested?

Unit tests added and manually tested in local cluster mode and on yarn.

Closes #25087 from tgravescs/SPARK-28234-python.

Authored-by: Thomas Graves <tgraves@nvidia.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2019-07-11 09:32:58 +09:00
Dongjoon Hyun a6506f0c8a [SPARK-28290][CORE][SQL] Use SslContextFactory.Server instead of SslContextFactory
## What changes were proposed in this pull request?

`SslContextFactory` is deprecated at Jetty 9.4 and we are using `9.4.18.v20190429`. This PR aims to replace it with `SslContextFactory.Server`.
- https://www.eclipse.org/jetty/javadoc/9.4.19.v20190610/org/eclipse/jetty/util/ssl/SslContextFactory.html
- https://www.eclipse.org/jetty/javadoc/9.3.24.v20180605/org/eclipse/jetty/util/ssl/SslContextFactory.html

```
[WARNING] /Users/dhyun/APACHE/spark/core/src/main/scala/org/apache/spark/SSLOptions.scala:71:
constructor SslContextFactory in class SslContextFactory is deprecated:
see corresponding Javadoc for more information.
[WARNING]       val sslContextFactory = new SslContextFactory()
[WARNING]                               ^
```

## How was this patch tested?

Pass the Jenkins with the existing tests.

Closes #25067 from dongjoon-hyun/SPARK-28290.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-07-10 09:05:56 -07:00
Dongjoon Hyun bbc2be4f42 [SPARK-28294][CORE] Support spark.history.fs.cleaner.maxNum configuration
## What changes were proposed in this pull request?

Up to now, Apache Spark maintains the given event log directory by **time** policy, `spark.history.fs.cleaner.maxAge`. However, there are two issues.
1. Some file system has a limitation on the maximum number of files in a single directory. For example, HDFS `dfs.namenode.fs-limits.max-directory-items` is 1024 * 1024 by default.
https://hadoop.apache.org/docs/r3.2.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
2. Spark is sometimes unable to to clean up some old log files due to permission issues (mainly, security policy).

To handle both (1) and (2), this PR aims to support an additional policy configuration for the maximum number of files in the event log directory, `spark.history.fs.cleaner.maxNum`. Spark will try to keep the number of files in the event log directory according to this policy.

## How was this patch tested?

Pass the Jenkins with a newly added test case.

Closes #25072 from dongjoon-hyun/SPARK-28294.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-07-10 07:19:47 -07:00
Wenchen Fan d41bd7c891 [SPARK-26713][CORE][FOLLOWUP] revert the partial fix in ShuffleBlockFetcherIterator
## What changes were proposed in this pull request?

This PR reverts the partial bug fix in `ShuffleBlockFetcherIterator` which was introduced by https://github.com/apache/spark/pull/23638 .
The reasons:
1. It's a potential bug. After fixing `PipelinedRDD` in #23638 , the original problem was resolved.
2. The fix is incomplete according to [the discussion](https://github.com/apache/spark/pull/23638#discussion_r251869084)

We should fix the potential bug completely later.

## How was this patch tested?

existing tests

Closes #25049 from cloud-fan/revert.

Authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2019-07-09 11:52:12 +09:00
Gabor Somogyi 0b6c2c259c [MINOR] Add requestHeaderSize debug log
## What changes were proposed in this pull request?

`requestHeaderSize` is added in https://github.com/apache/spark/pull/23090 and applies to Spark + History server UI as well. Without debug log it's hard to find out on which side what configuration is used.

In this PR I've added a log message which prints out the value.

## How was this patch tested?

Manually checked log files.

Closes #25045 from gaborgsomogyi/SPARK-26118.

Authored-by: Gabor Somogyi <gabor.g.somogyi@gmail.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-07-03 09:34:31 -07:00
ShuMingLi 378ed677a8 [SPARK-28202][CORE][TEST] Avoid noises of system props in SparkConfSuite
When SPARK_HOME of env is set and contains a specific `spark-defaults,conf`, `org.apache.spark.util.loadDefaultSparkProperties` method may noise `system props`. So when runs `core/test` module, it is possible to fail to run `SparkConfSuite` .

 It's easy to repair by setting `loadDefaults` in `SparkConf` to be false.

```
[info] - deprecated configs *** FAILED *** (79 milliseconds)
[info] 7 did not equal 4 (SparkConfSuite.scala:266)
[info] org.scalatest.exceptions.TestFailedException:
[info] at org.scalatest.Assertions.newAssertionFailedException(Assertions.scala:528)
[info] at org.scalatest.Assertions.newAssertionFailedException$(Assertions.scala:527)
[info] at org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1560)
[info] at org.scalatest.Assertions$AssertionsHelper.macroAssert(Assertions.scala:501)
[info] at org.apache.spark.SparkConfSuite.$anonfun$new$26(SparkConfSuite.scala:266)
[info] at org.scalatest.OutcomeOf.outcomeOf(OutcomeOf.scala:85)
[info] at org.scalatest.OutcomeOf.outcomeOf$(OutcomeOf.scala:83)
[info] at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
[info] at org.scalatest.Transformer.apply(Transformer.scala:22)
[info] at org.scalatest.Transformer.apply(Transformer.scala:20)
[info] at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:186)
[info] at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:149)
[info] at org.scalatest.FunSuiteLike.invokeWithFixture$1(FunSuiteLike.scala:184)
[info] at org.scalatest.FunSuiteLike.$anonfun$runTest$1(FunSuiteLike.scala:196)
[info] at org.scalatest.SuperEngine.runTestImpl(Engine.scala:289)
```

Closes #24998 from LiShuMing/SPARK-28202.

Authored-by: ShuMingLi <ming.moriarty@gmail.com>
Signed-off-by: jerryshao <jerryshao@tencent.com>
2019-07-02 10:04:42 +08:00
HyukjinKwon 02f4763286 [SPARK-28198][PYTHON] Add mapPartitionsInPandas to allow an iterator of DataFrames
## What changes were proposed in this pull request?

This PR proposes to add `mapPartitionsInPandas` API to DataFrame by using existing `SCALAR_ITER` as below:

1. Filtering via setting the column

```python
from pyspark.sql.functions import pandas_udf, PandasUDFType

df = spark.createDataFrame([(1, 21), (2, 30)], ("id", "age"))

pandas_udf(df.schema, PandasUDFType.SCALAR_ITER)
def filter_func(iterator):
    for pdf in iterator:
        yield pdf[pdf.id == 1]

df.mapPartitionsInPandas(filter_func).show()
```

```
+---+---+
| id|age|
+---+---+
|  1| 21|
+---+---+
```

2. `DataFrame.loc`

```python
from pyspark.sql.functions import pandas_udf, PandasUDFType
import pandas as pd

df = spark.createDataFrame([['aa'], ['bb'], ['cc'], ['aa'], ['aa'], ['aa']], ["value"])

pandas_udf(df.schema, PandasUDFType.SCALAR_ITER)
def filter_func(iterator):
    for pdf in iterator:
        yield pdf.loc[pdf.value.str.contains('^a'), :]

df.mapPartitionsInPandas(filter_func).show()
```

```
+-----+
|value|
+-----+
|   aa|
|   aa|
|   aa|
|   aa|
+-----+
```

3. `pandas.melt`

```python
from pyspark.sql.functions import pandas_udf, PandasUDFType
import pandas as pd

df = spark.createDataFrame(
    pd.DataFrame({'A': {0: 'a', 1: 'b', 2: 'c'},
                  'B': {0: 1, 1: 3, 2: 5},
                  'C': {0: 2, 1: 4, 2: 6}}))

pandas_udf("A string, variable string, value long", PandasUDFType.SCALAR_ITER)
def filter_func(iterator):
    for pdf in iterator:
        import pandas as pd
        yield pd.melt(pdf, id_vars=['A'], value_vars=['B', 'C'])

df.mapPartitionsInPandas(filter_func).show()
```

```
+---+--------+-----+
|  A|variable|value|
+---+--------+-----+
|  a|       B|    1|
|  a|       C|    2|
|  b|       B|    3|
|  b|       C|    4|
|  c|       B|    5|
|  c|       C|    6|
+---+--------+-----+
```

The current limitation of `SCALAR_ITER` is that it doesn't allow different length of result, which is pretty critical in practice - for instance, we cannot simply filter by using Pandas APIs but we merely just map N to N. This PR allows map N to M like flatMap.

This API mimics the way of `mapPartitions` but keeps API shape of `SCALAR_ITER` by allowing different results.

### How does this PR implement?

This PR adds mimics both `dapply` with Arrow optimization and Grouped Map Pandas UDF. At Python execution side, it reuses existing `SCALAR_ITER` code path.

Therefore, externally, we don't introduce any new type of Pandas UDF but internally we use another evaluation type code `205` (`SQL_MAP_PANDAS_ITER_UDF`).

This approach is similar with Pandas' Windows function implementation with Grouped Aggregation Pandas UDF functions - internally we have `203` (`SQL_WINDOW_AGG_PANDAS_UDF`) but externally we just share the same `GROUPED_AGG`.

## How was this patch tested?

Manually tested and unittests were added.

Closes #24997 from HyukjinKwon/scalar-udf-iter.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2019-07-02 10:54:16 +09:00
Marcelo Vanzin 6af47b93ec [SPARK-28150][CORE] Log in user before getting delegation tokens.
This ensures that tokens are always created with an empty UGI, which
allows multiple contexts to be (sequentially) started from the same JVM.

Tested with code attached to the bug, and also usual kerberos tests.

Closes #24955 from vanzin/SPARK-28150.

Authored-by: Marcelo Vanzin <vanzin@cloudera.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-06-27 13:30:28 -07:00
wuyi 7cbe01e8ef [SPARK-27369][CORE] Setup resources when Standalone Worker starts up
## What changes were proposed in this pull request?

To support GPU-aware scheduling in Standalone (cluster mode), Worker should have ability to setup resources(e.g. GPU/FPGA) when it starts up.

Similar as driver/executor do, Worker has two ways(resourceFile & resourceDiscoveryScript) to setup resources when it starts up.  User could use `SPARK_WORKER_OPTS` to apply resource configs on Worker in the form of "-Dx=y". For example,
```
SPARK_WORKER_OPTS="-Dspark.worker.resource.gpu.amount=2 \
                   -Dspark.worker.resource.fpga.amount=1 \
                   -Dspark.worker.resource.fpga.discoveryScript=/Users/wuyi/tmp/getFPGAResources.sh \
                   -Dspark.worker.resourcesFile=/Users/wuyi/tmp/worker-resource-file"
 ```
## How was this patch tested?

Tested manually in Standalone locally:

- Worker could start up normally when no resources are configured

- Worker should fail to start up when exception threw during setup resources(e.g. unknown directory, parse fail)

- Worker could setup resources from resource file

- Worker could setup resources from discovery scripts

- Worker should setup resources from resource file & discovery scripts when both are configure.

Closes #24841 from Ngone51/dev-worker-resources-setup.

Authored-by: wuyi <ngone_5451@163.com>
Signed-off-by: Xingbo Jiang <xingbo.jiang@databricks.com>
2019-06-26 19:19:00 -07:00
Bryan Cutler c277afb12b [SPARK-27992][PYTHON] Allow Python to join with connection thread to propagate errors
## What changes were proposed in this pull request?

Currently with `toLocalIterator()` and `toPandas()` with Arrow enabled, if the Spark job being run in the background serving thread errors, it will be caught and sent to Python through the PySpark serializer.
This is not the ideal solution because it is only catch a SparkException, it won't handle an error that occurs in the serializer, and each method has to have it's own special handling to propagate the error.

This PR instead returns the Python Server object along with the serving port and authentication info, so that it allows the Python caller to join with the serving thread. During the call to join, the serving thread Future is completed either successfully or with an exception. In the latter case, the exception will be propagated to Python through the Py4j call.

## How was this patch tested?

Existing tests

Closes #24834 from BryanCutler/pyspark-propagate-server-error-SPARK-27992.

Authored-by: Bryan Cutler <cutlerb@gmail.com>
Signed-off-by: Bryan Cutler <cutlerb@gmail.com>
2019-06-26 13:05:41 -07:00
Dongjoon Hyun 7eeca02940 [SPARK-28157][CORE] Make SHS clear KVStore LogInfos for the blacklisted entries
## What changes were proposed in this pull request?

At Spark 2.4.0/2.3.2/2.2.3, [SPARK-24948](https://issues.apache.org/jira/browse/SPARK-24948) delegated access permission checks to the file system, and maintains a blacklist for all event log files failed once at reading. The blacklisted log files are released back after `CLEAN_INTERVAL_S` seconds.

However, the released files whose sizes don't changes are ignored forever due to `info.fileSize < entry.getLen()` condition (previously [here](3c96937c7b (diff-a7befb99e7bd7e3ab5c46c2568aa5b3eR454)) and now at [shouldReloadLog](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala#L571)) which returns `false` always when the size is the same with the existing value in `KVStore`. This is recovered only via SHS restart.

This PR aims to remove the existing entry from `KVStore` when it goes to the blacklist.

## How was this patch tested?

Pass the Jenkins with the updated test case.

Closes #24966 from dongjoon-hyun/SPARK-28157.

Authored-by: Dongjoon Hyun <dhyun@apple.com>
Signed-off-by: DB Tsai <d_tsai@apple.com>
2019-06-26 18:56:06 +00:00
sychen 38263f6d15 [SPARK-27630][CORE] Properly handle task end events from completed stages
## What changes were proposed in this pull request?
Track tasks separately for each stage attempt (instead of tracking by stage), and do NOT reset the numRunningTasks to 0 on StageCompleted.

In the case of stage retry, the `taskEnd` event from the zombie stage sometimes makes the number of `totalRunningTasks` negative, which will causes the job to get stuck.
Similar problem also exists with `stageIdToTaskIndices` & `stageIdToSpeculativeTaskIndices`.
If it is a failed `taskEnd` event of the zombie stage, this will cause `stageIdToTaskIndices` or `stageIdToSpeculativeTaskIndices` to remove the task index of the active stage, and the number of `totalPendingTasks` will increase unexpectedly.
## How was this patch tested?
unit test properly handle task end events from completed stages

Closes #24497 from cxzl25/fix_stuck_job_follow_up.

Authored-by: sychen <sychen@ctrip.com>
Signed-off-by: Imran Rashid <irashid@cloudera.com>
2019-06-25 14:30:13 -05:00
“attilapiros” b71c130fc6 [SPARK-27622][CORE] Avoiding the network when block manager fetches disk persisted RDD blocks from the same host
## What changes were proposed in this pull request?

Before this PR during fetching a disk persisted RDD block the network was always used to get the requested block content even when both the source and fetcher executor was running on the same host.

The idea to access another executor local disk files by directly reading the disk comes from the external shuffle service where the local dirs are stored for each executor (block manager).

To make this possible the following changes are done:
- `RegisterBlockManager` message is extended with the `localDirs` which is stored by the block manager master for each block manager as a new property of the `BlockManagerInfo`
- `GetLocationsAndStatus` is extended with the requester host
- `BlockLocationsAndStatus` (the reply for `GetLocationsAndStatus` message) is extended with the an option of local directories, which is filled with a local directories of a same host executor (if there is any, otherwise None is used). This is where the block content can be read from.

Shuffle blocks are out of scope of this PR: there will be a separate PR opened for that (for another Jira issue).

## How was this patch tested?

With a new unit test in `BlockManagerSuite`. See the the test prefixed by "SPARK-27622: avoid the network when block requested from same host".

Closes #24554 from attilapiros/SPARK-27622.

Authored-by: “attilapiros” <piros.attila.zsolt@gmail.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2019-06-25 07:35:44 -07:00
Xingbo Jiang 731a60c92e [SPARK-27823][CORE][DOC][FOLLOWUP] Update doc of config spark.driver.resourcesFile
## What changes were proposed in this pull request?

Update doc of config `spark.driver.resourcesFile`

## How was this patch tested?

N/A

Closes #24954 from jiangxb1987/ResourceAllocation.

Authored-by: Xingbo Jiang <xingbo.jiang@databricks.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2019-06-25 13:47:47 +09:00
zhengruifeng 6064368415 [SPARK-27018][CORE] Fix incorrect removal of checkpointed file in PeriodicCheckpointer
## What changes were proposed in this pull request?
remove the oldest checkpointed file only if next checkpoint exists.
I think this patch needs back-porting.

## How was this patch tested?
existing test

local check in spark-shell with following suite:
```
import org.apache.spark.ml.linalg.Vectors
import org.apache.spark.ml.classification.GBTClassifier

case class Row(features: org.apache.spark.ml.linalg.Vector, label: Int)

sc.setCheckpointDir("/checkpoints")
val trainingData = sc.parallelize(1 to 2426874, 256).map(x => Row(Vectors.dense(x, x + 1, x * 2 % 10), if (x % 5 == 0) 1 else 0)).toDF
val classifier = new GBTClassifier()
  .setLabelCol("label")
  .setFeaturesCol("features")
  .setProbabilityCol("probability")
  .setMaxIter(100)
  .setMaxDepth(10)
  .setCheckpointInterval(2)

classifier.fit(trainingData)
```

Closes #24870 from zhengruifeng/ck_update.

Authored-by: zhengruifeng <ruifengz@foxmail.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-06-24 09:34:01 -05:00
Jose Luis Pedrosa 0671395966 [SPARK-27989][CORE] Added retries on the connection to the driver for k8s
Disabled negative dns caching for docker images

Improved logging on DNS resolution, convenient for slow k8s clusters

## What changes were proposed in this pull request?
Added retries when building the connection to the driver in K8s.
In some scenarios DNS reslution can take more than the timeout.
Also openjdk-8 by default has negative dns caching enabled, which means even retries may not help depending on the times.

## How was this patch tested?
This patch was tested agains an specific k8s cluster with slow response time in DNS to ensure it woks.

Closes #24702 from jlpedrosa/feature/kuberetries.

Authored-by: Jose Luis Pedrosa <jlpedrosa@gmail.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-06-24 09:25:43 -05:00