This PR solves three SerializationDebugger issues.
* SPARK-7180 - SerializationDebugger fails with ArrayOutOfBoundsException
* SPARK-8090 - SerializationDebugger does not handle classes with writeReplace correctly
* SPARK-8091 - SerializationDebugger does not handle classes with writeObject method
The solutions for each are explained as follows
* SPARK-7180 - The wrong slot desc was used for getting the value of the fields in the object being tested.
* SPARK-8090 - Test the type of the replaced object.
* SPARK-8091 - Use a dummy ObjectOutputStream to collect all the objects written by the writeObject() method, and then test those objects as usual.
I also added more tests in the testsuite to increase code coverage. For example, added tests for cases where there are not serializability issues.
Author: Tathagata Das <tathagata.das1565@gmail.com>
Closes#6625 from tdas/SPARK-7180 and squashes the following commits:
c7cb046 [Tathagata Das] Addressed comments on docs
ae212c8 [Tathagata Das] Improved docs
304c97b [Tathagata Das] Fixed build error
26b5179 [Tathagata Das] more tests.....92% line coverage
7e2fdcf [Tathagata Das] Added more tests
d1967fb [Tathagata Das] Added comments.
da75d34 [Tathagata Das] Removed unnecessary lines.
50a608d [Tathagata Das] Fixed bugs and added support for writeObject
This patch also reenables the tests. Now that we have access to the log4j logs it should be easier to debug the flakiness.
yhuai brkyvz
Author: Andrew Or <andrew@databricks.com>
Closes#6886 from andrewor14/spark-submit-suite-fix and squashes the following commits:
3f99ff1 [Andrew Or] Move destroy to finally block
9a62188 [Andrew Or] Re-enable ignored tests
2382672 [Andrew Or] Check for exit code
(cherry picked from commit 68a2dca292)
Signed-off-by: Andrew Or <andrew@databricks.com>
Dependencies of artifacts in the local ivy cache were not being resolved properly. The dependencies were not being picked up. Now they should be.
cc andrewor14
Author: Burak Yavuz <brkyvz@gmail.com>
Closes#6788 from brkyvz/local-ivy-fix and squashes the following commits:
2875bf4 [Burak Yavuz] fix temp dir bug
48cc648 [Burak Yavuz] improve deletion
a69e3e6 [Burak Yavuz] delete cache before test as well
0037197 [Burak Yavuz] fix merge conflicts
f60772c [Burak Yavuz] use different folder for m2 cache during testing
b6ef038 [Burak Yavuz] [SPARK-8095] Resolve dependencies of Spark Packages in local ivy cache
Conflicts:
core/src/test/scala/org/apache/spark/deploy/SparkSubmitUtilsSuite.scala
```def getAllNodes: Seq[RDDOperationNode] =
{ _childNodes ++ _childClusters.flatMap(_.childNodes) }```
when the ```_childClusters``` has so many nodes, the process will hang on. I think we can improve the efficiency here.
Author: xutingjun <xutingjun@huawei.com>
Closes#6839 from XuTingjun/DAGImprove and squashes the following commits:
53b03ea [xutingjun] change code to more concise and easier to read
f98728b [xutingjun] fix words: node -> nodes
f87c663 [xutingjun] put the filter inside
81f9fd2 [xutingjun] put the filter inside
(cherry picked from commit e2cdb0568b)
Signed-off-by: Andrew Or <andrew@databricks.com>
This PR fixes the sum issue and also adds `emptyRDD` so that it's easy to create a test case.
Author: zsxwing <zsxwing@gmail.com>
Closes#6826 from zsxwing/python-emptyRDD and squashes the following commits:
b36993f [zsxwing] Update the return type to JavaRDD[T]
71df047 [zsxwing] Add emptyRDD to pyspark and fix the issue when calling sum on an empty RDD
(cherry picked from commit 0fc4b96f3e)
Signed-off-by: Andrew Or <andrew@databricks.com>
The history server may show an incorrect App ID for an incomplete application like <App ID>.inprogress. This app info will never disappear even after the app is completed.
![incorrectappinfo](https://cloud.githubusercontent.com/assets/9278199/8156147/2a10fdbe-137d-11e5-9620-c5b61d93e3c1.png)
The cause of the issue is that a log path name is used as the app id when app id cannot be got during replay.
Author: Carson Wang <carson.wang@intel.com>
Closes#6827 from carsonwang/SPARK-8372 and squashes the following commits:
cdbb089 [Carson Wang] Fix code style
3e46b35 [Carson Wang] Update code style
90f5dde [Carson Wang] Add a unit test
d8c9cd0 [Carson Wang] Replaying events only return information when app is started
(cherry picked from commit 2837e06709)
Signed-off-by: Andrew Or <andrew@databricks.com>
externalBlockStoreInitialized is never set to be true, which causes the blocks stored in ExternalBlockStore can not be removed.
Author: Mingfei <mingfei.shi@intel.com>
Closes#6702 from shimingfei/SetTrue and squashes the following commits:
add61d8 [Mingfei] Set externalBlockStoreInitialized to be true, after ExternalBlockStore is initialized
(cherry picked from commit 7ad8c5d869)
Signed-off-by: Andrew Or <andrew@databricks.com>
The problem occurs because the position mask `0xEFFFFFF` is incorrect. It has zero 25th bit, so when capacity grows beyond 2^24, `OpenHashMap` calculates incorrect index of value in `_values` array.
I've also added a size check in `rehash()`, so that it fails instead of reporting invalid item indices.
Author: Vyacheslav Baranov <slavik.baranov@gmail.com>
Closes#6763 from SlavikBaranov/SPARK-8309 and squashes the following commits:
8557445 [Vyacheslav Baranov] Resolved review comments
4d5b954 [Vyacheslav Baranov] Resolved review comments
eaf1e68 [Vyacheslav Baranov] Fixed failing test
f9284fd [Vyacheslav Baranov] Resolved review comments
3920656 [Vyacheslav Baranov] SPARK-8309: Support for more than 12M items in OpenHashMap
(cherry picked from commit c13da20a55)
Signed-off-by: Sean Owen <sowen@cloudera.com>
Safeguard against DOM rewriting.
Author: Andrew Or <andrew@databricks.com>
Closes#6787 from andrewor14/dag-viz-trim and squashes the following commits:
0fb4afe [Andrew Or] Trim input metadata from DOM
(cherry picked from commit 8860405151)
Signed-off-by: Andrew Or <andrew@databricks.com>
IBM Java has an extra method when we do getStackTrace(): this is "getStackTraceImpl", a native method. This causes two tests to fail within "DStreamScopeSuite" when running with IBM Java. Instead of "map" or "filter" being the method names found, "getStackTrace" is returned. This commit addresses such an issue by using dropWhile. Given that our current method is withScope, we look for the next method that isn't ours: we don't care about methods that come before us in the stack trace: e.g. getStackTrace (regardless of how many levels this might go).
IBM:
java.lang.Thread.getStackTraceImpl(Native Method)
java.lang.Thread.getStackTrace(Thread.java:1117)
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:104)
Oracle:
PRINTING STACKTRACE!!!
java.lang.Thread.getStackTrace(Thread.java:1552)
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:106)
I've tested this with Oracle and IBM Java, no side effects for other tests introduced.
Author: Adam Roberts <aroberts@uk.ibm.com>
Author: a-roberts <aroberts@uk.ibm.com>
Closes#6740 from a-roberts/RDDScopeStackCrawlFix and squashes the following commits:
13ce390 [Adam Roberts] Ensure consistency with String equality checking
a4fc0e0 [a-roberts] Update RDDOperationScope.scala
(cherry picked from commit 19e30b48f3)
Signed-off-by: Andrew Or <andrew@databricks.com>
Read number of threads for RBackend from configuration.
[SPARK-8282] #comment Linking with JIRA
Author: Hossein <hossein@databricks.com>
Closes#6730 from falaki/SPARK-8282 and squashes the following commits:
33b3d98 [Hossein] Documented new config parameter
70f2a9c [Hossein] Fixing import
ec44225 [Hossein] Read number of threads for RBackend from configuration
(cherry picked from commit 30ebf1a233)
Signed-off-by: Andrew Or <andrew@databricks.com>
Just as a safeguard against DOM rewriting.
Author: Andrew Or <andrew@databricks.com>
Closes#6732 from andrewor14/dag-viz-trim and squashes the following commits:
7e9bacb [Andrew Or] [MINOR] [UI] DAG visualization: trim whitespace from input
(cherry picked from commit 0d5892dc72)
Signed-off-by: Andrew Or <andrew@databricks.com>
This was caused by this commit: f271347
This patch does not attempt to fix the root cause of why the `VisibleForTesting` annotation causes a NPE in the shell. We should find a way to fix that separately.
Author: Andrew Or <andrew@databricks.com>
Closes#6711 from andrewor14/fix-spark-shell and squashes the following commits:
bf62ecc [Andrew Or] Prevent NPE in spark-shell
Even with all the efforts to cleanup the temp directories created by
unit tests, Spark leaves a lot of garbage in /tmp after a test run.
This change overrides java.io.tmpdir to place those files under the
build directory instead.
After an sbt full unit test run, I was left with > 400 MB of temp
files. Since they're now under the build dir, it's much easier to
clean them up.
Also make a slight change to a unit test to make it not pollute the
source directory with test data.
Author: Marcelo Vanzin <vanzin@cloudera.com>
Closes#6674 from vanzin/SPARK-8126 and squashes the following commits:
0f8ad41 [Marcelo Vanzin] Make sure tmp dir exists when tests run.
643e916 [Marcelo Vanzin] [MINOR] [BUILD] Use custom temp directory during build.
…moved if dynamic allocation is enabled.
This is a work in progress. This patch ensures that an executor that has cached RDD blocks are not removed,
but makes no attempt to find another executor to remove. This is meant to get some feedback on the current
approach, and if it makes sense then I will look at choosing another executor to remove. No testing has been done either.
Author: Hari Shreedharan <hshreedharan@apache.org>
Closes#6508 from harishreedharan/dymanic-caching and squashes the following commits:
dddf1eb [Hari Shreedharan] Minor configuration description update.
10130e2 [Hari Shreedharan] Fix compile issue.
5417b53 [Hari Shreedharan] Add documentation for new config. Remove block from cachedBlocks when it is dropped.
875916a [Hari Shreedharan] Make some code more readable.
39940ca [Hari Shreedharan] Handle the case where the executor has not yet registered.
90ad711 [Hari Shreedharan] Remove unused imports and unused methods.
063985c [Hari Shreedharan] Send correct message instead of recursively calling same method.
ec2fd7e [Hari Shreedharan] Add file missed in last commit
5d10fad [Hari Shreedharan] Update cached blocks status using local info, rather than doing an RPC.
193af4c [Hari Shreedharan] WIP. Use local state rather than via RPC.
ae932ff [Hari Shreedharan] Fix config param name.
272969d [Hari Shreedharan] Fix seconds to millis bug.
5a1993f [Hari Shreedharan] Add timeout for cache executors. Ignore broadcast blocks while checking if there are cached blocks.
57fefc2 [Hari Shreedharan] [SPARK-7955][Core] Ensure executors with cached RDD blocks are not removed if dynamic allocation is enabled.
(cherry picked from commit 3285a51121)
Signed-off-by: Andrew Or <andrew@databricks.com>
Even with all the efforts to cleanup the temp directories created by
unit tests, Spark leaves a lot of garbage in /tmp after a test run.
This change overrides java.io.tmpdir to place those files under the
build directory instead.
After an sbt full unit test run, I was left with > 400 MB of temp
files. Since they're now under the build dir, it's much easier to
clean them up.
Also make a slight change to a unit test to make it not pollute the
source directory with test data.
Author: Marcelo Vanzin <vanzin@cloudera.com>
Closes#6653 from vanzin/unit-test-tmp and squashes the following commits:
31e2dd5 [Marcelo Vanzin] Fix tests that depend on each other.
aa92944 [Marcelo Vanzin] [minor] [build] Use custom temp directory during build.
(cherry picked from commit b16b5434ff)
Signed-off-by: Sean Owen <sowen@cloudera.com>
Completely trivial but I noticed this wrinkle in a log message today; `$sender` doesn't refer to anything and isn't interpolated here.
Author: Sean Owen <sowen@cloudera.com>
Closes#6650 from srowen/Interpolation and squashes the following commits:
518687a [Sean Owen] Actually interpolate log string
7edb866 [Sean Owen] Trivial: remove unused interpolation var in log message
(cherry picked from commit 3a5c4da473)
Signed-off-by: Reynold Xin <rxin@databricks.com>
The log page should only show desired length of bytes. Currently it shows bytes from the startIndex to the end of the file. The "Next" button on the page is always disabled.
Author: Carson Wang <carson.wang@intel.com>
Closes#6640 from carsonwang/logpage and squashes the following commits:
58cb3fd [Carson Wang] Show correct length of bytes on log page
(cherry picked from commit 63bc0c4430)
Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>
If maxTaskFailures is 1, the task set is aborted after 1 task failure. Other documentation and the code supports this reading, I think it's just this comment that was off. It's easy to make this mistake — can you please double-check if I'm correct? Thanks!
Author: Daniel Darabos <darabos.daniel@gmail.com>
Closes#6621 from darabos/patch-2 and squashes the following commits:
dfebdec [Daniel Darabos] Fix comment.
(cherry picked from commit 10ba188087)
Signed-off-by: Sean Owen <sowen@cloudera.com>
This includes the following commits:
original: 9eb222c
hotfix1: 8c99793
hotfix2: a4f2412
scalastyle check: 609c492
---
Original patch #6441
Branch-1.3 patch #6602
Author: Andrew Or <andrew@databricks.com>
Closes#6598 from andrewor14/demarcate-tests-1.4 and squashes the following commits:
4c3c566 [Andrew Or] Merge branch 'branch-1.4' of github.com:apache/spark into demarcate-tests-1.4
e217b78 [Andrew Or] [SPARK-7558] Guard against direct uses of FunSuite / FunSuiteLike
46d4361 [Andrew Or] Various whitespace changes (minor)
3d9bf04 [Andrew Or] Make all test suites extend SparkFunSuite instead of FunSuite
eaa520e [Andrew Or] Fix tests?
b4d93de [Andrew Or] Fix tests
634a777 [Andrew Or] Fix log message
a932e8d [Andrew Or] Fix manual things that cannot be covered through automation
8bc355d [Andrew Or] Add core tests as dependencies in all modules
75d361f [Andrew Or] Introduce base abstract class for all test suites
Author: Ryan Williams <ryan.blake.williams@gmail.com>
Closes#6624 from ryan-williams/execs and squashes the following commits:
b6f71d4 [Ryan Williams] don't attempt to lower number of executors by 0
(cherry picked from commit 51898b5158)
Signed-off-by: Andrew Or <andrew@databricks.com>
The flaky tests in ExternalShuffleServiceSuite and SparkListenerWithClusterSuite will fail if there are not enough executors up before running the jobs.
This PR adds `JobProgressListener.waitUntilExecutorsUp`. The tests for the cluster mode can use it to wait until the expected executors are up.
Author: zsxwing <zsxwing@gmail.com>
Closes#6546 from zsxwing/SPARK-7989 and squashes the following commits:
5560e09 [zsxwing] Fix a typo
3b69840 [zsxwing] Fix flaky tests in ExternalShuffleServiceSuite and SparkListenerWithClusterSuite
(cherry picked from commit f27134782e)
Signed-off-by: Andrew Or <andrew@databricks.com>
Conflicts:
core/src/test/scala/org/apache/spark/broadcast/BroadcastSuite.scala
core/src/test/scala/org/apache/spark/scheduler/SparkListenerWithClusterSuite.scala
Some places forget to call `assert` to check the return value of `AsynchronousListenerBus.waitUntilEmpty`. Instead of adding `assert` in these places, I think it's better to make `AsynchronousListenerBus.waitUntilEmpty` throw `TimeoutException`.
Author: zsxwing <zsxwing@gmail.com>
Closes#6550 from zsxwing/SPARK-8001 and squashes the following commits:
607674a [zsxwing] Make AsynchronousListenerBus.waitUntilEmpty throw TimeoutException if timeout
(cherry picked from commit 1d8669f15c)
Signed-off-by: Andrew Or <andrew@databricks.com>
Author: Timothy Chen <tnachen@gmail.com>
Closes#6615 from tnachen/mesos_driver_path and squashes the following commits:
4f47b7c [Timothy Chen] Use the correct base path in mesos driver page.
(cherry picked from commit bfbf12b349)
Signed-off-by: Andrew Or <andrew@databricks.com>
This prevents the spark.jars from being cleared while using `--packages` or `--jars`
cc pwendell davies brkyvz
Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu>
Closes#6568 from shivaram/SPARK-8028 and squashes the following commits:
3a9cf1f [Shivaram Venkataraman] Use addJar instead of setJars in SparkR This prevents the spark.jars from being cleared
(cherry picked from commit 6b44278ef7)
Signed-off-by: Shivaram Venkataraman <shivaram@cs.berkeley.edu>
Author: Sun Rui <rui.sun@intel.com>
Closes#6183 from sun-rui/SPARK-7227 and squashes the following commits:
dd6f5b3 [Sun Rui] Rename readEnv() back to readMap(). Add alias na.omit() for dropna().
41cf725 [Sun Rui] [SPARK-7227][SPARKR] Support fillna / dropna in R DataFrame.
(cherry picked from commit 46576ab303)
Signed-off-by: Shivaram Venkataraman <shivaram@cs.berkeley.edu>
Author: Reynold Xin <rxin@databricks.com>
Closes#6533 from rxin/whitespace-2 and squashes the following commits:
038314c [Reynold Xin] [SPARK-3850] Trim trailing spaces for core.
(cherry picked from commit 74fdc97c72)
Signed-off-by: Reynold Xin <rxin@databricks.com>
Conflicts:
core/src/main/scala/org/apache/spark/storage/TachyonBlockManager.scala
core/src/test/scala/org/apache/spark/serializer/KryoSerializerSuite.scala
Only parse standalone master url when master url starts with spark://
Author: Timothy Chen <tnachen@gmail.com>
Closes#6517 from tnachen/fix_mesos_client and squashes the following commits:
61a1198 [Timothy Chen] Fix master url parsing in rest submission client.
(cherry picked from commit 78657d53d7)
Signed-off-by: Andrew Or <andrew@databricks.com>
cc JoshRosen
Thanks for noticing this!
Author: Burak Yavuz <brkyvz@gmail.com>
Closes#6509 from brkyvz/sample-perf-reg and squashes the following commits:
497465d [Burak Yavuz] addressed code review
293f95f [Burak Yavuz] [SPARK-7957] Preserve partitioning when using randomSplit
(cherry picked from commit 7ed06c3992)
Signed-off-by: Reynold Xin <rxin@databricks.com>
Add alias names for supported cipher suites to the sample SSL configuration.
The IBM JSSE provider reports its cipher suite with an SSL_ prefix, but accepts TLS_ prefixed suite names as an alias. However, Jetty filters the requested ciphers based on the provider's reported supported suites, so the TLS_ versions are never passed through to JSSE causing an SSL handshake failure.
Author: Tim Ellison <t.p.ellison@gmail.com>
Closes#6282 from tellison/SSLFailure and squashes the following commits:
8de8a3e [Tim Ellison] Update SecurityManagerSuite with new expected suite names
96158b2 [Tim Ellison] Update the sample configs to use ciphers that are common to both the Oracle and IBM security providers.
705421b [Tim Ellison] Merge branch 'master' of github.com:tellison/spark into SSLFailure
68b9425 [Tim Ellison] Merge branch 'master' of https://github.com/apache/spark into SSLFailure
b0c35f6 [Tim Ellison] [CORE] Add aliases used for cipher suites in IBM provider
(cherry picked from commit bf46580708)
Signed-off-by: Sean Owen <sowen@cloudera.com>
Shutdown hook for temp directories had priority 100 while SparkContext was 50. So the local root directory was deleted before SparkContext was shutdown. This leads to scary errors on running jobs, at the time of shutdown. This is especially a problem when running streaming examples, where Ctrl-C is the only way to shutdown.
The fix in this PR is to make the temp directory shutdown priority lower than SparkContext, so that the temp dirs are the last thing to get deleted, after the SparkContext has been shut down. Also, the DiskBlockManager shutdown priority is change from default 100 to temp_dir_prio + 1, so that it gets invoked just before all temp dirs are cleared.
Author: Tathagata Das <tathagata.das1565@gmail.com>
Closes#6482 from tdas/SPARK-7930 and squashes the following commits:
d7cbeb5 [Tathagata Das] Removed unnecessary line
1514d0b [Tathagata Das] Fixed shutdown hook priorities
(cherry picked from commit cd3d9a5c0c)
Signed-off-by: Patrick Wendell <patrick@databricks.com>
The existing code rounds down to the nearest percent when computing the proportion
of a task's time that was spent on each phase of execution, and then computes
the scheduler delay proportion as 100 - sum(all other proportions). As a result,
a few extra percent can end up in the scheduler delay. This commit eliminates
the rounding so that the time visualizations correspond properly to the real times.
sarutak If you could take a look at this, that would be great! Not sure if there's a good
reason to round here that I missed.
cc shivaram
Author: Kay Ousterhout <kayousterhout@gmail.com>
Closes#6484 from kayousterhout/SPARK-7932 and squashes the following commits:
1723cc4 [Kay Ousterhout] [SPARK-7932] Fix misleading scheduler delay visualization
(cherry picked from commit 04ddcd4db7)
Signed-off-by: Kay Ousterhout <kayousterhout@gmail.com>
So we can enable a whitespace enforcement rule in the style checker to save code review time.
Author: Reynold Xin <rxin@databricks.com>
Closes#6473 from rxin/whitespace-core and squashes the following commits:
058195d [Reynold Xin] Fixed tests.
fce11e9 [Reynold Xin] [SPARK-7927] whitespace fixes for core.
(cherry picked from commit 7f7505d8db)
Signed-off-by: Reynold Xin <rxin@databricks.com>
See comments on https://github.com/apache/spark/pull/3913
Author: Reynold Xin <rxin@databricks.com>
Closes#6471 from rxin/sizeestimator and squashes the following commits:
c057095 [Reynold Xin] Fixed import.
2da478b [Reynold Xin] Remove SizeEstimator from o.a.spark package.
(cherry picked from commit 0077af22ca)
Signed-off-by: Reynold Xin <rxin@databricks.com>
Author: Sandy Ryza <sandy@cloudera.com>
Closes#6440 from sryza/sandy-spark-7896 and squashes the following commits:
49d8a0d [Sandy Ryza] Fix bug introduced when reading over record boundaries
6006856 [Sandy Ryza] Fix overflow issues
006b4b2 [Sandy Ryza] Fix scalastyle by removing non ascii characters
8b000ca [Sandy Ryza] Add ascii art to describe layout of data in metaBuffer
f2053c0 [Sandy Ryza] Fix negative overflow issue
0368c78 [Sandy Ryza] Initialize size as 0
a5a4820 [Sandy Ryza] Use explicit types for all numbers in ChainedBuffer
b7e0213 [Sandy Ryza] SPARK-7896. Allow ChainedBuffer to store more than 2 GB
(cherry picked from commit bd11b01eba)
Signed-off-by: Patrick Wendell <patrick@databricks.com>
This is a somewhat obscure bug, but I think that it will seriously impact KryoSerializer users who use custom registrators which disabled auto-reset. When auto-reset is disabled, then this breaks things in some of our shuffle paths which actually end up creating multiple OutputStreams from the same shared SerializerInstance (which is unsafe).
This was introduced by a patch (SPARK-3386) which enables serializer re-use in some of the shuffle paths, since constructing new serializer instances is actually pretty costly for KryoSerializer. We had already fixed another corner-case (SPARK-7766) bug related to this, but missed this one.
I think that the root problem here is that KryoSerializerInstance can be used in a way which is unsafe even within a single thread, e.g. by creating multiple open OutputStreams from the same instance or by interleaving deserialize and deserializeStream calls. I considered a smaller patch which adds assertions to guard against this type of "misuse" but abandoned that approach after I realized how convoluted the Scaladoc became.
This patch fixes this bug by making it legal to create multiple streams from the same KryoSerializerInstance. Internally, KryoSerializerInstance now implements a `borrowKryo()` / `releaseKryo()` API that's backed by a "pool" of capacity 1. Each call to a KryoSerializerInstance method will borrow the Kryo, do its work, then release the serializer instance back to the pool. If the pool is empty and we need an instance, it will allocate a new Kryo on-demand. This makes it safe for multiple OutputStreams to be opened from the same serializer. If we try to release a Kryo back to the pool but the pool already contains a Kryo, then we'll just discard the new Kryo. I don't think there's a clear benefit to having a larger pool since our usages tend to fall into two cases, a) where we only create a single OutputStream and b) where we create a huge number of OutputStreams with the same lifecycle, then destroy the KryoSerializerInstance (this is what's happening in the bypassMergeSort code path that my regression test hits).
Author: Josh Rosen <joshrosen@databricks.com>
Closes#6415 from JoshRosen/SPARK-7873 and squashes the following commits:
00b402e [Josh Rosen] Initialize eagerly to fix a failing test
ba55d20 [Josh Rosen] Add explanatory comments
3f1da96 [Josh Rosen] Guard against duplicate close()
ab457ca [Josh Rosen] Sketch a loan/release based solution.
9816e8f [Josh Rosen] Add a failing test showing how deserialize() and deserializeStream() can interfere.
7350886 [Josh Rosen] Add failing regression test for SPARK-7873
(cherry picked from commit 852f4de2d3)
Signed-off-by: Patrick Wendell <patrick@databricks.com>
This issue is related to #6419 .
Now AllJobPage doesn't have a "kill link" but I think fix the issue mentioned in #6419 just in case to avoid accidents in the future.
So, it's minor issue for now and I don't file this issue in JIRA.
Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
Closes#6432 from sarutak/remove-ambiguity-of-link and squashes the following commits:
cd1a503 [Kousuke Saruta] Fixed ambiguity link issue in AllJobPage
(cherry picked from commit 0db76c90ad)
Signed-off-by: Andrew Or <andrew@databricks.com>
follow up for #6377
Change time to the equivalent in GMT
/cc squito
Author: scwf <wangfei1@huawei.com>
Closes#6425 from scwf/fix-HistoryServerSuite and squashes the following commits:
4d37935 [scwf] fix HistoryServerSuite
(cherry picked from commit 4615081d7a)
Signed-off-by: Imran Rashid <irashid@cloudera.com>