…rget versions.
I basically copied the compatibility rules from the top level pom.xml into here. Someone more familiar with all the options in the top level pom may want to make sure nothing else should be copied on down.
With this is allows me to build with jdk8 and run with lower versions. Source shows compiled for jdk6 as its supposed to.
Author: Tom Graves <tgraves@yahoo-inc.com>
Author: Thomas Graves <tgraves@staydecay.corp.gq1.yahoo.com>
Closes#6989 from tgravescs/SPARK-8574 and squashes the following commits:
e1ea2d4 [Thomas Graves] Change to use combine.children="append"
150d645 [Tom Graves] [SPARK-8574] org/apache/spark/unsafe doesn't honor the java source/target versions
(cherry picked from commit e988adb58f)
Signed-off-by: Tom Graves <tgraves@yahoo-inc.com>
JIRA: https://issues.apache.org/jira/browse/SPARK-7800
`isDefined` is marked as true twice in `Location.putNewKey`. The first one is unnecessary and will cause problem because it is too early and before some assert checking. E.g., if an attempt with incorrect `keyLengthBytes` marks `isDefined` as true, the location can not be used later.
ping JoshRosen
Author: Liang-Chi Hsieh <viirya@gmail.com>
Closes#6324 from viirya/dup_isdefined and squashes the following commits:
cbfe03b [Liang-Chi Hsieh] isDefined should not marked too early in putNewKey.
(cherry picked from commit 5a3c04bb92)
Signed-off-by: Josh Rosen <joshrosen@databricks.com>
This patch modifies `BytesToBytesMap.iterator()` to iterate through records in the order that they appear in the data pages rather than iterating through the hashtable pointer arrays. This results in fewer random memory accesses, significantly improving performance for scan-and-copy operations.
This is possible because our data pages are laid out as sequences of `[keyLength][data][valueLength][data]` entries. In order to mark the end of a partially-filled data page, we write `-1` as a special end-of-page length (BytesToByesMap supports empty/zero-length keys and values, which is why we had to use a negative length).
This patch incorporates / closes#5836.
Author: Josh Rosen <joshrosen@databricks.com>
Closes#6159 from JoshRosen/SPARK-7251 and squashes the following commits:
05bd90a [Josh Rosen] Compare capacity, not size, to MAX_CAPACITY
2a20d71 [Josh Rosen] Fix maximum BytesToBytesMap capacity
bc4854b [Josh Rosen] Guard against overflow when growing BytesToBytesMap
f5feadf [Josh Rosen] Add test for iterating over an empty map
273b842 [Josh Rosen] [SPARK-7251] Perform sequential scan when iterating over entries in BytesToBytesMap
(cherry picked from commit f2faa7af30)
Signed-off-by: Josh Rosen <joshrosen@databricks.com>
When on-heap memory allocation is used, ExecutorMemoryManager should maintain a cache / pool of buffers for re-use by tasks. This will significantly improve the performance of the new Tungsten's sort-shuffle for jobs with many short-lived tasks by eliminating a major source of GC.
This pull request is a minimum-viable-implementation of this idea. In its current form, this patch significantly improves performance on a stress test which launches huge numbers of short-lived shuffle map tasks back-to-back in the same JVM.
Author: Josh Rosen <joshrosen@databricks.com>
Closes#6227 from JoshRosen/SPARK-7698 and squashes the following commits:
fd6cb55 [Josh Rosen] SoftReference -> WeakReference
b154e86 [Josh Rosen] WIP sketch of pooling in ExecutorMemoryManager
(cherry picked from commit 7956dd7ab0)
Signed-off-by: Josh Rosen <joshrosen@databricks.com>