spark-instrumented-optimizer

History

Wenbo Zhao 3f4bda7289 [SPARK-24578][CORE] Cap sub-region's size of returned nio buffer ## What changes were proposed in this pull request? This PR tries to fix the performance regression introduced by SPARK-21517. In our production job, we performed many parallel computations, with high possibility, some task could be scheduled to a host-2 where it needs to read the cache block data from host-1. Often, this big transfer makes the cluster suffer time out issue (it will retry 3 times, each with 120s timeout, and then do recompute to put the cache block into the local MemoryStore). The root cause is that we don't do `consolidateIfNeeded` anymore as we are using ``` Unpooled.wrappedBuffer(chunks.length, getChunks(): _*) ``` in ChunkedByteBuffer. If we have many small chunks, it could cause the `buf.notBuffer(...)` have very bad performance in the case that we have to call `copyByteBuf(...)` many times. ## How was this patch tested? Existing unit tests and also test in production Author: Wenbo Zhao <wzhao@twosigma.com> Closes #21593 from WenboZhao/spark-24578.		2018-06-20 14:26:04 -07:00
..
kvstore	[SPARK-23103][CORE] Ensure correct sort order for negative values in LevelDB.	2018-01-19 13:32:20 -06:00
network-common	[SPARK-24578][CORE] Cap sub-region's size of returned nio buffer	2018-06-20 14:26:04 -07:00
network-shuffle	[SPARK-24356][CORE] Duplicate strings in File.path managed by FileSegmentManagedBuffer	2018-06-02 23:07:39 -05:00
network-yarn	[SPARK-23028] Bump master branch version to 2.4.0-SNAPSHOT	2018-01-13 00:37:59 +08:00
sketch	[SPARK-23381][CORE] Murmur3 hash generates a different value from other implementations	2018-02-16 17:17:55 -08:00
tags	[SPARK-23028] Bump master branch version to 2.4.0-SNAPSHOT	2018-01-13 00:37:59 +08:00
unsafe	[SPARK-23976][CORE] Detect length overflow in UTF8String.concat()/ByteArray.concat()	2018-05-02 10:41:34 +02:00