spark-instrumented-optimizer/common
Henrique Goulart d42cf4566a [SPARK-30246][CORE] OneForOneStreamManager might leak memory in connectionTerminated
### What changes were proposed in this pull request?

Ensure that all StreamStates are removed from OneForOneStreamManager memory map even if there's an error trying to release buffers

### Why are the changes needed?

OneForOneStreamManager may not remove all StreamStates from memory map when a connection is terminated. A RuntimeException might be thrown in StreamState$buffers.next() by one of ExternalShuffleBlockResolver$getBlockData... **breaking the loop through streams.entrySet(), keeping StreamStates in memory forever leaking memory.**
That may happen when an application is terminated abruptly and executors removed before the connection is terminated or if shuffleIndexCache fails to get ShuffleIndexInformation

References:
ee050ddbc6/common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalBlockHandler.java (L319)

ee050ddbc6/common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalBlockHandler.java (L357)

ee050ddbc6/common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockResolver.java (L195)

ee050ddbc6/common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockResolver.java (L208)

ee050ddbc6/common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalShuffleBlockResolver.java (L330)

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
Unit test added

Closes #27064 from hensg/SPARK-30246.

Lead-authored-by: Henrique Goulart <henriquedsg89@gmail.com>
Co-authored-by: Henrique Goulart <henrique.goulart@trivago.com>
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
2020-01-15 13:27:15 -08:00
..
kvstore [SPARK-30272][SQL][CORE] Remove usage of Guava that breaks in 27; replace with workalikes 2019-12-20 08:55:04 -06:00
network-common [SPARK-30246][CORE] OneForOneStreamManager might leak memory in connectionTerminated 2020-01-15 13:27:15 -08:00
network-shuffle [SPARK-30290][CORE] Count for merged block when fetching continuous blocks in batch 2019-12-25 18:57:02 +08:00
network-yarn [SPARK-30272][SQL][CORE] Remove usage of Guava that breaks in 27; replace with workalikes 2019-12-20 08:55:04 -06:00
sketch [INFRA] Reverts commit 56dcd79 and c216ef1 2019-12-16 19:57:44 -07:00
tags [INFRA] Reverts commit 56dcd79 and c216ef1 2019-12-16 19:57:44 -07:00
unsafe [SPARK-30292][SQL] Throw Exception when invalid string is cast to numeric type in ANSI mode 2020-01-14 17:03:10 +08:00