[SPARK-27338][CORE] Fix deadlock in UnsafeExternalSorter.SpillableIterator when locking both UnsafeExternalSorter.SpillableIterator and TaskMemoryManager

## What changes were proposed in this pull request?

In `UnsafeExternalSorter.SpillableIterator#loadNext()` takes lock on the `UnsafeExternalSorter` and calls `freePage` once the `lastPage` is consumed which needs to take a lock on `TaskMemoryManager`. At the same time, there can be another MemoryConsumer using `UnsafeExternalSorter` as part of sorting can try to `allocatePage` needs to get lock on `TaskMemoryManager` which can cause spill to happen which requires lock on `UnsafeExternalSorter` again causing deadlock. This is a classic deadlock situation happening similar to the SPARK-26265.

To fix this, we can move the `freePage` call in `loadNext` outside of `Synchronized` block similar to the fix in SPARK-26265

## How was this patch tested?

Manual tests were being done and will also try to add a test.

Closes #24265 from venkata91/deadlock-sorter.

Authored-by: Venkata krishnan Sowrirajan <vsowrirajan@qubole.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
This commit is contained in:
Venkata krishnan Sowrirajan 2019-04-04 09:58:05 +08:00 committed by Wenchen Fan
parent 69dd44af19
commit 6c4552c650

View file

@ -573,19 +573,31 @@ public final class UnsafeExternalSorter extends MemoryConsumer {
@Override
public void loadNext() throws IOException {
synchronized (this) {
loaded = true;
if (nextUpstream != null) {
// Just consumed the last record from in memory iterator
if (lastPage != null) {
freePage(lastPage);
lastPage = null;
MemoryBlock pageToFree = null;
try {
synchronized (this) {
loaded = true;
if (nextUpstream != null) {
// Just consumed the last record from in memory iterator
if(lastPage != null) {
// Do not free the page here, while we are locking `SpillableIterator`. The `freePage`
// method locks the `TaskMemoryManager`, and it's a bad idea to lock 2 objects in
// sequence. We may hit dead lock if another thread locks `TaskMemoryManager` and
// `SpillableIterator` in sequence, which may happen in
// `TaskMemoryManager.acquireExecutionMemory`.
pageToFree = lastPage;
lastPage = null;
}
upstream = nextUpstream;
nextUpstream = null;
}
upstream = nextUpstream;
nextUpstream = null;
numRecords--;
upstream.loadNext();
}
} finally {
if (pageToFree != null) {
freePage(pageToFree);
}
numRecords--;
upstream.loadNext();
}
}