[SPARK-27338][CORE] Fix deadlock in UnsafeExternalSorter.SpillableIterator when locking both UnsafeExternalSorter.SpillableIterator and TaskMemoryManager
## What changes were proposed in this pull request? In `UnsafeExternalSorter.SpillableIterator#loadNext()` takes lock on the `UnsafeExternalSorter` and calls `freePage` once the `lastPage` is consumed which needs to take a lock on `TaskMemoryManager`. At the same time, there can be another MemoryConsumer using `UnsafeExternalSorter` as part of sorting can try to `allocatePage` needs to get lock on `TaskMemoryManager` which can cause spill to happen which requires lock on `UnsafeExternalSorter` again causing deadlock. This is a classic deadlock situation happening similar to the SPARK-26265. To fix this, we can move the `freePage` call in `loadNext` outside of `Synchronized` block similar to the fix in SPARK-26265 ## How was this patch tested? Manual tests were being done and will also try to add a test. Closes #24265 from venkata91/deadlock-sorter. Authored-by: Venkata krishnan Sowrirajan <vsowrirajan@qubole.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
This commit is contained in:
parent
69dd44af19
commit
6c4552c650
|
@ -573,19 +573,31 @@ public final class UnsafeExternalSorter extends MemoryConsumer {
|
|||
|
||||
@Override
|
||||
public void loadNext() throws IOException {
|
||||
synchronized (this) {
|
||||
loaded = true;
|
||||
if (nextUpstream != null) {
|
||||
// Just consumed the last record from in memory iterator
|
||||
if (lastPage != null) {
|
||||
freePage(lastPage);
|
||||
lastPage = null;
|
||||
MemoryBlock pageToFree = null;
|
||||
try {
|
||||
synchronized (this) {
|
||||
loaded = true;
|
||||
if (nextUpstream != null) {
|
||||
// Just consumed the last record from in memory iterator
|
||||
if(lastPage != null) {
|
||||
// Do not free the page here, while we are locking `SpillableIterator`. The `freePage`
|
||||
// method locks the `TaskMemoryManager`, and it's a bad idea to lock 2 objects in
|
||||
// sequence. We may hit dead lock if another thread locks `TaskMemoryManager` and
|
||||
// `SpillableIterator` in sequence, which may happen in
|
||||
// `TaskMemoryManager.acquireExecutionMemory`.
|
||||
pageToFree = lastPage;
|
||||
lastPage = null;
|
||||
}
|
||||
upstream = nextUpstream;
|
||||
nextUpstream = null;
|
||||
}
|
||||
upstream = nextUpstream;
|
||||
nextUpstream = null;
|
||||
numRecords--;
|
||||
upstream.loadNext();
|
||||
}
|
||||
} finally {
|
||||
if (pageToFree != null) {
|
||||
freePage(pageToFree);
|
||||
}
|
||||
numRecords--;
|
||||
upstream.loadNext();
|
||||
}
|
||||
}
|
||||
|
||||
|
|
Loading…
Reference in a new issue