[SPARK-30274][CORE] Avoid BytesToBytesMap lookup hang forever when holding keys reaching max capacity

### What changes were proposed in this pull request?

We should not append keys to BytesToBytesMap to be its max capacity.

### Why are the changes needed?

BytesToBytesMap.append allows to append keys until the number of keys reaches MAX_CAPACITY. But once the the pointer array in the map holds MAX_CAPACITY keys, next time call of lookup will hang forever.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

Manually test by:
```java
Test
  public void testCapacity() {
    TestMemoryManager memoryManager2 =
            new TestMemoryManager(
                    new SparkConf()
                            .set(package$.MODULE$.MEMORY_OFFHEAP_ENABLED(), true)
                            .set(package$.MODULE$.MEMORY_OFFHEAP_SIZE(), 25600 * 1024 * 1024L)
                            .set(package$.MODULE$.SHUFFLE_SPILL_COMPRESS(), false)
                            .set(package$.MODULE$.SHUFFLE_COMPRESS(), false));
    TaskMemoryManager taskMemoryManager2 = new TaskMemoryManager(memoryManager2, 0);
    final long pageSizeBytes = 8000000 + 8; // 8 bytes for end-of-page marker
    final BytesToBytesMap map = new BytesToBytesMap(taskMemoryManager2, 1024, pageSizeBytes);

    try {
      for (long i = 0; i < BytesToBytesMap.MAX_CAPACITY + 1; i++) {
        final long[] value = new long[]{i};
        boolean succeed = map.lookup(value, Platform.LONG_ARRAY_OFFSET, 8).append(
                value,
                Platform.LONG_ARRAY_OFFSET,
                8,
                value,
                Platform.LONG_ARRAY_OFFSET,
                8);
      }
      map.free();
    } finally {
      map.free();
    }
  }
```

Once the map was appended to 536870912 keys (MAX_CAPACITY), the next lookup will hang.

Closes #26914 from viirya/fix-bytemap2.

Authored-by: Liang-Chi Hsieh <liangchi@uber.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
This commit is contained in:
Liang-Chi Hsieh 2019-12-17 11:37:05 -08:00 committed by Dongjoon Hyun
parent cdc8fc6233
commit b2baaa2fcc

View file

@ -694,7 +694,10 @@ public final class BytesToBytesMap extends MemoryConsumer {
assert (vlen % 8 == 0);
assert (longArray != null);
if (numKeys == MAX_CAPACITY
// We should not increase number of keys to be MAX_CAPACITY. The usage pattern of this map is
// lookup + append. If we append key until the number of keys to be MAX_CAPACITY, next time
// the call of lookup will hang forever because it cannot find an empty slot.
if (numKeys == MAX_CAPACITY - 1
// The map could be reused from last spill (because of no enough memory to grow),
// then we don't try to grow again if hit the `growthThreshold`.
|| !canGrowArray && numKeys >= growthThreshold) {