[SPARK-31379][CORE][TEST] Fix flaky o.a.s.scheduler.CoarseGrainedSchedulerBackendSuite.extra resources from executor
### What changes were proposed in this pull request? This PR (SPARK-31379) adds one line `when(ts.resourceOffers(any[IndexedSeq[WorkerOffer]])).thenReturn(Seq.empty)` to avoid allocating resources. ### Why are the changes needed? The test is flaky and here's part of error stack: ``` sbt.ForkMain$ForkError: org.scalatest.exceptions.TestFailedDueToTimeoutException: The code passed to eventually never returned normally. Attempted 325 times over 5.01070979 seconds. Last failure message: ArrayBuffer("1", "3") did not equal Array("0", "1", "3"). ... org.apache.spark.scheduler.CoarseGrainedSchedulerBackendSuite.eventually(CoarseGrainedSchedulerBackendSuite.scala:45) ``` You can check [here](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/120786/testReport/org.apache.spark.scheduler/CoarseGrainedSchedulerBackendSuite/extra_resources_from_executor/) for details. And it is flaky because: after sending `StatusUpdate` to `CoarseGrainedSchedulerBackend`, `CoarseGrainedSchedulerBackend` will call `makeOffer` immediately once releasing the resources. So, it's possible that `availableAddrs` has allocated again before we assert `execResources(GPU).availableAddrs.sorted === Array("0", "1", "3")`. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? The issue can be stably reproduced by inserting `Thread.sleep(3000)` after the line of sending `StatusUpdate`. After applying this fix, the issue is gone. Closes #28145 from Ngone51/fix_flaky. Authored-by: yi.wu <yi.wu@databricks.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>
This commit is contained in:
parent
a3d83948b8
commit
a2789c2a51
|
@ -258,6 +258,9 @@ class CoarseGrainedSchedulerBackendSuite extends SparkFunSuite with LocalSparkCo
|
|||
assert(execResources(GPU).assignedAddrs === Array("0"))
|
||||
}
|
||||
|
||||
// To avoid allocating any resources immediately after releasing the resource from the task to
|
||||
// make sure that `availableAddrs` below won't change
|
||||
when(ts.resourceOffers(any[IndexedSeq[WorkerOffer]])).thenReturn(Seq.empty)
|
||||
backend.driverEndpoint.send(
|
||||
StatusUpdate("1", 1, TaskState.FINISHED, buffer, taskResources))
|
||||
|
||||
|
|
Loading…
Reference in a new issue