97a1aa2c70
## What changes were proposed in this pull request? In current code, it is expensive to use `UnboundedFollowingWindowFunctionFrame`, because it is iterating from the start to lower bound every time calling `write` method. When traverse the iterator, it's possible to skip some spilled files thus to save some time. ## How was this patch tested? Added unit test Did a small test for benchmark: Put 2000200 rows into `UnsafeExternalSorter`-- 2 spill files(each contains 1000000 rows) and inMemSorter contains 200 rows. Move the iterator forward to index=2000001. *With this change*: `getIterator(2000001)`, it will cost almost 0ms~1ms; *Without this change*: `for(int i=0; i<2000001; i++)geIterator().loadNext()`, it will cost 300ms. Author: jinxing <jinxing6042@126.com> Closes #18541 from jinxing64/SPARK-21315. |
||
---|---|---|
.. | ||
src | ||
pom.xml |