spark-instrumented-optimizer/sql/hive-thriftserver
Dongjoon Hyun a2c6adcc5d
[SPARK-18857][SQL] Don't use Iterator.duplicate for incrementalCollect in Thrift Server
## What changes were proposed in this pull request?

To support `FETCH_FIRST`, SPARK-16563 used Scala `Iterator.duplicate`. However,
Scala `Iterator.duplicate` uses a **queue to buffer all items between both iterators**,
this causes GC and hangs for queries with large number of rows. We should not use this,
especially for `spark.sql.thriftServer.incrementalCollect`.

https://github.com/scala/scala/blob/2.12.x/src/library/scala/collection/Iterator.scala#L1262-L1300

## How was this patch tested?

Pass the existing tests.

Author: Dongjoon Hyun <dongjoon@apache.org>

Closes #16440 from dongjoon-hyun/SPARK-18857.
2017-01-10 13:27:55 +00:00
..
if [MINOR][MLLIB][STREAMING][SQL] Fix typos 2016-05-25 10:53:57 -07:00
src [SPARK-18857][SQL] Don't use Iterator.duplicate for incrementalCollect in Thrift Server 2017-01-10 13:27:55 +00:00
pom.xml [SPARK-17807][CORE] split test-tags into test-JAR 2016-12-21 16:37:20 -08:00