spark-instrumented-optimizer/sql/hive-thriftserver/src
Dongjoon Hyun a2c6adcc5d
[SPARK-18857][SQL] Don't use Iterator.duplicate for incrementalCollect in Thrift Server
## What changes were proposed in this pull request?

To support `FETCH_FIRST`, SPARK-16563 used Scala `Iterator.duplicate`. However,
Scala `Iterator.duplicate` uses a **queue to buffer all items between both iterators**,
this causes GC and hangs for queries with large number of rows. We should not use this,
especially for `spark.sql.thriftServer.incrementalCollect`.

https://github.com/scala/scala/blob/2.12.x/src/library/scala/collection/Iterator.scala#L1262-L1300

## How was this patch tested?

Pass the existing tests.

Author: Dongjoon Hyun <dongjoon@apache.org>

Closes #16440 from dongjoon-hyun/SPARK-18857.
2017-01-10 13:27:55 +00:00
..
gen/java/org/apache/hive/service/cli/thrift [SPARK-14987][SQL] inline hive-service (cli) into sql/hive-thriftserver 2016-04-29 09:32:42 -07:00
main [SPARK-18857][SQL] Don't use Iterator.duplicate for incrementalCollect in Thrift Server 2017-01-10 13:27:55 +00:00
test [SPARK-18992][SQL] Move spark.sql.hive.thriftServer.singleSession to SQLConf 2016-12-28 10:16:22 +08:00