spark-instrumented-optimizer/sql/hive
Michael Armbrust 25bef7e695 [SQL] More aggressive defaults
- Turns on compression for in-memory cached data by default
 - Changes the default parquet compression format back to gzip (we have seen more OOMs with production workloads due to the way Snappy allocates memory)
 - Ups the batch size to 10,000 rows
 - Increases the broadcast threshold to 10mb.
 - Uses our parquet implementation instead of the hive one by default.
 - Cache parquet metadata by default.

Author: Michael Armbrust <michael@databricks.com>

Closes #3064 from marmbrus/fasterDefaults and squashes the following commits:

97ee9f8 [Michael Armbrust] parquet codec docs
e641694 [Michael Armbrust] Remote also
a12866a [Michael Armbrust] Cache metadata.
2d73acc [Michael Armbrust] Update docs defaults.
d63d2d5 [Michael Armbrust] document parquet option
da373f9 [Michael Armbrust] More aggressive defaults
2014-11-03 14:08:27 -08:00
..
compatibility/src/test/scala/org/apache/spark/sql/hive/execution [SPARK-3904] [SQL] add constant objectinspector support for udfs 2014-10-28 19:11:57 -07:00
src [SQL] More aggressive defaults 2014-11-03 14:08:27 -08:00
v0.12.0/src/main/scala/org/apache/spark/sql/hive [SPARK-3930] [SPARK-3933] Support fixed-precision decimal in SQL, and some optimizations 2014-11-01 19:29:14 -07:00
v0.13.1/src/main/scala/org/apache/spark/sql/hive [SPARK-3930] [SPARK-3933] Support fixed-precision decimal in SQL, and some optimizations 2014-11-01 19:29:14 -07:00
pom.xml [SPARK-3826][SQL]enable hive-thriftserver to support hive-0.13.1 2014-10-31 11:27:59 -07:00