spark-instrumented-optimizer/sql/core
Reynold Xin 2f0b882e5c [SPARK-14482][SQL] Change default Parquet codec from gzip to snappy
## What changes were proposed in this pull request?
Based on our tests, gzip decompression is very slow (< 100MB/s), making queries decompression bound. Snappy can decompress at ~ 500MB/s on a single core.

This patch changes the default compression codec for Parquet output from gzip to snappy, and also introduces a ParquetOptions class to be more consistent with other data sources (e.g. CSV, JSON).

## How was this patch tested?
Should be covered by existing unit tests.

Author: Reynold Xin <rxin@databricks.com>

Closes #12256 from rxin/SPARK-14482.
2016-04-08 23:52:04 -07:00
..
src [SPARK-14482][SQL] Change default Parquet codec from gzip to snappy 2016-04-08 23:52:04 -07:00
pom.xml [SPARK-14103][SQL] Parse unescaped quotes in CSV data source. 2016-04-08 00:28:59 -07:00