spark-instrumented-optimizer

History

Nathan Howell eb8bfa3eaa [SPARK-9618] [SQL] Use the specified schema when reading Parquet files The user specified schema is currently ignored when loading Parquet files. One workaround is to use the `format` and `load` methods instead of `parquet`, e.g.: ``` val schema = ??? // schema is ignored sqlContext.read.schema(schema).parquet("hdfs:///test") // schema is retained sqlContext.read.schema(schema).format("parquet").load("hdfs:///test") ``` The fix is simple, but I wonder if the `parquet` method should instead be written in a similar fashion to `orc`: ``` def parquet(path: String): DataFrame = format("parquet").load(path) ``` Author: Nathan Howell <nhowell@godaddy.com> Closes #7947 from NathanHowell/SPARK-9618 and squashes the following commits: d1ea62c [Nathan Howell] [SPARK-9618] [SQL] Use the specified schema when reading Parquet files	2015-08-05 22:16:56 +08:00
..
main	[SPARK-9618] [SQL] Use the specified schema when reading Parquet files	2015-08-05 22:16:56 +08:00
test	[SPARK-8861][SPARK-8862][SQL] Add basic instrumentation to each SparkPlan operator and add a new SQL tab	2015-08-05 01:51:22 -07:00

Nathan Howell eb8bfa3eaa [SPARK-9618] [SQL] Use the specified schema when reading Parquet files

The user specified schema is currently ignored when loading Parquet files.

One workaround is to use the `format` and `load` methods instead of `parquet`, e.g.:

```
val schema = ???

// schema is ignored
sqlContext.read.schema(schema).parquet("hdfs:///test")

// schema is retained
sqlContext.read.schema(schema).format("parquet").load("hdfs:///test")
```

The fix is simple, but I wonder if the `parquet` method should instead be written in a similar fashion to `orc`:

```
def parquet(path: String): DataFrame = format("parquet").load(path)
```

Author: Nathan Howell <nhowell@godaddy.com>

Closes #7947 from NathanHowell/SPARK-9618 and squashes the following commits:

d1ea62c [Nathan Howell] [SPARK-9618] [SQL] Use the specified schema when reading Parquet files

2015-08-05 22:16:56 +08:00

main

[SPARK-9618] [SQL] Use the specified schema when reading Parquet files

2015-08-05 22:16:56 +08:00

test

[SPARK-8861][SPARK-8862][SQL] Add basic instrumentation to each SparkPlan operator and add a new SQL tab

2015-08-05 01:51:22 -07:00