spark-instrumented-optimizer/sql/core/src/test
Takuya UESHIN 5b28e02584 [SPARK-16189][SQL] Add ExternalRDD logical plan for input with RDD to have a chance to eliminate serialize/deserialize.
## What changes were proposed in this pull request?

Currently the input `RDD` of `Dataset` is always serialized to `RDD[InternalRow]` prior to being as `Dataset`, but there is a case that we use `map` or `mapPartitions` just after converted to `Dataset`.
In this case, serialize and then deserialize happens but it would not be needed.

This pr adds `ExistingRDD` logical plan for input with `RDD` to have a chance to eliminate serialize/deserialize.

## How was this patch tested?

Existing tests.

Author: Takuya UESHIN <ueshin@happy-camper.st>

Closes #13890 from ueshin/issues/SPARK-16189.
2016-07-12 17:16:59 +08:00
..
avro [SPARK-10136] [SQL] Fixes Parquet support for Avro array of primitive array 2015-08-20 11:00:29 -07:00
gen-java/org/apache/spark/sql/execution/datasources/parquet/test/avro [SPARK-13401][SQL][TESTS] Fix SQL test warnings. 2016-03-22 21:08:11 -07:00
java/test/org/apache/spark/sql [SPARK-15982][SPARK-16009][SPARK-16007][SQL] Harmonize the behavior of DataFrameReader.text/csv/json/parquet/orc 2016-06-20 14:52:28 -07:00
resources [SPARK-15887][SQL] Bring back the hive-site.xml support for Spark 2.0 2016-06-13 14:57:35 -07:00
scala/org/apache/spark/sql [SPARK-16189][SQL] Add ExternalRDD logical plan for input with RDD to have a chance to eliminate serialize/deserialize. 2016-07-12 17:16:59 +08:00
scripts [SPARK-9407] [SQL] Relaxes Parquet ValidTypeMap to allow ENUM predicates to be pushed down 2015-08-12 20:01:34 +08:00
thrift [SPARK-9407] [SQL] Relaxes Parquet ValidTypeMap to allow ENUM predicates to be pushed down 2015-08-12 20:01:34 +08:00
README.md [SPARK-9407] [SQL] Relaxes Parquet ValidTypeMap to allow ENUM predicates to be pushed down 2015-08-12 20:01:34 +08:00

Notes for Parquet compatibility tests

The following directories and files are used for Parquet compatibility tests:

.
├── README.md                   # This file
├── avro
│   ├── *.avdl                  # Testing Avro IDL(s)
│   └── *.avpr                  # !! NO TOUCH !! Protocol files generated from Avro IDL(s)
├── gen-java                    # !! NO TOUCH !! Generated Java code
├── scripts
│   ├── gen-avro.sh             # Script used to generate Java code for Avro
│   └── gen-thrift.sh           # Script used to generate Java code for Thrift
└── thrift
    └── *.thrift                # Testing Thrift schema(s)

To avoid code generation during build time, Java code generated from testing Thrift schema and Avro IDL are also checked in.

When updating the testing Thrift schema and Avro IDL, please run gen-avro.sh and gen-thrift.sh accordingly to update generated Java code.

Prerequisites

Please ensure avro-tools and thrift are installed. You may install these two on Mac OS X via:

$ brew install thrift avro-tools