spark-instrumented-optimizer/sql/core/src/test
Reynold Xin f23a721c10 [SPARK-8993][SQL] More comprehensive type checking in expressions.
This patch makes the following changes:

1. ExpectsInputTypes only defines expected input types, but does not perform any implicit type casting.
2. ImplicitCastInputTypes is a new trait that defines both expected input types, as well as performs implicit type casting.
3. BinaryOperator has a new abstract function "inputType", which defines the expected input type for both left/right. Concrete BinaryOperator expressions no longer perform any implicit type casting.
4. For BinaryOperators, convert NullType (i.e. null literals) into some accepted type so BinaryOperators don't need to handle NullTypes.

TODOs needed: fix unit tests for error reporting.

I'm intentionally not changing anything in aggregate expressions because yhuai is doing a big refactoring on that right now.

Author: Reynold Xin <rxin@databricks.com>

Closes #7348 from rxin/typecheck and squashes the following commits:

8fcf814 [Reynold Xin] Fixed ordering of cases.
3bb63e7 [Reynold Xin] Style fix.
f45408f [Reynold Xin] Comment update.
aa7790e [Reynold Xin] Moved RemoveNullTypes into ImplicitTypeCasts.
438ea07 [Reynold Xin] space
d55c9e5 [Reynold Xin] Removes NullTypes.
360d124 [Reynold Xin] Fixed the rule.
fb66657 [Reynold Xin] Convert NullType into some accepted type for BinaryOperators.
2e22330 [Reynold Xin] Fixed unit tests.
4932d57 [Reynold Xin] Style fix.
d061691 [Reynold Xin] Rename existing ExpectsInputTypes -> ImplicitCastInputTypes.
e4727cc [Reynold Xin] BinaryOperator should not be doing implicit cast.
d017861 [Reynold Xin] Improve expression type checking.
2015-07-14 22:52:53 -07:00
..
avro [SPARK-6123] [SPARK-6775] [SPARK-6776] [SQL] Refactors Parquet read path for interoperability and backwards-compatibility 2015-07-08 15:51:01 -07:00
gen-java/org/apache/spark/sql/parquet/test/avro [SPARK-8959] [SQL] [HOTFIX] Removes parquet-thrift and libthrift dependencies 2015-07-09 17:09:16 -07:00
java/test/org/apache/spark/sql [SPARK-7654][SQL] Move JDBC into DataFrame's reader/writer interface. 2015-05-16 22:01:53 -07:00
resources [SPARK-8959] [SQL] [HOTFIX] Removes parquet-thrift and libthrift dependencies 2015-07-09 17:09:16 -07:00
scala/org/apache/spark/sql [SPARK-8993][SQL] More comprehensive type checking in expressions. 2015-07-14 22:52:53 -07:00
scripts [SPARK-6123] [SPARK-6775] [SPARK-6776] [SQL] Refactors Parquet read path for interoperability and backwards-compatibility 2015-07-08 15:51:01 -07:00
thrift [SPARK-6123] [SPARK-6775] [SPARK-6776] [SQL] Refactors Parquet read path for interoperability and backwards-compatibility 2015-07-08 15:51:01 -07:00
README.md [SPARK-6123] [SPARK-6775] [SPARK-6776] [SQL] Refactors Parquet read path for interoperability and backwards-compatibility 2015-07-08 15:51:01 -07:00

Notes for Parquet compatibility tests

The following directories and files are used for Parquet compatibility tests:

.
├── README.md                   # This file
├── avro
│   ├── parquet-compat.avdl     # Testing Avro IDL
│   └── parquet-compat.avpr     # !! NO TOUCH !! Protocol file generated from parquet-compat.avdl
├── gen-java                    # !! NO TOUCH !! Generated Java code
├── scripts
│   └── gen-code.sh             # Script used to generate Java code for Thrift and Avro
└── thrift
    └── parquet-compat.thrift   # Testing Thrift schema

Generated Java code are used in the following test suites:

  • org.apache.spark.sql.parquet.ParquetAvroCompatibilitySuite
  • org.apache.spark.sql.parquet.ParquetThriftCompatibilitySuite

To avoid code generation during build time, Java code generated from testing Thrift schema and Avro IDL are also checked in.

When updating the testing Thrift schema and Avro IDL, please run gen-code.sh to update all the generated Java code.

Prerequisites

Please ensure avro-tools and thrift are installed. You may install these two on Mac OS X via:

$ brew install thrift avro-tools