e78ec1a8fa
UnsafeRow.getDouble and getFloat() return NaN when called on columns that are null, which is inconsistent with the behavior of other row classes (which is to return 0.0). In addition, the generic get(ordinal, dataType) method should always return null for a null literal, but currently it handles nulls by calling the type-specific accessors. This patch addresses both of these issues and adds a regression test. Author: Josh Rosen <joshrosen@databricks.com> Closes #7736 from JoshRosen/unsafe-row-null-fixes and squashes the following commits: c8eb2ee [Josh Rosen] Fix test in UnsafeRowConverterSuite 6214682 [Josh Rosen] Fixes to null handling in UnsafeRow |
||
---|---|---|
.. | ||
avro | ||
gen-java/org/apache/spark/sql/parquet/test/avro | ||
java/test/org/apache/spark/sql | ||
resources | ||
scala/org/apache/spark/sql | ||
scripts | ||
thrift | ||
README.md |
Notes for Parquet compatibility tests
The following directories and files are used for Parquet compatibility tests:
.
├── README.md # This file
├── avro
│ ├── parquet-compat.avdl # Testing Avro IDL
│ └── parquet-compat.avpr # !! NO TOUCH !! Protocol file generated from parquet-compat.avdl
├── gen-java # !! NO TOUCH !! Generated Java code
├── scripts
│ └── gen-code.sh # Script used to generate Java code for Thrift and Avro
└── thrift
└── parquet-compat.thrift # Testing Thrift schema
Generated Java code are used in the following test suites:
org.apache.spark.sql.parquet.ParquetAvroCompatibilitySuite
org.apache.spark.sql.parquet.ParquetThriftCompatibilitySuite
To avoid code generation during build time, Java code generated from testing Thrift schema and Avro IDL are also checked in.
When updating the testing Thrift schema and Avro IDL, please run gen-code.sh
to update all the generated Java code.
Prerequisites
Please ensure avro-tools
and thrift
are installed. You may install these two on Mac OS X via:
$ brew install thrift avro-tools