spark-instrumented-optimizer/sql/hive
Liang-Chi Hsieh 7bca62f790 [SPARK-6607][SQL] Check invalid characters for Parquet schema and show error messages
'(' and ')' are special characters used in Parquet schema for type annotation. When we run an aggregation query, we will obtain attribute name such as "MAX(a)".

If we directly store the generated DataFrame as Parquet file, it causes failure when reading and parsing the stored schema string.

Several methods can be adopted to solve this. This pr uses a simplest one to just replace attribute names before generating Parquet schema based on these attributes.

Another possible method might be modifying all aggregation expression names from "func(column)" to "func[column]".

Author: Liang-Chi Hsieh <viirya@gmail.com>

Closes #5263 from viirya/parquet_aggregation_name and squashes the following commits:

2d70542 [Liang-Chi Hsieh] Address comment.
463dff4 [Liang-Chi Hsieh] Instead of replacing special chars, showing error message to user to suggest using Alias.
1de001d [Liang-Chi Hsieh] Replace special characters '(' and ')' of Parquet schema.
2015-04-05 00:20:43 +08:00
..
compatibility/src/test/scala/org/apache/spark/sql/hive/execution [SPARK-5680][SQL] Sum function on all null values, should return zero 2015-03-21 13:24:24 -07:00
src [SPARK-6607][SQL] Check invalid characters for Parquet schema and show error messages 2015-04-05 00:20:43 +08:00
v0.12.0/src/main/scala/org/apache/spark/sql/hive [SPARK-5498][SQL]fix query exception when partition schema does not match table schema 2015-03-25 17:47:45 -07:00
v0.13.1/src/main/scala/org/apache/spark/sql/hive [SPARK-5498][SQL]fix query exception when partition schema does not match table schema 2015-03-25 17:47:45 -07:00
pom.xml SPARK-6433 hive tests to import spark-sql test JAR for QueryTest access 2015-04-01 16:26:54 +01:00