spark-instrumented-optimizer/dev/deps
Sean Owen a9d4e60a90 [SPARK-32614][SQL] Don't apply comment processing if 'comment' unset for CSV
### What changes were proposed in this pull request?

Spark's CSV source can optionally ignore lines starting with a comment char. Some code paths check to see if it's set before applying comment logic (i.e. not set to default of `\0`), but many do not, including the one that passes the option to Univocity. This means that rows beginning with a null char were being treated as comments even when 'disabled'.

### Why are the changes needed?

To avoid dropping rows that start with a null char when this is not requested or intended. See JIRA for an example.

### Does this PR introduce _any_ user-facing change?

Nothing beyond the effect of the bug fix.

### How was this patch tested?

Existing tests plus new test case.

Closes #29516 from srowen/SPARK-32614.

Authored-by: Sean Owen <srowen@gmail.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2020-08-26 00:25:58 +09:00
..
spark-deps-hadoop-2.7-hive-1.2 [SPARK-32614][SQL] Don't apply comment processing if 'comment' unset for CSV 2020-08-26 00:25:58 +09:00
spark-deps-hadoop-2.7-hive-2.3 [SPARK-32614][SQL] Don't apply comment processing if 'comment' unset for CSV 2020-08-26 00:25:58 +09:00
spark-deps-hadoop-3.2-hive-2.3 [SPARK-32614][SQL] Don't apply comment processing if 'comment' unset for CSV 2020-08-26 00:25:58 +09:00