cbaf595447
The current code references the schema of the DataFrame to be written before checking save mode. This triggers expensive metadata discovery prematurely. For save mode other than `Append`, this metadata discovery is useless since we either ignore the result (for `Ignore` and `ErrorIfExists`) or delete existing files (for `Overwrite`) later.
This PR fixes this issue by deferring metadata discovery after save mode checking.
Author: Cheng Lian <lian@databricks.com>
Closes #6583 from liancheng/spark-8014 and squashes the following commits:
1aafabd [Cheng Lian] Updates comments
088abaa [Cheng Lian] Avoids schema merging and partition discovery when data schema and partition schema are defined
8fbd93f [Cheng Lian] Fixes SPARK-8014
(cherry picked from commit
|
||
---|---|---|
.. | ||
src | ||
pom.xml |