spark-instrumented-optimizer/core
Wenchen Fan 32407bfd2b [SPARK-9899] [SQL] log warning for direct output committer with speculation enabled
This is a follow-up of https://github.com/apache/spark/pull/8317.

When speculation is enabled, there may be multiply tasks writing to the same path. Generally it's OK as we will write to a temporary directory first and only one task can commit the temporary directory to target path.

However, when we use direct output committer, tasks will write data to target path directly without temporary directory. This causes problems like corrupted data. Please see [PR comment](https://github.com/apache/spark/pull/8191#issuecomment-131598385) for more details.

Unfortunately, we don't have a simple flag to tell if a output committer will write to temporary directory or not, so for safety, we have to disable any customized output committer when `speculation` is true.

Author: Wenchen Fan <cloud0fan@outlook.com>

Closes #8687 from cloud-fan/direct-committer.
2015-09-14 11:51:39 -07:00
..
src [SPARK-9899] [SQL] log warning for direct output committer with speculation enabled 2015-09-14 11:51:39 -07:00
pom.xml [SPARK-10421] [BUILD] Exclude curator artifacts from tachyon dependencies. 2015-09-03 15:10:30 -07:00