spark-instrumented-optimizer/sql/core/src/main
frreiss 620da3b482 [SPARK-17475][STREAMING] Delete CRC files if the filesystem doesn't use checksum files
## What changes were proposed in this pull request?

When the metadata logs for various parts of Structured Streaming are stored on non-HDFS filesystems such as NFS or ext4, the HDFSMetadataLog class leaves hidden HDFS-style checksum (CRC) files in the log directory, one file per batch. This PR modifies HDFSMetadataLog so that it detects the use of a filesystem that doesn't use CRC files and removes the CRC files.
## How was this patch tested?

Modified an existing test case in HDFSMetadataLogSuite to check whether HDFSMetadataLog correctly removes CRC files on the local POSIX filesystem.  Ran the entire regression suite.

Author: frreiss <frreiss@us.ibm.com>

Closes #15027 from frreiss/fred-17475.
2016-11-01 23:00:17 -07:00
..
java/org/apache/spark/sql [SPARK-17830][SQL] Annotate remaining SQL APIs with InterfaceStability 2016-10-13 11:12:30 -07:00
resources [SPARK-16031] Add debug-only socket source in Structured Streaming 2016-06-19 21:27:04 -07:00
scala/org/apache/spark/sql [SPARK-17475][STREAMING] Delete CRC files if the filesystem doesn't use checksum files 2016-11-01 23:00:17 -07:00