55b7e2fdff
Bug fixes for file input stream and checkpointing - Fixed bugs in the file input stream that led the stream to fail due to transient HDFS errors (listing files when a background thread it deleting fails caused errors, etc.) - Updated Spark's CheckpointRDD and Streaming's CheckpointWriter to use SparkContext.hadoopConfiguration, to allow checkpoints to be written to any HDFS compatible store requiring special configuration. - Changed the API of SparkContext.setCheckpointDir() - eliminated the unnecessary 'useExisting' parameter. Now SparkContext will always create a unique subdirectory within the user specified checkpoint directory. This is to ensure that previous checkpoint files are not accidentally overwritten. - Fixed bug where setting checkpoint directory as a relative local path caused the checkpointing to fail. |
||
---|---|---|
.. | ||
mllib | ||
__init__.py | ||
accumulators.py | ||
broadcast.py | ||
cloudpickle.py | ||
context.py | ||
daemon.py | ||
files.py | ||
java_gateway.py | ||
join.py | ||
rdd.py | ||
rddsampler.py | ||
serializers.py | ||
shell.py | ||
statcounter.py | ||
storagelevel.py | ||
tests.py | ||
worker.py |