Commit graph

771 commits

Author SHA1 Message Date
Ryan LeCompte 22445fbea9 attempt to sleep for more accurate time period, minor cleanup 2013-01-11 13:30:49 -08:00
Matei Zaharia f7cf035b9b Merge pull request #350 from tdas/streaming
Spark Streaming
2013-01-07 17:40:11 -08:00
Tathagata Das 4719e6d8fe Changed locations for unit test logs. 2013-01-07 16:06:07 -08:00
Tathagata Das 3b0a3b89ac Added better docs for RDDCheckpointData 2013-01-07 14:55:49 -08:00
Tathagata Das 237bac36e9 Renamed examples and added documentation. 2013-01-07 14:37:21 -08:00
Tathagata Das 1346126485 Changed cleanup to clearOldValues for TimeStampedHashMap and TimeStampedHashSet. 2013-01-07 12:11:27 -08:00
Matei Zaharia ecf9c08901 Fix Accumulators in Java, and add a test for them 2013-01-05 20:54:08 -05:00
Tathagata Das 3dc87dd923 Fixed compilation bug in RDDSuite created during merge for mesos/master. 2013-01-01 16:38:04 -08:00
Tathagata Das d34dba25c2 Merge branch 'mesos' into dev-merge 2013-01-01 15:48:39 -08:00
Matei Zaharia 55809fbc6d Merge pull request #349 from woggling/cache-finally
Avoid stalls when computation of cached RDD throws exception
2013-01-01 08:21:33 -08:00
Charles Reiss 58072a7340 Remove some dead comments 2013-01-01 08:07:44 -08:00
Charles Reiss 21636ee4fa Test with exception while computing cached RDD. 2013-01-01 08:07:40 -08:00
Charles Reiss feadaf72f4 Mark key as not loading in CacheTracker even when compute() fails 2013-01-01 07:57:20 -08:00
Josh Rosen f803953998 Raise exception when hashing Java arrays (SPARK-597) 2012-12-31 20:20:11 -08:00
Tathagata Das 7e0271b438 Refactored a whole lot to push all DStreams into the spark.streaming.dstream package. 2012-12-30 15:19:55 -08:00
Tathagata Das 9e644402c1 Improved jekyll and scala docs. Made many classes and method private to remove them from scala docs. 2012-12-29 18:31:51 -08:00
Josh Rosen 397e67103c Change Utils.fetchFile() warning to SparkException. 2012-12-28 17:37:13 -08:00
Josh Rosen d64fa72d2e Add addFile() and addJar() to JavaSparkContext. 2012-12-28 17:00:57 -08:00
Josh Rosen bd237d4a9d Add synchronization to LocalScheduler.updateDependencies(). 2012-12-28 17:00:57 -08:00
Josh Rosen f1bf4f0385 Skip deletion of files in clearFiles().
This fixes an issue where Spark could delete
original files in the current working directory
that were added to the job using addFile().

There was also the potential for addFile() to
overwrite local files, which is addressed by
changing Utils.fetchFile() to log a warning
instead of overwriting a file with new contents.

This is a short-term fix; a better long-term
solution would be to remove the dependence on
storing files in the current working directory,
since we can't change the cwd from Java.
2012-12-28 17:00:57 -08:00
Tathagata Das 0bc0a60d30 Modifications to make sure LocalScheduler terminate cleanly without errors when SparkContext is shutdown, to minimize spurious exception during master failure tests. 2012-12-27 15:37:33 -08:00
Tathagata Das 7c33f76291 Merge branch 'mesos' into dev-merge 2012-12-26 19:19:07 -08:00
Tathagata Das 836042bb9f Merge branch 'dev-checkpoint' of github.com:radlab/spark into dev-merge
Conflicts:
	core/src/main/scala/spark/ParallelCollection.scala
	core/src/main/scala/spark/RDD.scala
	core/src/main/scala/spark/rdd/BlockRDD.scala
	core/src/main/scala/spark/rdd/CartesianRDD.scala
	core/src/main/scala/spark/rdd/CoGroupedRDD.scala
	core/src/main/scala/spark/rdd/CoalescedRDD.scala
	core/src/main/scala/spark/rdd/FilteredRDD.scala
	core/src/main/scala/spark/rdd/FlatMappedRDD.scala
	core/src/main/scala/spark/rdd/GlommedRDD.scala
	core/src/main/scala/spark/rdd/HadoopRDD.scala
	core/src/main/scala/spark/rdd/MapPartitionsRDD.scala
	core/src/main/scala/spark/rdd/MapPartitionsWithSplitRDD.scala
	core/src/main/scala/spark/rdd/MappedRDD.scala
	core/src/main/scala/spark/rdd/PipedRDD.scala
	core/src/main/scala/spark/rdd/SampledRDD.scala
	core/src/main/scala/spark/rdd/ShuffledRDD.scala
	core/src/main/scala/spark/rdd/UnionRDD.scala
	core/src/main/scala/spark/scheduler/ResultTask.scala
	core/src/test/scala/spark/CheckpointSuite.scala
2012-12-26 19:09:01 -08:00
Mark Hamstra 903f3518df fall back to filter-map-collect when calling lookup() on an RDD without a partitioner 2012-12-24 13:18:45 -08:00
Mark Hamstra 61be8566e2 Allow distinct() to be called without parentheses when using the default number of splits. 2012-12-24 02:36:47 -08:00
Reynold Xin 60f7338092 Remove the call to close input stream in Kryo serializer. 2012-12-21 15:49:33 -08:00
Matei Zaharia 3334b7c6b5 Merge pull request #341 from rxin/4a3fb06ac2d11125feb08acbbd4df76d1e91b677
Kryo2 update against Spark master
2012-12-21 15:31:23 -08:00
Reynold Xin eac566a7f4 Merge branch 'master' of github.com:mesos/spark into dev
Conflicts:
	core/src/main/scala/spark/MapOutputTracker.scala
	core/src/main/scala/spark/PairRDDFunctions.scala
	core/src/main/scala/spark/ParallelCollection.scala
	core/src/main/scala/spark/RDD.scala
	core/src/main/scala/spark/rdd/BlockRDD.scala
	core/src/main/scala/spark/rdd/CartesianRDD.scala
	core/src/main/scala/spark/rdd/CoGroupedRDD.scala
	core/src/main/scala/spark/rdd/CoalescedRDD.scala
	core/src/main/scala/spark/rdd/FilteredRDD.scala
	core/src/main/scala/spark/rdd/FlatMappedRDD.scala
	core/src/main/scala/spark/rdd/GlommedRDD.scala
	core/src/main/scala/spark/rdd/HadoopRDD.scala
	core/src/main/scala/spark/rdd/MapPartitionsRDD.scala
	core/src/main/scala/spark/rdd/MapPartitionsWithSplitRDD.scala
	core/src/main/scala/spark/rdd/MappedRDD.scala
	core/src/main/scala/spark/rdd/PipedRDD.scala
	core/src/main/scala/spark/rdd/SampledRDD.scala
	core/src/main/scala/spark/rdd/ShuffledRDD.scala
	core/src/main/scala/spark/rdd/UnionRDD.scala
	core/src/main/scala/spark/storage/BlockManager.scala
	core/src/main/scala/spark/storage/BlockManagerId.scala
	core/src/main/scala/spark/storage/BlockManagerMaster.scala
	core/src/main/scala/spark/storage/StorageLevel.scala
	core/src/main/scala/spark/util/MetadataCleaner.scala
	core/src/main/scala/spark/util/TimeStampedHashMap.scala
	core/src/test/scala/spark/storage/BlockManagerSuite.scala
	run
2012-12-20 14:53:40 -08:00
Tathagata Das 8512dd3225 Merge branch 'dev' of github.com:radlab/spark into dev-checkpoint
Conflicts:
	core/src/main/scala/spark/ParallelCollection.scala
	core/src/test/scala/spark/CheckpointSuite.scala
	streaming/src/main/scala/spark/streaming/DStream.scala
2012-12-20 14:24:19 -08:00
Tathagata Das fe777eb77d Fixed bugs in CheckpointRDD and spark.CheckpointSuite. 2012-12-20 13:39:27 -08:00
Tathagata Das f9c5b0a6fe Changed checkpoint writing and reading process. 2012-12-20 11:52:23 -08:00
Matei Zaharia 5e51b889fe Merge pull request #327 from rxin/spark-633
Added the ability in block manager to remove blocks.
2012-12-20 11:33:38 -08:00
Reynold Xin 9397c5014e Let the slave notify the master block removal. 2012-12-20 01:37:09 -08:00
Reynold Xin 68c52d80ec Moved BlockManager's IdGenerator into BlockManager object. Removed some
excessive debug messages.
2012-12-19 15:27:23 -08:00
Tathagata Das 5184141936 Introduced getSpits, getDependencies, and getPreferredLocations in RDD and RDDCheckpointData. 2012-12-18 13:30:53 -08:00
Patrick Wendell bfac06e1f6 SPARK-616: Logging dead workers in Web UI.
This patch keeps track of which workers have died and marks them
as such in the master web UI. It also handles workers which die and
re-register using different actor ID's.
2012-12-17 23:09:05 -08:00
Tathagata Das 72eed2b95e Converted CheckpointState in RDDCheckpointData to use scala Enumeration. 2012-12-17 18:52:43 -08:00
Matei Zaharia b82a6dd2c7 Merge pull request #332 from JoshRosen/spark-607
Add try-finally to handle MapOutputTracker timeouts
2012-12-14 11:41:16 -08:00
Reynold Xin 06f855c24d Merge branch 'spark-633' of github.com:rxin/spark into spark-633 2012-12-14 00:27:24 -08:00
Reynold Xin 8c01295b85 Fixed conflicts from merging Charles' and TD's block manager changes. 2012-12-14 00:26:36 -08:00
Charles Reiss c528932a41 Code review cleanup. 2012-12-13 22:37:16 -08:00
Charles Reiss 0aad42b5e7 Have standalone cluster report exit codes to clients. Addresses SPARK-639. 2012-12-13 22:37:16 -08:00
Reynold Xin 0235667f73 Merge branch 'master' of github.com:mesos/spark into spark-633 2012-12-13 22:33:41 -08:00
Reynold Xin 97434f49b8 Merged TD's block manager refactoring. 2012-12-13 22:32:19 -08:00
Reynold Xin f4a9e1b9be Fixed the broken Java unit test from SPARK-635. 2012-12-13 22:22:12 -08:00
Reynold Xin 41e58a519a Merge branch 'master' of github.com:mesos/spark into spark-633 2012-12-13 22:06:47 -08:00
Josh Rosen cf52d9cade Add try-finally to handle MapOutputTracker timeouts. 2012-12-13 21:53:30 -08:00
Matei Zaharia 05e225f988 Merge pull request #329 from woggling/executor-status-codes
Executor exit status codes
2012-12-13 20:14:10 -08:00
Charles Reiss b054d3b222 ExecutorLostReason -> ExecutorLossReason 2012-12-13 18:44:07 -08:00
Charles Reiss 24d7aa2d15 Extra whitespace in ExecutorExitCode 2012-12-13 18:39:23 -08:00