Commit graph

308 commits

Author SHA1 Message Date
Matei Zaharia f61b61c4ac Merge branch 'master' into new-rdds 2011-05-21 21:25:58 -07:00
Matei Zaharia 24a1e7f838 Scheduler can now recover from lost map outputs 2011-05-20 00:19:53 -07:00
Matei Zaharia 82329b0b28 Updated scheduler to support running on just some partitions of final RDD 2011-05-19 12:47:09 -07:00
Matei Zaharia 328e51b693 Various minor fixes 2011-05-19 11:19:25 -07:00
Matei Zaharia fd1d255821 Stop objectifying various trackers, caches, etc. 2011-05-17 12:41:13 -07:00
Matei Zaharia 4db50e26c7 Fixed unit tests by making them clean up the SparkContext after use and
thus clean up the various singletons (RDDCache, MapOutputTracker, etc).
This isn't perfect yet (ideally we shouldn't use singleton objects at
all) but we can fix that later.
2011-05-13 12:03:58 -07:00
Matei Zaharia aca8150c52 Ensure that AddedToCache messages make it home before tasks finish 2011-05-13 11:43:52 -07:00
Matei Zaharia 16c886a581 Optimization for count() 2011-05-13 10:41:34 -07:00
Matei Zaharia 4b1f0f1ce4 Merge pull request #48 from ankurdave/bagel-new
Bagel: Large-scale graph processing on Spark
2011-05-12 21:34:38 -07:00
Ankur Dave f40a0898a7 Rename bagel to spark.bagel and Pregel to Bagel 2011-05-09 15:23:21 -07:00
Matei Zaharia 7e20648914 Upgraded to SBT 0.7.5 2011-05-09 14:48:39 -07:00
Matei Zaharia 4bedf5b13a Merge pull request #47 from ankurdave/cache-to-disk
Merging in Ankur's code for a cache that spills to disk
2011-05-09 14:22:56 -07:00
Ankur Dave c1104058c6 Move shortest path and PageRank to bagel.examples 2011-05-03 18:53:58 -07:00
Ankur Dave 563c5e717c Refactor and add aggregator support
Refactored out the agg() and comp() methods from Pregel.run.

Defined an implicit conversion to allow applications that don't use
aggregators to avoid including a null argument for the result of the
aggregator in the compute function.
2011-05-03 15:40:45 -07:00
Ankur Dave c18fa3ebc6 Package combiner functions into a trait 2011-05-03 15:40:41 -07:00
Ankur Dave 1c8ca0ebe1 Add Bagel test suite
Note: This test suite currently fails for the same reason that the
Spark Core test suite fails: Spark currently seems to have a bug where
any test after the first one fails.
2011-05-03 15:40:31 -07:00
Ankur Dave c5b3ea755f Clean up Bagel source and interface 2011-05-03 15:40:01 -07:00
Ankur Dave 19122af787 Update ShortestPath to work with controllable partitioning 2011-05-03 15:39:39 -07:00
Ankur Dave 45ec9db8af Add Bagel classpath to run script 2011-05-03 15:39:21 -07:00
Ankur Dave 62ef620354 Clean up Pregel.run, add logging 2011-05-03 15:38:01 -07:00
Ankur Dave c0736f6f68 Add Bagel, an implementation of Pregel on Spark 2011-05-03 15:37:08 -07:00
Ankur Dave a4c04f3f6f Error handling for disk I/O in DiskSpillingCache
Also renamed the property spark.DiskSpillingCache.cacheDir to spark.diskSpillingCache.cacheDir in order to follow conventions.
2011-04-27 23:23:29 -07:00
Ankur Dave 12ff0d2dc3 Bring an entry back into memory after fetching it from disk 2011-04-27 22:59:05 -07:00
Ankur Dave e30313aa2c Added DiskSpillingCache
DiskSpillingCache is a BoundedMemoryCache that spills entries to disk
when it runs out of space. Currently the implementation is very
simple. In particular, it's missing the following features:

- Error handling for disk I/O, including checking of disk space levels
- Bringing an entry back into memory after fetching it from disk

In addition, here are some features that aren't critical but should be
implemented soon:

- Spilling based on a user-set priority in addition to LRU
- Caching into a subdirectory of spark.DiskSpillingCache.cacheDir
  rather than the root directory
2011-04-27 22:32:35 -07:00
Mosharaf Chowdhury 9d2d533493 Temporary fix for issue #42. 2011-04-21 17:40:26 -07:00
Timothy Hunter 5c9535228a fixed small bug when classpath has some strange formatting 2011-04-18 17:12:29 -07:00
Matei Zaharia 94ba95bcb2 Added flatMapValues 2011-04-12 19:51:58 -07:00
Matei Zaharia d840fa8d0c Merge remote branch 'origin/custom-serialization' into new-rdds 2011-03-09 00:40:07 -08:00
root ff5b13799a Some tweaks to make Kryo cache work better 2011-03-09 03:31:50 -05:00
Matei Zaharia 7febdfbe29 Better reuse of buffers in Kryo serialization 2011-03-08 12:36:36 -08:00
Matei Zaharia 8ee3ec29ee Merge remote branch 'origin/custom-serialization' into new-rdds 2011-03-08 11:58:19 -08:00
Matei Zaharia 7408230bfa Updated modified Kryo to use objenesis 2011-03-08 11:58:08 -08:00
Matei Zaharia ab1216cb14 Register None and Nil properly 2011-03-08 11:52:58 -08:00
Matei Zaharia d39f5dd15e Merge remote branch 'origin/custom-serialization' into new-rdds 2011-03-08 10:28:50 -08:00
Matei Zaharia 4f0d0a7b73 stuff 2011-03-08 10:28:26 -08:00
Matei Zaharia 8b6f3db415 Merge remote branch 'origin/custom-serialization' into new-rdds 2011-03-07 19:20:28 -08:00
Matei Zaharia 38f6bce33d Added SerializingCache 2011-03-07 19:16:24 -08:00
Matei Zaharia 6316c7979d Remove some logging 2011-03-07 18:56:36 -08:00
Matei Zaharia e7b4b047a6 Added pluggable serializers and Kryo serialization 2011-03-07 18:41:53 -08:00
Matei Zaharia 467f056e29 Remove commented code 2011-03-06 23:38:41 -08:00
Matei Zaharia bce95b8458 Finished cogroup stuff 2011-03-06 23:38:16 -08:00
Matei Zaharia 04c2d6a60c stuff 2011-03-06 19:27:03 -08:00
Matei Zaharia 0fb691dd28 Various fixes to get MesosScheduler working with new RDDs 2011-03-06 16:16:38 -08:00
Matei Zaharia 1df5a65a01 Pass cache locations correctly to DAGScheduler. 2011-03-06 12:16:38 -08:00
Matei Zaharia e1436f1eaa Merge remote branch 'origin/master' into new-rdds 2011-03-06 11:11:47 -08:00
Matei Zaharia 370b95816f Added sampling for large arrays in SizeEstimator 2011-03-06 11:11:20 -08:00
Matei Zaharia a789e9aaea Merge remote branch 'origin/master' into new-rdds 2011-03-01 10:33:37 -08:00
Matei Zaharia 021c50a8d4 Remove unnecessary lock which was there to work around a bug in
Configuration in Hadoop 0.20.0
2011-03-01 10:28:38 -08:00
Matei Zaharia adaba4d550 Removed old slf4j jars that came with Hadoop 2011-03-01 10:28:21 -08:00
Matei Zaharia 447debb771 Updated Hadoop to 0.20.2 to include some bug fixes 2011-03-01 10:27:48 -08:00