Commit graph

341 commits

Author SHA1 Message Date
Ankur Dave 19122af787 Update ShortestPath to work with controllable partitioning 2011-05-03 15:39:39 -07:00
Ankur Dave 45ec9db8af Add Bagel classpath to run script 2011-05-03 15:39:21 -07:00
Ankur Dave 62ef620354 Clean up Pregel.run, add logging 2011-05-03 15:38:01 -07:00
Ankur Dave c0736f6f68 Add Bagel, an implementation of Pregel on Spark 2011-05-03 15:37:08 -07:00
Ankur Dave a4c04f3f6f Error handling for disk I/O in DiskSpillingCache
Also renamed the property spark.DiskSpillingCache.cacheDir to spark.diskSpillingCache.cacheDir in order to follow conventions.
2011-04-27 23:23:29 -07:00
Ankur Dave 12ff0d2dc3 Bring an entry back into memory after fetching it from disk 2011-04-27 22:59:05 -07:00
Ankur Dave e30313aa2c Added DiskSpillingCache
DiskSpillingCache is a BoundedMemoryCache that spills entries to disk
when it runs out of space. Currently the implementation is very
simple. In particular, it's missing the following features:

- Error handling for disk I/O, including checking of disk space levels
- Bringing an entry back into memory after fetching it from disk

In addition, here are some features that aren't critical but should be
implemented soon:

- Spilling based on a user-set priority in addition to LRU
- Caching into a subdirectory of spark.DiskSpillingCache.cacheDir
  rather than the root directory
2011-04-27 22:32:35 -07:00
Mosharaf Chowdhury 9d2d533493 Temporary fix for issue #42. 2011-04-21 17:40:26 -07:00
Timothy Hunter 5c9535228a fixed small bug when classpath has some strange formatting 2011-04-18 17:12:29 -07:00
Matei Zaharia 94ba95bcb2 Added flatMapValues 2011-04-12 19:51:58 -07:00
Matei Zaharia d840fa8d0c Merge remote branch 'origin/custom-serialization' into new-rdds 2011-03-09 00:40:07 -08:00
root ff5b13799a Some tweaks to make Kryo cache work better 2011-03-09 03:31:50 -05:00
Matei Zaharia 7febdfbe29 Better reuse of buffers in Kryo serialization 2011-03-08 12:36:36 -08:00
Matei Zaharia 8ee3ec29ee Merge remote branch 'origin/custom-serialization' into new-rdds 2011-03-08 11:58:19 -08:00
Matei Zaharia 7408230bfa Updated modified Kryo to use objenesis 2011-03-08 11:58:08 -08:00
Matei Zaharia ab1216cb14 Register None and Nil properly 2011-03-08 11:52:58 -08:00
Matei Zaharia d39f5dd15e Merge remote branch 'origin/custom-serialization' into new-rdds 2011-03-08 10:28:50 -08:00
Matei Zaharia 4f0d0a7b73 stuff 2011-03-08 10:28:26 -08:00
Matei Zaharia 8b6f3db415 Merge remote branch 'origin/custom-serialization' into new-rdds 2011-03-07 19:20:28 -08:00
Matei Zaharia 38f6bce33d Added SerializingCache 2011-03-07 19:16:24 -08:00
Matei Zaharia 6316c7979d Remove some logging 2011-03-07 18:56:36 -08:00
Matei Zaharia e7b4b047a6 Added pluggable serializers and Kryo serialization 2011-03-07 18:41:53 -08:00
Matei Zaharia 467f056e29 Remove commented code 2011-03-06 23:38:41 -08:00
Matei Zaharia bce95b8458 Finished cogroup stuff 2011-03-06 23:38:16 -08:00
Matei Zaharia 04c2d6a60c stuff 2011-03-06 19:27:03 -08:00
Matei Zaharia 0fb691dd28 Various fixes to get MesosScheduler working with new RDDs 2011-03-06 16:16:38 -08:00
Matei Zaharia 1df5a65a01 Pass cache locations correctly to DAGScheduler. 2011-03-06 12:16:38 -08:00
Matei Zaharia e1436f1eaa Merge remote branch 'origin/master' into new-rdds 2011-03-06 11:11:47 -08:00
Matei Zaharia 370b95816f Added sampling for large arrays in SizeEstimator 2011-03-06 11:11:20 -08:00
Matei Zaharia a789e9aaea Merge remote branch 'origin/master' into new-rdds 2011-03-01 10:33:37 -08:00
Matei Zaharia 021c50a8d4 Remove unnecessary lock which was there to work around a bug in
Configuration in Hadoop 0.20.0
2011-03-01 10:28:38 -08:00
Matei Zaharia adaba4d550 Removed old slf4j jars that came with Hadoop 2011-03-01 10:28:21 -08:00
Matei Zaharia 447debb771 Updated Hadoop to 0.20.2 to include some bug fixes 2011-03-01 10:27:48 -08:00
Matei Zaharia 9e59afd710 More work on new RDD design 2011-02-27 19:15:52 -08:00
Matei Zaharia f38f86d59e More stuff 2011-02-27 14:27:12 -08:00
Matei Zaharia 2e6023f2bf stuff 2011-02-26 23:41:44 -08:00
Matei Zaharia 309367c477 Initial work towards new RDD design 2011-02-26 23:15:33 -08:00
Matei Zaharia dc24aecd8f Close record readers in HadoopFile after finishing a split 2011-02-10 12:07:48 -08:00
Matei Zaharia 62f1c6f5a8 Remove build.properties from version control 2011-02-09 11:52:56 -08:00
Matei Zaharia d3df963a13 Brought in some reorganization of build file from Hive branch 2011-02-08 21:27:36 -08:00
Matei Zaharia e8df4bbd40 Added more SBT stuff to gitignore 2011-02-08 17:06:07 -08:00
Matei Zaharia 26b77aece9 Increased SBT mem to 700 MB so that unit tests run more nicely 2011-02-08 17:03:28 -08:00
Matei Zaharia 99f3f23efa Changed default shuffle to LocalFileShuffle because it's way faster for small files 2011-02-08 17:03:03 -08:00
Matei Zaharia f4f7aa2ab2 formatting 2011-02-08 16:39:17 -08:00
Matei Zaharia ee60aaa0f5 Added a pointer to wiki in readme 2011-02-08 16:38:10 -08:00
Matei Zaharia c1c766a93c Updated readme 2011-02-02 19:21:49 -08:00
Matei Zaharia 50df43bf7b Added SBT target for building a single JAR with Spark Core and its
dependencies
2011-02-02 19:08:14 -08:00
Matei Zaharia a11fe23017 Moved examples to spark.examples package 2011-02-02 16:30:27 -08:00
Matei Zaharia 82170608b1 Added IntelliJ's build directory to gitignore 2011-02-02 00:30:29 -08:00
Matei Zaharia ec28b607fd Merge branch 'master' into sbt
Conflicts:
	Makefile
	core/src/main/java/spark/compress/lzf/LZF.java
	core/src/main/java/spark/compress/lzf/LZFInputStream.java
	core/src/main/java/spark/compress/lzf/LZFOutputStream.java
	core/src/main/native/spark_compress_lzf_LZF.c
	run
2011-02-02 00:25:54 -08:00