Commit graph

10925 commits

Author SHA1 Message Date
Matei Zaharia 8b6f3db415 Merge remote branch 'origin/custom-serialization' into new-rdds 2011-03-07 19:20:28 -08:00
Matei Zaharia 38f6bce33d Added SerializingCache 2011-03-07 19:16:24 -08:00
Matei Zaharia 6316c7979d Remove some logging 2011-03-07 18:56:36 -08:00
Matei Zaharia e7b4b047a6 Added pluggable serializers and Kryo serialization 2011-03-07 18:41:53 -08:00
Matei Zaharia 467f056e29 Remove commented code 2011-03-06 23:38:41 -08:00
Matei Zaharia bce95b8458 Finished cogroup stuff 2011-03-06 23:38:16 -08:00
Matei Zaharia 04c2d6a60c stuff 2011-03-06 19:27:03 -08:00
Matei Zaharia 0fb691dd28 Various fixes to get MesosScheduler working with new RDDs 2011-03-06 16:16:38 -08:00
Matei Zaharia 1df5a65a01 Pass cache locations correctly to DAGScheduler. 2011-03-06 12:16:38 -08:00
Matei Zaharia e1436f1eaa Merge remote branch 'origin/master' into new-rdds 2011-03-06 11:11:47 -08:00
Matei Zaharia 370b95816f Added sampling for large arrays in SizeEstimator 2011-03-06 11:11:20 -08:00
Matei Zaharia a789e9aaea Merge remote branch 'origin/master' into new-rdds 2011-03-01 10:33:37 -08:00
Matei Zaharia 021c50a8d4 Remove unnecessary lock which was there to work around a bug in
Configuration in Hadoop 0.20.0
2011-03-01 10:28:38 -08:00
Matei Zaharia adaba4d550 Removed old slf4j jars that came with Hadoop 2011-03-01 10:28:21 -08:00
Matei Zaharia 447debb771 Updated Hadoop to 0.20.2 to include some bug fixes 2011-03-01 10:27:48 -08:00
Matei Zaharia 9e59afd710 More work on new RDD design 2011-02-27 19:15:52 -08:00
Matei Zaharia f38f86d59e More stuff 2011-02-27 14:27:12 -08:00
Matei Zaharia 2e6023f2bf stuff 2011-02-26 23:41:44 -08:00
Matei Zaharia 309367c477 Initial work towards new RDD design 2011-02-26 23:15:33 -08:00
Mosharaf Chowdhury 0416cc22d2 Picking peers weighted by the number of rare blocks they have. A block is rare if there are at most 2 copies in the neighborhood. Better number can be used (some function of neighborhood size) 2011-02-15 16:27:44 -08:00
Mosharaf Chowdhury cf81da9485 Optimization: Master sends out at least one copy of each block first regardless of whatever a client is asking for. Once one copy of each block is out, Master then responds to specific blocks from individual receivers. 2011-02-14 15:08:33 -08:00
Mosharaf Chowdhury 2b946fb2d1 pickBlockRarestFirst and gossips commented OUT for now.
Problem with the rarestFirst implemention is that we are picking peers randomly first and then picking blocks from the random peer using rarestFirst. NOT the right away to do it. It should be the other way around.
Problem with gossip is that peers might end up overwriting newer information by older ones. To fix that we either have to have timestamps or must match the bitVectors before overwriting.
2011-02-13 13:53:15 -08:00
Mosharaf Chowdhury ca2895ebb0 Fix in rarestFirst implemenation.
If there are more than one rarest blocks, pick randomly between them (was deterministic before)
2011-02-10 20:37:44 -08:00
Mosharaf Chowdhury 520bbdc7e3 Peers now gossip about their neighbors when they talk. 2011-02-10 20:15:30 -08:00
Matei Zaharia dc24aecd8f Close record readers in HadoopFile after finishing a split 2011-02-10 12:07:48 -08:00
Mosharaf Chowdhury 8034728afc Merge branch 'master' into mos-bt 2011-02-09 15:30:41 -08:00
Mosharaf Chowdhury 441462bc7f Fixed some warnings during compilation. 2011-02-09 12:11:43 -08:00
Matei Zaharia 62f1c6f5a8 Remove build.properties from version control 2011-02-09 11:52:56 -08:00
Mosharaf Chowdhury 1a73c0d265 Merged with master. Using sbt. 2011-02-09 10:48:48 -08:00
Mosharaf Chowdhury 495b38658e Merge branch 'master' into mos-bt 2011-02-09 10:40:23 -08:00
Matei Zaharia d3df963a13 Brought in some reorganization of build file from Hive branch 2011-02-08 21:27:36 -08:00
Matei Zaharia e8df4bbd40 Added more SBT stuff to gitignore 2011-02-08 17:06:07 -08:00
Matei Zaharia 26b77aece9 Increased SBT mem to 700 MB so that unit tests run more nicely 2011-02-08 17:03:28 -08:00
Matei Zaharia 99f3f23efa Changed default shuffle to LocalFileShuffle because it's way faster for small files 2011-02-08 17:03:03 -08:00
Matei Zaharia f4f7aa2ab2 formatting 2011-02-08 16:39:17 -08:00
Matei Zaharia ee60aaa0f5 Added a pointer to wiki in readme 2011-02-08 16:38:10 -08:00
Mosharaf Chowdhury a12c0b6c00 Implemented rarest-first block request policy. Need to test it though. 2011-02-08 14:56:39 -08:00
Mosharaf Chowdhury dfbc5af6ba Added the fake shuffle to git. 2011-02-05 12:32:16 -08:00
Mosharaf Chowdhury 9b6d4074f9 Updating hasBlocks variable inside synchronized block 2011-02-05 12:14:41 -08:00
Matei Zaharia c1c766a93c Updated readme 2011-02-02 19:21:49 -08:00
Matei Zaharia 50df43bf7b Added SBT target for building a single JAR with Spark Core and its
dependencies
2011-02-02 19:08:14 -08:00
Matei Zaharia a11fe23017 Moved examples to spark.examples package 2011-02-02 16:30:27 -08:00
Matei Zaharia 82170608b1 Added IntelliJ's build directory to gitignore 2011-02-02 00:30:29 -08:00
Matei Zaharia ec28b607fd Merge branch 'master' into sbt
Conflicts:
	Makefile
	core/src/main/java/spark/compress/lzf/LZF.java
	core/src/main/java/spark/compress/lzf/LZFInputStream.java
	core/src/main/java/spark/compress/lzf/LZFOutputStream.java
	core/src/main/native/spark_compress_lzf_LZF.c
	run
2011-02-02 00:25:54 -08:00
Matei Zaharia 7f74ee99f6 Added support for IntelliJ IDEA 2011-02-02 00:08:13 -08:00
Matei Zaharia e5c4cd8a5e Made examples and core subprojects 2011-02-01 15:11:08 -08:00
Mosharaf Chowdhury 9a6110fd99 Bug fix: a reducer never returns until its shuffleConsumer thread joins. 2011-01-25 20:01:31 -08:00
Mosharaf Chowdhury b9b461e78f Updating hasSplits variable inside a synchronized block now. This was causing a concurrency bug. However, this is a hacky solution and should be fixed. 2011-01-22 18:03:38 -08:00
Mosharaf Chowdhury 888a584434 Bug fix in a log statement 2011-01-20 17:01:53 -08:00
Mosharaf Chowdhury 3ab7ddd2e6 Bug fixes + updated limit connecitons shuffle strategy to handle endgame situation.
After the last commit the bottleneck shifted from "requesting to tracker for mappers" (now done in batches) to "notifying tracker when threads leave" (done individually)
2011-01-14 01:50:21 -08:00