Commit graph

496 commits

Author SHA1 Message Date
Mosharaf Chowdhury 495b38658e Merge branch 'master' into mos-bt 2011-02-09 10:40:23 -08:00
Matei Zaharia d3df963a13 Brought in some reorganization of build file from Hive branch 2011-02-08 21:27:36 -08:00
Matei Zaharia e8df4bbd40 Added more SBT stuff to gitignore 2011-02-08 17:06:07 -08:00
Matei Zaharia 26b77aece9 Increased SBT mem to 700 MB so that unit tests run more nicely 2011-02-08 17:03:28 -08:00
Matei Zaharia 99f3f23efa Changed default shuffle to LocalFileShuffle because it's way faster for small files 2011-02-08 17:03:03 -08:00
Matei Zaharia f4f7aa2ab2 formatting 2011-02-08 16:39:17 -08:00
Matei Zaharia ee60aaa0f5 Added a pointer to wiki in readme 2011-02-08 16:38:10 -08:00
Mosharaf Chowdhury a12c0b6c00 Implemented rarest-first block request policy. Need to test it though. 2011-02-08 14:56:39 -08:00
Mosharaf Chowdhury dfbc5af6ba Added the fake shuffle to git. 2011-02-05 12:32:16 -08:00
Mosharaf Chowdhury 9b6d4074f9 Updating hasBlocks variable inside synchronized block 2011-02-05 12:14:41 -08:00
Matei Zaharia c1c766a93c Updated readme 2011-02-02 19:21:49 -08:00
Matei Zaharia 50df43bf7b Added SBT target for building a single JAR with Spark Core and its
dependencies
2011-02-02 19:08:14 -08:00
Matei Zaharia a11fe23017 Moved examples to spark.examples package 2011-02-02 16:30:27 -08:00
Matei Zaharia 82170608b1 Added IntelliJ's build directory to gitignore 2011-02-02 00:30:29 -08:00
Matei Zaharia ec28b607fd Merge branch 'master' into sbt
Conflicts:
	Makefile
	core/src/main/java/spark/compress/lzf/LZF.java
	core/src/main/java/spark/compress/lzf/LZFInputStream.java
	core/src/main/java/spark/compress/lzf/LZFOutputStream.java
	core/src/main/native/spark_compress_lzf_LZF.c
	run
2011-02-02 00:25:54 -08:00
Matei Zaharia 7f74ee99f6 Added support for IntelliJ IDEA 2011-02-02 00:08:13 -08:00
Matei Zaharia e5c4cd8a5e Made examples and core subprojects 2011-02-01 15:11:08 -08:00
Mosharaf Chowdhury 9a6110fd99 Bug fix: a reducer never returns until its shuffleConsumer thread joins. 2011-01-25 20:01:31 -08:00
Mosharaf Chowdhury b9b461e78f Updating hasSplits variable inside a synchronized block now. This was causing a concurrency bug. However, this is a hacky solution and should be fixed. 2011-01-22 18:03:38 -08:00
Mosharaf Chowdhury 888a584434 Bug fix in a log statement 2011-01-20 17:01:53 -08:00
Mosharaf Chowdhury 3ab7ddd2e6 Bug fixes + updated limit connecitons shuffle strategy to handle endgame situation.
After the last commit the bottleneck shifted from "requesting to tracker for mappers" (now done in batches) to "notifying tracker when threads leave" (done individually)
2011-01-14 01:50:21 -08:00
Mosharaf Chowdhury c71124ed3d Revamped tracker interface. Now tracker can send back multiple response at the same time.
Turned OFF timers in the reducers due to inconsistent behavior (sometimes they fire, sometimes they don't)
2011-01-14 00:08:58 -08:00
Mosharaf Chowdhury 36b21813fa Implemented a tracker strategy that allows reducers to create concurrent connections proportional to their time remaining. Which connections to create is random though. 2011-01-13 21:02:53 -08:00
Mosharaf Chowdhury 025b5485b7 Changed speed estimation and time remaining variables to Double instead of Int. 2011-01-13 19:41:04 -08:00
Mosharaf Chowdhury 5bf6369220 Added a tracker strategy that selects random mappers for reducers. This can be used to measure tracker overhead. 2011-01-13 19:22:07 -08:00
Mosharaf Chowdhury fceae31877 Bug fix in selectSuitableSources: was changing the index value before updating a bitset using that same index variable. 2011-01-10 20:58:59 -08:00
Mosharaf Chowdhury fd3fd37383 In-memory version of tracker+blocked shuffle checked in. 2011-01-10 17:13:52 -08:00
Mosharaf Chowdhury d7081a927f Updated conf/java-opts 2011-01-09 17:19:26 -08:00
Mosharaf Chowdhury de676428ac Merge branch 'mos-shuffle-tracked' of git@github.com:mesos/spark into mos-shuffle-tracked 2011-01-07 01:44:32 -08:00
Mosharaf Chowdhury ba74169162 Dummy log line added in SparkContext for #45 2011-01-06 21:21:35 -08:00
Mosharaf Chowdhury e76b4f7155 Turned OFF some log messages due to #45. 2011-01-06 17:07:14 -08:00
Mosharaf Chowdhury f5ba0b49c6 Bug fix + default config param values update.
Updated some default config parameters for broadcast parameters.
Fixed a concurrency bug in the variable that keeps track of how many receivers have finished.
2011-01-06 15:44:24 -08:00
Matei Zaharia 15829df799 Updated simple skewed group-by to generate multiple keys with each
hashcode so that Mosharaf's current block implementation works
2011-01-05 22:31:50 -05:00
Mosharaf Chowdhury 6908fa6c31 Updated BalanceRemaining strategy for a better faster-reducers identification mechanism. 2011-01-05 19:19:52 -08:00
Mosharaf Chowdhury 69845d5648 trackerStrategy can now only be accessed by one thread at a time.
Previously, synchronized were around individual methods which were working on shared variables and two such methods could be accessed by different threads simultaneously. A source of possible concurrency issues.
2011-01-05 12:18:02 -08:00
Mosharaf Chowdhury a2c7caeac8 Bug fixes, optimizations, changes to ShuffleClient/Tracker interface and updated BalanceRemainingShuffleTrackerStrategy.
There is still 1 or 10 bugs in tracker implementations. They are normally found in the last remaining clients.
 - Sometimes ShuffleClients just don't do anything for a while.
 - Sometimes the last ShuffleClient or 2 block on reading block size from ShuffleServer and the program does not proceed at all. This happens for larger shuffle (500000 keys)
2011-01-05 01:58:37 -08:00
Matei Zaharia c251b5e4d6 Added a skewed reduce test where one reducer gets more keys than the
others.
2011-01-04 17:20:13 -05:00
Mosharaf Chowdhury b35ea63c2e Updated BalanceRemainingShuffleTrackerStrategy. Better divisioning of slower and faster reducers, instead of dividing between the fastest and the rest. 2011-01-03 23:12:46 -08:00
Mosharaf Chowdhury 7eb334d97c Bug fix: splitsInRequestBitVector(splitIndex) was wrongly set to false after receiving just one block in Blocked implementations that receive multiple blocks at a time. 2011-01-03 19:05:52 -08:00
Mosharaf Chowdhury 07e778d7fa - Updated Reducer-Tracker communication protocol.
- Implemented a new tracker strategy for shuffle where if a reducer is too fast its stalled until other catchup. Basic version is working, but more work is necessary.
2011-01-02 23:49:56 -08:00
Mosharaf Chowdhury 33d59fb206 TrackedCustomBlockedLocalFileShuffle has also been updated. 2010-12-30 18:24:12 -08:00
Mosharaf Chowdhury a30f03eae6 CustomBlockedInMemoryShuffle is receiving multiple blocks after connecting to a mapper instead of just one. 2010-12-30 18:12:32 -08:00
Mosharaf Chowdhury b566be47d7 Bug fix/update: All the shuffle implementations are using consistent config parameters. 2010-12-30 17:52:01 -08:00
Mosharaf Chowdhury 4545df67cf Consumption is delayed until everything has been received. Otherwise it interferes with network performance. 2010-12-30 17:10:20 -08:00
Mosharaf Chowdhury 1e26fb3953 CustomBlockedLocalFileShuffle: reducers are reading multiple blocks per connections instead of just one.
Sometimes ShuffleServer fails to start for small shuffle data with small block size in local VM. No problem with large block size.
2010-12-30 13:33:34 -08:00
Mosharaf Chowdhury fb51df0b5b Added a skewed shuffle test example.
Output per mapper is distributed from 1/numMappers to 1 of numKVPairs.
2010-12-29 13:50:43 -08:00
Mosharaf Chowdhury 2fefbe17e4 TreeBroadcast is an extended version of ChainedBroadcast with customizable maxDegree per node. maxDegree = 1 is ChainedBroadcast. 2010-12-28 20:18:49 -08:00
Mosharaf Chowdhury 5540a99ab7 ChainedBroadcast is also reading masterHostAddress from config file until #42 is resolved. 2010-12-28 18:53:59 -08:00
Mosharaf Chowdhury b23d337c92 Updating reception stats before consuming. Can create trouble if there is any exception during consumption (less likely,) but this frees up splits that threads can connect to instead of idling around. 2010-12-28 16:08:40 -08:00
Mosharaf Chowdhury 5074e8500a - Implemented TrackedCustomBlockedLocalFileShuffle.
- Fixed several bugs. (Copy-paste is the bane of coding :|)
2010-12-28 15:28:43 -08:00