Commit graph

23182 commits

Author SHA1 Message Date
Mosharaf Chowdhury c484b735bb Bug squashed. CustomParallelInMemoryShuffle is rocking!
We were serializing one (the wrong) thing, trying to deserialize another (the right thing).
2010-12-22 17:03:31 -08:00
Mosharaf Chowdhury 23586d3bef Added an in-memory implementation of CustomParalleLFS. There is a serialization/deserialization bug in the implementation. 2010-12-22 16:45:26 -08:00
Mosharaf Chowdhury c4c8f72e98 Fixed an indexing bug in HttpBlockedLocalFileShuffle. It still doesn't work on EC2 with >5 nodes cluster. 2010-12-22 12:48:11 -08:00
Mosharaf Chowdhury a5a8b7048d CustomBlockedLocalFileShuffle has separate consumer thread. 2010-12-22 12:04:12 -08:00
Mosharaf Chowdhury 92d2a9a13a Removed unncessary stuff from HttpParallelLocalFileShuffle 2010-12-22 11:28:50 -08:00
Mosharaf Chowdhury 4ab268ee36 HttpParallelLocalFileShuffle also has a consuming thread. It works on EC2. 2010-12-21 23:50:02 -08:00
Mosharaf Chowdhury 5f7bfbc70e HttpBlockedLocalFileShuffle has also been converted to have per-reducer consumption thread. Works in local mesos, but NOT on EC2 :| 2010-12-21 23:05:32 -08:00
Mosharaf Chowdhury 5f0cdabd40 Added a separate thread to deserialize (1 thread per reducer) in CustomParallelLocalFileShuffle
Upside: No synchronized blocking on "combiners" variable. 3x faster :)
Downside: Inefficient implementation. Requiring too much temporary data. Approx. 2x increase in memory requirement :( Should be fixed at some point.
2010-12-21 21:52:37 -08:00
Mosharaf Chowdhury f4d0e917a2 Added all the options to the java-opts file. Tired of writing them for separate runs :| 2010-12-21 18:59:51 -08:00
Mosharaf Chowdhury 6ef17e918b Fixed logging. Again. 2010-12-21 18:49:35 -08:00
Mosharaf Chowdhury f47fb44479 - Divided maxConnections to max[Rx|Tx]Connections.
- Fixed config param loading bug in CustomParallelLFS
2010-12-21 17:34:51 -08:00
Mosharaf Chowdhury d92b067350 Fixed log message in CustomParallelLocalFileShuffle that was giving some problem in log processing. 2010-12-21 13:12:15 -08:00
Mosharaf Chowdhury 3b21a5fb26 Code formatting... 2010-12-19 18:03:20 -08:00
Mosharaf Chowdhury 81f78282e1 All shuffle implementations are now in the same place. Time to work on new things. 2010-12-19 14:32:40 -08:00
Mosharaf Chowdhury 272c72b405 Merge branch 'mos-shuffle' into mos-shuffle-parallel
Conflicts:
	conf/java-opts
	src/scala/spark/BasicLocalFileShuffle.scala
2010-12-19 14:25:13 -08:00
Mosharaf Chowdhury ca37e7b33d Renamed CustomParallelLocalFileShuffle 2010-12-19 14:22:05 -08:00
Mosharaf Chowdhury 864d202cda Merge branch 'mos-shuffle-parallel-http' into mos-shuffle
Conflicts:
	conf/java-opts
	src/scala/spark/BlockedLocalFileShuffle.scala
	src/scala/spark/CustomBlockedLocalFileShuffle.scala
	src/scala/spark/HttpBlockedLocalFileShuffle.scala
2010-12-19 14:08:39 -08:00
Mosharaf Chowdhury 89172fcd69 Renamed this version of BlockedLocalFileShuffle to CustomBlockedLocalFileShuffle. 2010-12-19 14:05:35 -08:00
Mosharaf Chowdhury a83a722256 Renamed BlockedLocalFileShuffle to HttpBlockedLocalFileShuffle for merging with the mos-shuffle branch. 2010-12-19 14:02:19 -08:00
Mosharaf Chowdhury 62d61ed928 - Reimplemented BlockedLocalFileShuffle without creating too many files.
- Clients now request for byte ranges to the server using an INDEX file.
2010-12-18 14:03:49 -08:00
Mosharaf Chowdhury 5c5d767bc1 Modified MultiBroadcastTest. 2010-12-18 10:40:00 -08:00
Mosharaf Chowdhury d18d08ec9d Added a new BroadcastTest in the examples where 2 broadcasts are required. Should be used to experiment how multiple broadcasts work. 2010-12-17 10:43:49 -08:00
Mosharaf Chowdhury e30fdeb025 Updated GroupByKey example. 2010-12-16 20:30:18 -08:00
Mosharaf Chowdhury a40cbc1904 Code formatting. 2010-12-16 16:54:02 -08:00
Mosharaf Chowdhury ce96d8a7d3 First version of BlockedLocalFileShuffle is in. It works! 2010-12-16 15:15:51 -08:00
Mosharaf Chowdhury fddcdf87c9 Added a small description of how ParallelLFS works. 2010-12-16 11:58:00 -08:00
Mosharaf Chowdhury 77a4017585 Fixed config param naming in ParallelLocalFileShuffle 2010-12-16 11:42:37 -08:00
Mosharaf Chowdhury c5483e39f9 - ParallelLocalFileShuffle does NOT use HttpPipelining at all.
- Config option related to pipelining has been removed.
 - Summary: Basic -> Pipelining / Parallel -> NO pipelining
2010-12-15 22:08:34 -08:00
Mosharaf Chowdhury 56d8a2afa1 - Updated java-opts file of this branch.
- Renamed some ParallelLocalFileShuffle config options for clarity.
2010-12-15 20:56:22 -08:00
Mosharaf Chowdhury 25fb3c4cf6 - Brought back Matei's LocalFileShuffle implementation as BasicLocalFileShuffle
- Renamed parallel-pull version to ParallelLocalFileShuffle
 - Note that setting max-concurrent connections to 1 in ParallelLocalFileShuffle should essentially be the same as BasicLocalFileShuffle
2010-12-15 20:33:28 -08:00
Matei Zaharia 817e722321 Merge branch 'master' of github.com:mesos/spark 2010-12-15 19:40:35 -08:00
Matei Zaharia 14c29c1b14 Fixed import 2010-12-15 19:40:27 -08:00
Mosharaf Chowdhury 5cafdd7ba2 Removed some unused imports from Broadcast.scala 2010-12-15 19:11:23 -08:00
Mosharaf Chowdhury be0ce57de2 - Fixed an compilation error due to wrong 'import' of legacy lzf libraries in DfsBroadcast.scala
- Updated to use ning libraries.
 - Passes all unit tests
2010-12-15 18:34:27 -08:00
Matei Zaharia 5c222dbe28 Merge branch 'master' into mos-bt
Conflicts:
	src/scala/spark/Broadcast.scala
2010-12-15 10:57:39 -08:00
Mosharaf Chowdhury 0a5c24ae3d - Default broadcast mechanism is set to DfsBroadcast
- Configuration parameters are renamed to follow our convention
 - Master now automatically supplies its hostAddress instead of reading from config file
 - sendBroadcast has been removed from the Broadcast trait
2010-12-13 14:36:39 -08:00
Timothy Hunter 34395730db Someone forgot to pass the parameters: fixes SPARK_MEM set from main script but not passed to executor. 2010-12-12 13:30:49 -08:00
Matei Zaharia 0d895ba636 Added BSD license 2010-12-07 10:32:17 -08:00
Mosharaf Chowdhury 06dc4a5148 - Removed config files from git's control.
- Changed DfsShuffle to default in RDD.scala.
2010-12-07 10:17:47 -08:00
Mosharaf Chowdhury f82cc17bc5 UseHttpPipelining option is brought back in. It works! 2010-12-07 10:07:30 -08:00
Joshua Hartman 799c1b19f5 Adding license file for compress-lzf 2010-12-07 08:30:29 -08:00
Joshua Hartman 2fb849502f Replacing the native lzf compression code with the ning open-source compress-lzf library. (Apache 2.0 liscense) 2010-12-05 21:20:15 -08:00
Mosharaf Chowdhury 7e2d72c328 Multiple connections created at a time. No upper limit on the server side though. 2010-12-04 18:55:55 -08:00
Mosharaf Chowdhury c6df327dd7 Updated logging format. 2010-12-04 16:41:13 -08:00
Mosharaf Chowdhury 7df20d681a Combined MaxRxPeers and MaxTxPeers to a single config parameter MaxConnections 2010-12-04 14:37:16 -08:00
Mosharaf Chowdhury b1745b3103 Removed an unnecessary byte array in the middle. Probabaly will have to bring it back if we do block level data movement. 2010-12-04 13:55:25 -08:00
Mosharaf Chowdhury 3a671ce989 Config parameters are in place. Good to go (I think) 2010-12-04 10:59:06 -08:00
Mosharaf Chowdhury 476a216d9d Parallel is working. Need to fix/finalize some config parameters. 2010-12-04 02:05:41 -08:00
Mosharaf Chowdhury c546c299bc combining is happening inside the thread. Its still synchronized though. 2010-12-04 00:59:25 -08:00
Mosharaf Chowdhury 0d7ca7751e Bug fixes. Not yet parallel. 2010-12-04 00:06:47 -08:00