Mosharaf Chowdhury
ba71b61e40
Reading masterHostAddress from config file until #42 has been resolved.
2010-12-25 10:22:22 -08:00
Mosharaf Chowdhury
c1ff210387
Fixed some comments.
2010-12-24 20:05:00 -08:00
Mosharaf Chowdhury
8dc44bfa96
CustomBlockedInMemoryShuffle is an in- memroy implementation of CustomBlockedLFS
2010-12-22 21:06:03 -08:00
Mosharaf Chowdhury
a064835808
CustomBlockedLocalFileShuffle has been added. This is essentially ManualBlockedLocalFileShuffle with our servers.
2010-12-22 19:02:20 -08:00
Mosharaf Chowdhury
3447f903da
Renamed CustomBlockedLocalFileShuffle to ManualBlockedLocalFileShuffle.
...
There will be a new CustomBlockedLocalFileShuffle where 'Custom' will mean ManualBlockedLocalFileShuffle with custom server instead of jetty.
2010-12-22 17:17:33 -08:00
Mosharaf Chowdhury
c484b735bb
Bug squashed. CustomParallelInMemoryShuffle is rocking!
...
We were serializing one (the wrong) thing, trying to deserialize another (the right thing).
2010-12-22 17:03:31 -08:00
Mosharaf Chowdhury
23586d3bef
Added an in-memory implementation of CustomParalleLFS. There is a serialization/deserialization bug in the implementation.
2010-12-22 16:45:26 -08:00
Mosharaf Chowdhury
c4c8f72e98
Fixed an indexing bug in HttpBlockedLocalFileShuffle. It still doesn't work on EC2 with >5 nodes cluster.
2010-12-22 12:48:11 -08:00
Mosharaf Chowdhury
a5a8b7048d
CustomBlockedLocalFileShuffle has separate consumer thread.
2010-12-22 12:04:12 -08:00
Mosharaf Chowdhury
92d2a9a13a
Removed unncessary stuff from HttpParallelLocalFileShuffle
2010-12-22 11:28:50 -08:00
Mosharaf Chowdhury
4ab268ee36
HttpParallelLocalFileShuffle also has a consuming thread. It works on EC2.
2010-12-21 23:50:02 -08:00
Mosharaf Chowdhury
5f7bfbc70e
HttpBlockedLocalFileShuffle has also been converted to have per-reducer consumption thread. Works in local mesos, but NOT on EC2 :|
2010-12-21 23:05:32 -08:00
Mosharaf Chowdhury
5f0cdabd40
Added a separate thread to deserialize (1 thread per reducer) in CustomParallelLocalFileShuffle
...
Upside: No synchronized blocking on "combiners" variable. 3x faster :)
Downside: Inefficient implementation. Requiring too much temporary data. Approx. 2x increase in memory requirement :( Should be fixed at some point.
2010-12-21 21:52:37 -08:00
Mosharaf Chowdhury
f4d0e917a2
Added all the options to the java-opts file. Tired of writing them for separate runs :|
2010-12-21 18:59:51 -08:00
Mosharaf Chowdhury
6ef17e918b
Fixed logging. Again.
2010-12-21 18:49:35 -08:00
Mosharaf Chowdhury
f47fb44479
- Divided maxConnections to max[Rx|Tx]Connections.
...
- Fixed config param loading bug in CustomParallelLFS
2010-12-21 17:34:51 -08:00
Mosharaf Chowdhury
d92b067350
Fixed log message in CustomParallelLocalFileShuffle that was giving some problem in log processing.
2010-12-21 13:12:15 -08:00
Mosharaf Chowdhury
3b21a5fb26
Code formatting...
2010-12-19 18:03:20 -08:00
Mosharaf Chowdhury
81f78282e1
All shuffle implementations are now in the same place. Time to work on new things.
2010-12-19 14:32:40 -08:00
Mosharaf Chowdhury
272c72b405
Merge branch 'mos-shuffle' into mos-shuffle-parallel
...
Conflicts:
conf/java-opts
src/scala/spark/BasicLocalFileShuffle.scala
2010-12-19 14:25:13 -08:00
Mosharaf Chowdhury
ca37e7b33d
Renamed CustomParallelLocalFileShuffle
2010-12-19 14:22:05 -08:00
Mosharaf Chowdhury
864d202cda
Merge branch 'mos-shuffle-parallel-http' into mos-shuffle
...
Conflicts:
conf/java-opts
src/scala/spark/BlockedLocalFileShuffle.scala
src/scala/spark/CustomBlockedLocalFileShuffle.scala
src/scala/spark/HttpBlockedLocalFileShuffle.scala
2010-12-19 14:08:39 -08:00
Mosharaf Chowdhury
89172fcd69
Renamed this version of BlockedLocalFileShuffle to CustomBlockedLocalFileShuffle.
2010-12-19 14:05:35 -08:00
Mosharaf Chowdhury
a83a722256
Renamed BlockedLocalFileShuffle to HttpBlockedLocalFileShuffle for merging with the mos-shuffle branch.
2010-12-19 14:02:19 -08:00
Mosharaf Chowdhury
62d61ed928
- Reimplemented BlockedLocalFileShuffle without creating too many files.
...
- Clients now request for byte ranges to the server using an INDEX file.
2010-12-18 14:03:49 -08:00
Mosharaf Chowdhury
5c5d767bc1
Modified MultiBroadcastTest.
2010-12-18 10:40:00 -08:00
Mosharaf Chowdhury
d18d08ec9d
Added a new BroadcastTest in the examples where 2 broadcasts are required. Should be used to experiment how multiple broadcasts work.
2010-12-17 10:43:49 -08:00
Mosharaf Chowdhury
e30fdeb025
Updated GroupByKey example.
2010-12-16 20:30:18 -08:00
Mosharaf Chowdhury
a40cbc1904
Code formatting.
2010-12-16 16:54:02 -08:00
Mosharaf Chowdhury
ce96d8a7d3
First version of BlockedLocalFileShuffle is in. It works!
2010-12-16 15:15:51 -08:00
Mosharaf Chowdhury
fddcdf87c9
Added a small description of how ParallelLFS works.
2010-12-16 11:58:00 -08:00
Mosharaf Chowdhury
77a4017585
Fixed config param naming in ParallelLocalFileShuffle
2010-12-16 11:42:37 -08:00
Mosharaf Chowdhury
c5483e39f9
- ParallelLocalFileShuffle does NOT use HttpPipelining at all.
...
- Config option related to pipelining has been removed.
- Summary: Basic -> Pipelining / Parallel -> NO pipelining
2010-12-15 22:08:34 -08:00
Mosharaf Chowdhury
56d8a2afa1
- Updated java-opts file of this branch.
...
- Renamed some ParallelLocalFileShuffle config options for clarity.
2010-12-15 20:56:22 -08:00
Mosharaf Chowdhury
25fb3c4cf6
- Brought back Matei's LocalFileShuffle implementation as BasicLocalFileShuffle
...
- Renamed parallel-pull version to ParallelLocalFileShuffle
- Note that setting max-concurrent connections to 1 in ParallelLocalFileShuffle should essentially be the same as BasicLocalFileShuffle
2010-12-15 20:33:28 -08:00
Matei Zaharia
817e722321
Merge branch 'master' of github.com:mesos/spark
2010-12-15 19:40:35 -08:00
Matei Zaharia
14c29c1b14
Fixed import
2010-12-15 19:40:27 -08:00
Mosharaf Chowdhury
5cafdd7ba2
Removed some unused imports from Broadcast.scala
2010-12-15 19:11:23 -08:00
Mosharaf Chowdhury
be0ce57de2
- Fixed an compilation error due to wrong 'import' of legacy lzf libraries in DfsBroadcast.scala
...
- Updated to use ning libraries.
- Passes all unit tests
2010-12-15 18:34:27 -08:00
Matei Zaharia
5c222dbe28
Merge branch 'master' into mos-bt
...
Conflicts:
src/scala/spark/Broadcast.scala
2010-12-15 10:57:39 -08:00
Mosharaf Chowdhury
0a5c24ae3d
- Default broadcast mechanism is set to DfsBroadcast
...
- Configuration parameters are renamed to follow our convention
- Master now automatically supplies its hostAddress instead of reading from config file
- sendBroadcast has been removed from the Broadcast trait
2010-12-13 14:36:39 -08:00
Timothy Hunter
34395730db
Someone forgot to pass the parameters: fixes SPARK_MEM set from main script but not passed to executor.
2010-12-12 13:30:49 -08:00
Matei Zaharia
0d895ba636
Added BSD license
2010-12-07 10:32:17 -08:00
Mosharaf Chowdhury
06dc4a5148
- Removed config files from git's control.
...
- Changed DfsShuffle to default in RDD.scala.
2010-12-07 10:17:47 -08:00
Mosharaf Chowdhury
f82cc17bc5
UseHttpPipelining option is brought back in. It works!
2010-12-07 10:07:30 -08:00
Joshua Hartman
799c1b19f5
Adding license file for compress-lzf
2010-12-07 08:30:29 -08:00
Joshua Hartman
2fb849502f
Replacing the native lzf compression code with the ning open-source compress-lzf library. (Apache 2.0 liscense)
2010-12-05 21:20:15 -08:00
Mosharaf Chowdhury
7e2d72c328
Multiple connections created at a time. No upper limit on the server side though.
2010-12-04 18:55:55 -08:00
Mosharaf Chowdhury
c6df327dd7
Updated logging format.
2010-12-04 16:41:13 -08:00
Mosharaf Chowdhury
7df20d681a
Combined MaxRxPeers and MaxTxPeers to a single config parameter MaxConnections
2010-12-04 14:37:16 -08:00