Matei Zaharia
af3c9d5042
Add Apache license headers and LICENSE and NOTICE files
2013-07-16 17:21:33 -07:00
Stephen Haberman
7dfb82a992
Replace old 'master' term with 'driver'.
2013-01-25 11:03:00 -06:00
Matei Zaharia
86057ec7c8
Merge branch 'master' into streaming
...
Conflicts:
core/src/main/scala/spark/api/python/PythonRDD.scala
2013-01-20 12:47:55 -08:00
Matei Zaharia
54c0f9f185
Fix code that assumed spark.local.dir is only a single directory
2013-01-17 17:40:55 -08:00
Fernand Pajot
742bc841ad
changed HttpBroadcast server cache to be in spark.local.dir instead of java.io.tmpdir
2013-01-17 16:56:11 -08:00
Tathagata Das
cd1521cfdb
Merge branch 'master' into streaming
...
Conflicts:
core/src/main/scala/spark/rdd/CoGroupedRDD.scala
core/src/main/scala/spark/rdd/FilteredRDD.scala
docs/_layouts/global.html
docs/index.md
run
2013-01-15 12:08:51 -08:00
Josh Rosen
c5cee53f20
Merge remote-tracking branch 'origin/master' into python-api
...
Conflicts:
docs/quick-start.md
2012-12-29 16:00:51 -08:00
Reynold Xin
eac566a7f4
Merge branch 'master' of github.com:mesos/spark into dev
...
Conflicts:
core/src/main/scala/spark/MapOutputTracker.scala
core/src/main/scala/spark/PairRDDFunctions.scala
core/src/main/scala/spark/ParallelCollection.scala
core/src/main/scala/spark/RDD.scala
core/src/main/scala/spark/rdd/BlockRDD.scala
core/src/main/scala/spark/rdd/CartesianRDD.scala
core/src/main/scala/spark/rdd/CoGroupedRDD.scala
core/src/main/scala/spark/rdd/CoalescedRDD.scala
core/src/main/scala/spark/rdd/FilteredRDD.scala
core/src/main/scala/spark/rdd/FlatMappedRDD.scala
core/src/main/scala/spark/rdd/GlommedRDD.scala
core/src/main/scala/spark/rdd/HadoopRDD.scala
core/src/main/scala/spark/rdd/MapPartitionsRDD.scala
core/src/main/scala/spark/rdd/MapPartitionsWithSplitRDD.scala
core/src/main/scala/spark/rdd/MappedRDD.scala
core/src/main/scala/spark/rdd/PipedRDD.scala
core/src/main/scala/spark/rdd/SampledRDD.scala
core/src/main/scala/spark/rdd/ShuffledRDD.scala
core/src/main/scala/spark/rdd/UnionRDD.scala
core/src/main/scala/spark/storage/BlockManager.scala
core/src/main/scala/spark/storage/BlockManagerId.scala
core/src/main/scala/spark/storage/BlockManagerMaster.scala
core/src/main/scala/spark/storage/StorageLevel.scala
core/src/main/scala/spark/util/MetadataCleaner.scala
core/src/main/scala/spark/util/TimeStampedHashMap.scala
core/src/test/scala/spark/storage/BlockManagerSuite.scala
run
2012-12-20 14:53:40 -08:00
Matei Zaharia
e1d7cd2276
Search for a non-loopback address in Utils.getLocalIpAddress
2012-12-08 00:33:11 -08:00
Tathagata Das
a69a82be26
Added metadata cleaner to HttpBroadcast to clean up old broacast files.
2012-12-03 22:37:31 -08:00
Josh Rosen
52989c8a2c
Update Python API for v0.6.0 compatibility.
2012-10-19 10:24:49 -07:00
Mosharaf Chowdhury
119e50c7b9
Conflict fixed
2012-10-02 22:25:39 -07:00
Denny
18a1faedf6
Stylistic changes and Public Accumulable and Broadcast
2012-10-02 19:28:37 -07:00
Denny
4d9f4b01af
Make classes package private
2012-10-02 19:00:19 -07:00
Matei Zaharia
802aa8aef9
Some bug fixes and logging fixes for broadcast.
2012-10-01 15:20:42 -07:00
Matei Zaharia
83143f9a5f
Fixed several bugs that caused weird behavior with files in spark-shell:
...
- SizeEstimator was following through a ClassLoader field of Hadoop
JobConfs, which referenced the whole interpreter, Scala compiler, etc.
Chaos ensued, giving an estimated size in the tens of gigabytes.
- Broadcast variables in local mode were only stored as MEMORY_ONLY and
never made accessible over a server, so they fell out of the cache when
they were deemed too large and couldn't be reloaded.
2012-09-30 21:19:39 -07:00
Matei Zaharia
009b0e37e7
Added an option to compress blocks in the block store
2012-09-27 18:45:44 -07:00
Matei Zaharia
7bcb08cef5
Renamed storage levels to something cleaner; fixes #223 .
2012-09-27 17:50:59 -07:00
Matei Zaharia
995982b3c9
Added a unit test for local-cluster mode and simplified some of the code involved in that
2012-09-07 17:08:36 -07:00
Mosharaf Chowdhury
31ffe8d528
Synchronization bug fix in broadcast implementations
2012-08-30 22:26:43 -07:00
Mosharaf Chowdhury
3883532545
Bug fix. Fixed log messages. Updated BroadcastTest example to have iterations.
2012-08-30 21:43:00 -07:00
Matei Zaharia
deedb9e7b7
Fix further issues with tests and broadcast.
...
The broadcast fix is to store values as MEMORY_ONLY_DESER instead of
MEMORY_ONLY, which will save substantial time on serialization.
2012-08-23 20:31:49 -07:00
Matei Zaharia
59b831b9d1
Fixed test failures due to broadcast not stopping correctly
2012-08-23 19:59:55 -07:00
Mosharaf Chowdhury
d821dd3ccc
BroadcastManager is a class now (replaced Braodcast object)
2012-08-05 01:10:51 -07:00
Mosharaf Chowdhury
24b7eb872c
Bug fixed. Broadcast now works with BlockManager.
2012-08-04 00:27:28 -07:00
Mosharaf Chowdhury
b5be936d7c
Broadcasts using BlockManager instead of BoundedMemoryCache
2012-07-27 15:38:46 -07:00
Mosharaf Chowdhury
bb4ee580fa
Cleaning BitTorrentBroadcast code...
2012-07-13 01:04:01 -07:00
Mosharaf Chowdhury
8ccffe21da
Cleaned TreeBroadcast
2012-07-13 00:54:25 -07:00
Mosharaf Chowdhury
34999d97f5
Added stop() to the Broadcast subsystem
2012-07-10 01:03:47 -07:00
Mosharaf Chowdhury
701f49e0d9
Refactoring
2012-07-09 22:39:47 -07:00
Mosharaf Chowdhury
cf1c60a1de
Refactoring
2012-07-09 22:07:46 -07:00
Mosharaf Chowdhury
e71f69ad3d
Refactoring
2012-07-09 22:07:17 -07:00
Mosharaf Chowdhury
ca02a92332
Refactored TrackMultipleValues out.
2012-07-09 21:35:39 -07:00
Mosharaf Chowdhury
654576ef1a
Tweaks
2012-07-09 21:12:42 -07:00
Mosharaf Chowdhury
425c247269
Removed some unused stuff
2012-07-08 14:29:04 -07:00
Mosharaf Chowdhury
c7c5258e25
Compiles without Dfs
2012-07-08 13:22:12 -07:00
Mosharaf Chowdhury
178bb29f05
Removed Chained and Dfs broadcast implementations
2012-07-08 11:57:00 -07:00
Matei Zaharia
c53670b9bf
Various code style fixes, mostly from IntelliJ IDEA
2012-06-29 18:47:12 -07:00
Matei Zaharia
b3eeac55b8
Fixed HttpBroadcast to work with this branch's Serializer.
2012-06-15 23:54:38 -07:00
Matei Zaharia
4b05798c06
Further bug fix to HttpBroadcast
2012-06-09 16:24:03 -07:00
Matei Zaharia
8ed662862e
Bug fix to HttpBroadcast
2012-06-09 16:16:55 -07:00
Matei Zaharia
e75b1b5cb4
Change the default broadcast implementation to a simple HTTP-based
...
broadcast. Fixes #139 .
2012-06-09 15:58:07 -07:00
Matei Zaharia
f48742683a
Made caches dataset-aware so that they won't cyclically evict partitions
...
from the same dataset.
2012-05-06 20:14:40 -07:00
haoyuan
194c42ab79
Code format.
2012-02-10 08:19:53 -08:00
Ismael Juma
620de2dd1d
Change currentThread to Thread.currentThread as the former is deprecated.
2011-08-02 10:25:16 +01:00
Ismael Juma
0fba22b3d2
Fix issue #65 : Change @serializable to extends Serializable in 2.9 branch
...
Note that we use scala.Serializable introduced in Scala 2.9 instead of
java.io.Serializable. Also, case classes inherit from scala.Serializable by
default.
2011-08-02 10:16:33 +01:00
Matei Zaharia
c4dd68ae21
Merge branch 'mos-bt'
...
This merge keeps only the broadcast work in mos-bt because the structure
of shuffle has changed with the new RDD design. We still need some kind
of parallel shuffle but that will be added later.
Conflicts:
core/src/main/scala/spark/BitTorrentBroadcast.scala
core/src/main/scala/spark/ChainedBroadcast.scala
core/src/main/scala/spark/RDD.scala
core/src/main/scala/spark/SparkContext.scala
core/src/main/scala/spark/Utils.scala
core/src/main/scala/spark/shuffle/BasicLocalFileShuffle.scala
core/src/main/scala/spark/shuffle/DfsShuffle.scala
2011-06-26 18:22:12 -07:00
Mosharaf Chowdhury
db7a2c4897
Issue #42 fixed.
2011-04-28 14:30:48 -07:00
Mosharaf Chowdhury
60d1121343
Refactoring: daemonThreadFactories have all been moved to the Utils
...
object instead of having multiple copies in Broadcast and Shuffle
objects.
2011-04-27 22:13:01 -07:00
Mosharaf Chowdhury
e898e108a3
Cleanup + refactoring...
2011-04-27 22:00:24 -07:00