Matei Zaharia
ee71fa49c1
Look for Kryo registrator using context class loader
2012-09-26 14:15:16 -07:00
Matei Zaharia
d71a358c46
Fixed a test that was getting extremely lucky before, and increased the
...
number of samples used for sorting
2012-09-26 00:25:34 -07:00
Matei Zaharia
051785c7e6
Several fixes to sampling issues pointed out by Henry Milner:
...
- takeSample was biased towards earlier partitions
- There were some range errors in takeSample
- SampledRDDs with replacement didn't produce appropriate counts
across partitions (we took exactly frac of each one)
2012-09-25 21:46:58 -07:00
Matei Zaharia
4d3339a3ec
Merge pull request #217 from rxin/dev
...
Added a method to RDD to expose the ClassManifest.
2012-09-24 23:52:32 -07:00
Reynold Xin
7a4cd92861
Renamed RDD.manifest to RDD.elementClassManifest
2012-09-24 23:42:33 -07:00
Matei Zaharia
296e24b440
Merge pull request #218 from rnpandya/dev
...
Scripts to start Spark under windows
2012-09-24 21:10:31 -07:00
Reynold Xin
348bcbca1f
Added a method to RDD to expose the ClassManifest.
2012-09-24 16:56:27 -07:00
Ravi Pandya
39215357af
Windows command scripts for sbt and run
2012-09-24 15:43:19 -07:00
Matei Zaharia
6eeb379cf8
Fix some test issues
2012-09-24 15:39:58 -07:00
Matei Zaharia
f855e4fad2
Merge pull request #208 from rxin/dev
...
Separated ShuffledRDD into multiple classes.
2012-09-24 12:32:01 -07:00
root
107a5ca879
Make default number of parallel fetches slightly smaller since it doesn't seem to hurt performance much and it will cause slightly less GC.
2012-09-23 06:06:12 +00:00
root
e41cab04ca
Avoid creating an extra buffer when saving a stream of values as DISK_ONLY
2012-09-23 05:56:44 +00:00
Denny
afb7ccc838
HTTP File server fixes.
2012-09-21 10:58:13 -07:00
root
6d28dde370
Rename our toIterator method into asIterator to prevent confusion with the
...
Scala collection one, which often *copies* a collection.
2012-09-21 06:02:55 +00:00
root
a642051ade
Fixed a performance bug in BlockManager that was creating garbage when
...
returning deserialized, in-memory RDDs.
2012-09-21 05:42:21 +00:00
root
8feb5caacd
Fixed an issue with ordering of classloader setup that was causing Java deserializer to break
2012-09-21 05:13:19 +00:00
Reynold Xin
6b5980da79
Set a limited number of retry in standalone deploy mode.
2012-09-19 15:41:56 -07:00
Reynold Xin
397d3816e1
Separated ShuffledRDD into multiple classes: RepartitionShuffledRDD,
...
ShuffledSortedRDD, and ShuffledAggregatedRDD.
2012-09-19 12:31:45 -07:00
Denny
ca64d16a2d
When a file is downloaded, make it executable. That's neccsary for scripts (e.g. in Shark)
2012-09-17 10:08:37 -07:00
Matei Zaharia
840cbcf849
Change default serializer to Java.. it had accidentally become Kryo.
2012-09-13 17:19:26 -07:00
Matei Zaharia
b4dfa25c8a
Store shuffle map outputs as DISK_ONLY
2012-09-12 16:05:57 -07:00
Matei Zaharia
2d761e3353
Ported performance and FT improvements from latest streaming work
2012-09-12 14:54:40 -07:00
Matei Zaharia
9b4cd1648b
Fix bugs with Connection's shutdown callback failing to get its address
2012-09-12 14:54:14 -07:00
Matei Zaharia
9199775d41
Wait for Akka to really shut down in SparkEnv.stop()
2012-09-12 14:50:37 -07:00
Denny
5e4076e3f2
Merge branch 'dev' into feature/fileserver
...
Conflicts:
core/src/main/scala/spark/SparkContext.scala
2012-09-11 16:57:17 -07:00
Denny
77873d2c8e
Formatting
2012-09-11 16:51:46 -07:00
Denny
24b9b37314
Subclass URLClassLoader instead of using reflection
2012-09-11 16:51:08 -07:00
Denny
31c53e917d
Use stageId as index for fileSet caches.
2012-09-11 16:10:45 -07:00
Matei Zaharia
943df48348
Merge branch 'dev' of github.com:mesos/spark into dev
2012-09-11 16:00:37 -07:00
Matei Zaharia
6d7f907e73
Manually merge pull request #175 by Imran Rashid
2012-09-11 16:00:06 -07:00
Reynold Xin
7af7c79ce5
Updated the logError call from the previous commit to conform to
...
logError API.
2012-09-11 14:32:24 -07:00
Reynold Xin
38b9119c96
Log entire exception (including stack trace) in BlockManagerWorker.
2012-09-11 11:31:35 -07:00
Denny
4d3471dd07
Fix serialization bugs and added local cluster tests
2012-09-10 15:39:58 -07:00
Denny
b864c36a30
Dynamically adding jar files and caching fileSets.
2012-09-10 12:49:09 -07:00
Denny
f275fb07da
General FileServer
...
A general fileserver for both JARs and regular files.
2012-09-10 12:48:59 -07:00
Matei Zaharia
a13780670d
Added a unit test for local-cluster mode and simplified some of the code involved in that
2012-09-10 12:48:58 -07:00
Denny
f2ac55840c
Add shutdown hook to Executor Runner and execute code to shutdown local cluster in Scheduler Backend
2012-09-10 12:48:58 -07:00
Denny
9ead8ab14e
Set SPARK_LAUNCH_WITH_SCALA=0 in Executor Runner
2012-09-10 12:48:58 -07:00
Denny
8bb3c73977
Renamed spark-cluster to spark-local.
2012-09-10 12:48:58 -07:00
Denny
a367c20f49
Fix wrong counting
2012-09-10 12:48:57 -07:00
Denny
93fe331e6d
Delete old DeployUtils.
2012-09-10 12:48:57 -07:00
Denny
cf074f9c96
Renamed class.
2012-09-10 12:48:57 -07:00
Denny
3749f94184
Start a standalone cluster locally.
2012-09-10 12:48:57 -07:00
Matei Zaharia
995982b3c9
Added a unit test for local-cluster mode and simplified some of the code involved in that
2012-09-07 17:08:36 -07:00
Matei Zaharia
8d2fcc2832
Merge pull request #189 from dennybritz/feature/localcluster
...
Simulating a Spark standalone cluster locally
2012-09-07 15:43:43 -07:00
Denny
7ff9311add
Add shutdown hook to Executor Runner and execute code to shutdown local cluster in Scheduler Backend
2012-09-07 14:09:12 -07:00
Denny
4e7b264cf7
Set SPARK_LAUNCH_WITH_SCALA=0 in Executor Runner
2012-09-07 11:39:44 -07:00
root
c2da64409a
Randomize the order of block fetches in getMultiple
2012-09-06 23:16:26 +00:00
Denny
886183e591
Renamed spark-cluster to spark-local.
2012-09-05 17:10:54 -07:00
Reynold Xin
c308fbcb79
Removed cache add/remove log messages from CacheTracker.
...
Added log messages on BlockManagerMaster to reflect block add/remove.
Also did some minor cleanup of storage package code.
2012-09-05 15:59:48 -07:00
Denny
babbca0a2f
Fix wrong counting
2012-09-04 22:04:18 -07:00
Denny
9326509f66
Delete old DeployUtils.
2012-09-04 21:15:23 -07:00
Denny
1588d4dbe6
Renamed class.
2012-09-04 21:13:25 -07:00
Denny
22dde6e020
Start a standalone cluster locally.
2012-09-04 20:56:30 -07:00
Matei Zaharia
a842c63044
Minor formatting fixes
2012-09-03 16:24:00 -07:00
Harvey
3076b038f4
Start fetching a remote block when a received remote block has been passed
...
to the reduce function
2012-09-01 12:01:35 -07:00
Matei Zaharia
389fb4cc54
End runJob() with a SparkException when a task fails too many times in
...
one of the cluster schedulers.
2012-08-31 17:47:43 -07:00
Matei Zaharia
a480dec6b2
Deserialize multi-get results in the caller's thread. This fixes an
...
issue with shared buffers in the KryoSerializer.
2012-08-30 20:01:06 -07:00
Reynold Xin
5945bcdcc5
Added a new flag in Aggregator to indicate applying map side combiners.
2012-08-29 23:32:08 -07:00
Reynold Xin
c68e820b2a
Merge branch 'dev' of github.com:mesos/spark into dev
2012-08-29 23:01:19 -07:00
Reynold Xin
940869dfda
Disable running combiners on map tasks when mergeCombiners function is
...
not specified by the user.
2012-08-29 23:00:02 -07:00
Matei Zaharia
bf2e9cb08e
Fault tolerance and block store fixes discovered through streaming tests.
2012-08-27 23:07:50 -07:00
Reynold Xin
3a6a95dc24
Removed the deserialization cache for ShuffleMapTask because it was
...
causing concurrency problems (some variables in Shark get set to null).
The cost of task deserialization on slaves is trivial compared with the
execution time of the task anyway.
2012-08-27 22:33:15 -07:00
Matei Zaharia
deedb9e7b7
Fix further issues with tests and broadcast.
...
The broadcast fix is to store values as MEMORY_ONLY_DESER instead of
MEMORY_ONLY, which will save substantial time on serialization.
2012-08-23 20:31:49 -07:00
Matei Zaharia
59b831b9d1
Fixed test failures due to broadcast not stopping correctly
2012-08-23 19:59:55 -07:00
Matei Zaharia
7310a6f499
Merge pull request #147 from mosharaf/dev
...
Broadcast refactoring/cleaning up
2012-08-23 19:38:28 -07:00
Matei Zaharia
25a6a39e6d
Added other SparkContext constructors to JavaSparkContext
2012-08-19 18:59:16 -07:00
Shivaram Venkataraman
1ea269110c
Move object size and pointer size initialization into a function to enable unit-testing
2012-08-13 13:31:45 -07:00
Shivaram Venkataraman
44661df9cc
If spark.test.useCompressedOops is set, use that to infer compressed oops
...
setting. This is useful to get a deterministic test case
2012-08-13 13:31:39 -07:00
Shivaram Venkataraman
0dd8fe73ba
Use HotSpotDiagnosticMXBean to get if CompressedOops are in use or not
2012-08-13 13:31:29 -07:00
Shivaram Venkataraman
80104ce1da
Add link to Java wiki which specifies what changes with compressed oops
2012-08-13 13:31:21 -07:00
Shivaram Venkataraman
00ab5490b3
Changes to make size estimator more accurate. Fixes object size, pointer size
...
according to architecture and also aligns objects and arrays when computing
instance sizes. Verified using Eclipse Memory Analysis Tool (MAT)
2012-08-13 13:31:11 -07:00
Matei Zaharia
6ae3c375a9
Renamed apply() to call() in Java API and allowed it to throw Exceptions
2012-08-12 23:10:19 +02:00
Matei Zaharia
0141879c40
Use Promises instead of having a Future wait on a thread in
...
ConnectionManager.
2012-08-12 22:16:32 +02:00
Matei Zaharia
845a870242
Return remotely fetched blocks in a pipelined fashion from BlockManager
2012-08-12 20:01:38 +02:00
Matei Zaharia
e17ed9a21d
Switch to Akka futures in connection manager.
...
It's still not good because each Future ends up waiting on a lock, but
it seems to work better than Scala Actors, and more importantly it
allows us to use onComplete and other listeners on futures.
2012-08-12 19:40:37 +02:00
Matei Zaharia
ad8a7612a4
Changed multi-get method in BlockManager to return an iterator
2012-08-12 19:18:01 +02:00
Matei Zaharia
3c94e5c188
Merge pull request #168 from shivaram/dev
...
Use JavaConversion to get a scala iterator
2012-08-10 00:57:33 -07:00
Matei Zaharia
e463e7a333
Merge pull request #167 from JoshRosen/piped-rdd-fixes
...
Detect non-zero exit status from PipedRDD process
2012-08-10 00:56:42 -07:00
Josh Rosen
59c22fb444
Print exit status in PipedRDD failure exception.
2012-08-10 00:33:56 -07:00
Shivaram Venkataraman
1803cce692
Use an implicit conversion to get the scala iterator
2012-08-08 14:31:04 -07:00
Shivaram Venkataraman
674fcf56bf
Use JavaConversion to get a scala iterator
2012-08-08 14:10:23 -07:00
Shivaram Venkataraman
f4aaec7a48
Avoid a copy in ShuffleMapTask by creating an iterator that will be used by the
...
block manager.
2012-08-08 00:47:02 -07:00
Mosharaf Chowdhury
d821dd3ccc
BroadcastManager is a class now (replaced Braodcast object)
2012-08-05 01:10:51 -07:00
Mosharaf Chowdhury
b4804119f9
Merge remote-tracking branch 'upstream/dev' into dev
2012-08-04 20:42:12 -07:00
Matei Zaharia
88b016db2a
Merge pull request #160 from dennybritz/clusterscripts
...
Standalone cluster scripts
2012-08-04 17:45:20 -07:00
Mosharaf Chowdhury
1b0534af8f
Merge branch 'dev' into bc-bm
2012-08-04 00:30:08 -07:00
Mosharaf Chowdhury
d11b457e67
Merge remote-tracking branch 'upstream/dev' into dev
2012-08-04 00:28:10 -07:00
Mosharaf Chowdhury
24b7eb872c
Bug fixed. Broadcast now works with BlockManager.
2012-08-04 00:27:28 -07:00
Matei Zaharia
6601a6212b
Added a unit test for cross-partition balancing in sort, and changes to
...
RangePartitioner to make it pass. It turns out that the first partition
was always kind of small due to how we picked partition boundaries.
2012-08-03 16:40:45 -04:00
Harvey
1170de3757
Fix for partitioning when sorting in descending order
2012-08-03 16:40:38 -04:00
Paul Cavallaro
d05c0f97ca
Logging Throwables in Info and Debug
...
Logging Throwables in logInfo and logDebug instead of swallowing them.
Conflicts:
core/src/main/scala/spark/Logging.scala
2012-08-03 16:40:21 -04:00
Denny
0008994044
merged dev branch
2012-08-02 16:00:33 -07:00
Denny
53008c2d8a
Settings variables and bugfix for stop script.
2012-08-02 15:59:39 -07:00
Matei Zaharia
71a958b0b7
Merge branch 'dev' of github.com:mesos/spark into dev
...
Conflicts:
project/SparkBuild.scala
2012-08-02 17:23:13 -04:00
Denny
7312a5c30f
Use spray's implicit Marshaller for Futures.
2012-08-02 14:11:27 -07:00
Denny
ba7e30fb5e
Mostly stlyistic changes.
2012-08-02 13:55:09 -07:00
Shivaram Venkataraman
1a07bb9ba4
Avoid an extra partition copy by passing an iterator to blockManager.put
2012-08-02 12:22:33 -07:00
Shivaram Venkataraman
6790908b11
Use maxMemory to better estimate memory available for BlockManager cache
2012-08-02 12:05:05 -07:00
Denny
863c31b7c1
Moved resources into static folder
2012-08-02 09:48:36 -07:00