Patrick Wendell
9fc78f8f29
Fixing some whitespace issues
2012-09-28 16:05:50 -07:00
Patrick Wendell
bc909c2903
Changes based on Matei's comments
2012-09-28 16:04:36 -07:00
Patrick Wendell
c387e40fb1
Log message which records RDD origin
...
This adds tracking to determine the "origin" of an RDD. Origin is defined by
the boundary between the user's code and the spark code, during an RDD's
instantiation. It is meant to help users understand where a Spark RDD is
coming from in their code.
This patch also logs origin data when stages are submitted to the scheduler.
Finally, it adds a new log message to fix an inconsitency in the way that
dependent stages (those missing parents) and independent stages (those
without) are logged during submission.
2012-09-28 15:51:46 -07:00
Matei Zaharia
920fab23c3
Merge pull request #222 from rxin/dev
...
Added MapPartitionsWithSplitRDD.
2012-09-26 23:16:45 -07:00
Matei Zaharia
ea05fc130b
Updates to standalone cluster, web UI and deploy docs.
2012-09-26 22:54:39 -07:00
Matei Zaharia
1ef4f0fbd2
Allow controlling number of splits in sortByKey.
2012-09-26 19:18:47 -07:00
Reynold Xin
1ad1331a34
Added MapPartitionsWithSplitRDD.
2012-09-26 17:11:28 -07:00
Matei Zaharia
ee71fa49c1
Look for Kryo registrator using context class loader
2012-09-26 14:15:16 -07:00
Matei Zaharia
d71a358c46
Fixed a test that was getting extremely lucky before, and increased the
...
number of samples used for sorting
2012-09-26 00:25:34 -07:00
Matei Zaharia
051785c7e6
Several fixes to sampling issues pointed out by Henry Milner:
...
- takeSample was biased towards earlier partitions
- There were some range errors in takeSample
- SampledRDDs with replacement didn't produce appropriate counts
across partitions (we took exactly frac of each one)
2012-09-25 21:46:58 -07:00
Matei Zaharia
4d3339a3ec
Merge pull request #217 from rxin/dev
...
Added a method to RDD to expose the ClassManifest.
2012-09-24 23:52:32 -07:00
Reynold Xin
7a4cd92861
Renamed RDD.manifest to RDD.elementClassManifest
2012-09-24 23:42:33 -07:00
Matei Zaharia
296e24b440
Merge pull request #218 from rnpandya/dev
...
Scripts to start Spark under windows
2012-09-24 21:10:31 -07:00
Reynold Xin
348bcbca1f
Added a method to RDD to expose the ClassManifest.
2012-09-24 16:56:27 -07:00
Ravi Pandya
39215357af
Windows command scripts for sbt and run
2012-09-24 15:43:19 -07:00
Matei Zaharia
6eeb379cf8
Fix some test issues
2012-09-24 15:39:58 -07:00
Matei Zaharia
f855e4fad2
Merge pull request #208 from rxin/dev
...
Separated ShuffledRDD into multiple classes.
2012-09-24 12:32:01 -07:00
root
107a5ca879
Make default number of parallel fetches slightly smaller since it doesn't seem to hurt performance much and it will cause slightly less GC.
2012-09-23 06:06:12 +00:00
root
e41cab04ca
Avoid creating an extra buffer when saving a stream of values as DISK_ONLY
2012-09-23 05:56:44 +00:00
Denny
afb7ccc838
HTTP File server fixes.
2012-09-21 10:58:13 -07:00
root
6d28dde370
Rename our toIterator method into asIterator to prevent confusion with the
...
Scala collection one, which often *copies* a collection.
2012-09-21 06:02:55 +00:00
root
a642051ade
Fixed a performance bug in BlockManager that was creating garbage when
...
returning deserialized, in-memory RDDs.
2012-09-21 05:42:21 +00:00
root
8feb5caacd
Fixed an issue with ordering of classloader setup that was causing Java deserializer to break
2012-09-21 05:13:19 +00:00
Reynold Xin
6b5980da79
Set a limited number of retry in standalone deploy mode.
2012-09-19 15:41:56 -07:00
Reynold Xin
397d3816e1
Separated ShuffledRDD into multiple classes: RepartitionShuffledRDD,
...
ShuffledSortedRDD, and ShuffledAggregatedRDD.
2012-09-19 12:31:45 -07:00
Denny
ca64d16a2d
When a file is downloaded, make it executable. That's neccsary for scripts (e.g. in Shark)
2012-09-17 10:08:37 -07:00
Matei Zaharia
840cbcf849
Change default serializer to Java.. it had accidentally become Kryo.
2012-09-13 17:19:26 -07:00
Matei Zaharia
b4dfa25c8a
Store shuffle map outputs as DISK_ONLY
2012-09-12 16:05:57 -07:00
Matei Zaharia
2d761e3353
Ported performance and FT improvements from latest streaming work
2012-09-12 14:54:40 -07:00
Matei Zaharia
9b4cd1648b
Fix bugs with Connection's shutdown callback failing to get its address
2012-09-12 14:54:14 -07:00
Matei Zaharia
9199775d41
Wait for Akka to really shut down in SparkEnv.stop()
2012-09-12 14:50:37 -07:00
Denny
5e4076e3f2
Merge branch 'dev' into feature/fileserver
...
Conflicts:
core/src/main/scala/spark/SparkContext.scala
2012-09-11 16:57:17 -07:00
Denny
77873d2c8e
Formatting
2012-09-11 16:51:46 -07:00
Denny
24b9b37314
Subclass URLClassLoader instead of using reflection
2012-09-11 16:51:08 -07:00
Denny
31c53e917d
Use stageId as index for fileSet caches.
2012-09-11 16:10:45 -07:00
Matei Zaharia
943df48348
Merge branch 'dev' of github.com:mesos/spark into dev
2012-09-11 16:00:37 -07:00
Matei Zaharia
6d7f907e73
Manually merge pull request #175 by Imran Rashid
2012-09-11 16:00:06 -07:00
Reynold Xin
7af7c79ce5
Updated the logError call from the previous commit to conform to
...
logError API.
2012-09-11 14:32:24 -07:00
Reynold Xin
38b9119c96
Log entire exception (including stack trace) in BlockManagerWorker.
2012-09-11 11:31:35 -07:00
Denny
4d3471dd07
Fix serialization bugs and added local cluster tests
2012-09-10 15:39:58 -07:00
Denny
b864c36a30
Dynamically adding jar files and caching fileSets.
2012-09-10 12:49:09 -07:00
Denny
f275fb07da
General FileServer
...
A general fileserver for both JARs and regular files.
2012-09-10 12:48:59 -07:00
Matei Zaharia
a13780670d
Added a unit test for local-cluster mode and simplified some of the code involved in that
2012-09-10 12:48:58 -07:00
Denny
f2ac55840c
Add shutdown hook to Executor Runner and execute code to shutdown local cluster in Scheduler Backend
2012-09-10 12:48:58 -07:00
Denny
9ead8ab14e
Set SPARK_LAUNCH_WITH_SCALA=0 in Executor Runner
2012-09-10 12:48:58 -07:00
Denny
8bb3c73977
Renamed spark-cluster to spark-local.
2012-09-10 12:48:58 -07:00
Denny
a367c20f49
Fix wrong counting
2012-09-10 12:48:57 -07:00
Denny
93fe331e6d
Delete old DeployUtils.
2012-09-10 12:48:57 -07:00
Denny
cf074f9c96
Renamed class.
2012-09-10 12:48:57 -07:00
Denny
3749f94184
Start a standalone cluster locally.
2012-09-10 12:48:57 -07:00