Commit graph

1031 commits

Author SHA1 Message Date
Patrick Wendell c387e40fb1 Log message which records RDD origin
This adds tracking to determine the "origin" of an RDD. Origin is defined by
the boundary between the user's code and the spark code, during an RDD's
instantiation. It is meant to help users understand where a Spark RDD is
coming from in their code.

This patch also logs origin data when stages are submitted to the scheduler.

Finally, it adds a new log message to fix an inconsitency in the way that
dependent stages (those missing parents) and independent stages (those
without) are logged during submission.
2012-09-28 15:51:46 -07:00
Matei Zaharia 0850d641af Merge branch 'dev' of github.com:mesos/spark into dev 2012-09-26 23:53:48 -07:00
Matei Zaharia bf18e0994e Minor typos 2012-09-26 23:53:38 -07:00
Matei Zaharia a4093f7563 Minor doc fixes 2012-09-26 23:22:15 -07:00
Matei Zaharia 920fab23c3 Merge pull request #222 from rxin/dev
Added MapPartitionsWithSplitRDD.
2012-09-26 23:16:45 -07:00
Matei Zaharia ea05fc130b Updates to standalone cluster, web UI and deploy docs. 2012-09-26 22:54:39 -07:00
Matei Zaharia 1ef4f0fbd2 Allow controlling number of splits in sortByKey. 2012-09-26 19:18:47 -07:00
Matei Zaharia 874a9fd407 More updates to docs, including tuning guide 2012-09-26 19:17:58 -07:00
Reynold Xin 1ad1331a34 Added MapPartitionsWithSplitRDD. 2012-09-26 17:11:28 -07:00
Matei Zaharia ee71fa49c1 Look for Kryo registrator using context class loader 2012-09-26 14:15:16 -07:00
Matei Zaharia 58eb44acbb Doc tweaks 2012-09-26 00:32:59 -07:00
Matei Zaharia d71a358c46 Fixed a test that was getting extremely lucky before, and increased the
number of samples used for sorting
2012-09-26 00:25:34 -07:00
Matei Zaharia d51d5e0582 Doc fixes 2012-09-25 23:59:04 -07:00
Matei Zaharia c5754bb939 Fixes to Java guide 2012-09-25 23:51:04 -07:00
Matei Zaharia f1246cc7c1 Various enhancements to the programming guide and HTML/CSS 2012-09-25 23:26:56 -07:00
Matei Zaharia 051785c7e6 Several fixes to sampling issues pointed out by Henry Milner:
- takeSample was biased towards earlier partitions
- There were some range errors in takeSample
- SampledRDDs with replacement didn't produce appropriate counts
  across partitions (we took exactly frac of each one)
2012-09-25 21:46:58 -07:00
Matei Zaharia 56c90485fd More updates to documentation 2012-09-25 19:31:07 -07:00
Matei Zaharia 1821bf1f1f Merge branch 'dev' of github.com:mesos/spark into dev 2012-09-25 15:46:27 -07:00
Matei Zaharia e47e11720f Documentation updates 2012-09-25 15:46:18 -07:00
Matei Zaharia 635348b185 Merge pull request #220 from andyk/doc
One commit that makes nav dropdowns show on hover
2012-09-25 15:45:49 -07:00
Andy Konwinski 098351b735 Makes nav menu dropdowns show on hover instead of on click. 2012-09-25 15:42:50 -07:00
Andy Konwinski 8d30fe616e Merge remote-tracking branch 'public-spark/dev' into doc 2012-09-25 15:19:57 -07:00
Matei Zaharia 30362a21e7 Update license info on deploy scripts 2012-09-25 14:43:47 -07:00
Matei Zaharia 60bce9574f Update Jekyll plugin to look at Scala 2.9.2 docs 2012-09-25 14:31:53 -07:00
Matei Zaharia aa50d5b9a2 Merge pull request #219 from rnpandya/dev
Add spark-shell.cmd
2012-09-25 09:58:35 -07:00
Ravi Pandya 36349a2fb6 Add spark-shell.cmd 2012-09-25 07:26:29 -07:00
Matei Zaharia 4d3339a3ec Merge pull request #217 from rxin/dev
Added a method to RDD to expose the ClassManifest.
2012-09-24 23:52:32 -07:00
Reynold Xin 7a4cd92861 Renamed RDD.manifest to RDD.elementClassManifest 2012-09-24 23:42:33 -07:00
Matei Zaharia 296e24b440 Merge pull request #218 from rnpandya/dev
Scripts to start Spark under windows
2012-09-24 21:10:31 -07:00
Reynold Xin 348bcbca1f Added a method to RDD to expose the ClassManifest. 2012-09-24 16:56:27 -07:00
Ravi Pandya afd8fc0c66 Echo off 2012-09-24 15:46:03 -07:00
Ravi Pandya 39215357af Windows command scripts for sbt and run 2012-09-24 15:43:19 -07:00
Matei Zaharia 6eeb379cf8 Fix some test issues 2012-09-24 15:39:58 -07:00
Matei Zaharia 35cc9f13e9 Update Akka to 2.0.3 2012-09-24 14:17:10 -07:00
Matei Zaharia 1f539aa473 Update Scala version dependency to 2.9.2 2012-09-24 14:12:48 -07:00
Matei Zaharia f855e4fad2 Merge pull request #208 from rxin/dev
Separated ShuffledRDD into multiple classes.
2012-09-24 12:32:01 -07:00
root 107a5ca879 Make default number of parallel fetches slightly smaller since it doesn't seem to hurt performance much and it will cause slightly less GC. 2012-09-23 06:06:12 +00:00
root e41cab04ca Avoid creating an extra buffer when saving a stream of values as DISK_ONLY 2012-09-23 05:56:44 +00:00
Matei Zaharia 33fb373e69 Merge pull request #215 from dennybritz/dev
HTTP File server fixes
2012-09-21 17:39:28 -07:00
Denny afb7ccc838 HTTP File server fixes. 2012-09-21 10:58:13 -07:00
root 6d28dde370 Rename our toIterator method into asIterator to prevent confusion with the
Scala collection one, which often *copies* a collection.
2012-09-21 06:02:55 +00:00
root a642051ade Fixed a performance bug in BlockManager that was creating garbage when
returning deserialized, in-memory RDDs.
2012-09-21 05:42:21 +00:00
root 8feb5caacd Fixed an issue with ordering of classloader setup that was causing Java deserializer to break 2012-09-21 05:13:19 +00:00
Matei Zaharia bf891a5c18 Merge pull request #203 from JoshRosen/docs/java-programming-guide
Java Programming Guide
2012-09-20 17:35:10 -07:00
Matei Zaharia 1c3d3f4e8c Merge pull request #210 from rxin/deploy-limit-retrycount
Set a limited number of retry in standalone deploy mode.
2012-09-20 17:33:47 -07:00
Reynold Xin 6b5980da79 Set a limited number of retry in standalone deploy mode. 2012-09-19 15:41:56 -07:00
Reynold Xin 397d3816e1 Separated ShuffledRDD into multiple classes: RepartitionShuffledRDD,
ShuffledSortedRDD, and ShuffledAggregatedRDD.
2012-09-19 12:31:45 -07:00
Matei Zaharia 9a449e0063 Merge pull request #204 from dennybritz/dev
When a file is downloaded, make it executable.
2012-09-17 10:11:22 -07:00
Denny ca64d16a2d When a file is downloaded, make it executable. That's neccsary for scripts (e.g. in Shark) 2012-09-17 10:08:37 -07:00
Josh Rosen c94e9cc54a Add Java Programming Guide; fix broken doc links. 2012-09-16 20:46:46 -07:00