Commit graph

3688 commits

Author SHA1 Message Date
Joseph E. Gonzalez 0598c10eb1 Merge branch 'master' of https://github.com/mesos/spark into indexed_rdd 2013-08-19 13:05:59 -07:00
Joseph E. Gonzalez 630281bf76 Corrected all indexed RDD tests.
There appears to be an issue with subtract by key tests that needs to be investigated further.
2013-08-18 17:16:45 -07:00
Matei Zaharia 8fa0747978 Merge pull request #840 from AndreSchumacher/zipegg
Implementing SPARK-878 for PySpark: adding zip and egg files to context ...
2013-08-18 17:02:54 -07:00
Joseph E. Gonzalez 8fd37adf83 Merged with changes to zipPartitions 2013-08-18 10:57:35 -07:00
Joseph E. Gonzalez 2b568520bf Merge branch 'master' of https://github.com/mesos/spark into indexed_rdd2 2013-08-18 10:45:30 -07:00
Matei Zaharia 1e137a5a21 Merge pull request #846 from rxin/rdd
Two minor RDD refactoring
2013-08-17 22:22:32 -07:00
Reynold Xin 2c00ea3efc Moved shuffle serializer setting from a constructor parameter to a setSerializer method in various RDDs that involve shuffle operations. 2013-08-17 21:43:29 -07:00
Reynold Xin 0e84fee76b Removed the mapSideCombine option in partitionBy. 2013-08-17 21:13:41 -07:00
Reynold Xin 10af952a3d Removed the mapSideCombine option in CoGroupedRDD. 2013-08-17 21:07:34 -07:00
Reynold Xin 5d050a3e1f Removed the unused shuffleId in ShuffleDependency's constructor. 2013-08-16 23:23:16 -07:00
Matei Zaharia e89ffc7b3c Merge pull request #839 from jegonzal/zip_partitions
Currying RDD.zipPartitions
2013-08-16 14:02:34 -07:00
Joseph E. Gonzalez 53b2639a1e Reversing the argument order in zipPartitions to enable stronger type inference. 2013-08-16 12:38:59 -07:00
Andre Schumacher c7e348faec Implementing SPARK-878 for PySpark: adding zip and egg files to context and passing it down to workers which add these to their sys.path 2013-08-16 11:58:20 -07:00
Reynold Xin 1fb1b09928 Merge pull request #841 from rxin/json
Use the JSON formatter from Scala library and removed dependency on lift-json.
2013-08-15 22:15:05 -07:00
Matei Zaharia c69c48947d Merge pull request #843 from Reinvigorate/bug-879
fixing typo in conf/slaves
2013-08-15 20:55:09 -07:00
seanm a5193a8fac fixing typo 2013-08-15 20:52:58 -06:00
Reynold Xin c961c19b7b Use the JSON formatter from Scala library and removed dependency on lift-json.
It made the JSON creation slightly more complicated, but reduces one external dependency. The scala library also properly escape "/" (which lift-json doesn't).
2013-08-15 18:23:01 -07:00
Reynold Xin eddbf43b54 Revert "Merge pull request #834 from Daemoen/master"
This reverts commit 230ab2722e, reversing
changes made to 659553b21d.
2013-08-15 17:49:37 -07:00
Reynold Xin 230ab2722e Merge pull request #834 from Daemoen/master
Updated json output to allow for display of worker state
2013-08-15 17:45:17 -07:00
Patrick Wendell 659553b21d Merge pull request #836 from pwendell/rename
Rename `memoryBytesToString` and `memoryMegabytesToString`
2013-08-15 16:56:31 -07:00
Matei Zaharia 28369ff773 Merge pull request #829 from JoshRosen/pyspark-unit-tests-python-2.6
Fix PySpark unit tests on Python 2.6
2013-08-15 16:44:02 -07:00
Joseph E. Gonzalez 327a4db9f7 changing caching behavior on indexedrdds 2013-08-15 16:36:26 -07:00
Patrick Wendell 4c6ade1ad5 Rename memoryBytesToString and memoryMegabytesToString
These are used all over the place now and they are not specific to memory at all.

memoryBytesToString --> bytesToString
memoryMegabytesToString --> megabytesToString
2013-08-15 15:58:07 -07:00
Reynold Xin 1a13460cb0 Merge pull request #833 from rxin/ui
Various UI improvements.
2013-08-15 15:50:44 -07:00
Reynold Xin 1a51deae8a More minor UI changes including code review feedback. 2013-08-15 14:34:07 -07:00
Joseph E. Gonzalez 3bb6e019d4 adding better error handling when indexing an RDD 2013-08-15 14:29:48 -07:00
Joseph E. Gonzalez 61281756f2 IndexedRDD passes all PairRDD Function tests 2013-08-15 14:20:59 -07:00
Daemoen ad2e8b5126 Updated json output to allow for display of worker state
Ops teams need to ensure that the cluster is functional and performant.  Having to scrape the html source for worker state won't work reliably, and will be slow.  By exposing the state in the json output, ops teams are able to ensure a fully functional environment by querying for the json output and parsing for dead nodes.
2013-08-15 12:19:14 -07:00
Reynold Xin 2d2a556bdf Various UI improvements. 2013-08-14 23:23:09 -07:00
Reynold Xin 044a088c0d Merge pull request #831 from rxin/scheduler
A few small scheduler / job description changes.
2013-08-14 20:43:49 -07:00
Reynold Xin 290e3e6e65 Renamed setCurrentJobDescription to setJobDescription. 2013-08-14 18:40:53 -07:00
Reynold Xin 3886b54933 A few small scheduler / job description changes.
1. Renamed SparkContext.addLocalProperty to setLocalProperty. And allow this function to unset a property.

2. Renamed SparkContext.setDescription to setCurrentJobDescription.

3. Throw an exception if the fair scheduler allocation file is invalid.
2013-08-14 17:19:42 -07:00
Joseph E. Gonzalez 54b54903c3 Adding testing code for indexedrdd 2013-08-14 16:35:20 -07:00
Matei Zaharia 839f2d4f3f Merge pull request #822 from pwendell/ui-features
Adding GC Stats to TaskMetrics (and three small fixes)
2013-08-14 16:17:23 -07:00
Joseph E. Gonzalez b71d4febbc Finished early prototype of IndexedRDD 2013-08-14 15:25:56 -07:00
Josh Rosen 7a9abb9ddc Fix PySpark unit tests on Python 2.6. 2013-08-14 15:12:12 -07:00
Patrick Wendell 04ad78b09d Style cleanup based on Matei feedback 2013-08-14 14:57:21 -07:00
Reynold Xin 63446f9208 Merge pull request #826 from kayousterhout/ui_fix
Fixed 2 bugs in executor UI (incl. SPARK-877)
2013-08-14 00:17:07 -07:00
Kay Ousterhout a88aa5e6ed Fixed 2 bugs in executor UI.
1) UI crashed if the executor UI was loaded before any tasks started.
2) The total tasks was incorrectly reported due to using string (rather
than int) arithmetic.
2013-08-13 23:44:58 -07:00
Matei Zaharia 3f14cbab05 Merge pull request #825 from shivaram/maven-repl-fix
Set SPARK_CLASSPATH for maven repl tests
2013-08-13 20:09:51 -07:00
Shivaram Venkataraman a1227708e9 Set SPARK_CLASSPATH for maven repl tests 2013-08-13 20:06:47 -07:00
Matei Zaharia 596adc63be Merge pull request #824 from mateiz/mesos-0.12.1
Update to Mesos 0.12.1
2013-08-13 19:41:34 -07:00
Matei Zaharia d9588183fa Update to Mesos 0.12.1 2013-08-13 18:51:35 -07:00
Patrick Wendell c223176388 Small style clean-up 2013-08-13 16:56:37 -07:00
Patrick Wendell fab5cee111 Correcting terminology in RDD page 2013-08-13 16:25:55 -07:00
Patrick Wendell 024e5c5ce1 Correct sorting order for stages 2013-08-13 16:25:55 -07:00
Patrick Wendell 4e9f0c2df6 Capturing GC detials in TaskMetrics 2013-08-13 16:25:55 -07:00
Patrick Wendell f0382007dc Bug fix for display of shuffle read/write metrics.
This fixes an error where empty cells are missing if a given task
has no shuffle read/write.
2013-08-13 16:25:55 -07:00
Matei Zaharia d316af9c84 Merge pull request #821 from pwendell/print-launch-command
Print run command to stderr rather than stdout
2013-08-13 15:31:01 -07:00
Matei Zaharia 1f79d21f33 Merge pull request #818 from kayousterhout/killed_fix
Properly account for killed tasks.
2013-08-13 15:23:54 -07:00