Commit graph

2917 commits

Author SHA1 Message Date
Patrick Wendell faefea3fd8 Adding driver ID to submission response 2013-12-29 11:31:10 -08:00
Patrick Wendell 6ffa9bb226 Documentation and adding supervise option 2013-12-29 11:26:56 -08:00
Patrick Wendell 35f6dc252a Changes to allow fate sharing of drivers/executors and workers. 2013-12-29 11:14:36 -08:00
Matei Zaharia cd00225db9 Add SparkConf support in Python 2013-12-29 14:03:39 -05:00
Tor Myklebust d812aeece9 Factor call site reporting out to SparkContext. 2013-12-28 23:21:49 -05:00
Matei Zaharia 20631348d1 Fix other failing tests 2013-12-28 23:17:58 -05:00
Matei Zaharia 5bbe73864e Fix Executor not getting properties in local mode 2013-12-28 17:31:58 -05:00
Matei Zaharia a16c52ed1b Check for SPARK_YARN_MODE through a system property too since it can
sometimes be set that way (undoes a change in previous commit)
2013-12-28 17:24:21 -05:00
Matei Zaharia 642029e7f4 Various fixes to configuration code
- Got rid of global SparkContext.globalConf
- Pass SparkConf to serializers and compression codecs
- Made SparkConf public instead of private[spark]
- Improved API of SparkContext and SparkConf
- Switched executor environment vars to be passed through SparkConf
- Fixed some places that were still using system properties
- Fixed some tests, though others are still failing

This still fails several tests in core, repl and streaming, likely due
to properties not being set or cleared correctly (some of the tests run
fine in isolation).
2013-12-28 17:13:15 -05:00
Patrick Wendell 7375047d51 Merge pull request #304 from kayousterhout/remove_unused
Removed unused failed and causeOfFailure variables (in TaskSetManager)
2013-12-28 13:25:06 -08:00
Matei Zaharia ad3dfd1531 Merge pull request #307 from kayousterhout/other_failure
Removed unused OtherFailure TaskEndReason.

The OtherFailure TaskEndReason was added by @mateiz 3 years ago in this commit: 24a1e7f838

Unless I am missing something, it doesn't seem to have been used then, and is not used now, so seems safe for deletion.
2013-12-27 22:10:14 -05:00
Kay Ousterhout b4619e509b Changed naming of StageCompleted event to be consistent
The rest of the SparkListener events are named with "SparkListener"
as the prefix of the name; this commit renames the StageCompleted
event to SparkListenerStageCompleted for consistency.
2013-12-27 17:45:20 -08:00
Kay Ousterhout e17d7518ab Removed unused OtherFailure TaskEndReason. 2013-12-27 15:51:27 -08:00
Kay Ousterhout 8419148e5f Remove unused hasPendingTasks methods 2013-12-27 15:19:42 -08:00
Patrick Wendell c8c8b42a6f Some notes and TODO about dependencies 2013-12-27 15:13:11 -08:00
Kay Ousterhout 0c71ffe924 Style fixes as per Reynold's review 2013-12-27 12:19:38 -08:00
Kay Ousterhout 8c81068e16 Fixed >100char lines in DAGScheduler.scala 2013-12-27 11:36:54 -08:00
Binh Nguyen 2c5bade4ee Fix failed unit tests
Also clean up a bit.
2013-12-27 11:24:30 -08:00
Kay Ousterhout baaabcedc9 Removed unused failed and causeOfFailure variables 2013-12-27 11:12:36 -08:00
Aaron Davidson 2a7b3511f4 Add Apache headers 2013-12-27 10:55:16 -08:00
Reynold Xin 7be1e57786 Merge pull request #298 from aarondav/minor
Minor: Decrease margin of left side of Log page

Before
![before](https://f.cloud.github.com/assets/1400247/1812647/1a4be53e-6e87-11e3-9d5b-f851274be0e9.png)

After
![after](https://f.cloud.github.com/assets/1400247/1812648/1ca1ea2c-6e87-11e3-946c-31be9258f450.png)

It's a start anyway...
2013-12-26 23:41:40 -10:00
Andrew Or d0cfbc41e2 Rename spark.shuffle.buffer variables 2013-12-27 00:07:09 -08:00
Andrew Or 8f3175773c Final cleanup 2013-12-26 23:40:08 -08:00
Aaron Davidson 1dc0440c1a Use real serializer & manual ordering 2013-12-26 23:40:08 -08:00
Aaron Davidson 0f66b7f2fc Return efficient iterator if no spillage happened 2013-12-26 23:40:08 -08:00
Andrew Or ec8c5dc644 Sort AppendOnlyMap in-place 2013-12-26 23:40:08 -08:00
Aaron Davidson 0289eb752a Allow Product2 rather than just tuple kv pairs 2013-12-26 23:40:07 -08:00
Andrew Or 64b2d54a02 Move maps to util, and refactor more 2013-12-26 23:40:07 -08:00
Aaron Davidson 804beb43be SamplingSizeTracker + Map + test suite 2013-12-26 23:40:07 -08:00
Andrew Or 7ad4408255 New minor edits 2013-12-26 23:40:07 -08:00
Aaron Davidson fcc443b3db Minor cleanup for Scala style 2013-12-26 23:40:07 -08:00
Andrew Or 2a2ca2a661 Add toggle for ExternalAppendOnlyMap in Aggregator and CoGroupedRDD 2013-12-26 23:40:07 -08:00
Andrew Or 28685a4820 Provide for cases when mergeCombiners is not specified in ExternalAppendOnlyMap 2013-12-26 23:40:07 -08:00
Andrew Or 17def8cc11 Refactor ExternalAppendOnlyMap to take in KVC instead of just KV 2013-12-26 23:40:07 -08:00
Andrew Or 6a45ec1972 Working ExternalAppendOnlyMap for both CoGroupedRDDs and Aggregator 2013-12-26 23:40:07 -08:00
Andrew Or 97fbb3ec52 Working ExternalAppendOnlyMap for Aggregator, but not for CoGroupedRDD 2013-12-26 23:40:07 -08:00
Patrick Wendell 55c8bb741c Intermediate clean-up of tests to appease jenkins 2013-12-26 15:43:15 -08:00
Aaron Davidson 4f2fb761b0 Decrease margin of left side of log page 2013-12-26 15:38:45 -08:00
Patrick Wendell 5c1b4f6405 Minor fixes 2013-12-26 14:39:39 -08:00
Tathagata Das 5fde4566ea Added Apache boilerplate and class docs to PartitionerAwareUnionRDD. 2013-12-26 14:33:37 -08:00
Patrick Wendell c23d640516 Addressing smaller changes from Aaron's review 2013-12-26 12:38:39 -08:00
Tathagata Das 3579647cdc Merge branch 'apache-master' into window-improvement 2013-12-26 12:12:10 -08:00
Patrick Wendell da20270b83 Merge pull request #1 from aarondav/driver
Refactor DriverClient to be more Actor-based
2013-12-26 12:11:52 -08:00
Patrick Wendell a97ad55c45 Removing accidental file 2013-12-26 12:11:28 -08:00
Tathagata Das c4a54f51b5 Merge branch 'master' into window-improvement 2013-12-26 12:03:11 -08:00
Patrick Wendell 5938cfc153 Updated approach to driver restarting 2013-12-26 12:02:19 -08:00
Mark Hamstra c529dceaff Avoid a lump of coal (NPE) in JobProgressListener's stocking. 2013-12-25 23:10:02 -08:00
Tathagata Das 94479673eb Fixed bug in PartitionAwareUnionRDD 2013-12-26 00:07:45 +00:00
Aaron Davidson 61372b11f4 Refactor DriverClient to be more Actor-based 2013-12-25 10:55:25 -08:00
walker 0af4b4f3e8 Bug fixes for updating the RDD block's memory and disk usage information 2013-12-25 20:07:01 +08:00
Patrick Wendell bbc362833b Removing un-used variable 2013-12-25 01:38:57 -08:00
Patrick Wendell 18ad419b52 Small fix from rebase 2013-12-25 01:22:38 -08:00
Patrick Wendell 55f833803a Minor bug fix 2013-12-25 01:19:25 -08:00
Patrick Wendell c9c0f745af Minor style clean-up 2013-12-25 01:19:25 -08:00
Patrick Wendell b2b7514ba3 Import clean-up (yay Aaron) 2013-12-25 01:19:25 -08:00
Patrick Wendell d5f23e0083 Adding scheduling and reporting based on cores 2013-12-25 01:19:01 -08:00
Patrick Wendell 760823d393 Adding better option parsing 2013-12-25 01:19:01 -08:00
Patrick Wendell 6a4acc4c2d Initial cut at driver submission. 2013-12-25 01:19:01 -08:00
Patrick Wendell 1070b566d4 Renaming Client => AppClient 2013-12-25 01:17:01 -08:00
Patrick Wendell 85a344b4f0 Merge pull request #127 from kayousterhout/consolidate_schedulers
Deduplicate Local and Cluster schedulers.

The code in LocalScheduler/LocalTaskSetManager was nearly identical
to the code in ClusterScheduler/ClusterTaskSetManager. The redundancy
made making updating the schedulers unnecessarily painful and error-
prone. This commit combines the two into a single TaskScheduler/
TaskSetManager.

Unfortunately the diff makes this change look much more invasive than it is -- TaskScheduler.scala is only superficially changed (names updated, overrides removed) from the old ClusterScheduler.scala, and the same with
TaskSetManager.scala.

Thanks @rxin for suggesting this change!
2013-12-24 16:35:06 -08:00
Binh Nguyen 786f393a98 Fix imports order 2013-12-24 14:59:30 -08:00
Binh Nguyen 9115a5de62 Remove import * and fix some formatting 2013-12-24 14:59:30 -08:00
Binh Nguyen 040dd3ecd5 upgrade Netty from 4.0.0.Beta2 to 4.0.13.Final 2013-12-24 14:58:18 -08:00
Patrick Wendell c2dd6bcd6e Merge pull request #279 from aarondav/shuffle-cleanup0
Clean up shuffle files once their metadata is gone

Previously, we would only clean the in-memory metadata for consolidated shuffle files.

Additionally, fixes a bug where the Metadata Cleaner was ignoring type-specific TTLs.
2013-12-24 14:36:47 -08:00
Kay Ousterhout 1efe3adf56 Responded to Reynold's style comments 2013-12-24 14:18:39 -08:00
Tathagata Das d4dfab503a Fixed Python API for sc.setCheckpointDir. Also other fixes based on Reynold's comments on PR 289. 2013-12-24 14:01:13 -08:00
Tathagata Das 9f79fd89dc Merge branch 'apache-master' into filestream-fix 2013-12-24 11:38:17 -08:00
Prashant Sharma 2573add94c spark-544, introducing SparkConf and related configuration overhaul. 2013-12-25 00:09:36 +05:30
Matei Zaharia 23a9ae6be3 Merge pull request #277 from tdas/scheduler-update
Refactored the streaming scheduler and added StreamingListener interface

- Refactored the streaming scheduler for cleaner code. Specifically, the JobManager was renamed to JobScheduler, as it does the actual scheduling of Spark jobs to the SparkContext. The earlier Scheduler was renamed to JobGenerator, as it actually generates the jobs from the DStreams. The JobScheduler starts the JobGenerator. Also, moved all the scheduler related code from spark.streaming to spark.streaming.scheduler package.
- Implemented the StreamingListener interface, similar to SparkListener. The streaming version of StatusReportListener prints the batch processing time statistics (for now). Added StreamingListernerSuite to test it.
- Refactored streaming TestSuiteBase for deduping code in the other streaming testsuites.
2013-12-24 00:08:48 -05:00
Reynold Xin 11107c9de5 Merge pull request #244 from leftnoteasy/master
Added SPARK-968 implementation for review

Added SPARK-968 implementation for review
2013-12-23 10:38:20 -08:00
wangda.tan 2f689ba97b SPARK-968, added executor address showing in aggregated metrics by executors table 2013-12-23 15:03:45 +08:00
Kay Ousterhout b7bfae1afe Correctly merged in maxTaskFailures fix 2013-12-22 07:34:44 -08:00
wangda.tan c979eecdf6 added changes according to comments from rxin 2013-12-22 21:43:15 +08:00
Kay Ousterhout b8ae096a40 Fix build error in test 2013-12-21 23:28:48 -08:00
Kay Ousterhout 30186aa264 Renamed ClusterScheduler to TaskSchedulerImpl 2013-12-20 14:58:04 -08:00
Kay Ousterhout c06945cfe0 Merge remote branch 'upstream/master' into consolidate_schedulers
Conflicts:
	core/src/main/scala/org/apache/spark/scheduler/cluster/ClusterTaskSetManager.scala
	core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
	core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala
2013-12-20 14:39:30 -08:00
Patrick Wendell 0bc57c5767 Merge pull request #280 from aarondav/minor
Minor cleanup for standalone scheduler

See commit messages
2013-12-20 11:56:54 -08:00
Tathagata Das 61f4bbda0d Added tests for PartitionerAwareUnionRDD in the CheckpointSuite. Refactored CheckpointSuite to make the tests simpler and more reliable. Added missing test for ZippedRDD. 2013-12-20 00:41:47 -08:00
Patrick Wendell eca68d4425 Merge pull request #272 from tmyklebu/master
Track and report task result serialisation time.

 - DirectTaskResult now has a ByteBuffer valueBytes instead of a T value.
 - DirectTaskResult now has a member function T value() that deserialises valueBytes.
 - Executor serialises value into a ByteBuffer and passes it to DTR's ctor.
 - Executor tracks the time taken to do so and puts it in a new field in TaskMetrics.
 - StagePage now reports serialisation time from TaskMetrics along with the other things it reported.
2013-12-19 18:12:22 -08:00
Aaron Davidson 6613ab663d Fix compiler warning in SparkZooKeeperSession 2013-12-19 17:56:13 -08:00
Aaron Davidson 4d74b899b7 Remove firstApp from the standalone scheduler Master
As a lonely child with no one to care for it... we had to put it down.
2013-12-19 17:53:41 -08:00
Aaron Davidson 1ab031eaff Extraordinarily minor code/comment cleanup 2013-12-19 17:51:29 -08:00
Aaron Davidson 0647ec9757 Clean up shuffle files once their metadata is gone
Previously, we would only clean the in-memory metadata for consolidated
shuffle files.

Additionally, fixes a bug where the Metadata Cleaner was ignoring type-
specific TTLs.
2013-12-19 15:40:48 -08:00
Reynold Xin 7990c56375 Merge pull request #276 from shivaram/collectPartition
Add collectPartition to JavaRDD interface.

This interface is useful for implementing `take` from other language frontends where the data is serialized. Also remove `takePartition` from PythonRDD and use `collectPartition` in rdd.py.

Thanks @concretevitamin for the original change and tests.
2013-12-19 13:35:09 -08:00
Tathagata Das de41c436a0 Merge branch 'scheduler-update' into window-improvement
Conflicts:
	streaming/src/main/scala/org/apache/spark/streaming/dstream/WindowedDStream.scala
2013-12-19 12:05:08 -08:00
Shivaram Venkataraman 9cc3a6d3c0 Add comment explaining collectPartitions's use 2013-12-19 11:49:17 -08:00
Shivaram Venkataraman d3234f9726 Make collectPartitions take an array of partitions
Change the implementation to use runJob instead of PartitionPruningRDD.
Also update the unit tests and the python take implementation
to use the new interface.
2013-12-19 11:40:34 -08:00
Tathagata Das 984c582487 Merge branch 'scheduler-update' into filestream-fix
Conflicts:
	core/src/main/scala/org/apache/spark/rdd/CheckpointRDD.scala
	streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala
	streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala
	streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala
	streaming/src/test/scala/org/apache/spark/streaming/CheckpointSuite.scala
2013-12-19 11:20:48 -08:00
Nick Pentreath a76f53416c Add toString to Java RDD, and __repr__ to Python RDD 2013-12-19 14:38:20 +02:00
Tathagata Das ec71b445ad Minor changes. 2013-12-18 23:39:28 -08:00
Aaron Davidson 293a0af5a1 In experimental clusters we've observed that a 10 second timeout was insufficient,
despite having a low number of nodes and relatively small workload (16 nodes, <1.5 TB data).
This would cause an entire job to fail at the beginning of the reduce phase.
There is no particular reason for this value to be small as a timeout should only occur
in an exceptional situation.

Also centralized the reading of spark.akka.askTimeout to AkkaUtils (surely this can later
be cleaned up to use Typesafe).

Finally, deleted some lurking implicits. If anyone can think of a reason they should still
be there, please let me know.
2013-12-18 21:42:29 -08:00
Tathagata Das e93b391d75 Merge branch 'apache-master' into scheduler-update
Conflicts:
	streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala
	streaming/src/main/scala/org/apache/spark/streaming/dstream/ForEachDStream.scala
2013-12-18 17:51:14 -08:00
Tathagata Das b80ec05635 Added StatsReportListener to generate processing time statistics across multiple batches. 2013-12-18 15:35:24 -08:00
Shivaram Venkataraman af0cd6bd27 Add collectPartition to JavaRDD interface.
Also remove takePartition from PythonRDD and use collectPartition in rdd.py.
2013-12-18 11:40:07 -08:00
Tor Myklebust d3b1af4b6c Add a serialisation time column to the StagePage. 2013-12-18 14:25:56 -05:00
Tor Myklebust 717c7fddb2 objectSer -> valueSer in a test. 2013-12-17 23:02:21 -05:00
Reynold Xin 9a6864d016 Fixed a performance problem in RDD.top and BoundedPriorityQueue (size in BoundedPriority was actually traversing the entire queue to calculate the size, resulting in bad performance in insertion). 2013-12-17 18:44:39 -08:00
wangda.tan 59e53fa21c spark-968, changes for avoid a NPE 2013-12-17 17:57:27 +08:00
wangda.tan 36060f4f50 spark-898, changes according to review comments 2013-12-17 17:55:38 +08:00
Patrick Wendell c1fec89895 Cleanup 2013-12-16 21:56:21 -08:00
Patrick Wendell c6f95e603e Attempt with extra repositories 2013-12-16 21:53:51 -08:00
Tor Myklebust b2f0329511 Missed a spot; had an objectSer here too. 2013-12-17 00:18:46 -05:00
Tor Myklebust 25fa976580 Merge branch 'master' of git://github.com/apache/incubator-spark 2013-12-16 23:48:37 -05:00
Tor Myklebust 963d6f065a Incorporate pwendell's code review suggestions. 2013-12-16 23:14:52 -05:00
Reynold Xin 883e034aeb Merge pull request #245 from gregakespret/task-maxfailures-fix
Fix for spark.task.maxFailures not enforced correctly.

Docs at http://spark.incubator.apache.org/docs/latest/configuration.html say:

```
spark.task.maxFailures

Number of individual task failures before giving up on the job. Should be greater than or equal to 1. Number of allowed retries = this value - 1.
```

Previous implementation worked incorrectly. When for example `spark.task.maxFailures` was set to 1, the job was aborted only after the second task failure, not after the first one.
2013-12-16 14:16:02 -08:00
Tor Myklebust 882d544856 UI to display serialisation time of a stage. 2013-12-16 13:27:03 -05:00
Tor Myklebust 8a397a959b Track task value serialisation time in TaskMetrics. 2013-12-16 12:07:39 -05:00
wangda.tan 8ab8c6a526 Merge branch 'master' of git://github.com/apache/incubator-spark 2013-12-16 21:45:43 +08:00
Mark Hamstra 09ed7ddfa0 Use scala.binary.version in POMs 2013-12-15 12:39:58 -08:00
Josh Rosen 2fd781d347 Merge pull request #249 from ngbinh/partitionInJavaSortByKey
Expose numPartitions parameter in JavaPairRDD.sortByKey()

This change makes Java and Scala API on sortByKey() the same.
2013-12-14 12:59:37 -08:00
Prashant Sharma 1ae3c0fc5e Added a comment about ActorRef and ActorSelection difference. 2013-12-14 10:44:24 +05:30
Prashant Sharma a854cc536d Review comments on the PR for scala 2.10 migration. 2013-12-13 15:19:51 +05:30
Tathagata Das 097e120c0c Refactored streaming scheduler and added listener interface.
- Refactored Scheduler + JobManager to JobGenerator + JobScheduler and
  added JobSet for cleaner code. Moved scheduler related code to
  streaming.scheduler package.
- Added StreamingListener trait (similar to SparkListener) to enable
  gathering to streaming stats like processing times and delays.
  StreamingContext.addListener() to added listeners.
- Deduped some code in streaming tests by modifying TestSuiteBase, and
  added StreamingListenerSuite.
2013-12-12 20:48:02 -08:00
Tathagata Das 5e9ce83d68 Fixed multiple file stream and checkpointing bugs.
- Made file stream more robust to transient failures.
- Changed Spark.setCheckpointDir API to not have the second
  'useExisting' parameter. Spark will always create a unique directory
  for checkpointing underneath the directory provide to the funtion.
- Fixed bug wrt local relative paths as checkpoint directory.
- Made DStream and RDD checkpointing use
  SparkContext.hadoopConfiguration, so that more HDFS compatible
  filesystems are supported for checkpointing.
2013-12-11 14:01:36 -08:00
Prashant Sharma 603af51bb5 Merge branch 'master' into akka-bug-fix
Conflicts:
	core/pom.xml
	core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
	pom.xml
	project/SparkBuild.scala
	streaming/pom.xml
	yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala
2013-12-11 10:21:53 +05:30
Hossein Falaki 49bf47e1b7 Removed superfluous abs call from test cases. 2013-12-10 19:50:50 -08:00
Binh Nguyen 0b494f7db4 Hook directly to Scala API 2013-12-10 11:17:52 -08:00
Binh Nguyen e85af50767 Leave default value of numPartitions to Scala code. 2013-12-10 11:04:14 -08:00
Grega Kespret 558af87334 Fix tests. 2013-12-10 11:43:42 +01:00
Binh Nguyen c82d4f079b Use braces to shorten the line. 2013-12-10 01:04:52 -08:00
Binh Nguyen 5013fb64b2 Expose numPartitions parameter in JavaPairRDD.sortByKey()
This change make Java and Scala API on sortByKey() the same.
2013-12-10 00:38:16 -08:00
Prashant Sharma 17db6a9041 Style fixes and addressed review comments at #221 2013-12-10 11:47:16 +05:30
Patrick Wendell 5b74609d97 License headers 2013-12-09 16:41:01 -08:00
Grega Kespret 14a1df6572 Fix for spark.task.maxFailures not enforced correctly. 2013-12-09 10:39:02 +01:00
wangda.tan ee68a85cff SPARK-968, added sc finalize code to avoid akka rebinding to the same port 2013-12-09 09:38:58 +08:00
Aaron Davidson 40f63eb034 Merge master into 127 2013-12-08 11:16:52 -08:00
wangda.tan 850c4b709a Merge branch 'master' of https://github.com/leftnoteasy/incubator-spark-1 2013-12-09 00:12:46 +08:00
wangda.tan 48e4f2ad14 SPARK-968, In stage UI, add an overview section that shows task stats grouped by executor id 2013-12-09 00:02:59 +08:00
Prashant Sharma 7ad6921ae0 Incorporated Patrick's feedback comment on #211 and made maven build/dep-resolution atleast a bit faster. 2013-12-07 12:45:57 +05:30
Matei Zaharia e0392343a0 Merge pull request #190 from markhamstra/Stages4Jobs
stageId <--> jobId mapping in DAGScheduler

Okay, I think this one is ready to go -- or at least it's ready for review and discussion.  It's a carry-over of https://github.com/mesos/spark/pull/842 with updates for the newer job cancellation functionality.  The prior discussion still applies.  I've actually changed the job cancellation flow a bit: Instead of ``cancelTasks`` going to the TaskScheduler and then ``taskSetFailed`` coming back to the DAGScheduler (resulting in ``abortStage`` there), the DAGScheduler now takes care of figuring out which stages should be cancelled, tells the TaskScheduler to cancel tasks for those stages, then does the cleanup within the DAGScheduler directly without the need for any further prompting by the TaskScheduler.

I know of three outstanding issues, each of which can and should, I believe, be handled in follow-up pull requests:

1) https://spark-project.atlassian.net/browse/SPARK-960
2) JobLogger should be re-factored to eliminate duplication
3) Related to 2), the WebUI should also become a consumer of the DAGScheduler's new understanding of the relationship between jobs and stages so that it can display progress indication and the like grouped by job.  Right now, some of this information is just being sent out as part of ``SparkListenerJobStart`` messages, but more or different job <--> stage information may need to be exported from the DAGScheduler to meet listeners needs.

Except for the eventQueue -> Actor commit, the rest can be cherry-picked almost cleanly into branch-0.8.  A little merging is needed in MapOutputTracker and the DAGScheduler.  Merged versions of those files are in aba2b40ce0

Note that between the recent Actor change in the DAGScheduler and the cleaning up of DAGScheduler data structures on job completion in this PR, some races have been introduced into the DAGSchedulerSuite.  Those tests usually pass, and I don't think that better-behaved code that doesn't directly inspect DAGScheduler data structures should be seeing any problems, but I'll work on fixing DAGSchedulerSuite as either an addition to this PR or as a separate request.

UPDATE: Fixed the race that I introduced.  Created a JIRA issue (SPARK-965) for the one that was introduced with the switch to eventProcessorActor in the DAGScheduler.
2013-12-06 11:49:59 -08:00
Matei Zaharia bfa68609d9 Merge pull request #233 from hsaputra/changecontexttobackend
Change the name of input argument in ClusterScheduler#initialize from context to backend.

The SchedulerBackend used to be called ClusterSchedulerContext so just want to make small
change of the input param in the ClusterScheduler#initialize to reflect this.
2013-12-06 11:04:03 -08:00
Matei Zaharia 3fb302c08d Merge pull request #205 from kayousterhout/logging
Added logging of scheduler delays to UI

This commit adds two metrics to the UI:

1) The time to get task results, if they're fetched remotely

2) The scheduler delay.  When the scheduler starts getting overwhelmed (because it can't keep up with the rate at which tasks are being submitted), the result is that tasks get delayed on the tail-end: the message from the worker saying that the task has completed ends up in a long queue and takes a while to be processed by the scheduler.  This commit records that delay in the UI so that users can tell when the scheduler is becoming the bottleneck.
2013-12-06 11:03:32 -08:00
Matei Zaharia 87676a6af2 Merge pull request #220 from rxin/zippart
Memoize preferred locations in ZippedPartitionsBaseRDD

so preferred location computation doesn't lead to exponential explosion.

This was a problem in GraphX where we have a whole chain of RDDs that are ZippedPartitionsRDD's, and the preferred locations were taking eternity to compute.

(cherry picked from commit e36fe55a03)
Signed-off-by: Reynold Xin <rxin@apache.org>
2013-12-06 11:01:42 -08:00
Aaron Davidson 94b5881ee9 Fix long lines 2013-12-06 00:22:00 -08:00
Aaron Davidson 5a864e3fce Rename SparkActorSystem to IndestructibleActorSystem 2013-12-06 00:21:43 -08:00
Prashant Sharma c9cd2af71e Merge branch 'wip-scala-2.10' into akka-bug-fix 2013-12-06 13:32:15 +05:30
Mark Hamstra ee888f6b25 FutureAction result tests 2013-12-05 23:01:18 -08:00
Prashant Sharma 4e70480038 A left over akka -> akka.tcp changes 2013-12-06 12:29:53 +05:30
Henry Saputra 1cb259cb57 Change the name of input ragument in ClusterScheduler#initialize from context to backend.
The SchedulerBackend used to be called ClusterSchedulerContext so just want to make small
change of the input param in the ClusterScheduler#initialize to reflect this.
2013-12-05 18:50:26 -08:00
Mark Hamstra aebb123fd3 jobWaiter.synchronized before jobWaiter.wait 2013-12-05 17:16:44 -08:00
Patrick Wendell 5d460253d6 Merge pull request #228 from pwendell/master
Document missing configs and set shuffle consolidation to false.
2013-12-05 12:31:24 -08:00
Patrick Wendell 75d161b357 Forcing shuffle consolidation in DiskBlockManagerSuite 2013-12-05 11:36:41 -08:00
Matei Zaharia 72b696156c Merge pull request #199 from harveyfeng/yarn-2.2
Hadoop 2.2 migration

Includes support for the YARN API stabilized in the Hadoop 2.2 release, and a few style patches.

Short description for each set of commits:

a98f5a0 - "Misc style changes in the 'yarn' package"
a67ebf4 - "A few more style fixes in the 'yarn' package"
Both of these are some minor style changes, such as fixing lines over 100 chars, to the existing YARN code.

ab8652f - "Add a 'new-yarn' directory ... "
Copies everything from `SPARK_HOME/yarn` to `SPARK_HOME/new-yarn`. No actual code changes here.

4f1c3fa - "Hadoop 2.2 YARN API migration ..."
API patches to code in the `SPARK_HOME/new-yarn` directory. There are a few more small style changes mixed in, too.
Based on @colorant's Hadoop 2.2 support for the scala-2.10 branch in #141.

a1a1c62 - "Add optional Hadoop 2.2 settings in sbt build ... "
If Spark should be built against Hadoop 2.2, then:
a) the `org.apache.spark.deploy.yarn` package will be compiled from the `new-yarn` directory.
b) Protobuf v2.5 will be used as a Spark dependency, since Hadoop 2.2 depends on it. Also, Spark will be built against a version of Akka v2.0.5 that's built against Protobuf 2.5, named `akka-2.0.5-protobuf-2.5`. The patched Akka is here: https://github.com/harveyfeng/akka/tree/2.0.5-protobuf-2.5, and was published to local Ivy during testing.

There's also a new boolean environment variable, `SPARK_IS_NEW_HADOOP`, that users can manually set if their `SPARK_HADOOP_VERSION` specification does not start with `2.2`, which is how the build file tries to detect a 2.2 version. Not sure if this is necessary or done in the best way, though...
2013-12-04 23:33:04 -08:00
Patrick Wendell b1c6fa1584 Document missing configs and set shuffle consolidation to false. 2013-12-04 18:39:34 -08:00
Patrick Wendell 182f9baeed Merge pull request #227 from pwendell/master
Fix small bug in web UI and minor clean-up.

There was a bug where sorting order didn't work correctly for write time metrics.

I also cleaned up some earlier code that fixed the same issue for read and
write bytes.
2013-12-04 15:52:07 -08:00
Patrick Wendell 380b90b9b3 Fix small bug in web UI and minor clean-up.
There was a bug where sorting order didn't work correctly for write time metrics.

I also cleaned up some earlier code that fixed the same issue for read and
write bytes.
2013-12-04 14:41:48 -08:00
Andrew Ash 217611680d Add missing space after "Serialized" in StorageLevel
Current code creates outputs like:

scala> res0.getStorageLevel.description
res2: String = Serialized1x Replicated
2013-12-04 11:29:20 -08:00
Matei Zaharia d6e5473872 Merge pull request #223 from rxin/transient
Mark partitioner, name, and generator field in RDD as @transient.

As part of the effort to reduce serialized task size.
2013-12-04 10:28:50 -08:00
Reynold Xin 974a69d79c Marked doCheckpointCalled as transient. 2013-12-03 11:34:38 -08:00
Mark Hamstra 403234dd0d SparkListenerJobStart posted from local jobs 2013-12-03 09:57:32 -08:00
Mark Hamstra f55d0b935d Synchronous, inline cleanup after runLocally 2013-12-03 09:57:32 -08:00
Mark Hamstra c9fcd909d0 Local jobs post SparkListenerJobEnd, and DAGScheduler data structure
cleanup always occurs before any posting of SparkListenerJobEnd.
2013-12-03 09:57:32 -08:00
Mark Hamstra 9ae2d094a9 Tightly couple stageIdToJobIds and jobIdToStageIds 2013-12-03 09:57:32 -08:00
Mark Hamstra 27c45e5236 Cleaned up job cancellation handling 2013-12-03 09:57:32 -08:00
Mark Hamstra 686a420ddc Refactoring to make job removal, stage removal, task cancellation clearer 2013-12-03 09:57:32 -08:00
Mark Hamstra 205566e56e Improved comment 2013-12-03 09:57:32 -08:00
Mark Hamstra 94087c463b Removed redundant residual re: reverted refactoring. 2013-12-03 09:57:31 -08:00
Mark Hamstra 982797dcba Fixed intended side-effects 2013-12-03 09:57:31 -08:00
Mark Hamstra 6f8359b5ad Actor instead of eventQueue for LocalJobCompleted 2013-12-03 09:57:31 -08:00
Mark Hamstra 51458ab4a1 Added stageId <--> jobId mapping in DAGScheduler
...and make sure that DAGScheduler data structures are cleaned up on job completion.
  Initial effort and discussion at https://github.com/mesos/spark/pull/842
2013-12-03 09:57:31 -08:00
Raymond Liu 4738818dd6 Fix pom.xml for maven build 2013-12-03 16:36:05 +08:00
Reynold Xin 58d9bbcfec Merge pull request #217 from aarondav/mesos-urls
Re-enable zk:// urls for Mesos SparkContexts

This was broken in PR #71 when we explicitly disallow anything that didn't fit a mesos:// url.
Although it is not really clear that a zk:// url should match Mesos, it is what the docs say and it is necessary for backwards compatibility.

Additionally added a unit test for the creation of all types of TaskSchedulers. Since YARN and Mesos are not necessarily available in the system, they are allowed to pass as long as the YARN/Mesos code paths are exercised.
2013-12-02 21:58:53 -08:00
Prashant Sharma 09e8be9a62 Made running SparkActorSystem specific to executors only. 2013-12-03 11:27:45 +05:30
Aaron Davidson 0f24576c08 Cleanup and documentation of SparkActorSystem 2013-12-03 11:05:12 +05:30
Reynold Xin e34b4693d3 Mark partitioner, name, and generator field in RDD as @transient. 2013-12-02 21:24:44 -08:00
Kay Ousterhout 58b3aff9a8 Fixed problem with scheduler delay 2013-12-02 20:30:03 -08:00
Aaron Davidson f6c8c1c7b6 Cleanup and documentation of SparkActorSystem 2013-12-02 11:42:53 -08:00
Prashant Sharma 5b11028a04 Made akka capable of tolerating fatal exceptions and moving on. 2013-12-02 10:47:39 +05:30
Reynold Xin 740922f25d Merge pull request #219 from sundeepn/schedulerexception
Scheduler quits when newStage fails

The current scheduler thread does not handle exceptions from newStage stage while launching new jobs. The thread fails on any exception that gets triggered at that level, leaving the cluster hanging with no schduler.
2013-12-01 12:46:58 -08:00
Sundeep Narravula be3ea2394f Log exception in scheduler in addition to passing it to the caller.
Code Styling changes.
2013-12-01 00:50:34 -08:00
Reynold Xin 9cf7f31e4d Memoize preferred locations in ZippedPartitionsBaseRDD so preferred location computation doesn't lead to exponential explosion.
(cherry picked from commit e36fe55a03)
Signed-off-by: Reynold Xin <rxin@apache.org>
2013-11-30 18:10:52 -08:00
Sundeep Narravula 4d53830eb7 Scheduler quits when createStage fails.
The current scheduler thread does not handle exceptions from createStage stage while launching new jobs. The thread fails on any exception that gets triggered at that level, leaving the cluster hanging with no schduler.
2013-11-30 16:18:12 -08:00
Aaron Davidson 96df26be47 Add spaces between tests 2013-11-29 13:20:43 -08:00
Prashant Sharma 5618af6803 Merge branch 'master' into wip-scala-2.10 2013-11-29 13:41:21 +05:30
Prashant Sharma 1bc83ca791 Changed defaults for akka to almost disable failure detector. 2013-11-29 13:41:05 +05:30
Lian, Cheng 4a1d966e26 More comments 2013-11-29 16:02:58 +08:00
Lian, Cheng 1e25086009 Updated some inline comments in DAGScheduler 2013-11-29 15:56:47 +08:00
Aaron Davidson 081a0b6861 Add unit test for SparkContext scheduler creation
Since YARN and Mesos are not necessarily available in the system,
they are allowed to pass as long as the YARN/Mesos code paths are
exercised.
2013-11-28 20:40:57 -08:00
Aaron Davidson 37f161cf6b Re-enable zk:// urls for Mesos SparkContexts
This was broken in PR #71 when we explicitly disallow anything that
didn't fit a mesos:// url.

Although it is not really clear that a zk:// url should match Mesos,
it is what the docs say and it is necessary for backwards compatibility.
2013-11-28 20:37:56 -08:00
Lian, Cheng 18def5d6f2 Bugfix: SPARK-965 & SPARK-966
SPARK-965: https://spark-project.atlassian.net/browse/SPARK-965
SPARK-966: https://spark-project.atlassian.net/browse/SPARK-966

* Add back DAGScheduler.start(), eventProcessActor is created and started here.

  Notice that function is only called by SparkContext.

* Cancel the scheduled stage resubmission task when stopping eventProcessActor

* Add a new DAGSchedulerEvent ResubmitFailedStages

  This event message is sent by the scheduled stage resubmission task to eventProcessActor.  In this way, DAGScheduler.resubmitFailedStages is guaranteed to be executed from the same thread that runs DAGScheduler.processEvent.

  Please refer to discussion in SPARK-966 for details.
2013-11-28 17:46:06 +08:00
Prashant Sharma 3ec5d74766 Fixed the broken build. 2013-11-28 13:02:28 +05:30
Matei Zaharia 743a31a7ca Merge pull request #210 from haitaoyao/http-timeout
add http timeout for httpbroadcast

While pulling task bytecode from HttpBroadcast server, there's no timeout value set. This may cause spark executor code hang and other task in the same executor process wait for the lock. I have encountered the issue in my cluster. Here's the stacktrace I captured  : https://gist.github.com/haitaoyao/7655830

So add a time out value to ensure the task fail fast.
2013-11-27 18:24:39 -08:00
Prashant Sharma 17987778da Merge branch 'master' into wip-scala-2.10
Conflicts:
	core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala
	core/src/main/scala/org/apache/spark/rdd/MapPartitionsRDD.scala
	core/src/main/scala/org/apache/spark/rdd/MapPartitionsWithContextRDD.scala
	core/src/main/scala/org/apache/spark/rdd/RDD.scala
	python/pyspark/rdd.py
2013-11-27 14:44:12 +05:30
Prashant Sharma 54862af5ee Improvements from the review comments and followed Boy Scout Rule. 2013-11-27 14:26:28 +05:30
Matei Zaharia fb6875dd5c Merge pull request #146 from JoshRosen/pyspark-custom-serializers
Custom Serializers for PySpark

This pull request adds support for custom serializers to PySpark.  For now, all Python-transformed (or parallelize()d RDDs) are serialized with the same serializer that's specified when creating SparkContext.

For now, PySpark includes `PickleSerDe` and `MarshalSerDe` classes for using Python's `pickle` and `marshal` serializers.  It's pretty easy to add support for other serializers, although I still need to add instructions on this.

A few notable changes:

- The Scala `PythonRDD` class no longer manipulates Pickled objects; data from `textFile` is written to Python as MUTF-8 strings.  The Python code performs the appropriate bookkeeping to track which deserializer should be used when reading an underlying JavaRDD.  This mechanism could also be used to support other data exchange formats, such as MsgPack.
- Several magic numbers were refactored into constants.
- Batching is implemented by wrapping / decorating an unbatched SerDe.
2013-11-26 20:55:40 -08:00
Matei Zaharia 330ada1766 Merge pull request #207 from henrydavidge/master
Log a warning if a task's serialized size is very big

As per Reynold's instructions, we now create a warning level log entry if a task's serialized size is too big. "Too big" is currently defined as 100kb. This warning message is generated at most once for each stage.
2013-11-26 19:08:33 -08:00
Harvey Feng afe4fe7f5e Merge remote-tracking branch 'origin/master' into yarn-2.2
Conflicts:
	yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala
2013-11-26 15:03:03 -08:00
hhd 57579934f0 Emit warning when task size > 100KB 2013-11-26 16:58:39 -05:00
Mark Hamstra ed7ecb93ce [SPARK-963] Wait for SparkListenerBus eventQueue to be empty before checking jobLogger state 2013-11-26 13:30:17 -08:00
Reynold Xin cb976dfb50 Merge pull request #209 from pwendell/better-docs
Improve docs for shuffle instrumentation
2013-11-26 10:23:19 -08:00
Prashant Sharma 560e44a8e1 Restored master address for client. 2013-11-26 18:18:05 +05:30
haitao.yao db998a6e14 add http timeout for httpbroadcast 2013-11-26 18:23:48 +08:00
Prashant Sharma d092a8cc6a Fixed compile time warnings and formatting post merge. 2013-11-26 15:21:50 +05:30
Matei Zaharia 18d6df0e17 Merge pull request #86 from holdenk/master
Add histogram functionality to DoubleRDDFunctions

This pull request add histogram functionality to the DoubleRDDFunctions.
2013-11-26 00:00:07 -08:00
Patrick Wendell 297c09d4bb Improve docs for shuffle instrumentation 2013-11-25 22:53:28 -08:00
Holden Karau 7222ee2977 Fix the test 2013-11-25 21:06:42 -08:00
Matei Zaharia 0e2109ddb2 Merge pull request #204 from rxin/hash
OpenHashSet fixes

Incorporated ideas from pull request #200.
- Use Murmur Hash 3 finalization step to scramble the bits of HashCode
  instead of the simpler version in java.util.HashMap; the latter one
  had trouble with ranges of consecutive integers. Murmur Hash 3 is used
  by fastutil.
- Don't check keys for equality when re-inserting due to growing the
  table; the keys will already be unique.
- Remember the grow threshold instead of recomputing it on each insert

Also added unit tests for size estimation for specialized hash sets and maps.
2013-11-25 20:48:37 -08:00
Matei Zaharia 14bb465bb3 Merge pull request #201 from rxin/mappartitions
Use the proper partition index in mapPartitionsWIthIndex

mapPartitionsWithIndex uses TaskContext.partitionId as the partition index. TaskContext.partitionId used to be identical to the partition index in a RDD. However, pull request #186 introduced a scenario (with partition pruning) that the two can be different. This pull request uses the right partition index in all mapPartitionsWithIndex related calls.

Also removed the extra MapPartitionsWIthContextRDD and put all the mapPartitions related functionality in MapPartitionsRDD.
2013-11-25 18:50:18 -08:00
Matei Zaharia eb4296c8f7 Merge pull request #101 from colorant/yarn-client-scheduler
For SPARK-527, Support spark-shell when running on YARN

sync to trunk and resubmit here

In current YARN mode approaching, the application is run in the Application Master as a user program thus the whole spark context is on remote.

This approaching won't support application that involve local interaction and need to be run on where it is launched.

So In this pull request I have a YarnClientClusterScheduler and backend added.

With this scheduler, the user application is launched locally,While the executor will be launched by YARN on remote nodes with a thin AM which only launch the executor and monitor the Driver Actor status, so that when client app is done, it can finish the YARN Application as well.

This enables spark-shell to run upon YARN.

This also enable other Spark applications to have the spark context to run locally with a master-url "yarn-client". Thus e.g. SparkPi could have the result output locally on console instead of output in the log of the remote machine where AM is running on.

Docs also updated to show how to use this yarn-client mode.
2013-11-25 15:25:29 -08:00
Prashant Sharma 44fd30d3fb Merge branch 'master' into scala-2.10-wip
Conflicts:
	core/src/main/scala/org/apache/spark/rdd/RDD.scala
	project/SparkBuild.scala
2013-11-25 18:10:54 +05:30
Prashant Sharma 489862a657 Remote death watch has a funny bug.
https://gist.github.com/ScrapCodes/4805fd84906e40b7b03d
2013-11-25 18:00:02 +05:30
Reynold Xin 466fd06475 Incorporated ideas from pull request #200.
- Use Murmur Hash 3 finalization step to scramble the bits of HashCode
  instead of the simpler version in java.util.HashMap; the latter one
  had trouble with ranges of consecutive integers. Murmur Hash 3 is used
  by fastutil.

- Don't check keys for equality when re-inserting due to growing the
  table; the keys will already be unique

- Remember the grow threshold instead of recomputing it on each insert
2013-11-25 18:27:26 +08:00
Reynold Xin 95c55df1c2 Added unit tests for size estimation for specialized hash sets and maps. 2013-11-25 18:27:06 +08:00
Prashant Sharma 77929cfeed Fine tuning defaults for akka and restored tracking of dissassociated events, for they are delivered when a remote TCP socket is closed. Also made transport failure heartbeats larger interval for it is mostly not needed. As we are using remote death watch instead. 2013-11-25 14:13:21 +05:30
Matei Zaharia 859d62dc2a Merge pull request #151 from russellcardullo/add-graphite-sink
Add graphite sink for metrics

This adds a metrics sink for graphite.  The sink must
be configured with the host and port of a graphite node
and optionally may be configured with a prefix that will
be prepended to all metrics that are sent to graphite.
2013-11-24 16:19:51 -08:00
Matei Zaharia 65de73c7f8 Merge pull request #185 from mkolod/random-number-generator
XORShift RNG with unit tests and benchmark

This patch was introduced to address SPARK-950 - the discussion below the ticket explains not only the rationale, but also the design and testing decisions: https://spark-project.atlassian.net/browse/SPARK-950

To run unit test, start SBT console and type:
compile
test-only org.apache.spark.util.XORShiftRandomSuite
To run benchmark, type:
project core
console
Once the Scala console starts, type:
org.apache.spark.util.XORShiftRandom.benchmark(100000000)
XORShiftRandom is also an object with a main method taking the
number of iterations as an argument, so you can also run it
from the command line.
2013-11-24 15:52:33 -08:00
Reynold Xin 972171b9d9 Merge pull request #197 from aarondav/patrick-fix
Fix 'timeWriting' stat for shuffle files

Due to concurrent git branches, changes from shuffle file consolidation patch
caused the shuffle write timing patch to no longer actually measure the time,
since it requires time be measured after the stream has been closed.
2013-11-25 07:50:46 +08:00
Reynold Xin e9ff13ec72 Consolidated both mapPartitions related RDDs into a single MapPartitionsRDD.
Also changed the semantics of the index parameter in mapPartitionsWithIndex from the partition index of the output partition to the partition index in the current RDD.
2013-11-24 17:56:43 +08:00
Matei Zaharia 9837a60234 Some other optimizations to AppendOnlyMap:
- Don't check keys for equality when re-inserting due to growing the
  table; the keys will already be unique
- Remember the grow threshold instead of recomputing it on each insert
2013-11-23 17:38:29 -08:00
Matei Zaharia 7535d7fbcb Fixes to AppendOnlyMap:
- Use Murmur Hash 3 finalization step to scramble the bits of HashCode
  instead of the simpler version in java.util.HashMap; the latter one
  had trouble with ranges of consecutive integers. Murmur Hash 3 is used
  by fastutil.
- Use Object.equals() instead of Scala's == to compare keys, because the
  latter does extra casts for numeric types (see the equals method in
  https://github.com/scala/scala/blob/master/src/library/scala/runtime/BoxesRunTime.java)
2013-11-23 17:21:37 -08:00
Harvey Feng 4f1c3fa5d7 Hadoop 2.2 YARN API migration for SPARK_HOME/new-yarn 2013-11-23 17:08:30 -08:00
Ankur Dave c1507afc6c Support preservesPartitioning in RDD.zipPartitions 2013-11-23 03:03:31 -08:00
Aaron Davidson ccea38b759 Fix 'timeWriting' stat for shuffle files
Due to concurrent git branches, changes from shuffle file consolidation patch
caused the shuffle write timing patch to no longer actually measure the time,
since it requires time be measured after the stream has been closed.
2013-11-21 21:36:08 -08:00
Reynold Xin f20093c3af Merge pull request #196 from pwendell/master
TimeTrackingOutputStream should pass on calls to close() and flush().

Without this fix you get a huge number of open files when running shuffles.
2013-11-22 10:12:13 +08:00
Raymond Liu ab3cefde53 Add YarnClientClusterScheduler and Backend.
With this scheduler, the user application is launched locally,
While the executor will be launched by YARN on remote nodes.

This enables spark-shell to run upon YARN.
2013-11-22 09:23:27 +08:00
Patrick Wendell 53b94ef2f5 TimeTrackingOutputStream should pass on calls to close() and flush().
Without this fix you get a huge number of open shuffles after running
shuffles.
2013-11-21 17:20:15 -08:00
Kay Ousterhout fc78f67da2 Added logging of scheduler delays to UI 2013-11-21 16:54:23 -08:00
Tathagata Das fd031679df Added partitioner aware union, modified DStream.window. 2013-11-21 11:28:37 -08:00
Prashant Sharma 95d8dbce91 Merge branch 'master' of github.com:apache/incubator-spark into scala-2.10-temp
Conflicts:
	core/src/main/scala/org/apache/spark/util/collection/PrimitiveVector.scala
	streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaStreamingContext.scala
2013-11-21 12:34:46 +05:30
Prashant Sharma 199e9cf02d Merge branch 'scala210-master' of github.com:colorant/incubator-spark into scala-2.10
Conflicts:
	core/src/main/scala/org/apache/spark/deploy/client/Client.scala
	core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala
	core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala
	core/src/test/scala/org/apache/spark/MapOutputTrackerSuite.scala
2013-11-21 11:55:48 +05:30
Reynold Xin 2fead510f7 Merge branch 'master' of github.com:tbfenet/incubator-spark
PartitionPruningRDD is using index from parent

I was getting a ArrayIndexOutOfBoundsException exception after doing union on pruned RDD. The index it was using on the partition was the index in the original RDD not the new pruned RDD.
2013-11-21 07:15:55 +08:00
Marek Kolodziej 22724659db Make XORShiftRandom explicit in KMeans and roll it back for RDD 2013-11-20 07:03:36 -05:00
Marek Kolodziej bcc6ed30bf Formatting and scoping (private[spark]) updates 2013-11-19 20:50:38 -05:00
Henry Saputra 43dfac5132 Merge branch 'master' into removesemicolonscala 2013-11-19 16:57:57 -08:00
Henry Saputra 10be58f251 Another set of changes to remove unnecessary semicolon (;) from Scala code.
Passed the sbt/sbt compile and test
2013-11-19 16:56:23 -08:00
Matei Zaharia f568912f85 Merge pull request #181 from BlackNiuza/fix_tasks_number
correct number of tasks in ExecutorsUI

Index `a` is not `execId` here
2013-11-19 16:11:31 -08:00
tgravescs 4093e9393a Impove Spark on Yarn Error handling 2013-11-19 12:44:00 -06:00
Henry Saputra 9c934b640f Remove the semicolons at the end of Scala code to make it more pure Scala code.
Also remove unused imports as I found them along the way.
Remove return statements when returning value in the Scala code.

Passing compile and tests.
2013-11-19 10:19:03 -08:00
Matthew Taylor f639b65eab PartitionPruningRDD is using index from parent(review changes) 2013-11-19 10:48:48 +00:00
Matthew Taylor 13b9bf494b PartitionPruningRDD is using index from parent 2013-11-19 06:27:33 +00:00
Holden Karau e163e31c20 Add spaces 2013-11-18 20:13:25 -08:00
Holden Karau 7de180fd13 Remove explicit boxing 2013-11-18 20:05:05 -08:00
Marek Kolodziej 99cfe89c68 Updates to reflect pull request code review 2013-11-18 22:00:36 -05:00
Marek Kolodziej 09bdfe3b16 XORShift RNG with unit tests and benchmark
To run unit test, start SBT console and type:
compile
test-only org.apache.spark.util.XORShiftRandomSuite
To run benchmark, type:
project core
console
Once the Scala console starts, type:
org.apache.spark.util.XORShiftRandom.benchmark(100000000)
2013-11-18 15:21:43 -05:00
Russell Cardullo 1360f62d15 Cleanup GraphiteSink.scala based on feedback
* Reorder imports according to the style guide
* Consistently use propertyToOption in all places
2013-11-18 08:53:39 -08:00
shiyun.wxm eda05fa439 use HashSet.empty[Long] instead of Seq[Long] 2013-11-18 13:31:14 +08:00
Aaron Davidson 85763f4942 Add PrimitiveVectorSuite and fix bug in resize() 2013-11-17 18:16:51 -08:00
Reynold Xin 16a2286d6d Return the vector itself for trim and resize method in PrimitiveVector. 2013-11-17 17:52:02 -08:00
BlackNiuza ecfbaf2442 rename "a" to "statusId" 2013-11-18 09:51:40 +08:00
Reynold Xin c30979c7d6 Slightly enhanced PrimitiveVector:
1. Added trim() method
2. Added size method.
3. Renamed getUnderlyingArray to array.
4. Minor documentation update.
2013-11-17 17:09:40 -08:00
BlackNiuza b60839e56a correct number of tasks in ExecutorsUI 2013-11-17 21:38:57 +08:00
Matei Zaharia 1b5b358309 Merge pull request #178 from hsaputra/simplecleanupcode
Simple cleanup on Spark's Scala code

Simple cleanup on Spark's Scala code while testing some modules:
-) Remove some of unused imports as I found them
-) Remove ";" in the imports statements
-) Remove () at the end of method calls like size that does not have size effect.
2013-11-16 11:44:10 -08:00
Kay Ousterhout 2b0a6e7d92 Fixed error message in ClusterScheduler to be consistent with the old LocalScheduler 2013-11-15 18:34:28 -08:00
Kay Ousterhout 0913c22971 Merge remote-tracking branch 'upstream/master' into consolidate_schedulers
Conflicts:
	core/src/main/scala/org/apache/spark/scheduler/cluster/ClusterTaskSetManager.scala
2013-11-15 10:59:33 -08:00
Henry Saputra c33f802044 Simple cleanup on Spark's Scala code while testing core and yarn modules:
-) Remove some of unused imports as I found them
-) Remove ";" in the imports statements
-) Remove () at the end of method call like size that does not have size effect.
2013-11-15 10:32:20 -08:00
Matei Zaharia 96e0fb4630 Merge pull request #173 from kayousterhout/scheduler_hang
Fix bug where scheduler could hang after task failure.

When a task fails, we need to call reviveOffers() so that the
task can be rescheduled on a different machine. In the current code,
the state in ClusterTaskSetManager indicating which tasks are
pending may be updated after revive offers is called (there's a
race condition here), so when revive offers is called, the task set
manager does not yet realize that there are failed tasks that need
to be relaunched.

This isn't currently unit tested but will be once my pull request for
merging the cluster and local schedulers goes in -- at which point
many more of the unit tests will exercise the code paths through
the cluster scheduler (currently the failure test suite uses the local
scheduler, which is why we didn't see this bug before).
2013-11-14 22:29:28 -08:00
Aaron Davidson f629ba95b6 Various merge corrections
I've diff'd this patch against my own -- since they were both created
independently, this means that two sets of eyes have gone over all the
merge conflicts that were created, so I'm feeling significantly more
confident in the resulting PR.

@rxin has looked at the changes to the repl and is resoundingly
confident that they are correct.
2013-11-14 22:13:09 -08:00
Matei Zaharia dfd40e9f6f Merge pull request #175 from kayousterhout/no_retry_not_serializable
Don't retry tasks when they fail due to a NotSerializableException

As with my previous pull request, this will be unit tested once the Cluster and Local schedulers get merged.
2013-11-14 19:44:50 -08:00
Matei Zaharia ed25105fd9 Merge pull request #174 from ahirreddy/master
Write Spark UI url to driver file on HDFS

This makes the SIMR code path simpler
2013-11-14 19:43:55 -08:00
Raymond Liu d4cd32330e Some fixes for previous master merge commits 2013-11-15 10:22:31 +08:00