ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Matei Zaharia	ca67909cd4	Merge pull request #311 from tmyklebu/master SPARK-991: Report information gleaned from a Python stacktrace in the UI Scala: - Added setCallSite/clearCallSite to SparkContext and JavaSparkContext. These functions mutate a LocalProperty called "externalCallSite." - Add a wrapper, getCallSite, that checks for an externalCallSite and, if none is found, calls the usual Utils.formatSparkCallSite. - Change everything that calls Utils.formatSparkCallSite to call getCallSite instead. Except getCallSite. - Add wrappers to setCallSite/clearCallSite wrappers to JavaSparkContext. Python: - Add a gruesome hack to rdd.py that inspects the traceback and guesses what you want to see in the UI. - Add a RAII wrapper around said gruesome hack that calls setCallSite/clearCallSite as appropriate. - Wire said RAII wrapper up around three calls into the Scala code. I'm not sure that I hit all the spots with the RAII wrapper. I'm also not sure that my gruesome hack does exactly what we want. One could also approach this change by refactoring runJob/submitJob/runApproximateJob to take a call site, then threading that parameter through everything that needs to know it. One might object to the pointless-looking wrappers in JavaSparkContext. Unfortunately, I can't directly access the SparkContext from Python---or, if I can, I don't know how---so I need to wrap everything that matters in JavaSparkContext. Conflicts: core/src/main/scala/org/apache/spark/api/java/JavaSparkContext.scala	2014-01-02 15:54:54 -05:00
Kay Ousterhout	a1b438d94d	Remove erroneous FAILED state for killed tasks. Currently, when tasks are killed, the Executor first sends a status update for the task with a "KILLED" state, and then sends a second status update with a "FAILED" state saying that the task failed due to an exception. The second FAILED state is misleading/unncessary, and occurs due to a NonLocalReturnControl Exception that gets thrown due to the way we kill tasks. This commit eliminates that problem.	2014-01-02 12:34:46 -08:00
Kay Ousterhout	5a3c00c958	Removed redundant TaskSetManager.error() function. This function was leftover from a while ago, and now just passes all calls through to the abort() function, so this commit deletes it.	2014-01-02 11:13:58 -08:00
Sean Owen	66d501276b	Suggested small changes to Java code for slightly more standard style, encapsulation and in some cases performance	2014-01-02 16:17:57 +00:00
Prashant Sharma	980afd280a	Merge branch 'scripts-reorg' of github.com:shane-huang/incubator-spark into spark-915-segregate-scripts Conflicts: bin/spark-shell core/pom.xml core/src/main/scala/org/apache/spark/SparkContext.scala core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/CoarseMesosSchedulerBackend.scala core/src/main/scala/org/apache/spark/ui/UIWorkloadGenerator.scala core/src/test/scala/org/apache/spark/DriverSuite.scala python/run-tests sbin/compute-classpath.sh sbin/spark-class sbin/stop-slaves.sh	2014-01-02 17:55:21 +05:30
Matei Zaharia	0f6060733d	Fixed two uses of conf.get with no default value in Mesos	2014-01-01 22:09:42 -05:00
Matei Zaharia	e2c68642c6	Miscellaneous fixes from code review. Also replaced SparkConf.getOrElse with just a "get" that takes a default value, and added getInt, getLong, etc to make code that uses this simpler later on.	2014-01-01 22:03:39 -05:00
Matei Zaharia	45ff8f413d	Merge remote-tracking branch 'apache/master' into conf2 Conflicts: core/src/main/scala/org/apache/spark/SparkContext.scala core/src/main/scala/org/apache/spark/metrics/MetricsSystem.scala core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala	2014-01-01 21:25:00 -05:00
Patrick Wendell	f8d245bdfc	Merge remote-tracking branch 'apache-github/master' into log4j-fix-2 Conflicts: streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala	2014-01-01 16:10:51 -08:00
Andrew Or	92c304fd03	Simplify ExternalAppendOnlyMap on the assumption that the mergeCombiners function is specified	2014-01-01 11:42:33 -08:00
Matei Zaharia	0e5b2adb5c	Merge remote-tracking branch 'apache/master' into conf2 Conflicts: project/SparkBuild.scala	2014-01-01 13:28:54 -05:00
Andrew Or	3bc9e391a3	Merge branch 'master' of github.com:andrewor14/incubator-spark	2013-12-31 20:02:12 -08:00
Andrew Or	83dfa16664	Address Patrick's and Reynold's comments	2013-12-31 20:02:05 -08:00
Reynold Xin	8b8e70ebde	Merge pull request #73 from falaki/ApproximateDistinctCount Approximate distinct count Added countApproxDistinct() to RDD and countApproxDistinctByKey() to PairRDDFunctions to approximately count distinct number of elements and distinct number of values per key, respectively. Both functions use HyperLogLog from stream-lib for counting. Both functions take a parameter that controls the trade-off between accuracy and memory consumption. Also added Scala docs and test suites for both methods.	2013-12-31 17:48:24 -08:00
Aaron Davidson	08302b113a	Rename IntermediateBlockId to TempBlockId	2013-12-31 17:44:15 -08:00
Patrick Wendell	37c43c9dd1	Adding outer checkout when initializing logging	2013-12-31 17:36:56 -08:00
Andrew Or	8bbe08b21e	Merge branch 'master' of github.com:andrewor14/incubator-spark	2013-12-31 17:26:26 -08:00
Andrew Or	53d8d36684	Add support and test for null keys in ExternalAppendOnlyMap Also add safeguard against use of destructively sorted AppendOnlyMap	2013-12-31 17:19:02 -08:00
Hossein Falaki	bee445c927	Made the code more compact and readable	2013-12-31 16:58:18 -08:00
Hossein Falaki	acb0323053	minor improvements	2013-12-31 15:34:26 -08:00
Matei Zaharia	ba9338f104	Merge remote-tracking branch 'apache/master' into conf2 Conflicts: core/src/main/scala/org/apache/spark/rdd/CheckpointRDD.scala streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala	2013-12-31 18:23:14 -05:00
Patrick Wendell	63b411dd86	Merge pull request #238 from ngbinh/upgradeNetty upgrade Netty from 4.0.0.Beta2 to 4.0.13.Final the changes are listed at https://github.com/netty/netty/wiki/New-and-noteworthy	2013-12-31 14:31:28 -08:00
Andrew Or	3ce22df954	Add warning message for spilling	2013-12-31 11:33:10 -08:00
Andrew Or	94ddc91d06	Address Aaron's and Jerry's comments	2013-12-31 10:50:08 -08:00
Patrick Wendell	55b7e2fdff	Merge pull request #289 from tdas/filestream-fix Bug fixes for file input stream and checkpointing - Fixed bugs in the file input stream that led the stream to fail due to transient HDFS errors (listing files when a background thread it deleting fails caused errors, etc.) - Updated Spark's CheckpointRDD and Streaming's CheckpointWriter to use SparkContext.hadoopConfiguration, to allow checkpoints to be written to any HDFS compatible store requiring special configuration. - Changed the API of SparkContext.setCheckpointDir() - eliminated the unnecessary 'useExisting' parameter. Now SparkContext will always create a unique subdirectory within the user specified checkpoint directory. This is to ensure that previous checkpoint files are not accidentally overwritten. - Fixed bug where setting checkpoint directory as a relative local path caused the checkpointing to fail.	2013-12-31 10:12:51 -08:00
Tathagata Das	fcd17a1e8e	Fixed comments and long lines based on comments on PR 289.	2013-12-31 02:01:45 -08:00
Patrick Wendell	4abb0c57ab	Tiny typo fix	2013-12-31 00:05:03 -08:00
Patrick Wendell	3c254f2eec	Minor fixes	2013-12-30 23:55:33 -08:00
Aaron Davidson	375d11743c	Add new line at end of file	2013-12-30 23:42:37 -08:00
Patrick Wendell	18181e6c41	Removing initLogging entirely	2013-12-30 23:39:47 -08:00
Aaron Davidson	daa7792ad6	Refactor SamplingSizeTracker into SizeTrackingAppendOnlyMap	2013-12-30 23:39:02 -08:00
Hossein Falaki	c3073b6cf2	Added Java API for countApproxDistinct	2013-12-30 19:31:06 -08:00
Hossein Falaki	ed06500d30	Added Java API for countApproxDistinctByKey	2013-12-30 19:30:42 -08:00
Hossein Falaki	a7de8e9b1c	Renamed countDistinct and countDistinctByKey methods to include Approx	2013-12-30 19:28:03 -08:00
Matei Zaharia	0fa5809768	Updated docs for SparkConf and handled review comments	2013-12-30 22:17:28 -05:00
Hossein Falaki	d50ccc5ca9	Using origin version	2013-12-30 15:08:34 -08:00
Andrew Or	347fafe4fc	Fix CheckpointSuite test fail	2013-12-30 13:10:33 -08:00
Andrew Or	d6e7910d92	Simplify merge logic based on the invariant that all spills contain unique keys	2013-12-30 13:01:00 -08:00
Patrick Wendell	1cbef081e3	Response to Shivaram's review	2013-12-30 12:46:09 -08:00
Andrew Or	2b71ab97c4	Merge pull request from aarondav: Utilize DiskBlockManager pathway for temp file writing This gives us a couple advantages: - Uses spark.local.dir and randomly selects a directory/disk. - Ensure files are deleted on normal DiskBlockManager cleanup. - Availability of same stats as usual DiskBlockObjectWriter (currenty unused). Also enable basic cleanup when iterator is fully drained. Still requires cleanup for operations that fail or don't go through all elements.	2013-12-30 11:01:30 -08:00
Patrick Wendell	50e3b8ec4c	Merge pull request #308 from kayousterhout/stage_naming Changed naming of StageCompleted event to be consistent The rest of the SparkListener events are named with "SparkListener" as the prefix of the name; this commit renames the StageCompleted event to SparkListenerStageCompleted for consistency.	2013-12-30 07:44:26 -08:00
Patrick Wendell	cffe1c1d5c	SPARK-1008: Logging improvments 1. Adds a default log4j file that gets loaded if users haven't specified a log4j file. 2. Isolates use of the tools assembly jar. I found this produced SLF4J warnings after building with SBT (and I've seen similar warnings on the mailing list).	2013-12-29 23:14:33 -08:00
Andrew Or	015a510b0a	Merge branch 'master' of github.com:andrewor14/incubator-spark	2013-12-29 22:03:47 -08:00
Andrew Or	4a014dc59c	Make serializer a parameter to ExternalAppendOnlyMap	2013-12-29 21:55:53 -08:00
Kay Ousterhout	c2c1af39f5	Updated code style according to Patrick's comments	2013-12-29 21:10:08 -08:00
Aaron Davidson	e3cac47e65	Use Comparator instead of Ordering lower object creation costs	2013-12-29 19:58:37 -08:00
Matei Zaharia	994f080f8a	Properly show Spark properties on web UI, and change app name property	2013-12-29 22:19:33 -05:00
Andrew Or	8fbff9f5d0	Address Aaron's comments	2013-12-29 16:22:44 -08:00
Matei Zaharia	11540b798d	Added tests for SparkConf and fixed a bug Typesafe Config caches system properties the first time it's invoked by default, ignoring later changes unless you do something special	2013-12-29 18:44:06 -05:00
Matei Zaharia	1ee7f5aee4	Fix a change that was lost during merge	2013-12-29 18:15:46 -05:00
Matei Zaharia	0bd1900cbc	Fix a few settings that were being read as system properties after merge	2013-12-29 15:38:46 -05:00
Patrick Wendell	7a99702ce2	Respect supervise option at Master	2013-12-29 12:12:58 -08:00
Matei Zaharia	b4ceed40d6	Merge remote-tracking branch 'origin/master' into conf2 Conflicts: core/src/main/scala/org/apache/spark/SparkContext.scala core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala core/src/main/scala/org/apache/spark/scheduler/cluster/ClusterTaskSetManager.scala core/src/main/scala/org/apache/spark/scheduler/local/LocalScheduler.scala core/src/main/scala/org/apache/spark/util/MetadataCleaner.scala core/src/test/scala/org/apache/spark/scheduler/TaskResultGetterSuite.scala core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala new-yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaStreamingContext.scala streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala streaming/src/test/scala/org/apache/spark/streaming/BasicOperationsSuite.scala streaming/src/test/scala/org/apache/spark/streaming/CheckpointSuite.scala streaming/src/test/scala/org/apache/spark/streaming/InputStreamsSuite.scala streaming/src/test/scala/org/apache/spark/streaming/TestSuiteBase.scala streaming/src/test/scala/org/apache/spark/streaming/WindowOperationsSuite.scala	2013-12-29 15:08:08 -05:00
Patrick Wendell	a8729770f5	Slight change to retry logic	2013-12-29 11:57:57 -08:00
Patrick Wendell	8da1012f9b	TODO clean-up	2013-12-29 11:38:12 -08:00
Patrick Wendell	faefea3fd8	Adding driver ID to submission response	2013-12-29 11:31:10 -08:00
Patrick Wendell	6ffa9bb226	Documentation and adding supervise option	2013-12-29 11:26:56 -08:00
Patrick Wendell	35f6dc252a	Changes to allow fate sharing of drivers/executors and workers.	2013-12-29 11:14:36 -08:00
Matei Zaharia	cd00225db9	Add SparkConf support in Python	2013-12-29 14:03:39 -05:00
Tor Myklebust	d812aeece9	Factor call site reporting out to SparkContext.	2013-12-28 23:21:49 -05:00
Matei Zaharia	20631348d1	Fix other failing tests	2013-12-28 23:17:58 -05:00
Matei Zaharia	5bbe73864e	Fix Executor not getting properties in local mode	2013-12-28 17:31:58 -05:00
Matei Zaharia	a16c52ed1b	Check for SPARK_YARN_MODE through a system property too since it can sometimes be set that way (undoes a change in previous commit)	2013-12-28 17:24:21 -05:00
Matei Zaharia	642029e7f4	Various fixes to configuration code - Got rid of global SparkContext.globalConf - Pass SparkConf to serializers and compression codecs - Made SparkConf public instead of private[spark] - Improved API of SparkContext and SparkConf - Switched executor environment vars to be passed through SparkConf - Fixed some places that were still using system properties - Fixed some tests, though others are still failing This still fails several tests in core, repl and streaming, likely due to properties not being set or cleared correctly (some of the tests run fine in isolation).	2013-12-28 17:13:15 -05:00
Patrick Wendell	7375047d51	Merge pull request #304 from kayousterhout/remove_unused Removed unused failed and causeOfFailure variables (in TaskSetManager)	2013-12-28 13:25:06 -08:00
Matei Zaharia	ad3dfd1531	Merge pull request #307 from kayousterhout/other_failure Removed unused OtherFailure TaskEndReason. The OtherFailure TaskEndReason was added by @mateiz 3 years ago in this commit: `24a1e7f838` Unless I am missing something, it doesn't seem to have been used then, and is not used now, so seems safe for deletion.	2013-12-27 22:10:14 -05:00
Kay Ousterhout	b4619e509b	Changed naming of StageCompleted event to be consistent The rest of the SparkListener events are named with "SparkListener" as the prefix of the name; this commit renames the StageCompleted event to SparkListenerStageCompleted for consistency.	2013-12-27 17:45:20 -08:00
Kay Ousterhout	e17d7518ab	Removed unused OtherFailure TaskEndReason.	2013-12-27 15:51:27 -08:00
Kay Ousterhout	8419148e5f	Remove unused hasPendingTasks methods	2013-12-27 15:19:42 -08:00
Patrick Wendell	c8c8b42a6f	Some notes and TODO about dependencies	2013-12-27 15:13:11 -08:00
Kay Ousterhout	0c71ffe924	Style fixes as per Reynold's review	2013-12-27 12:19:38 -08:00
Kay Ousterhout	8c81068e16	Fixed >100char lines in DAGScheduler.scala	2013-12-27 11:36:54 -08:00
Binh Nguyen	2c5bade4ee	Fix failed unit tests Also clean up a bit.	2013-12-27 11:24:30 -08:00
Kay Ousterhout	baaabcedc9	Removed unused failed and causeOfFailure variables	2013-12-27 11:12:36 -08:00
Aaron Davidson	2a7b3511f4	Add Apache headers	2013-12-27 10:55:16 -08:00
Reynold Xin	7be1e57786	Merge pull request #298 from aarondav/minor Minor: Decrease margin of left side of Log page Before ![before](https://f.cloud.github.com/assets/1400247/1812647/1a4be53e-6e87-11e3-9d5b-f851274be0e9.png) After ![after](https://f.cloud.github.com/assets/1400247/1812648/1ca1ea2c-6e87-11e3-946c-31be9258f450.png) It's a start anyway...	2013-12-26 23:41:40 -10:00
Andrew Or	d0cfbc41e2	Rename spark.shuffle.buffer variables	2013-12-27 00:07:09 -08:00
Andrew Or	8f3175773c	Final cleanup	2013-12-26 23:40:08 -08:00
Aaron Davidson	1dc0440c1a	Use real serializer & manual ordering	2013-12-26 23:40:08 -08:00
Aaron Davidson	0f66b7f2fc	Return efficient iterator if no spillage happened	2013-12-26 23:40:08 -08:00
Andrew Or	ec8c5dc644	Sort AppendOnlyMap in-place	2013-12-26 23:40:08 -08:00
Aaron Davidson	0289eb752a	Allow Product2 rather than just tuple kv pairs	2013-12-26 23:40:07 -08:00
Andrew Or	64b2d54a02	Move maps to util, and refactor more	2013-12-26 23:40:07 -08:00
Aaron Davidson	804beb43be	SamplingSizeTracker + Map + test suite	2013-12-26 23:40:07 -08:00
Andrew Or	7ad4408255	New minor edits	2013-12-26 23:40:07 -08:00
Aaron Davidson	fcc443b3db	Minor cleanup for Scala style	2013-12-26 23:40:07 -08:00
Andrew Or	2a2ca2a661	Add toggle for ExternalAppendOnlyMap in Aggregator and CoGroupedRDD	2013-12-26 23:40:07 -08:00
Andrew Or	28685a4820	Provide for cases when mergeCombiners is not specified in ExternalAppendOnlyMap	2013-12-26 23:40:07 -08:00
Andrew Or	17def8cc11	Refactor ExternalAppendOnlyMap to take in KVC instead of just KV	2013-12-26 23:40:07 -08:00
Andrew Or	6a45ec1972	Working ExternalAppendOnlyMap for both CoGroupedRDDs and Aggregator	2013-12-26 23:40:07 -08:00
Andrew Or	97fbb3ec52	Working ExternalAppendOnlyMap for Aggregator, but not for CoGroupedRDD	2013-12-26 23:40:07 -08:00
Aaron Davidson	4f2fb761b0	Decrease margin of left side of log page	2013-12-26 15:38:45 -08:00
Patrick Wendell	5c1b4f6405	Minor fixes	2013-12-26 14:39:39 -08:00
Tathagata Das	5fde4566ea	Added Apache boilerplate and class docs to PartitionerAwareUnionRDD.	2013-12-26 14:33:37 -08:00
Patrick Wendell	c23d640516	Addressing smaller changes from Aaron's review	2013-12-26 12:38:39 -08:00
Tathagata Das	3579647cdc	Merge branch 'apache-master' into window-improvement	2013-12-26 12:12:10 -08:00
Patrick Wendell	da20270b83	Merge pull request #1 from aarondav/driver Refactor DriverClient to be more Actor-based	2013-12-26 12:11:52 -08:00
Patrick Wendell	a97ad55c45	Removing accidental file	2013-12-26 12:11:28 -08:00
Tathagata Das	c4a54f51b5	Merge branch 'master' into window-improvement	2013-12-26 12:03:11 -08:00
Patrick Wendell	5938cfc153	Updated approach to driver restarting	2013-12-26 12:02:19 -08:00
Mark Hamstra	c529dceaff	Avoid a lump of coal (NPE) in JobProgressListener's stocking.	2013-12-25 23:10:02 -08:00
Tathagata Das	94479673eb	Fixed bug in PartitionAwareUnionRDD	2013-12-26 00:07:45 +00:00
Aaron Davidson	61372b11f4	Refactor DriverClient to be more Actor-based	2013-12-25 10:55:25 -08:00
walker	0af4b4f3e8	Bug fixes for updating the RDD block's memory and disk usage information	2013-12-25 20:07:01 +08:00
Patrick Wendell	bbc362833b	Removing un-used variable	2013-12-25 01:38:57 -08:00
Patrick Wendell	18ad419b52	Small fix from rebase	2013-12-25 01:22:38 -08:00
Patrick Wendell	55f833803a	Minor bug fix	2013-12-25 01:19:25 -08:00
Patrick Wendell	c9c0f745af	Minor style clean-up	2013-12-25 01:19:25 -08:00
Patrick Wendell	b2b7514ba3	Import clean-up (yay Aaron)	2013-12-25 01:19:25 -08:00
Patrick Wendell	d5f23e0083	Adding scheduling and reporting based on cores	2013-12-25 01:19:01 -08:00
Patrick Wendell	760823d393	Adding better option parsing	2013-12-25 01:19:01 -08:00
Patrick Wendell	6a4acc4c2d	Initial cut at driver submission.	2013-12-25 01:19:01 -08:00
Patrick Wendell	1070b566d4	Renaming Client => AppClient	2013-12-25 01:17:01 -08:00
Patrick Wendell	85a344b4f0	Merge pull request #127 from kayousterhout/consolidate_schedulers Deduplicate Local and Cluster schedulers. The code in LocalScheduler/LocalTaskSetManager was nearly identical to the code in ClusterScheduler/ClusterTaskSetManager. The redundancy made making updating the schedulers unnecessarily painful and error- prone. This commit combines the two into a single TaskScheduler/ TaskSetManager. Unfortunately the diff makes this change look much more invasive than it is -- TaskScheduler.scala is only superficially changed (names updated, overrides removed) from the old ClusterScheduler.scala, and the same with TaskSetManager.scala. Thanks @rxin for suggesting this change!	2013-12-24 16:35:06 -08:00
Binh Nguyen	786f393a98	Fix imports order	2013-12-24 14:59:30 -08:00
Binh Nguyen	9115a5de62	Remove import * and fix some formatting	2013-12-24 14:59:30 -08:00
Binh Nguyen	040dd3ecd5	upgrade Netty from 4.0.0.Beta2 to 4.0.13.Final	2013-12-24 14:58:18 -08:00
Patrick Wendell	c2dd6bcd6e	Merge pull request #279 from aarondav/shuffle-cleanup0 Clean up shuffle files once their metadata is gone Previously, we would only clean the in-memory metadata for consolidated shuffle files. Additionally, fixes a bug where the Metadata Cleaner was ignoring type-specific TTLs.	2013-12-24 14:36:47 -08:00
Kay Ousterhout	1efe3adf56	Responded to Reynold's style comments	2013-12-24 14:18:39 -08:00
Tathagata Das	d4dfab503a	Fixed Python API for sc.setCheckpointDir. Also other fixes based on Reynold's comments on PR 289.	2013-12-24 14:01:13 -08:00
Tathagata Das	9f79fd89dc	Merge branch 'apache-master' into filestream-fix	2013-12-24 11:38:17 -08:00
Prashant Sharma	2573add94c	spark-544, introducing SparkConf and related configuration overhaul.	2013-12-25 00:09:36 +05:30
Matei Zaharia	23a9ae6be3	Merge pull request #277 from tdas/scheduler-update Refactored the streaming scheduler and added StreamingListener interface - Refactored the streaming scheduler for cleaner code. Specifically, the JobManager was renamed to JobScheduler, as it does the actual scheduling of Spark jobs to the SparkContext. The earlier Scheduler was renamed to JobGenerator, as it actually generates the jobs from the DStreams. The JobScheduler starts the JobGenerator. Also, moved all the scheduler related code from spark.streaming to spark.streaming.scheduler package. - Implemented the StreamingListener interface, similar to SparkListener. The streaming version of StatusReportListener prints the batch processing time statistics (for now). Added StreamingListernerSuite to test it. - Refactored streaming TestSuiteBase for deduping code in the other streaming testsuites.	2013-12-24 00:08:48 -05:00
Reynold Xin	11107c9de5	Merge pull request #244 from leftnoteasy/master Added SPARK-968 implementation for review Added SPARK-968 implementation for review	2013-12-23 10:38:20 -08:00
wangda.tan	2f689ba97b	SPARK-968, added executor address showing in aggregated metrics by executors table	2013-12-23 15:03:45 +08:00
Kay Ousterhout	b7bfae1afe	Correctly merged in maxTaskFailures fix	2013-12-22 07:34:44 -08:00
wangda.tan	c979eecdf6	added changes according to comments from rxin	2013-12-22 21:43:15 +08:00
Kay Ousterhout	30186aa264	Renamed ClusterScheduler to TaskSchedulerImpl	2013-12-20 14:58:04 -08:00
Kay Ousterhout	c06945cfe0	Merge remote branch 'upstream/master' into consolidate_schedulers Conflicts: core/src/main/scala/org/apache/spark/scheduler/cluster/ClusterTaskSetManager.scala core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala	2013-12-20 14:39:30 -08:00
Patrick Wendell	0bc57c5767	Merge pull request #280 from aarondav/minor Minor cleanup for standalone scheduler See commit messages	2013-12-20 11:56:54 -08:00
Tathagata Das	61f4bbda0d	Added tests for PartitionerAwareUnionRDD in the CheckpointSuite. Refactored CheckpointSuite to make the tests simpler and more reliable. Added missing test for ZippedRDD.	2013-12-20 00:41:47 -08:00
Patrick Wendell	eca68d4425	Merge pull request #272 from tmyklebu/master Track and report task result serialisation time. - DirectTaskResult now has a ByteBuffer valueBytes instead of a T value. - DirectTaskResult now has a member function T value() that deserialises valueBytes. - Executor serialises value into a ByteBuffer and passes it to DTR's ctor. - Executor tracks the time taken to do so and puts it in a new field in TaskMetrics. - StagePage now reports serialisation time from TaskMetrics along with the other things it reported.	2013-12-19 18:12:22 -08:00
Aaron Davidson	6613ab663d	Fix compiler warning in SparkZooKeeperSession	2013-12-19 17:56:13 -08:00
Aaron Davidson	4d74b899b7	Remove firstApp from the standalone scheduler Master As a lonely child with no one to care for it... we had to put it down.	2013-12-19 17:53:41 -08:00
Aaron Davidson	1ab031eaff	Extraordinarily minor code/comment cleanup	2013-12-19 17:51:29 -08:00
Aaron Davidson	0647ec9757	Clean up shuffle files once their metadata is gone Previously, we would only clean the in-memory metadata for consolidated shuffle files. Additionally, fixes a bug where the Metadata Cleaner was ignoring type- specific TTLs.	2013-12-19 15:40:48 -08:00
Reynold Xin	7990c56375	Merge pull request #276 from shivaram/collectPartition Add collectPartition to JavaRDD interface. This interface is useful for implementing `take` from other language frontends where the data is serialized. Also remove `takePartition` from PythonRDD and use `collectPartition` in rdd.py. Thanks @concretevitamin for the original change and tests.	2013-12-19 13:35:09 -08:00
Tathagata Das	de41c436a0	Merge branch 'scheduler-update' into window-improvement Conflicts: streaming/src/main/scala/org/apache/spark/streaming/dstream/WindowedDStream.scala	2013-12-19 12:05:08 -08:00
Shivaram Venkataraman	9cc3a6d3c0	Add comment explaining collectPartitions's use	2013-12-19 11:49:17 -08:00
Shivaram Venkataraman	d3234f9726	Make collectPartitions take an array of partitions Change the implementation to use runJob instead of PartitionPruningRDD. Also update the unit tests and the python take implementation to use the new interface.	2013-12-19 11:40:34 -08:00
Tathagata Das	984c582487	Merge branch 'scheduler-update' into filestream-fix Conflicts: core/src/main/scala/org/apache/spark/rdd/CheckpointRDD.scala streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala streaming/src/test/scala/org/apache/spark/streaming/CheckpointSuite.scala	2013-12-19 11:20:48 -08:00
Nick Pentreath	a76f53416c	Add toString to Java RDD, and __repr__ to Python RDD	2013-12-19 14:38:20 +02:00
Tathagata Das	ec71b445ad	Minor changes.	2013-12-18 23:39:28 -08:00
Aaron Davidson	293a0af5a1	In experimental clusters we've observed that a 10 second timeout was insufficient, despite having a low number of nodes and relatively small workload (16 nodes, <1.5 TB data). This would cause an entire job to fail at the beginning of the reduce phase. There is no particular reason for this value to be small as a timeout should only occur in an exceptional situation. Also centralized the reading of spark.akka.askTimeout to AkkaUtils (surely this can later be cleaned up to use Typesafe). Finally, deleted some lurking implicits. If anyone can think of a reason they should still be there, please let me know.	2013-12-18 21:42:29 -08:00
Tathagata Das	e93b391d75	Merge branch 'apache-master' into scheduler-update Conflicts: streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala streaming/src/main/scala/org/apache/spark/streaming/dstream/ForEachDStream.scala	2013-12-18 17:51:14 -08:00
Tathagata Das	b80ec05635	Added StatsReportListener to generate processing time statistics across multiple batches.	2013-12-18 15:35:24 -08:00
Shivaram Venkataraman	af0cd6bd27	Add collectPartition to JavaRDD interface. Also remove takePartition from PythonRDD and use collectPartition in rdd.py.	2013-12-18 11:40:07 -08:00
Tor Myklebust	d3b1af4b6c	Add a serialisation time column to the StagePage.	2013-12-18 14:25:56 -05:00
Reynold Xin	9a6864d016	Fixed a performance problem in RDD.top and BoundedPriorityQueue (size in BoundedPriority was actually traversing the entire queue to calculate the size, resulting in bad performance in insertion).	2013-12-17 18:44:39 -08:00
wangda.tan	59e53fa21c	spark-968, changes for avoid a NPE	2013-12-17 17:57:27 +08:00
wangda.tan	36060f4f50	spark-898, changes according to review comments	2013-12-17 17:55:38 +08:00
Tor Myklebust	b2f0329511	Missed a spot; had an objectSer here too.	2013-12-17 00:18:46 -05:00
Tor Myklebust	25fa976580	Merge branch 'master' of git://github.com/apache/incubator-spark	2013-12-16 23:48:37 -05:00
Tor Myklebust	963d6f065a	Incorporate pwendell's code review suggestions.	2013-12-16 23:14:52 -05:00
Reynold Xin	883e034aeb	Merge pull request #245 from gregakespret/task-maxfailures-fix Fix for spark.task.maxFailures not enforced correctly. Docs at http://spark.incubator.apache.org/docs/latest/configuration.html say: ``` spark.task.maxFailures Number of individual task failures before giving up on the job. Should be greater than or equal to 1. Number of allowed retries = this value - 1. ``` Previous implementation worked incorrectly. When for example `spark.task.maxFailures` was set to 1, the job was aborted only after the second task failure, not after the first one.	2013-12-16 14:16:02 -08:00
Tor Myklebust	882d544856	UI to display serialisation time of a stage.	2013-12-16 13:27:03 -05:00
Tor Myklebust	8a397a959b	Track task value serialisation time in TaskMetrics.	2013-12-16 12:07:39 -05:00
wangda.tan	8ab8c6a526	Merge branch 'master' of git://github.com/apache/incubator-spark	2013-12-16 21:45:43 +08:00
Reynold Xin	bad85b051d	Use murmur3 hash for open hashset. (cherry picked from commit 212ff6834515543163aa63a3f4f762ebe641f8ca) Signed-off-by: Ankur Dave <ankurdave@gmail.com>	2013-12-15 17:23:15 -08:00
Josh Rosen	2fd781d347	Merge pull request #249 from ngbinh/partitionInJavaSortByKey Expose numPartitions parameter in JavaPairRDD.sortByKey() This change makes Java and Scala API on sortByKey() the same.	2013-12-14 12:59:37 -08:00
Prashant Sharma	1ae3c0fc5e	Added a comment about ActorRef and ActorSelection difference.	2013-12-14 10:44:24 +05:30
Prashant Sharma	a854cc536d	Review comments on the PR for scala 2.10 migration.	2013-12-13 15:19:51 +05:30
Tathagata Das	097e120c0c	Refactored streaming scheduler and added listener interface. - Refactored Scheduler + JobManager to JobGenerator + JobScheduler and added JobSet for cleaner code. Moved scheduler related code to streaming.scheduler package. - Added StreamingListener trait (similar to SparkListener) to enable gathering to streaming stats like processing times and delays. StreamingContext.addListener() to added listeners. - Deduped some code in streaming tests by modifying TestSuiteBase, and added StreamingListenerSuite.	2013-12-12 20:48:02 -08:00
Tathagata Das	5e9ce83d68	Fixed multiple file stream and checkpointing bugs. - Made file stream more robust to transient failures. - Changed Spark.setCheckpointDir API to not have the second 'useExisting' parameter. Spark will always create a unique directory for checkpointing underneath the directory provide to the funtion. - Fixed bug wrt local relative paths as checkpoint directory. - Made DStream and RDD checkpointing use SparkContext.hadoopConfiguration, so that more HDFS compatible filesystems are supported for checkpointing.	2013-12-11 14:01:36 -08:00
Prashant Sharma	603af51bb5	Merge branch 'master' into akka-bug-fix Conflicts: core/pom.xml core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala pom.xml project/SparkBuild.scala streaming/pom.xml yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala	2013-12-11 10:21:53 +05:30
Binh Nguyen	0b494f7db4	Hook directly to Scala API	2013-12-10 11:17:52 -08:00
Binh Nguyen	e85af50767	Leave default value of numPartitions to Scala code.	2013-12-10 11:04:14 -08:00
Binh Nguyen	c82d4f079b	Use braces to shorten the line.	2013-12-10 01:04:52 -08:00
Binh Nguyen	5013fb64b2	Expose numPartitions parameter in JavaPairRDD.sortByKey() This change make Java and Scala API on sortByKey() the same.	2013-12-10 00:38:16 -08:00
Prashant Sharma	17db6a9041	Style fixes and addressed review comments at #221	2013-12-10 11:47:16 +05:30
Patrick Wendell	5b74609d97	License headers	2013-12-09 16:41:01 -08:00
Grega Kespret	14a1df6572	Fix for spark.task.maxFailures not enforced correctly.	2013-12-09 10:39:02 +01:00
Aaron Davidson	40f63eb034	Merge master into 127	2013-12-08 11:16:52 -08:00
wangda.tan	850c4b709a	Merge branch 'master' of https://github.com/leftnoteasy/incubator-spark-1	2013-12-09 00:12:46 +08:00
wangda.tan	48e4f2ad14	SPARK-968, In stage UI, add an overview section that shows task stats grouped by executor id	2013-12-09 00:02:59 +08:00
Matei Zaharia	e0392343a0	Merge pull request #190 from markhamstra/Stages4Jobs stageId <--> jobId mapping in DAGScheduler Okay, I think this one is ready to go -- or at least it's ready for review and discussion. It's a carry-over of https://github.com/mesos/spark/pull/842 with updates for the newer job cancellation functionality. The prior discussion still applies. I've actually changed the job cancellation flow a bit: Instead of ``cancelTasks`` going to the TaskScheduler and then ``taskSetFailed`` coming back to the DAGScheduler (resulting in ``abortStage`` there), the DAGScheduler now takes care of figuring out which stages should be cancelled, tells the TaskScheduler to cancel tasks for those stages, then does the cleanup within the DAGScheduler directly without the need for any further prompting by the TaskScheduler. I know of three outstanding issues, each of which can and should, I believe, be handled in follow-up pull requests: 1) https://spark-project.atlassian.net/browse/SPARK-960 2) JobLogger should be re-factored to eliminate duplication 3) Related to 2), the WebUI should also become a consumer of the DAGScheduler's new understanding of the relationship between jobs and stages so that it can display progress indication and the like grouped by job. Right now, some of this information is just being sent out as part of ``SparkListenerJobStart`` messages, but more or different job <--> stage information may need to be exported from the DAGScheduler to meet listeners needs. Except for the eventQueue -> Actor commit, the rest can be cherry-picked almost cleanly into branch-0.8. A little merging is needed in MapOutputTracker and the DAGScheduler. Merged versions of those files are in `aba2b40ce0` Note that between the recent Actor change in the DAGScheduler and the cleaning up of DAGScheduler data structures on job completion in this PR, some races have been introduced into the DAGSchedulerSuite. Those tests usually pass, and I don't think that better-behaved code that doesn't directly inspect DAGScheduler data structures should be seeing any problems, but I'll work on fixing DAGSchedulerSuite as either an addition to this PR or as a separate request. UPDATE: Fixed the race that I introduced. Created a JIRA issue (SPARK-965) for the one that was introduced with the switch to eventProcessorActor in the DAGScheduler.	2013-12-06 11:49:59 -08:00
Matei Zaharia	bfa68609d9	Merge pull request #233 from hsaputra/changecontexttobackend Change the name of input argument in ClusterScheduler#initialize from context to backend. The SchedulerBackend used to be called ClusterSchedulerContext so just want to make small change of the input param in the ClusterScheduler#initialize to reflect this.	2013-12-06 11:04:03 -08:00
Matei Zaharia	3fb302c08d	Merge pull request #205 from kayousterhout/logging Added logging of scheduler delays to UI This commit adds two metrics to the UI: 1) The time to get task results, if they're fetched remotely 2) The scheduler delay. When the scheduler starts getting overwhelmed (because it can't keep up with the rate at which tasks are being submitted), the result is that tasks get delayed on the tail-end: the message from the worker saying that the task has completed ends up in a long queue and takes a while to be processed by the scheduler. This commit records that delay in the UI so that users can tell when the scheduler is becoming the bottleneck.	2013-12-06 11:03:32 -08:00
Matei Zaharia	87676a6af2	Merge pull request #220 from rxin/zippart Memoize preferred locations in ZippedPartitionsBaseRDD so preferred location computation doesn't lead to exponential explosion. This was a problem in GraphX where we have a whole chain of RDDs that are ZippedPartitionsRDD's, and the preferred locations were taking eternity to compute. (cherry picked from commit `e36fe55a03`) Signed-off-by: Reynold Xin <rxin@apache.org>	2013-12-06 11:01:42 -08:00
Aaron Davidson	94b5881ee9	Fix long lines	2013-12-06 00:22:00 -08:00
Aaron Davidson	5a864e3fce	Rename SparkActorSystem to IndestructibleActorSystem	2013-12-06 00:21:43 -08:00
Prashant Sharma	c9cd2af71e	Merge branch 'wip-scala-2.10' into akka-bug-fix	2013-12-06 13:32:15 +05:30
Prashant Sharma	4e70480038	A left over akka -> akka.tcp changes	2013-12-06 12:29:53 +05:30
Henry Saputra	1cb259cb57	Change the name of input ragument in ClusterScheduler#initialize from context to backend. The SchedulerBackend used to be called ClusterSchedulerContext so just want to make small change of the input param in the ClusterScheduler#initialize to reflect this.	2013-12-05 18:50:26 -08:00
Mark Hamstra	aebb123fd3	jobWaiter.synchronized before jobWaiter.wait	2013-12-05 17:16:44 -08:00
Reynold Xin	3fc4534d19	wip delta join.	2013-12-05 14:55:26 -08:00
Patrick Wendell	5d460253d6	Merge pull request #228 from pwendell/master Document missing configs and set shuffle consolidation to false.	2013-12-05 12:31:24 -08:00
Matei Zaharia	72b696156c	Merge pull request #199 from harveyfeng/yarn-2.2 Hadoop 2.2 migration Includes support for the YARN API stabilized in the Hadoop 2.2 release, and a few style patches. Short description for each set of commits: `a98f5a0` - "Misc style changes in the 'yarn' package" `a67ebf4` - "A few more style fixes in the 'yarn' package" Both of these are some minor style changes, such as fixing lines over 100 chars, to the existing YARN code. `ab8652f` - "Add a 'new-yarn' directory ... " Copies everything from `SPARK_HOME/yarn` to `SPARK_HOME/new-yarn`. No actual code changes here. `4f1c3fa` - "Hadoop 2.2 YARN API migration ..." API patches to code in the `SPARK_HOME/new-yarn` directory. There are a few more small style changes mixed in, too. Based on @colorant's Hadoop 2.2 support for the scala-2.10 branch in #141. `a1a1c62` - "Add optional Hadoop 2.2 settings in sbt build ... " If Spark should be built against Hadoop 2.2, then: a) the `org.apache.spark.deploy.yarn` package will be compiled from the `new-yarn` directory. b) Protobuf v2.5 will be used as a Spark dependency, since Hadoop 2.2 depends on it. Also, Spark will be built against a version of Akka v2.0.5 that's built against Protobuf 2.5, named `akka-2.0.5-protobuf-2.5`. The patched Akka is here: https://github.com/harveyfeng/akka/tree/2.0.5-protobuf-2.5, and was published to local Ivy during testing. There's also a new boolean environment variable, `SPARK_IS_NEW_HADOOP`, that users can manually set if their `SPARK_HADOOP_VERSION` specification does not start with `2.2`, which is how the build file tries to detect a 2.2 version. Not sure if this is necessary or done in the best way, though...	2013-12-04 23:33:04 -08:00
Patrick Wendell	b1c6fa1584	Document missing configs and set shuffle consolidation to false.	2013-12-04 18:39:34 -08:00
Patrick Wendell	182f9baeed	Merge pull request #227 from pwendell/master Fix small bug in web UI and minor clean-up. There was a bug where sorting order didn't work correctly for write time metrics. I also cleaned up some earlier code that fixed the same issue for read and write bytes.	2013-12-04 15:52:07 -08:00
Patrick Wendell	380b90b9b3	Fix small bug in web UI and minor clean-up. There was a bug where sorting order didn't work correctly for write time metrics. I also cleaned up some earlier code that fixed the same issue for read and write bytes.	2013-12-04 14:41:48 -08:00
Andrew Ash	217611680d	Add missing space after "Serialized" in StorageLevel Current code creates outputs like: scala> res0.getStorageLevel.description res2: String = Serialized1x Replicated	2013-12-04 11:29:20 -08:00
Matei Zaharia	d6e5473872	Merge pull request #223 from rxin/transient Mark partitioner, name, and generator field in RDD as @transient. As part of the effort to reduce serialized task size.	2013-12-04 10:28:50 -08:00
Reynold Xin	974a69d79c	Marked doCheckpointCalled as transient.	2013-12-03 11:34:38 -08:00
Mark Hamstra	403234dd0d	SparkListenerJobStart posted from local jobs	2013-12-03 09:57:32 -08:00
Mark Hamstra	f55d0b935d	Synchronous, inline cleanup after runLocally	2013-12-03 09:57:32 -08:00
Mark Hamstra	c9fcd909d0	Local jobs post SparkListenerJobEnd, and DAGScheduler data structure cleanup always occurs before any posting of SparkListenerJobEnd.	2013-12-03 09:57:32 -08:00
Mark Hamstra	9ae2d094a9	Tightly couple stageIdToJobIds and jobIdToStageIds	2013-12-03 09:57:32 -08:00
Mark Hamstra	27c45e5236	Cleaned up job cancellation handling	2013-12-03 09:57:32 -08:00
Mark Hamstra	686a420ddc	Refactoring to make job removal, stage removal, task cancellation clearer	2013-12-03 09:57:32 -08:00
Mark Hamstra	205566e56e	Improved comment	2013-12-03 09:57:32 -08:00
Mark Hamstra	94087c463b	Removed redundant residual re: reverted refactoring.	2013-12-03 09:57:31 -08:00
Mark Hamstra	982797dcba	Fixed intended side-effects	2013-12-03 09:57:31 -08:00
Mark Hamstra	6f8359b5ad	Actor instead of eventQueue for LocalJobCompleted	2013-12-03 09:57:31 -08:00
Mark Hamstra	51458ab4a1	Added stageId <--> jobId mapping in DAGScheduler ...and make sure that DAGScheduler data structures are cleaned up on job completion. Initial effort and discussion at https://github.com/mesos/spark/pull/842	2013-12-03 09:57:31 -08:00
Reynold Xin	58d9bbcfec	Merge pull request #217 from aarondav/mesos-urls Re-enable zk:// urls for Mesos SparkContexts This was broken in PR #71 when we explicitly disallow anything that didn't fit a mesos:// url. Although it is not really clear that a zk:// url should match Mesos, it is what the docs say and it is necessary for backwards compatibility. Additionally added a unit test for the creation of all types of TaskSchedulers. Since YARN and Mesos are not necessarily available in the system, they are allowed to pass as long as the YARN/Mesos code paths are exercised.	2013-12-02 21:58:53 -08:00
Prashant Sharma	09e8be9a62	Made running SparkActorSystem specific to executors only.	2013-12-03 11:27:45 +05:30
Aaron Davidson	0f24576c08	Cleanup and documentation of SparkActorSystem	2013-12-03 11:05:12 +05:30
Reynold Xin	e34b4693d3	Mark partitioner, name, and generator field in RDD as @transient.	2013-12-02 21:24:44 -08:00
Kay Ousterhout	58b3aff9a8	Fixed problem with scheduler delay	2013-12-02 20:30:03 -08:00
Aaron Davidson	f6c8c1c7b6	Cleanup and documentation of SparkActorSystem	2013-12-02 11:42:53 -08:00
Prashant Sharma	5b11028a04	Made akka capable of tolerating fatal exceptions and moving on.	2013-12-02 10:47:39 +05:30
Reynold Xin	740922f25d	Merge pull request #219 from sundeepn/schedulerexception Scheduler quits when newStage fails The current scheduler thread does not handle exceptions from newStage stage while launching new jobs. The thread fails on any exception that gets triggered at that level, leaving the cluster hanging with no schduler.	2013-12-01 12:46:58 -08:00
Sundeep Narravula	be3ea2394f	Log exception in scheduler in addition to passing it to the caller. Code Styling changes.	2013-12-01 00:50:34 -08:00
Reynold Xin	9cf7f31e4d	Memoize preferred locations in ZippedPartitionsBaseRDD so preferred location computation doesn't lead to exponential explosion. (cherry picked from commit `e36fe55a03`) Signed-off-by: Reynold Xin <rxin@apache.org>	2013-11-30 18:10:52 -08:00
Reynold Xin	e36fe55a03	Memoize preferred locations in ZippedPartitionsBaseRDD so preferred location computation doesn't lead to exponential explosion.	2013-11-30 18:07:36 -08:00
Sundeep Narravula	4d53830eb7	Scheduler quits when createStage fails. The current scheduler thread does not handle exceptions from createStage stage while launching new jobs. The thread fails on any exception that gets triggered at that level, leaving the cluster hanging with no schduler.	2013-11-30 16:18:12 -08:00
Prashant Sharma	5618af6803	Merge branch 'master' into wip-scala-2.10	2013-11-29 13:41:21 +05:30
Prashant Sharma	1bc83ca791	Changed defaults for akka to almost disable failure detector.	2013-11-29 13:41:05 +05:30
Lian, Cheng	4a1d966e26	More comments	2013-11-29 16:02:58 +08:00
Lian, Cheng	1e25086009	Updated some inline comments in DAGScheduler	2013-11-29 15:56:47 +08:00
Aaron Davidson	081a0b6861	Add unit test for SparkContext scheduler creation Since YARN and Mesos are not necessarily available in the system, they are allowed to pass as long as the YARN/Mesos code paths are exercised.	2013-11-28 20:40:57 -08:00
Aaron Davidson	37f161cf6b	Re-enable zk:// urls for Mesos SparkContexts This was broken in PR #71 when we explicitly disallow anything that didn't fit a mesos:// url. Although it is not really clear that a zk:// url should match Mesos, it is what the docs say and it is necessary for backwards compatibility.	2013-11-28 20:37:56 -08:00
Lian, Cheng	18def5d6f2	Bugfix: SPARK-965 & SPARK-966 SPARK-965: https://spark-project.atlassian.net/browse/SPARK-965 SPARK-966: https://spark-project.atlassian.net/browse/SPARK-966 * Add back DAGScheduler.start(), eventProcessActor is created and started here. Notice that function is only called by SparkContext. * Cancel the scheduled stage resubmission task when stopping eventProcessActor * Add a new DAGSchedulerEvent ResubmitFailedStages This event message is sent by the scheduled stage resubmission task to eventProcessActor. In this way, DAGScheduler.resubmitFailedStages is guaranteed to be executed from the same thread that runs DAGScheduler.processEvent. Please refer to discussion in SPARK-966 for details.	2013-11-28 17:46:06 +08:00
Prashant Sharma	3ec5d74766	Fixed the broken build.	2013-11-28 13:02:28 +05:30
Matei Zaharia	743a31a7ca	Merge pull request #210 from haitaoyao/http-timeout add http timeout for httpbroadcast While pulling task bytecode from HttpBroadcast server, there's no timeout value set. This may cause spark executor code hang and other task in the same executor process wait for the lock. I have encountered the issue in my cluster. Here's the stacktrace I captured : https://gist.github.com/haitaoyao/7655830 So add a time out value to ensure the task fail fast.	2013-11-27 18:24:39 -08:00
Prashant Sharma	17987778da	Merge branch 'master' into wip-scala-2.10 Conflicts: core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala core/src/main/scala/org/apache/spark/rdd/MapPartitionsRDD.scala core/src/main/scala/org/apache/spark/rdd/MapPartitionsWithContextRDD.scala core/src/main/scala/org/apache/spark/rdd/RDD.scala python/pyspark/rdd.py	2013-11-27 14:44:12 +05:30
Prashant Sharma	54862af5ee	Improvements from the review comments and followed Boy Scout Rule.	2013-11-27 14:26:28 +05:30
Reynold Xin	95e83af209	More, bigger cleaning for better encapsulation of VertexSetRDD and VertexPartition. This is work in progress as stuff doesn't really run.	2013-11-27 00:30:26 -08:00
Reynold Xin	caba162861	Added join and aggregateUsingIndex to VertexPartition.	2013-11-26 21:02:39 -08:00
Matei Zaharia	fb6875dd5c	Merge pull request #146 from JoshRosen/pyspark-custom-serializers Custom Serializers for PySpark This pull request adds support for custom serializers to PySpark. For now, all Python-transformed (or parallelize()d RDDs) are serialized with the same serializer that's specified when creating SparkContext. For now, PySpark includes `PickleSerDe` and `MarshalSerDe` classes for using Python's `pickle` and `marshal` serializers. It's pretty easy to add support for other serializers, although I still need to add instructions on this. A few notable changes: - The Scala `PythonRDD` class no longer manipulates Pickled objects; data from `textFile` is written to Python as MUTF-8 strings. The Python code performs the appropriate bookkeeping to track which deserializer should be used when reading an underlying JavaRDD. This mechanism could also be used to support other data exchange formats, such as MsgPack. - Several magic numbers were refactored into constants. - Batching is implemented by wrapping / decorating an unbatched SerDe.	2013-11-26 20:55:40 -08:00
Matei Zaharia	330ada1766	Merge pull request #207 from henrydavidge/master Log a warning if a task's serialized size is very big As per Reynold's instructions, we now create a warning level log entry if a task's serialized size is too big. "Too big" is currently defined as 100kb. This warning message is generated at most once for each stage.	2013-11-26 19:08:33 -08:00
Harvey Feng	afe4fe7f5e	Merge remote-tracking branch 'origin/master' into yarn-2.2 Conflicts: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala	2013-11-26 15:03:03 -08:00
hhd	57579934f0	Emit warning when task size > 100KB	2013-11-26 16:58:39 -05:00
Reynold Xin	2d19d0381b	Merge branch 'simplify' into clean	2013-11-26 13:55:26 -08:00
Reynold Xin	d58bfa8573	Code cleaning to improve readability.	2013-11-26 13:54:46 -08:00
Reynold Xin	cb976dfb50	Merge pull request #209 from pwendell/better-docs Improve docs for shuffle instrumentation	2013-11-26 10:23:19 -08:00
Prashant Sharma	560e44a8e1	Restored master address for client.	2013-11-26 18:18:05 +05:30
Reynold Xin	d074e4c6ab	Bring PrimitiveVector up to date.	2013-11-26 02:49:41 -08:00
haitao.yao	db998a6e14	add http timeout for httpbroadcast	2013-11-26 18:23:48 +08:00
Prashant Sharma	d092a8cc6a	Fixed compile time warnings and formatting post merge.	2013-11-26 15:21:50 +05:30
Matei Zaharia	18d6df0e17	Merge pull request #86 from holdenk/master Add histogram functionality to DoubleRDDFunctions This pull request add histogram functionality to the DoubleRDDFunctions.	2013-11-26 00:00:07 -08:00
Patrick Wendell	297c09d4bb	Improve docs for shuffle instrumentation	2013-11-25 22:53:28 -08:00
Holden Karau	7222ee2977	Fix the test	2013-11-25 21:06:42 -08:00
Matei Zaharia	0e2109ddb2	Merge pull request #204 from rxin/hash OpenHashSet fixes Incorporated ideas from pull request #200. - Use Murmur Hash 3 finalization step to scramble the bits of HashCode instead of the simpler version in java.util.HashMap; the latter one had trouble with ranges of consecutive integers. Murmur Hash 3 is used by fastutil. - Don't check keys for equality when re-inserting due to growing the table; the keys will already be unique. - Remember the grow threshold instead of recomputing it on each insert Also added unit tests for size estimation for specialized hash sets and maps.	2013-11-25 20:48:37 -08:00
Matei Zaharia	14bb465bb3	Merge pull request #201 from rxin/mappartitions Use the proper partition index in mapPartitionsWIthIndex mapPartitionsWithIndex uses TaskContext.partitionId as the partition index. TaskContext.partitionId used to be identical to the partition index in a RDD. However, pull request #186 introduced a scenario (with partition pruning) that the two can be different. This pull request uses the right partition index in all mapPartitionsWithIndex related calls. Also removed the extra MapPartitionsWIthContextRDD and put all the mapPartitions related functionality in MapPartitionsRDD.	2013-11-25 18:50:18 -08:00
Matei Zaharia	eb4296c8f7	Merge pull request #101 from colorant/yarn-client-scheduler For SPARK-527, Support spark-shell when running on YARN sync to trunk and resubmit here In current YARN mode approaching, the application is run in the Application Master as a user program thus the whole spark context is on remote. This approaching won't support application that involve local interaction and need to be run on where it is launched. So In this pull request I have a YarnClientClusterScheduler and backend added. With this scheduler, the user application is launched locally,While the executor will be launched by YARN on remote nodes with a thin AM which only launch the executor and monitor the Driver Actor status, so that when client app is done, it can finish the YARN Application as well. This enables spark-shell to run upon YARN. This also enable other Spark applications to have the spark context to run locally with a master-url "yarn-client". Thus e.g. SparkPi could have the result output locally on console instead of output in the log of the remote machine where AM is running on. Docs also updated to show how to use this yarn-client mode.	2013-11-25 15:25:29 -08:00
Prashant Sharma	44fd30d3fb	Merge branch 'master' into scala-2.10-wip Conflicts: core/src/main/scala/org/apache/spark/rdd/RDD.scala project/SparkBuild.scala	2013-11-25 18:10:54 +05:30
Prashant Sharma	489862a657	Remote death watch has a funny bug. https://gist.github.com/ScrapCodes/4805fd84906e40b7b03d	2013-11-25 18:00:02 +05:30
Reynold Xin	466fd06475	Incorporated ideas from pull request #200 . - Use Murmur Hash 3 finalization step to scramble the bits of HashCode instead of the simpler version in java.util.HashMap; the latter one had trouble with ranges of consecutive integers. Murmur Hash 3 is used by fastutil. - Don't check keys for equality when re-inserting due to growing the table; the keys will already be unique - Remember the grow threshold instead of recomputing it on each insert	2013-11-25 18:27:26 +08:00

... 3 4 5 6 7 ...

2723 commits