ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
eklavya	6a65feebc7	Added foreachPartition method to JavaRDD.	2014-01-13 17:56:47 +05:30
eklavya	dbadc6b994	Added mapPartitions method to JavaRDD.	2014-01-13 17:56:10 +05:30
eklavya	aae8a01425	Added setter method setGenerator to JavaRDD.	2014-01-13 17:53:35 +05:30
Andrew Or	a1f0992fae	Report bytes spilled for both memory and disk on Web UI	2014-01-12 23:42:57 -08:00
Andrew Or	69c9aebed0	Enable external sorting by default	2014-01-12 22:43:01 -08:00
Reynold Xin	e6ed13f255	Merge pull request #397 from pwendell/host-port Remove now un-needed hostPort option I noticed this was logging some scary error messages in various places. After I looked into it, this is no longer really used. I removed the option and re-wrote the one remaining use case (it was unnecessary there anyways).	2014-01-12 22:35:14 -08:00
Andrew Or	8d40e7222f	Get rid of spill map in SparkEnv	2014-01-12 22:34:33 -08:00
Patrick Wendell	0b96d85c20	Merge pull request #399 from pwendell/consolidate-off Disable shuffle file consolidation by default After running various performance tests for the 0.9 release, this still seems to have performance issues even on XFS. So let's keep this off-by-default for 0.9 and users can experiment with it depending on their disk configurations.	2014-01-12 21:31:43 -08:00
Patrick Wendell	0ab505a29e	Merge pull request #395 from hsaputra/remove_simpleredundantreturn_scala Remove simple redundant return statements for Scala methods/functions Remove simple redundant return statements for Scala methods/functions: -) Only change simple return statements at the end of method -) Ignore the complex if-else check -) Ignore the ones inside synchronized -) Add small changes to making var to val if possible and remove () for simple get This hopefully makes the review simpler =) Pass compile and tests.	2014-01-12 21:31:04 -08:00
Patrick Wendell	2802cc80bc	Disable shuffle file consolidation by default	2014-01-12 19:16:43 -08:00
Henry Saputra	5a8abfb70e	Address code review concerns and comments.	2014-01-12 19:15:09 -08:00
Tathagata Das	aa2c993858	Merge remote-tracking branch 'apache/master' into error-handling	2014-01-12 17:37:46 -08:00
Patrick Wendell	074f50232f	Merge pull request #396 from pwendell/executor-env Setting load defaults to true in executor This preserves the behavior in earlier releases. If properties are set for the executors via `spark-env.sh` on the slaves, then they should take precedence over spark defaults. This is useful for if system administrators are setting properties for a standalone cluster, such as shuffle locations. /cc @andrewor14 who initially reported this issue.	2014-01-12 17:01:13 -08:00
Reynold Xin	82e2b92c6d	Merge pull request #392 from rxin/listenerbus Stop SparkListenerBus daemon thread when DAGScheduler is stopped. Otherwise this leads to hundreds of SparkListenerBus daemon threads in our unit tests (and also problematic if user applications launches multiple SparkContext).	2014-01-12 16:55:11 -08:00
Patrick Wendell	0bb33076e2	Removing mentions in tests	2014-01-12 16:53:58 -08:00
Patrick Wendell	0d4886c000	Remove now un-needed hostPort option	2014-01-12 16:47:52 -08:00
Patrick Wendell	cfb1e6c13c	Setting load defaults to true in executor	2014-01-12 15:35:08 -08:00
Henry Saputra	f1c5eca494	Fix accidental comment modification.	2014-01-12 10:40:21 -08:00
Henry Saputra	91a563608e	Merge branch 'master' into remove_simpleredundantreturn_scala	2014-01-12 10:34:13 -08:00
Henry Saputra	93a65e5fde	Remove simple redundant return statement for Scala methods/functions: -) Only change simple return statements at the end of method -) Ignore the complex if-else check -) Ignore the ones inside synchronized	2014-01-12 10:30:04 -08:00
Tathagata Das	18f4889d96	Merge remote-tracking branch 'apache/master' into error-handling	2014-01-11 23:40:57 -08:00
Tathagata Das	f5108ffc24	Converted JobScheduler to use actors for event handling. Changed protected[streaming] to private[streaming] in StreamingContext and DStream. Added waitForStop to StreamingContext, and StreamingContextSuite.	2014-01-11 23:15:09 -08:00
Reynold Xin	288a878999	Merge pull request #389 from rxin/clone-writables Minor update for clone writables and more documentation.	2014-01-11 21:53:19 -08:00
Reynold Xin	dbc11df411	Merge pull request #388 from pwendell/master Fix UI bug introduced in #244. The 'duration' field was incorrectly renamed to 'task time' in the table that lists stages.	2014-01-11 18:07:13 -08:00
Reynold Xin	362cda18bc	Renamed cloneKeyValues to cloneRecords; updated docs.	2014-01-11 18:01:29 -08:00
Patrick Wendell	07b952e1d1	Revert "Fix default TTL for metadata cleaner" This reverts commit `669ba4caa9`.	2014-01-11 16:07:10 -08:00
Reynold Xin	2180c87188	Stop SparkListenerBus daemon thread when DAGScheduler is stopped.	2014-01-11 13:36:37 -08:00
Reynold Xin	b0fbfccadc	Minor update for clone writables and more documentation.	2014-01-11 12:35:10 -08:00
Reynold Xin	ee6e7f9b8c	Merge pull request #359 from ScrapCodes/clone-writables We clone hadoop key and values by default and reuse objects if asked to. We try to clone for most common types of writables and we call WritableUtils.clone otherwise intention is to optimize, for example for NullWritable there is no need and for Long, int and String creating a new object with value set would be faster than doing copy on object hopefully. There is another way to do this PR where we ask for both key and values whether to clone them or not, but could not think of a use case for it except either of them is actually a NullWritable for which I have already worked around. So thought that would be unnecessary.	2014-01-11 12:07:55 -08:00
Patrick Wendell	b313e15616	Fix UI bug introduced in #244 . The 'duration' field was incorrectly renamed to 'task time' in the table that lists stages.	2014-01-11 10:52:57 -08:00
Reynold Xin	0b5ce7af17	Merge pull request #386 from pwendell/typo-fix Small typo fix	2014-01-10 23:23:21 -08:00
Andrew Or	bb8098f203	Add number of bytes spilled to Web UI	2014-01-10 21:40:55 -08:00
Ankur Dave	d1d2b6d9b6	Remove blank lines added to Spark core	2014-01-10 21:17:32 -08:00
Matei Zaharia	1d7bef0c91	Merge pull request #381 from mateiz/default-ttl Fix default TTL for metadata cleaner It seems to have been set to 3500 in a previous commit for debugging, but it should be off by default.	2014-01-10 18:53:03 -08:00
Andrew Or	e6447152b3	Induce spilling in ExternalAppendOnlyMapSuite	2014-01-10 18:33:48 -08:00
Ankur Dave	41d6586e8e	Revert changes to Spark's (PrimitiveKey)OpenHashMap; copy PKOHM to graphx	2014-01-10 18:00:54 -08:00
Patrick Wendell	44d6a8e3d8	Merge pull request #382 from RongGu/master Fix a type error in comment lines Fix a type error in comment lines	2014-01-10 17:51:50 -08:00
Patrick Wendell	08370a52b8	Small typo fix	2014-01-10 17:47:15 -08:00
Patrick Wendell	f26553102c	Merge pull request #383 from tdas/driver-test API for automatic driver recovery for streaming programs and other bug fixes 1. Added Scala and Java API for automatically loading checkpoint if it exists in the provided checkpoint directory. Scala API: `StreamingContext.getOrCreate(<checkpoint dir>, <function to create new StreamingContext>)` returns a StreamingContext Java API: `JavaStreamingContext.getOrCreate(<checkpoint dir>, <factory obj of type JavaStreamingContextFactory>)`, return a JavaStreamingContext See the RecoverableNetworkWordCount below as an example of how to use it. 2. Refactored streaming.Checkpoint*** code to fix bugs and make the DStream metadata checkpoint writing and reading more robust. Specifically, it fixes and improves the logic behind backing up and writing metadata checkpoint files. Also, it ensure that spark.driver.* and spark.hostPort is cleared from SparkConf before being written to checkpoint. 3. Fixed bug in cleaning up of checkpointed RDDs created by DStream. Specifically, this fix ensures that checkpointed RDD's files are not prematurely cleaned up, thus ensuring reliable recovery. 4. TimeStampedHashMap is upgraded to optionally update the timestamp on map.get(key). This allows clearing of data based on access time (i.e., clear records were last accessed before a threshold timestamp). 5. Added caching for file modification time in FileInputDStream using the updated TimeStampedHashMap. Without the caching, enumerating the mod times to find new files can take seconds if there are 1000s of files. This cache is automatically cleared. This PR is not entirely final as I may make some minor additions - a Java examples, and adding StreamingContext.getOrCreate to unit test. Edit: Java example to be added later, unit test added.	2014-01-10 16:25:44 -08:00
Patrick Wendell	d37408f39c	Merge pull request #377 from andrewor14/master External Sorting for Aggregator and CoGroupedRDDs (Revisited) (This pull request is re-opened from https://github.com/apache/incubator-spark/pull/303, which was closed because Jenkins / github was misbehaving) The target issue for this patch is the out-of-memory exceptions triggered by aggregate operations such as reduce, groupBy, join, and cogroup. The existing AppendOnlyMap used by these operations resides purely in memory, and grows with the size of the input data until the amount of allocated memory is exceeded. Under large workloads, this problem is aggravated by the fact that OOM frequently occurs only after a very long (> 1 hour) map phase, in which case the entire job must be restarted. The solution is to spill the contents of this map to disk once a certain memory threshold is exceeded. This functionality is provided by ExternalAppendOnlyMap, which additionally sorts this buffer before writing it out to disk, and later merges these buffers back in sorted order. Under normal circumstances in which OOM is not triggered, ExternalAppendOnlyMap is simply a wrapper around AppendOnlyMap and incurs little overhead. Only when the memory usage is expected to exceed the given threshold does ExternalAppendOnlyMap spill to disk.	2014-01-10 16:25:01 -08:00
Tathagata Das	4f39e79c23	Merge remote-tracking branch 'apache/master' into driver-test Conflicts: streaming/src/main/scala/org/apache/spark/streaming/DStreamGraph.scala	2014-01-10 15:47:01 -08:00
Reynold Xin	0eaf01c5ed	Merge pull request #369 from pillis/master SPARK-961 Add a Vector.random() method Added method and testcases	2014-01-10 15:32:19 -08:00
Andrew Or	e4c51d2113	Address Patrick's and Reynold's comments Aside from trivial formatting changes, use nulls instead of Options for DiskMapIterator, and add documentation for spark.shuffle.externalSorting and spark.shuffle.memoryFraction. Also, set spark.shuffle.memoryFraction to 0.3, and spark.storage.memoryFraction = 0.6.	2014-01-10 15:09:51 -08:00
RongGu	94776f753f	fix a type error in comment lines	2014-01-11 05:43:56 +08:00
Thomas Graves	7cef8435d7	Merge pull request #371 from tgravescs/yarn_client_addjar_misc_fixes Yarn client addjar and misc fixes Fix the addJar functionality in yarn-client mode, add support for the other options supported in yarn-standalone mode, set the application type on yarn in hadoop 2.X, add documentation, change heartbeat interval to be same code as the yarn-standalone so it doesn't take so long to get containers and exit.	2014-01-10 15:34:15 -06:00
Patrick Wendell	7b58f116e5	Merge pull request #384 from pwendell/debug-logs Make DEBUG-level logs consummable. Removes two things that caused issues with the debug logs: (a) Internal polling in the DAGScheduler was polluting the logs. (b) The Scala REPL logs were really noisy.	2014-01-10 12:47:46 -08:00
Tathagata Das	e4bb845238	Updated docs based on Patrick's comments in PR 383.	2014-01-10 12:17:09 -08:00
Patrick Wendell	e9ed2d9e82	Make DEBUG-level logs consummable. Removes two things that caused issues with the debug logs: (a) Internal polling in the DAGScheduler was polluting the logs. (b) The Scala REPL logs were really noisy.	2014-01-10 10:33:24 -08:00
Tathagata Das	740730a179	Fixed conf/slaves and updated docs.	2014-01-10 05:06:15 -08:00
Matei Zaharia	669ba4caa9	Fix default TTL for metadata cleaner It seems to have been set to 3500 in a previous commit for debugging, but it should be off by default	2014-01-10 00:21:36 -08:00
Pillis	8d021b42bc	SPARK-961. Add a Vector.random() method - update 1	2014-01-10 00:07:36 -08:00
Matei Zaharia	0ebc97305a	Merge pull request #375 from mateiz/option-fix Fix bug added when we changed AppDescription.maxCores to an Option The Scala compiler warned about this -- we were comparing an Option against an integer now.	2014-01-09 23:58:49 -08:00
Patrick Wendell	460f655cc6	Enable shuffle consolidation by default. Bump this to being enabled for 0.9.0.	2014-01-09 22:42:50 -08:00
Andrew Or	aa5002bb96	Defensively allocate memory from global pool This is an alternative to the existing approach, which evenly distributes the collective shuffle memory among all running tasks. In the new approach, each thread requests a chunk of memory whenever its map is about to multiplicatively grow. If there is sufficient memory in the global pool, the thread allocates it and grows its map. Otherwise, it spills. A danger with the previous approach is that a new task may quickly fill up its map before old tasks finish spilling, potentially causing an OOM. This approach prevents this scenario as it favors existing tasks over new tasks; any thread that may step over the boundary of other threads defensively backs off and starts spilling. Testing through spark-perf reveals: (1) When no spills have occured, the performance of external sorting using this memory management approach is essentially the same as without external sorting. (2) When one or more spills have occured, the performance of external sorting is a small multiple (3x) worse	2014-01-09 21:43:58 -08:00
Andrew Or	d76e1f90a8	Merge github.com:apache/incubator-spark Conflicts: core/src/main/scala/org/apache/spark/SparkEnv.scala streaming/src/test/java/org/apache/spark/streaming/JavaAPISuite.java	2014-01-09 21:38:48 -08:00
Tathagata Das	38d75e18fa	Merge remote-tracking branch 'apache/master' into driver-test	2014-01-09 19:31:36 -08:00
Reynold Xin	4b074fac05	Merge pull request #374 from mateiz/completeness Add some missing Java API methods These are primarily for setting job groups, canceling jobs, and setting names on RDDs. Seemed like useful stuff to expose in Java.	2014-01-09 19:03:55 -08:00
Reynold Xin	a9d533333d	Merge pull request #294 from RongGu/master Bug fixes for updating the RDD block's memory and disk usage information Bug fixes for updating the RDD block's memory and disk usage information. From the code context, we can find that the memSize and diskSize here are both always equal to the size of the block. Actually, they never be zero. Thus, the logic here is wrong for recording the block usage in BlockStatus, especially for the blocks which are dropped from memory to ensure space for the new input rdd blocks. I have tested it that this would cause the storage metrics shown in the Storage webpage wrong and misleading. With this patch, the metrics will be okay. Finally, Merry Christmas, guys:)	2014-01-09 18:46:46 -08:00
Patrick Wendell	d86a85e9ca	Merge pull request #293 from pwendell/standalone-driver SPARK-998: Support Launching Driver Inside of Standalone Mode [NOTE: I need to bring the tests up to date with new changes, so for now they will fail] This patch provides support for launching driver programs inside of a standalone cluster manager. It also supports monitoring and re-launching of driver programs which is useful for long running, recoverable applications such as Spark Streaming jobs. For those jobs, this patch allows a deployment mode which is resilient to the failure of any worker node, failure of a master node (provided a multi-master setup), and even failures of the applicaiton itself, provided they are recoverable on a restart. Driver information, such as the status and logs from a driver, is displayed in the UI There are a few small TODO's here, but the code is generally feature-complete. They are: - Bring tests up to date and add test coverage - Restarting on failure should be optional and maybe off by default. - See if we can re-use akka connections to facilitate clients behind a firewall A sensible place to start for review would be to look at the `DriverClient` class which presents users the ability to launch their driver program. I've also added an example program (`DriverSubmissionTest`) that allows you to test this locally and play around with killing workers, etc. Most of the code is devoted to persisting driver state in the cluster manger, exposing it in the UI, and dealing correctly with various types of failures. Instructions to test locally: - `sbt/sbt assembly/assembly examples/assembly` - start a local version of the standalone cluster manager ``` ./spark-class org.apache.spark.deploy.client.DriverClient \ -j -Dspark.test.property=something \ -e SPARK_TEST_KEY=SOMEVALUE \ launch spark://10.99.1.14:7077 \ ../path-to-examples-assembly-jar \ org.apache.spark.examples.DriverSubmissionTest 1000 some extra options --some-option-here -X 13 ``` - Go in the UI and make sure it started correctly, look at the output etc - Kill workers, the driver program, masters, etc.	2014-01-09 18:37:52 -08:00
Matei Zaharia	c43eb00644	Fix bug added when we changed AppDescription.maxCores to an Option The Scala compiler warned about this -- we were comparing an Option against an integer now.	2014-01-09 18:14:20 -08:00
Matei Zaharia	142921c6c0	Add some missing Java API methods	2014-01-09 18:11:12 -08:00
Patrick Wendell	26cdb5f68a	Merge pull request #372 from pwendell/log4j-fix-1 Send logs to stderr by default (instead of stdout).	2014-01-09 17:16:34 -08:00
Patrick Wendell	2af98198ad	Send logs to stderr by default (instead of stdout).	2014-01-09 15:57:44 -08:00
Matei Zaharia	12f414ed43	Merge pull request #362 from mateiz/conf-getters Use typed getters for configuration settings This improves some of the code style after SPARK-544.	2014-01-09 15:31:30 -08:00
Tathagata Das	f1d206c6b4	Merge branch 'standalone-driver' into driver-test Conflicts: core/src/main/scala/org/apache/spark/SparkContext.scala core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala examples/src/main/java/org/apache/spark/streaming/examples/JavaNetworkWordCount.java streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaStreamingContext.scala streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala	2014-01-09 15:06:24 -08:00
Tathagata Das	6f713e2a3e	Changed the way StreamingContext finds and reads checkpoint files, and added JavaStreamingContext.getOrCreate.	2014-01-09 13:42:04 -08:00
Patrick Wendell	67b9a33628	Some usability improvements	2014-01-09 12:42:37 -08:00
Thomas Graves	c617083e47	yarn-client addJar fix and misc other	2014-01-09 10:24:35 -06:00
Pillis	181471906e	SPARK-961 Add a Vector.random() method	2014-01-09 10:16:19 +01:00
Reynold Xin	365cac9465	Merge pull request #361 from rxin/clean Minor style cleanup. Mostly on indenting & line width changes. Focused on the few important files since they are the files that new contributors usually read first.	2014-01-09 00:56:16 -08:00
Reynold Xin	295d82583a	Minor update on SparkContext.broadcast's JavaDoc.	2014-01-09 00:30:22 -08:00
Ankur Dave	7309a29c75	Removed Kryo dependency and graphx-shell	2014-01-09 00:13:23 -08:00
Matei Zaharia	a01f3401e3	Use typed getters for configuration settings	2014-01-09 00:07:29 -08:00
Prashant Sharma	59b03e015d	Fixes corresponding to Reynolds feedback comments	2014-01-09 12:26:30 +05:30
Ankur Dave	78d6b13ac8	Fix mis-merge in `44fd30d3fb`	2014-01-08 21:19:14 -08:00
Ankur Dave	91227566bc	Merge remote-tracking branch 'spark-upstream/master' into HEAD Conflicts: README.md core/src/main/scala/org/apache/spark/util/collection/OpenHashMap.scala core/src/main/scala/org/apache/spark/util/collection/OpenHashSet.scala core/src/main/scala/org/apache/spark/util/collection/PrimitiveKeyOpenHashMap.scala pom.xml project/SparkBuild.scala repl/src/main/scala/org/apache/spark/repl/SparkILoop.scala	2014-01-08 21:19:08 -08:00
Patrick Wendell	112c0a1776	Fixing config option "retained_stages" => "retainedStages". This is a very esoteric option and it's out of sync with the style we use. So it seems fitting to fix it for 0.9.0.	2014-01-08 21:16:16 -08:00
Patrick Wendell	0f9d2ace6b	Adding polling to driver submission client.	2014-01-08 16:56:26 -08:00
Reynold Xin	46f6a3b6aa	Minor style cleanup. Mostly on indenting & line width changes.	2014-01-08 14:55:04 -08:00
Reynold Xin	56ebfeaa52	Merge pull request #357 from hsaputra/set_boolean_paramname Set boolean param name for call to SparkHadoopMapReduceUtil.newTaskAttemptID Set boolean param name for call to SparkHadoopMapReduceUtil.newTaskAttemptID to make it clear which param being set.	2014-01-08 11:50:06 -08:00
Reynold Xin	5cae05f59e	Merge pull request #356 from hsaputra/remove_deprecated_cleanup_method Remove calls to deprecated mapred's OutputCommitter.cleanupJob Since Hadoop 1.0.4 the mapred OutputCommitter.commitJob should do cleanup job via call to OutputCommitter.cleanupJob, Remove SparkHadoopWriter.cleanup since it is used only by PairRDDFunctions. In fact the implementation of mapred OutputCommitter.commitJob looks like this: public void commitJob(JobContext jobContext) throws IOException { cleanupJob(jobContext); }	2014-01-08 11:47:28 -08:00
walker	d942f95d7e	Merge remote branch 'upstream/master'	2014-01-09 01:22:26 +08:00
Prashant Sharma	277b4a36c5	we clone hadoop key and values by default and reuse if specified.	2014-01-08 16:32:55 +05:30
Patrick Wendell	62b08faac5	Adding mockito to maven build	2014-01-08 00:45:41 -08:00
Patrick Wendell	bc81ce040d	Merge remote-tracking branch 'apache-github/master' into standalone-driver Conflicts: core/src/test/scala/org/apache/spark/deploy/JsonProtocolSuite.scala pom.xml	2014-01-08 00:38:31 -08:00
Henry Saputra	aa56585d21	Resolve PR review over 100 chars	2014-01-08 00:38:29 -08:00
Patrick Wendell	3ec21f2eee	Show more helpful information in UI	2014-01-08 00:30:10 -08:00
Patrick Wendell	c78b381e91	Fixes	2014-01-08 00:09:12 -08:00
Patrick Wendell	d0533f7046	Rename to Client	2014-01-07 23:38:51 -08:00
Patrick Wendell	3d939e5fe8	Adding --verbose option to DriverClient	2014-01-07 23:27:18 -08:00
Henry Saputra	f6b6f88367	Set boolean param name for two files call to SparkHadoopMapReduceUtil.newTaskAttemptID to make it clear which param being set.	2014-01-07 23:23:17 -08:00
Henry Saputra	4517326ec6	Remove calls to deprecated mapred's OutputCommitter.cleanupJob because since Hadoop 1.0.4 the mapred OutputCommitter.commitJob should do cleanup job. In fact the implementation of mapred OutputCommitter.commitJob looks like this: public void commitJob(JobContext jobContext) throws IOException { cleanupJob(jobContext); } (The jobContext input argument is type of org.apache.hadoop.mapred.JobContext)	2014-01-07 22:55:56 -08:00
Patrick Wendell	f5f12dc282	Merge pull request #336 from liancheng/akka-remote-lookup Get rid of `Either[ActorRef, ActorSelection]' In this pull request, instead of returning an `Either[ActorRef, ActorSelection]`, `registerOrLookup` identifies the remote actor blockingly to obtain an `ActorRef`, or throws an exception if the remote actor doesn't exist or the lookup times out (configured by `spark.akka.lookupTimeout`). This function is only called when an `SparkEnv` is constructed (instantiating driver or executor), so the blocking call is considered acceptable. Executor side `ActorSelection`s/`ActorRef`s to driver side `MapOutputTrackerMasterActor` and `BlockManagerMasterActor` are affected by this pull request. `ActorSelection` is dangerous and should be used with care. It's only absolutely safe to send messages via an `ActorSelection` when the remote actor is stateless, so that actor incarnation is irrelevant. But as pointed by @ScrapCodes in the comments below, executor exits immediately once the connection to the driver lost, `ActorSelection`s are not harmful in this scenario. So this pull request is mostly a code style patch.	2014-01-07 21:56:35 -08:00
Matei Zaharia	d75dc428da	Merge pull request #350 from mateiz/standalone-limit Add way to limit default # of cores used by apps in standalone mode Also documents the spark.deploy.spreadOut option, and fixes a config option that had a dash in its name.	2014-01-08 00:30:03 -05:00
Patrick Wendell	61674bcadf	Merge pull request #352 from markhamstra/oldArch Don't leave os.arch unset after BlockManagerSuite Recent SparkConf changes meant that BlockManagerSuite was now leaving the os.arch System.property unset. That's a problem for any subsequent tests that rely upon having a valid os.arch. This is true for CompressionCodecSuite in the usual maven build test order, even though it isn't usually true for the sbt build.	2014-01-07 18:32:13 -08:00
Mark Hamstra	86ed1ad252	Fix BlockManagerSuite#after	2014-01-07 16:39:37 -08:00
Matei Zaharia	2c421749ea	Address review comments	2014-01-07 19:30:23 -05:00
Patrick Wendell	e21a707a13	Adding unit tests and some refactoring to promote testability.	2014-01-07 15:39:47 -08:00
Matei Zaharia	044c8ad3a4	Fix unit test compilation	2014-01-07 16:12:20 -05:00
Patrick Wendell	e688e11206	Add log4j exclusion rule to maven. To make this work I had to rename the defaults file. Otherwise maven's pattern matching rules included it when trying to match other log4j.properties files. I also fixed a bug in the existing maven build where two <transformers> tags were present in assembly/pom.xml such that one overwrote the other.	2014-01-07 12:56:24 -08:00
Andrew Or	80ba9f8ba0	Get SparkConf from SparkEnv, rather than creating new ones	2014-01-07 12:44:22 -08:00
Matei Zaharia	d8bcc8e9a0	Add way to limit default # of cores used by applications on standalone mode Also documents the spark.deploy.spreadOut option.	2014-01-07 14:35:52 -05:00
Reynold Xin	15d9534501	Merge pull request #318 from srowen/master Suggested small changes to Java code for slightly more standard style, encapsulation and in some cases performance Sorry if this is too abrupt or not a welcome set of changes, but thought I'd see if I could contribute a little. I'm a Java developer and just getting seriously into Spark. So I thought I'd suggest a number of small changes to the couple Java parts of the code to make it a little tighter, more standard and even a bit faster. Feel free to take all, some or none of this. Happy to explain any of it.	2014-01-07 08:10:02 -08:00
Prashant Sharma	c729fa7c8e	formatting related fixes suggested by Patrick.	2014-01-07 13:08:16 +05:30
Prashant Sharma	b84dc780d3	Allow configuration to be printed in logs for diagnosis.	2014-01-07 13:01:43 +05:30
Prashant Sharma	b3018811e1	Allow users to set arbitrary akka configurations via spark conf.	2014-01-07 13:01:43 +05:30
Patrick Wendell	6a3daead2d	Fixes after merge	2014-01-06 20:12:45 -08:00
Patrick Wendell	c0498f9265	Merge remote-tracking branch 'apache-github/master' into standalone-driver Conflicts: core/src/main/scala/org/apache/spark/deploy/client/AppClient.scala core/src/main/scala/org/apache/spark/deploy/client/TestClient.scala core/src/main/scala/org/apache/spark/deploy/master/Master.scala core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala core/src/main/scala/org/apache/spark/scheduler/cluster/SparkDeploySchedulerBackend.scala	2014-01-06 17:29:21 -08:00
Patrick Wendell	f236ddd1a2	Changes based on review feedback.	2014-01-06 17:15:52 -08:00
Adam Novak	fa8ce3fdd7	Changing org.apache.spark.util.collection.PrimitiveKeyOpenHashMap to have a real no-argument constructor, instead of a one-argument constructor with a default value. The lack of a real no-argument constructor was causing "sbt/sbt publish-local" to fail thusly: ``` [error] /pod/home/anovak/build/graphx/core/src/main/scala/org/apache/spark/storage/ShuffleBlockManager.scala:172: not enough arguments for constructor PrimitiveKeyOpenHashMap: (initialCapacity: Int)(implicit evidence$3: ClassManifest[Int], implicit evidence$4: ClassManifest[Int])org.apache.spark.util.collection.PrimitiveKeyOpenHashMap[Int,Int] [error] private val mapIdToIndex = new PrimitiveKeyOpenHashMap[Int, Int]() [error] ^ [info] No documentation generated with unsucessful compiler run [error] one error found [error] (core/compile:doc) Scaladoc generation failed [error] Total time: 67 s, completed Jan 6, 2014 2:20:51 PM ``` In theory a no-argument constructor ought not to differ from one with a single argument that has a default value, but in practice there seems to be an issue.	2014-01-06 14:52:15 -08:00
Patrick Wendell	9272a004af	Fix test breaking downstream builds	2014-01-06 13:03:19 -08:00
Patrick Wendell	357083c29f	Merge pull request #330 from tgravescs/fix_addjars_null_handling Fix handling of empty SPARK_EXAMPLES_JAR Currently if SPARK_EXAMPLES_JAR is left unset you get a null pointer exception when running the examples (atleast on spark on yarn). The null now gets turned into a string of "null" when its put into the SparkConf so addJar no longer properly ignores it. This fixes that so that it can be left unset.	2014-01-06 10:29:04 -08:00
walker	2ad315e80f	add inline comments	2014-01-07 01:27:57 +08:00
walker	6ab1db8071	add inline comments	2014-01-07 01:21:25 +08:00
walker	a0c6d96e27	Merge remote branch 'upstream/master'	2014-01-07 01:05:18 +08:00
Sean Owen	7379b2915f	Merge remote-tracking branch 'upstream/master'	2014-01-06 15:13:16 +00:00
Thomas Graves	25446dd931	Add warning to null setJars check	2014-01-06 07:58:59 -06:00
Tathagata Das	ac1f4b06c1	Added a hashmap to cache file mod times.	2014-01-05 23:42:53 -08:00
Patrick Wendell	a2e7e04974	Merge pull request #333 from pwendell/logging-silence Quiet ERROR-level Akka Logs This fixes an issue I've seen where akka logs a bunch of things at ERROR level when connecting to a standalone cluster, even in the normal case. I noticed that even when lifecycle logging was disabled, the netty code inside of akka still logged away via akka's EndpointWriter class. There are also some other log streams that I think are new in akka 2.2.1 that I've disabled. Finally, I added some better logging to the standalone client. This makes it more clear when a connection failure occurs what is going on. Previously it never explicitly said if a connection attempt had failed. The commit messages here have some more detail.	2014-01-05 22:37:36 -08:00
Patrick Wendell	675d7eb4f0	Responding to Aaron's review	2014-01-05 21:23:14 -08:00
Lian, Cheng	eb24684748	Fixed test suite compilation errors	2014-01-06 11:26:59 +08:00
Reynold Xin	5b0986a1d6	Merge pull request #334 from pwendell/examples-fix Removing SPARK_EXAMPLES_JAR in the code This re-writes all of the examples to use the `SparkContext.jarOfClass` mechanism for loading the examples jar. This necessary for environments like YARN and the Standalone mode where example programs will be submit from inside the cluster rather than at the client using `./spark-example`. This still leaves SPARK_EXAMPLES_JAR in place in the shell scripts for setting up the classpath if `./spark-example` is run.	2014-01-05 19:25:09 -08:00
Lian, Cheng	5c152e3e21	Fixed several compilation errors in test suites	2014-01-06 10:39:05 +08:00
Tathagata Das	2394794591	Merge branch 'filestream-fix' into driver-test Conflicts: streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala	2014-01-06 02:23:53 +00:00
Tathagata Das	8e88db3ca5	Bug fixes to the DriverRunner and minor changes here and there.	2014-01-06 02:21:56 +00:00
Lian, Cheng	a4048ff31e	Get rid of `Either[ActorRef, ActorSelection]' Although we can send messages via an ActorSelection, it would be better to identify the actor and obtain an ActorRef first, so that we can get informed earlier if the remote actor doesn't exist, and get rid of the annoying Either wrapper.	2014-01-06 09:18:17 +08:00
Reynold Xin	63f906322d	Fall back to zero-arg constructor for Serializer initialization if there is no constructor that accepts SparkConf. This maintains backward compatibility with older serializers implemented by users.	2014-01-05 15:52:43 -08:00
Patrick Wendell	94fdcda896	Provide logging when attempts to connect to the master fail. Without these it's a bit less clear what's going on for the user. One thing I realize when doing this is that akka itself actually retries the initial association. So the retry we currently have is redundant with akka's.	2014-01-05 15:16:01 -08:00
Patrick Wendell	aaaa673184	Quite akka when remote lifecycle logging is disabled. I noticed when connecting to a standalone cluster Spark gives a bunch of Akka ERROR logs that make it seem like something is failing. This patch does two things: 1. Akka dead letter logging is turned on/off according to the existing lifecycle spark property. 2. We explicitly silence akka's EndpointWriter log in log4j. This is necessary because for some reason that log doesn't pick up on the lifecycle logging settings. After a few hours of debugging this was the only solution I found that worked.	2014-01-05 15:15:59 -08:00
Patrick Wendell	79f52809c8	Removing SPARK_EXAMPLES_JAR in the code	2014-01-05 11:49:42 -08:00
Andrew Or	4de9c9554c	Use AtomicInteger for numRunningTasks	2014-01-04 11:16:30 -08:00
Thomas Graves	ad35c1a5f2	Fix handling of empty SPARK_EXAMPLES_JAR	2014-01-04 11:42:17 -06:00
Tathagata Das	3d4474330d	Removed the exponential backoff for testing.	2014-01-04 08:39:00 -08:00
Andrew Or	2db7884f6f	Address Mark's comments	2014-01-04 01:20:09 -08:00
Andrew Or	4296d96c82	Assign spill threshold as a fraction of maximum memory Further, divide this threshold by the number of tasks running concurrently. Note that this does not guard against the following scenario: a new task quickly fills up its share of the memory before old tasks finish spilling their contents, in which case the total memory used by such maps may exceed what was specified. Currently, spark.shuffle.safetyFraction mitigates the effect of this.	2014-01-04 00:00:57 -08:00
Patrick Wendell	604fad9c39	Merge remote-tracking branch 'apache-github/master' into remove-binaries Conflicts: core/src/test/scala/org/apache/spark/DriverSuite.scala docs/python-programming-guide.md	2014-01-03 21:29:33 -08:00
Patrick Wendell	9e6f3bdcda	Changes on top of Prashant's patch. Closes #316	2014-01-03 18:30:17 -08:00
Andrew Or	333d58df86	Remove unnecessary ClassTag's	2014-01-03 17:55:26 -08:00
Andrew Or	838b0e7d15	Refactor using SparkConf	2014-01-03 16:13:40 -08:00
Patrick Wendell	4ae101ff38	Merge pull request #317 from ScrapCodes/spark-915-segregate-scripts Spark-915 segregate scripts	2014-01-03 11:24:35 -08:00
Prashant Sharma	9ae382c363	sbin/compute-classpath* bin/compute-classpath*	2014-01-03 15:12:29 +05:30
Prashant Sharma	74ba97fcf7	sbin/spark-class* -> bin/spark-class*	2014-01-03 15:08:01 +05:30
Prashant Sharma	bc311bb826	Restored the previously removed test	2014-01-03 14:52:37 +05:30
Prashant Sharma	94f2fffa23	fixed review comments	2014-01-03 14:43:37 +05:30
Prashant Sharma	b4bb80002b	Merge branch 'master' into spark-1002-remove-jars	2014-01-03 12:12:04 +05:30
Andrew Or	df413e996f	Merge remote-tracking branch 'spark/master' Conflicts: core/src/main/scala/org/apache/spark/rdd/CoGroupedRDD.scala	2014-01-02 20:51:23 -08:00
Tathagata Das	a1b8dd53e3	Added StreamingContext.getOrCreate to for automatic recovery, and added RecoverableNetworkWordCount example to use it.	2014-01-02 19:07:22 -08:00
Reynold Xin	0475ca8f81	Merge pull request #320 from kayousterhout/erroneous_failed_msg Remove erroneous FAILED state for killed tasks. Currently, when tasks are killed, the Executor first sends a status update for the task with a "KILLED" state, and then sends a second status update with a "FAILED" state saying that the task failed due to an exception. The second FAILED state is misleading/unncessary, and occurs due to a NonLocalReturnControl Exception that gets thrown due to the way we kill tasks. This commit eliminates that problem. I'm not at all sure that this is the best way to fix this problem, so alternate suggestions welcome. @rxin guessing you're the right person to look at this.	2014-01-02 15:17:08 -08:00
Aaron Davidson	8831923219	TempBlockId takes UUID and is explicitly non-serializable	2014-01-02 13:52:35 -08:00
Patrick Wendell	588a1695f4	Merge pull request #297 from tdas/window-improvement Improvements to DStream window ops and refactoring of Spark's CheckpointSuite - Added a new RDD - PartitionerAwareUnionRDD. Using this RDD, one can take multiple RDDs partitioned by the same partitioner and unify them into a single RDD while preserving the partitioner. So m RDDs with p partitions each will be unified to a single RDD with p partitions and the same partitioner. The preferred location for each partition of the unified RDD will be the most common preferred location of the corresponding partitions of the parent RDDs. For example, location of partition 0 of the unified RDD will be where most of partition 0 of the parent RDDs are located. - Improved the performance of DStream's reduceByKeyAndWindow and groupByKeyAndWindow. Both these operations work by doing per-batch reduceByKey/groupByKey and then using PartitionerAwareUnionRDD to union the RDDs across the window. This eliminates a shuffle related to the window operation, which can reduce batch processing time by 30-40% for simple workloads. - Fixed bugs and simplified Spark's CheckpointSuite. Some of the tests were incorrect and unreliable. Added missing tests for ZippedRDD. I can go into greater detail if necessary. - Added mapSideCombine option to combineByKeyAndWindow.	2014-01-02 13:20:54 -08:00
Matei Zaharia	7bafb68d77	Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/incubator-spark	2014-01-02 15:57:28 -05:00
Matei Zaharia	ca67909cd4	Merge pull request #311 from tmyklebu/master SPARK-991: Report information gleaned from a Python stacktrace in the UI Scala: - Added setCallSite/clearCallSite to SparkContext and JavaSparkContext. These functions mutate a LocalProperty called "externalCallSite." - Add a wrapper, getCallSite, that checks for an externalCallSite and, if none is found, calls the usual Utils.formatSparkCallSite. - Change everything that calls Utils.formatSparkCallSite to call getCallSite instead. Except getCallSite. - Add wrappers to setCallSite/clearCallSite wrappers to JavaSparkContext. Python: - Add a gruesome hack to rdd.py that inspects the traceback and guesses what you want to see in the UI. - Add a RAII wrapper around said gruesome hack that calls setCallSite/clearCallSite as appropriate. - Wire said RAII wrapper up around three calls into the Scala code. I'm not sure that I hit all the spots with the RAII wrapper. I'm also not sure that my gruesome hack does exactly what we want. One could also approach this change by refactoring runJob/submitJob/runApproximateJob to take a call site, then threading that parameter through everything that needs to know it. One might object to the pointless-looking wrappers in JavaSparkContext. Unfortunately, I can't directly access the SparkContext from Python---or, if I can, I don't know how---so I need to wrap everything that matters in JavaSparkContext. Conflicts: core/src/main/scala/org/apache/spark/api/java/JavaSparkContext.scala	2014-01-02 15:54:54 -05:00
Kay Ousterhout	a1b438d94d	Remove erroneous FAILED state for killed tasks. Currently, when tasks are killed, the Executor first sends a status update for the task with a "KILLED" state, and then sends a second status update with a "FAILED" state saying that the task failed due to an exception. The second FAILED state is misleading/unncessary, and occurs due to a NonLocalReturnControl Exception that gets thrown due to the way we kill tasks. This commit eliminates that problem.	2014-01-02 12:34:46 -08:00
Kay Ousterhout	5a3c00c958	Removed redundant TaskSetManager.error() function. This function was leftover from a while ago, and now just passes all calls through to the abort() function, so this commit deletes it.	2014-01-02 11:13:58 -08:00
Sean Owen	66d501276b	Suggested small changes to Java code for slightly more standard style, encapsulation and in some cases performance	2014-01-02 16:17:57 +00:00
Prashant Sharma	980afd280a	Merge branch 'scripts-reorg' of github.com:shane-huang/incubator-spark into spark-915-segregate-scripts Conflicts: bin/spark-shell core/pom.xml core/src/main/scala/org/apache/spark/SparkContext.scala core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/CoarseMesosSchedulerBackend.scala core/src/main/scala/org/apache/spark/ui/UIWorkloadGenerator.scala core/src/test/scala/org/apache/spark/DriverSuite.scala python/run-tests sbin/compute-classpath.sh sbin/spark-class sbin/stop-slaves.sh	2014-01-02 17:55:21 +05:30
Prashant Sharma	08ec10de17	Removed a repeated test and changed tests to not use uncommons jar	2014-01-02 17:32:11 +05:30
Prashant Sharma	436f3d2856	ignoring tests for now, contrary to what I assumed these tests make sense given what they are testing.	2014-01-02 16:08:35 +05:30
Matei Zaharia	0f6060733d	Fixed two uses of conf.get with no default value in Mesos	2014-01-01 22:09:42 -05:00
Matei Zaharia	e2c68642c6	Miscellaneous fixes from code review. Also replaced SparkConf.getOrElse with just a "get" that takes a default value, and added getInt, getLong, etc to make code that uses this simpler later on.	2014-01-01 22:03:39 -05:00
Matei Zaharia	45ff8f413d	Merge remote-tracking branch 'apache/master' into conf2 Conflicts: core/src/main/scala/org/apache/spark/SparkContext.scala core/src/main/scala/org/apache/spark/metrics/MetricsSystem.scala core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala	2014-01-01 21:25:00 -05:00
Patrick Wendell	f8d245bdfc	Merge remote-tracking branch 'apache-github/master' into log4j-fix-2 Conflicts: streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala	2014-01-01 16:10:51 -08:00
Andrew Or	92c304fd03	Simplify ExternalAppendOnlyMap on the assumption that the mergeCombiners function is specified	2014-01-01 11:42:33 -08:00
Matei Zaharia	0e5b2adb5c	Merge remote-tracking branch 'apache/master' into conf2 Conflicts: project/SparkBuild.scala	2014-01-01 13:28:54 -05:00
Andrew Or	3bc9e391a3	Merge branch 'master' of github.com:andrewor14/incubator-spark	2013-12-31 20:02:12 -08:00
Andrew Or	83dfa16664	Address Patrick's and Reynold's comments	2013-12-31 20:02:05 -08:00
liguoqiang	b5d0b3b0f7	restore core/pom.xml file modification	2014-01-01 11:30:08 +08:00
Reynold Xin	8b8e70ebde	Merge pull request #73 from falaki/ApproximateDistinctCount Approximate distinct count Added countApproxDistinct() to RDD and countApproxDistinctByKey() to PairRDDFunctions to approximately count distinct number of elements and distinct number of values per key, respectively. Both functions use HyperLogLog from stream-lib for counting. Both functions take a parameter that controls the trade-off between accuracy and memory consumption. Also added Scala docs and test suites for both methods.	2013-12-31 17:48:24 -08:00
Aaron Davidson	08302b113a	Rename IntermediateBlockId to TempBlockId	2013-12-31 17:44:15 -08:00
Patrick Wendell	37c43c9dd1	Adding outer checkout when initializing logging	2013-12-31 17:36:56 -08:00
Andrew Or	8bbe08b21e	Merge branch 'master' of github.com:andrewor14/incubator-spark	2013-12-31 17:26:26 -08:00
Andrew Or	53d8d36684	Add support and test for null keys in ExternalAppendOnlyMap Also add safeguard against use of destructively sorted AppendOnlyMap	2013-12-31 17:19:02 -08:00
Hossein Falaki	bee445c927	Made the code more compact and readable	2013-12-31 16:58:18 -08:00
Hossein Falaki	acb0323053	minor improvements	2013-12-31 15:34:26 -08:00
Matei Zaharia	ba9338f104	Merge remote-tracking branch 'apache/master' into conf2 Conflicts: core/src/main/scala/org/apache/spark/rdd/CheckpointRDD.scala streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala	2013-12-31 18:23:14 -05:00
Patrick Wendell	63b411dd86	Merge pull request #238 from ngbinh/upgradeNetty upgrade Netty from 4.0.0.Beta2 to 4.0.13.Final the changes are listed at https://github.com/netty/netty/wiki/New-and-noteworthy	2013-12-31 14:31:28 -08:00
Andrew Or	3ce22df954	Add warning message for spilling	2013-12-31 11:33:10 -08:00
Andrew Or	94ddc91d06	Address Aaron's and Jerry's comments	2013-12-31 10:50:08 -08:00
Patrick Wendell	55b7e2fdff	Merge pull request #289 from tdas/filestream-fix Bug fixes for file input stream and checkpointing - Fixed bugs in the file input stream that led the stream to fail due to transient HDFS errors (listing files when a background thread it deleting fails caused errors, etc.) - Updated Spark's CheckpointRDD and Streaming's CheckpointWriter to use SparkContext.hadoopConfiguration, to allow checkpoints to be written to any HDFS compatible store requiring special configuration. - Changed the API of SparkContext.setCheckpointDir() - eliminated the unnecessary 'useExisting' parameter. Now SparkContext will always create a unique subdirectory within the user specified checkpoint directory. This is to ensure that previous checkpoint files are not accidentally overwritten. - Fixed bug where setting checkpoint directory as a relative local path caused the checkpointing to fail.	2013-12-31 10:12:51 -08:00
Tathagata Das	fcd17a1e8e	Fixed comments and long lines based on comments on PR 289.	2013-12-31 02:01:45 -08:00
Patrick Wendell	4abb0c57ab	Tiny typo fix	2013-12-31 00:05:03 -08:00
Patrick Wendell	4d009dcac6	Removing use in test	2013-12-31 00:01:44 -08:00
Patrick Wendell	3c254f2eec	Minor fixes	2013-12-30 23:55:33 -08:00
Aaron Davidson	375d11743c	Add new line at end of file	2013-12-30 23:42:37 -08:00
Patrick Wendell	18181e6c41	Removing initLogging entirely	2013-12-30 23:39:47 -08:00
Aaron Davidson	daa7792ad6	Refactor SamplingSizeTracker into SizeTrackingAppendOnlyMap	2013-12-30 23:39:02 -08:00
Hossein Falaki	d6cded7155	Added Java unit tests for countApproxDistinct and countApproxDistinctByKey	2013-12-30 19:32:05 -08:00
Hossein Falaki	c3073b6cf2	Added Java API for countApproxDistinct	2013-12-30 19:31:06 -08:00
Hossein Falaki	ed06500d30	Added Java API for countApproxDistinctByKey	2013-12-30 19:30:42 -08:00
Hossein Falaki	a7de8e9b1c	Renamed countDistinct and countDistinctByKey methods to include Approx	2013-12-30 19:28:03 -08:00
Matei Zaharia	0fa5809768	Updated docs for SparkConf and handled review comments	2013-12-30 22:17:28 -05:00
Hossein Falaki	d50ccc5ca9	Using origin version	2013-12-30 15:08:34 -08:00
Andrew Or	347fafe4fc	Fix CheckpointSuite test fail	2013-12-30 13:10:33 -08:00
Andrew Or	d6e7910d92	Simplify merge logic based on the invariant that all spills contain unique keys	2013-12-30 13:01:00 -08:00
Patrick Wendell	1cbef081e3	Response to Shivaram's review	2013-12-30 12:46:09 -08:00
Andrew Or	2b71ab97c4	Merge pull request from aarondav: Utilize DiskBlockManager pathway for temp file writing This gives us a couple advantages: - Uses spark.local.dir and randomly selects a directory/disk. - Ensure files are deleted on normal DiskBlockManager cleanup. - Availability of same stats as usual DiskBlockObjectWriter (currenty unused). Also enable basic cleanup when iterator is fully drained. Still requires cleanup for operations that fail or don't go through all elements.	2013-12-30 11:01:30 -08:00
Patrick Wendell	50e3b8ec4c	Merge pull request #308 from kayousterhout/stage_naming Changed naming of StageCompleted event to be consistent The rest of the SparkListener events are named with "SparkListener" as the prefix of the name; this commit renames the StageCompleted event to SparkListenerStageCompleted for consistency.	2013-12-30 07:44:26 -08:00
Patrick Wendell	cffe1c1d5c	SPARK-1008: Logging improvments 1. Adds a default log4j file that gets loaded if users haven't specified a log4j file. 2. Isolates use of the tools assembly jar. I found this produced SLF4J warnings after building with SBT (and I've seen similar warnings on the mailing list).	2013-12-29 23:14:33 -08:00
Andrew Or	015a510b0a	Merge branch 'master' of github.com:andrewor14/incubator-spark	2013-12-29 22:03:47 -08:00
Andrew Or	2a48d71528	Add test suite for ExternalAppendOnlyMap	2013-12-29 21:56:13 -08:00
Andrew Or	4a014dc59c	Make serializer a parameter to ExternalAppendOnlyMap	2013-12-29 21:55:53 -08:00
Kay Ousterhout	c2c1af39f5	Updated code style according to Patrick's comments	2013-12-29 21:10:08 -08:00
Aaron Davidson	e3cac47e65	Use Comparator instead of Ordering lower object creation costs	2013-12-29 19:58:37 -08:00
Matei Zaharia	994f080f8a	Properly show Spark properties on web UI, and change app name property	2013-12-29 22:19:33 -05:00
Andrew Or	8fbff9f5d0	Address Aaron's comments	2013-12-29 16:22:44 -08:00
Matei Zaharia	11540b798d	Added tests for SparkConf and fixed a bug Typesafe Config caches system properties the first time it's invoked by default, ignoring later changes unless you do something special	2013-12-29 18:44:06 -05:00
Matei Zaharia	1ee7f5aee4	Fix a change that was lost during merge	2013-12-29 18:15:46 -05:00
Matei Zaharia	0bd1900cbc	Fix a few settings that were being read as system properties after merge	2013-12-29 15:38:46 -05:00
Patrick Wendell	7a99702ce2	Respect supervise option at Master	2013-12-29 12:12:58 -08:00
Matei Zaharia	b4ceed40d6	Merge remote-tracking branch 'origin/master' into conf2 Conflicts: core/src/main/scala/org/apache/spark/SparkContext.scala core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala core/src/main/scala/org/apache/spark/scheduler/cluster/ClusterTaskSetManager.scala core/src/main/scala/org/apache/spark/scheduler/local/LocalScheduler.scala core/src/main/scala/org/apache/spark/util/MetadataCleaner.scala core/src/test/scala/org/apache/spark/scheduler/TaskResultGetterSuite.scala core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala new-yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala streaming/src/main/scala/org/apache/spark/streaming/Checkpoint.scala streaming/src/main/scala/org/apache/spark/streaming/api/java/JavaStreamingContext.scala streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala streaming/src/test/scala/org/apache/spark/streaming/BasicOperationsSuite.scala streaming/src/test/scala/org/apache/spark/streaming/CheckpointSuite.scala streaming/src/test/scala/org/apache/spark/streaming/InputStreamsSuite.scala streaming/src/test/scala/org/apache/spark/streaming/TestSuiteBase.scala streaming/src/test/scala/org/apache/spark/streaming/WindowOperationsSuite.scala	2013-12-29 15:08:08 -05:00
Patrick Wendell	a8729770f5	Slight change to retry logic	2013-12-29 11:57:57 -08:00
Patrick Wendell	8da1012f9b	TODO clean-up	2013-12-29 11:38:12 -08:00
Patrick Wendell	faefea3fd8	Adding driver ID to submission response	2013-12-29 11:31:10 -08:00
Patrick Wendell	6ffa9bb226	Documentation and adding supervise option	2013-12-29 11:26:56 -08:00
Patrick Wendell	35f6dc252a	Changes to allow fate sharing of drivers/executors and workers.	2013-12-29 11:14:36 -08:00
Matei Zaharia	cd00225db9	Add SparkConf support in Python	2013-12-29 14:03:39 -05:00
Tor Myklebust	d812aeece9	Factor call site reporting out to SparkContext.	2013-12-28 23:21:49 -05:00
Matei Zaharia	20631348d1	Fix other failing tests	2013-12-28 23:17:58 -05:00
Matei Zaharia	5bbe73864e	Fix Executor not getting properties in local mode	2013-12-28 17:31:58 -05:00
Matei Zaharia	a16c52ed1b	Check for SPARK_YARN_MODE through a system property too since it can sometimes be set that way (undoes a change in previous commit)	2013-12-28 17:24:21 -05:00
Matei Zaharia	642029e7f4	Various fixes to configuration code - Got rid of global SparkContext.globalConf - Pass SparkConf to serializers and compression codecs - Made SparkConf public instead of private[spark] - Improved API of SparkContext and SparkConf - Switched executor environment vars to be passed through SparkConf - Fixed some places that were still using system properties - Fixed some tests, though others are still failing This still fails several tests in core, repl and streaming, likely due to properties not being set or cleared correctly (some of the tests run fine in isolation).	2013-12-28 17:13:15 -05:00
Patrick Wendell	7375047d51	Merge pull request #304 from kayousterhout/remove_unused Removed unused failed and causeOfFailure variables (in TaskSetManager)	2013-12-28 13:25:06 -08:00
Matei Zaharia	ad3dfd1531	Merge pull request #307 from kayousterhout/other_failure Removed unused OtherFailure TaskEndReason. The OtherFailure TaskEndReason was added by @mateiz 3 years ago in this commit: `24a1e7f838` Unless I am missing something, it doesn't seem to have been used then, and is not used now, so seems safe for deletion.	2013-12-27 22:10:14 -05:00
Kay Ousterhout	b4619e509b	Changed naming of StageCompleted event to be consistent The rest of the SparkListener events are named with "SparkListener" as the prefix of the name; this commit renames the StageCompleted event to SparkListenerStageCompleted for consistency.	2013-12-27 17:45:20 -08:00
Kay Ousterhout	e17d7518ab	Removed unused OtherFailure TaskEndReason.	2013-12-27 15:51:27 -08:00
Kay Ousterhout	8419148e5f	Remove unused hasPendingTasks methods	2013-12-27 15:19:42 -08:00
Patrick Wendell	c8c8b42a6f	Some notes and TODO about dependencies	2013-12-27 15:13:11 -08:00
Kay Ousterhout	0c71ffe924	Style fixes as per Reynold's review	2013-12-27 12:19:38 -08:00
Kay Ousterhout	8c81068e16	Fixed >100char lines in DAGScheduler.scala	2013-12-27 11:36:54 -08:00
Binh Nguyen	2c5bade4ee	Fix failed unit tests Also clean up a bit.	2013-12-27 11:24:30 -08:00
Kay Ousterhout	baaabcedc9	Removed unused failed and causeOfFailure variables	2013-12-27 11:12:36 -08:00
Aaron Davidson	2a7b3511f4	Add Apache headers	2013-12-27 10:55:16 -08:00
Reynold Xin	7be1e57786	Merge pull request #298 from aarondav/minor Minor: Decrease margin of left side of Log page Before ![before](https://f.cloud.github.com/assets/1400247/1812647/1a4be53e-6e87-11e3-9d5b-f851274be0e9.png) After ![after](https://f.cloud.github.com/assets/1400247/1812648/1ca1ea2c-6e87-11e3-946c-31be9258f450.png) It's a start anyway...	2013-12-26 23:41:40 -10:00
Andrew Or	d0cfbc41e2	Rename spark.shuffle.buffer variables	2013-12-27 00:07:09 -08:00
Andrew Or	8f3175773c	Final cleanup	2013-12-26 23:40:08 -08:00
Aaron Davidson	1dc0440c1a	Use real serializer & manual ordering	2013-12-26 23:40:08 -08:00
Aaron Davidson	0f66b7f2fc	Return efficient iterator if no spillage happened	2013-12-26 23:40:08 -08:00
Andrew Or	ec8c5dc644	Sort AppendOnlyMap in-place	2013-12-26 23:40:08 -08:00
Aaron Davidson	0289eb752a	Allow Product2 rather than just tuple kv pairs	2013-12-26 23:40:07 -08:00
Andrew Or	64b2d54a02	Move maps to util, and refactor more	2013-12-26 23:40:07 -08:00
Aaron Davidson	804beb43be	SamplingSizeTracker + Map + test suite	2013-12-26 23:40:07 -08:00
Andrew Or	7ad4408255	New minor edits	2013-12-26 23:40:07 -08:00
Aaron Davidson	fcc443b3db	Minor cleanup for Scala style	2013-12-26 23:40:07 -08:00
Andrew Or	2a2ca2a661	Add toggle for ExternalAppendOnlyMap in Aggregator and CoGroupedRDD	2013-12-26 23:40:07 -08:00
Andrew Or	28685a4820	Provide for cases when mergeCombiners is not specified in ExternalAppendOnlyMap	2013-12-26 23:40:07 -08:00
Andrew Or	17def8cc11	Refactor ExternalAppendOnlyMap to take in KVC instead of just KV	2013-12-26 23:40:07 -08:00
Andrew Or	6a45ec1972	Working ExternalAppendOnlyMap for both CoGroupedRDDs and Aggregator	2013-12-26 23:40:07 -08:00
Andrew Or	97fbb3ec52	Working ExternalAppendOnlyMap for Aggregator, but not for CoGroupedRDD	2013-12-26 23:40:07 -08:00
Patrick Wendell	55c8bb741c	Intermediate clean-up of tests to appease jenkins	2013-12-26 15:43:15 -08:00
Aaron Davidson	4f2fb761b0	Decrease margin of left side of log page	2013-12-26 15:38:45 -08:00
Patrick Wendell	5c1b4f6405	Minor fixes	2013-12-26 14:39:39 -08:00
Tathagata Das	5fde4566ea	Added Apache boilerplate and class docs to PartitionerAwareUnionRDD.	2013-12-26 14:33:37 -08:00
Patrick Wendell	c23d640516	Addressing smaller changes from Aaron's review	2013-12-26 12:38:39 -08:00
Tathagata Das	3579647cdc	Merge branch 'apache-master' into window-improvement	2013-12-26 12:12:10 -08:00
Patrick Wendell	da20270b83	Merge pull request #1 from aarondav/driver Refactor DriverClient to be more Actor-based	2013-12-26 12:11:52 -08:00
Patrick Wendell	a97ad55c45	Removing accidental file	2013-12-26 12:11:28 -08:00
Tathagata Das	c4a54f51b5	Merge branch 'master' into window-improvement	2013-12-26 12:03:11 -08:00
Patrick Wendell	5938cfc153	Updated approach to driver restarting	2013-12-26 12:02:19 -08:00
Mark Hamstra	c529dceaff	Avoid a lump of coal (NPE) in JobProgressListener's stocking.	2013-12-25 23:10:02 -08:00
Tathagata Das	94479673eb	Fixed bug in PartitionAwareUnionRDD	2013-12-26 00:07:45 +00:00
Aaron Davidson	61372b11f4	Refactor DriverClient to be more Actor-based	2013-12-25 10:55:25 -08:00
walker	0af4b4f3e8	Bug fixes for updating the RDD block's memory and disk usage information	2013-12-25 20:07:01 +08:00
Patrick Wendell	bbc362833b	Removing un-used variable	2013-12-25 01:38:57 -08:00
Patrick Wendell	18ad419b52	Small fix from rebase	2013-12-25 01:22:38 -08:00
Patrick Wendell	55f833803a	Minor bug fix	2013-12-25 01:19:25 -08:00
Patrick Wendell	c9c0f745af	Minor style clean-up	2013-12-25 01:19:25 -08:00
Patrick Wendell	b2b7514ba3	Import clean-up (yay Aaron)	2013-12-25 01:19:25 -08:00
Patrick Wendell	d5f23e0083	Adding scheduling and reporting based on cores	2013-12-25 01:19:01 -08:00
Patrick Wendell	760823d393	Adding better option parsing	2013-12-25 01:19:01 -08:00
Patrick Wendell	6a4acc4c2d	Initial cut at driver submission.	2013-12-25 01:19:01 -08:00
Patrick Wendell	1070b566d4	Renaming Client => AppClient	2013-12-25 01:17:01 -08:00
Patrick Wendell	85a344b4f0	Merge pull request #127 from kayousterhout/consolidate_schedulers Deduplicate Local and Cluster schedulers. The code in LocalScheduler/LocalTaskSetManager was nearly identical to the code in ClusterScheduler/ClusterTaskSetManager. The redundancy made making updating the schedulers unnecessarily painful and error- prone. This commit combines the two into a single TaskScheduler/ TaskSetManager. Unfortunately the diff makes this change look much more invasive than it is -- TaskScheduler.scala is only superficially changed (names updated, overrides removed) from the old ClusterScheduler.scala, and the same with TaskSetManager.scala. Thanks @rxin for suggesting this change!	2013-12-24 16:35:06 -08:00
Binh Nguyen	786f393a98	Fix imports order	2013-12-24 14:59:30 -08:00
Binh Nguyen	9115a5de62	Remove import * and fix some formatting	2013-12-24 14:59:30 -08:00
Binh Nguyen	040dd3ecd5	upgrade Netty from 4.0.0.Beta2 to 4.0.13.Final	2013-12-24 14:58:18 -08:00
Patrick Wendell	c2dd6bcd6e	Merge pull request #279 from aarondav/shuffle-cleanup0 Clean up shuffle files once their metadata is gone Previously, we would only clean the in-memory metadata for consolidated shuffle files. Additionally, fixes a bug where the Metadata Cleaner was ignoring type-specific TTLs.	2013-12-24 14:36:47 -08:00
Kay Ousterhout	1efe3adf56	Responded to Reynold's style comments	2013-12-24 14:18:39 -08:00
Tathagata Das	d4dfab503a	Fixed Python API for sc.setCheckpointDir. Also other fixes based on Reynold's comments on PR 289.	2013-12-24 14:01:13 -08:00
Tathagata Das	9f79fd89dc	Merge branch 'apache-master' into filestream-fix	2013-12-24 11:38:17 -08:00
Prashant Sharma	2573add94c	spark-544, introducing SparkConf and related configuration overhaul.	2013-12-25 00:09:36 +05:30
Matei Zaharia	23a9ae6be3	Merge pull request #277 from tdas/scheduler-update Refactored the streaming scheduler and added StreamingListener interface - Refactored the streaming scheduler for cleaner code. Specifically, the JobManager was renamed to JobScheduler, as it does the actual scheduling of Spark jobs to the SparkContext. The earlier Scheduler was renamed to JobGenerator, as it actually generates the jobs from the DStreams. The JobScheduler starts the JobGenerator. Also, moved all the scheduler related code from spark.streaming to spark.streaming.scheduler package. - Implemented the StreamingListener interface, similar to SparkListener. The streaming version of StatusReportListener prints the batch processing time statistics (for now). Added StreamingListernerSuite to test it. - Refactored streaming TestSuiteBase for deduping code in the other streaming testsuites.	2013-12-24 00:08:48 -05:00
Reynold Xin	11107c9de5	Merge pull request #244 from leftnoteasy/master Added SPARK-968 implementation for review Added SPARK-968 implementation for review	2013-12-23 10:38:20 -08:00
wangda.tan	2f689ba97b	SPARK-968, added executor address showing in aggregated metrics by executors table	2013-12-23 15:03:45 +08:00
Kay Ousterhout	b7bfae1afe	Correctly merged in maxTaskFailures fix	2013-12-22 07:34:44 -08:00
wangda.tan	c979eecdf6	added changes according to comments from rxin	2013-12-22 21:43:15 +08:00
Kay Ousterhout	b8ae096a40	Fix build error in test	2013-12-21 23:28:48 -08:00
Kay Ousterhout	30186aa264	Renamed ClusterScheduler to TaskSchedulerImpl	2013-12-20 14:58:04 -08:00
Kay Ousterhout	c06945cfe0	Merge remote branch 'upstream/master' into consolidate_schedulers Conflicts: core/src/main/scala/org/apache/spark/scheduler/cluster/ClusterTaskSetManager.scala core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala	2013-12-20 14:39:30 -08:00
Patrick Wendell	0bc57c5767	Merge pull request #280 from aarondav/minor Minor cleanup for standalone scheduler See commit messages	2013-12-20 11:56:54 -08:00
Tathagata Das	61f4bbda0d	Added tests for PartitionerAwareUnionRDD in the CheckpointSuite. Refactored CheckpointSuite to make the tests simpler and more reliable. Added missing test for ZippedRDD.	2013-12-20 00:41:47 -08:00
Patrick Wendell	eca68d4425	Merge pull request #272 from tmyklebu/master Track and report task result serialisation time. - DirectTaskResult now has a ByteBuffer valueBytes instead of a T value. - DirectTaskResult now has a member function T value() that deserialises valueBytes. - Executor serialises value into a ByteBuffer and passes it to DTR's ctor. - Executor tracks the time taken to do so and puts it in a new field in TaskMetrics. - StagePage now reports serialisation time from TaskMetrics along with the other things it reported.	2013-12-19 18:12:22 -08:00
Aaron Davidson	6613ab663d	Fix compiler warning in SparkZooKeeperSession	2013-12-19 17:56:13 -08:00
Aaron Davidson	4d74b899b7	Remove firstApp from the standalone scheduler Master As a lonely child with no one to care for it... we had to put it down.	2013-12-19 17:53:41 -08:00
Aaron Davidson	1ab031eaff	Extraordinarily minor code/comment cleanup	2013-12-19 17:51:29 -08:00
Aaron Davidson	0647ec9757	Clean up shuffle files once their metadata is gone Previously, we would only clean the in-memory metadata for consolidated shuffle files. Additionally, fixes a bug where the Metadata Cleaner was ignoring type- specific TTLs.	2013-12-19 15:40:48 -08:00
Reynold Xin	7990c56375	Merge pull request #276 from shivaram/collectPartition Add collectPartition to JavaRDD interface. This interface is useful for implementing `take` from other language frontends where the data is serialized. Also remove `takePartition` from PythonRDD and use `collectPartition` in rdd.py. Thanks @concretevitamin for the original change and tests.	2013-12-19 13:35:09 -08:00
Tathagata Das	de41c436a0	Merge branch 'scheduler-update' into window-improvement Conflicts: streaming/src/main/scala/org/apache/spark/streaming/dstream/WindowedDStream.scala	2013-12-19 12:05:08 -08:00
Shivaram Venkataraman	9cc3a6d3c0	Add comment explaining collectPartitions's use	2013-12-19 11:49:17 -08:00
Shivaram Venkataraman	d3234f9726	Make collectPartitions take an array of partitions Change the implementation to use runJob instead of PartitionPruningRDD. Also update the unit tests and the python take implementation to use the new interface.	2013-12-19 11:40:34 -08:00
Tathagata Das	984c582487	Merge branch 'scheduler-update' into filestream-fix Conflicts: core/src/main/scala/org/apache/spark/rdd/CheckpointRDD.scala streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala streaming/src/main/scala/org/apache/spark/streaming/scheduler/JobGenerator.scala streaming/src/test/scala/org/apache/spark/streaming/CheckpointSuite.scala	2013-12-19 11:20:48 -08:00
Nick Pentreath	a76f53416c	Add toString to Java RDD, and __repr__ to Python RDD	2013-12-19 14:38:20 +02:00
Tathagata Das	ec71b445ad	Minor changes.	2013-12-18 23:39:28 -08:00
Aaron Davidson	293a0af5a1	In experimental clusters we've observed that a 10 second timeout was insufficient, despite having a low number of nodes and relatively small workload (16 nodes, <1.5 TB data). This would cause an entire job to fail at the beginning of the reduce phase. There is no particular reason for this value to be small as a timeout should only occur in an exceptional situation. Also centralized the reading of spark.akka.askTimeout to AkkaUtils (surely this can later be cleaned up to use Typesafe). Finally, deleted some lurking implicits. If anyone can think of a reason they should still be there, please let me know.	2013-12-18 21:42:29 -08:00
Tathagata Das	e93b391d75	Merge branch 'apache-master' into scheduler-update Conflicts: streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala streaming/src/main/scala/org/apache/spark/streaming/dstream/ForEachDStream.scala	2013-12-18 17:51:14 -08:00
Tathagata Das	b80ec05635	Added StatsReportListener to generate processing time statistics across multiple batches.	2013-12-18 15:35:24 -08:00
Shivaram Venkataraman	af0cd6bd27	Add collectPartition to JavaRDD interface. Also remove takePartition from PythonRDD and use collectPartition in rdd.py.	2013-12-18 11:40:07 -08:00
Tor Myklebust	d3b1af4b6c	Add a serialisation time column to the StagePage.	2013-12-18 14:25:56 -05:00
Tor Myklebust	717c7fddb2	objectSer -> valueSer in a test.	2013-12-17 23:02:21 -05:00
Reynold Xin	9a6864d016	Fixed a performance problem in RDD.top and BoundedPriorityQueue (size in BoundedPriority was actually traversing the entire queue to calculate the size, resulting in bad performance in insertion).	2013-12-17 18:44:39 -08:00
wangda.tan	59e53fa21c	spark-968, changes for avoid a NPE	2013-12-17 17:57:27 +08:00
wangda.tan	36060f4f50	spark-898, changes according to review comments	2013-12-17 17:55:38 +08:00
Patrick Wendell	c1fec89895	Cleanup	2013-12-16 21:56:21 -08:00
Patrick Wendell	c6f95e603e	Attempt with extra repositories	2013-12-16 21:53:51 -08:00
Tor Myklebust	b2f0329511	Missed a spot; had an objectSer here too.	2013-12-17 00:18:46 -05:00
Tor Myklebust	25fa976580	Merge branch 'master' of git://github.com/apache/incubator-spark	2013-12-16 23:48:37 -05:00
Tor Myklebust	963d6f065a	Incorporate pwendell's code review suggestions.	2013-12-16 23:14:52 -05:00
Reynold Xin	883e034aeb	Merge pull request #245 from gregakespret/task-maxfailures-fix Fix for spark.task.maxFailures not enforced correctly. Docs at http://spark.incubator.apache.org/docs/latest/configuration.html say: ``` spark.task.maxFailures Number of individual task failures before giving up on the job. Should be greater than or equal to 1. Number of allowed retries = this value - 1. ``` Previous implementation worked incorrectly. When for example `spark.task.maxFailures` was set to 1, the job was aborted only after the second task failure, not after the first one.	2013-12-16 14:16:02 -08:00
Tor Myklebust	882d544856	UI to display serialisation time of a stage.	2013-12-16 13:27:03 -05:00
Tor Myklebust	8a397a959b	Track task value serialisation time in TaskMetrics.	2013-12-16 12:07:39 -05:00
wangda.tan	8ab8c6a526	Merge branch 'master' of git://github.com/apache/incubator-spark	2013-12-16 21:45:43 +08:00
Reynold Xin	bad85b051d	Use murmur3 hash for open hashset. (cherry picked from commit 212ff6834515543163aa63a3f4f762ebe641f8ca) Signed-off-by: Ankur Dave <ankurdave@gmail.com>	2013-12-15 17:23:15 -08:00
Mark Hamstra	09ed7ddfa0	Use scala.binary.version in POMs	2013-12-15 12:39:58 -08:00
Josh Rosen	2fd781d347	Merge pull request #249 from ngbinh/partitionInJavaSortByKey Expose numPartitions parameter in JavaPairRDD.sortByKey() This change makes Java and Scala API on sortByKey() the same.	2013-12-14 12:59:37 -08:00
Prashant Sharma	1ae3c0fc5e	Added a comment about ActorRef and ActorSelection difference.	2013-12-14 10:44:24 +05:30
Prashant Sharma	a854cc536d	Review comments on the PR for scala 2.10 migration.	2013-12-13 15:19:51 +05:30
Tathagata Das	097e120c0c	Refactored streaming scheduler and added listener interface. - Refactored Scheduler + JobManager to JobGenerator + JobScheduler and added JobSet for cleaner code. Moved scheduler related code to streaming.scheduler package. - Added StreamingListener trait (similar to SparkListener) to enable gathering to streaming stats like processing times and delays. StreamingContext.addListener() to added listeners. - Deduped some code in streaming tests by modifying TestSuiteBase, and added StreamingListenerSuite.	2013-12-12 20:48:02 -08:00
Tathagata Das	5e9ce83d68	Fixed multiple file stream and checkpointing bugs. - Made file stream more robust to transient failures. - Changed Spark.setCheckpointDir API to not have the second 'useExisting' parameter. Spark will always create a unique directory for checkpointing underneath the directory provide to the funtion. - Fixed bug wrt local relative paths as checkpoint directory. - Made DStream and RDD checkpointing use SparkContext.hadoopConfiguration, so that more HDFS compatible filesystems are supported for checkpointing.	2013-12-11 14:01:36 -08:00
Prashant Sharma	603af51bb5	Merge branch 'master' into akka-bug-fix Conflicts: core/pom.xml core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala pom.xml project/SparkBuild.scala streaming/pom.xml yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocationHandler.scala	2013-12-11 10:21:53 +05:30
Hossein Falaki	49bf47e1b7	Removed superfluous abs call from test cases.	2013-12-10 19:50:50 -08:00
Binh Nguyen	0b494f7db4	Hook directly to Scala API	2013-12-10 11:17:52 -08:00
Binh Nguyen	e85af50767	Leave default value of numPartitions to Scala code.	2013-12-10 11:04:14 -08:00
Grega Kespret	558af87334	Fix tests.	2013-12-10 11:43:42 +01:00
Binh Nguyen	c82d4f079b	Use braces to shorten the line.	2013-12-10 01:04:52 -08:00
Binh Nguyen	5013fb64b2	Expose numPartitions parameter in JavaPairRDD.sortByKey() This change make Java and Scala API on sortByKey() the same.	2013-12-10 00:38:16 -08:00
Prashant Sharma	17db6a9041	Style fixes and addressed review comments at #221	2013-12-10 11:47:16 +05:30
Patrick Wendell	5b74609d97	License headers	2013-12-09 16:41:01 -08:00
Grega Kespret	14a1df6572	Fix for spark.task.maxFailures not enforced correctly.	2013-12-09 10:39:02 +01:00
wangda.tan	ee68a85cff	SPARK-968, added sc finalize code to avoid akka rebinding to the same port	2013-12-09 09:38:58 +08:00
Aaron Davidson	40f63eb034	Merge master into 127	2013-12-08 11:16:52 -08:00
wangda.tan	850c4b709a	Merge branch 'master' of https://github.com/leftnoteasy/incubator-spark-1	2013-12-09 00:12:46 +08:00
wangda.tan	48e4f2ad14	SPARK-968, In stage UI, add an overview section that shows task stats grouped by executor id	2013-12-09 00:02:59 +08:00
Prashant Sharma	7ad6921ae0	Incorporated Patrick's feedback comment on #211 and made maven build/dep-resolution atleast a bit faster.	2013-12-07 12:45:57 +05:30
Matei Zaharia	e0392343a0	Merge pull request #190 from markhamstra/Stages4Jobs stageId <--> jobId mapping in DAGScheduler Okay, I think this one is ready to go -- or at least it's ready for review and discussion. It's a carry-over of https://github.com/mesos/spark/pull/842 with updates for the newer job cancellation functionality. The prior discussion still applies. I've actually changed the job cancellation flow a bit: Instead of ``cancelTasks`` going to the TaskScheduler and then ``taskSetFailed`` coming back to the DAGScheduler (resulting in ``abortStage`` there), the DAGScheduler now takes care of figuring out which stages should be cancelled, tells the TaskScheduler to cancel tasks for those stages, then does the cleanup within the DAGScheduler directly without the need for any further prompting by the TaskScheduler. I know of three outstanding issues, each of which can and should, I believe, be handled in follow-up pull requests: 1) https://spark-project.atlassian.net/browse/SPARK-960 2) JobLogger should be re-factored to eliminate duplication 3) Related to 2), the WebUI should also become a consumer of the DAGScheduler's new understanding of the relationship between jobs and stages so that it can display progress indication and the like grouped by job. Right now, some of this information is just being sent out as part of ``SparkListenerJobStart`` messages, but more or different job <--> stage information may need to be exported from the DAGScheduler to meet listeners needs. Except for the eventQueue -> Actor commit, the rest can be cherry-picked almost cleanly into branch-0.8. A little merging is needed in MapOutputTracker and the DAGScheduler. Merged versions of those files are in `aba2b40ce0` Note that between the recent Actor change in the DAGScheduler and the cleaning up of DAGScheduler data structures on job completion in this PR, some races have been introduced into the DAGSchedulerSuite. Those tests usually pass, and I don't think that better-behaved code that doesn't directly inspect DAGScheduler data structures should be seeing any problems, but I'll work on fixing DAGSchedulerSuite as either an addition to this PR or as a separate request. UPDATE: Fixed the race that I introduced. Created a JIRA issue (SPARK-965) for the one that was introduced with the switch to eventProcessorActor in the DAGScheduler.	2013-12-06 11:49:59 -08:00
Matei Zaharia	bfa68609d9	Merge pull request #233 from hsaputra/changecontexttobackend Change the name of input argument in ClusterScheduler#initialize from context to backend. The SchedulerBackend used to be called ClusterSchedulerContext so just want to make small change of the input param in the ClusterScheduler#initialize to reflect this.	2013-12-06 11:04:03 -08:00
Matei Zaharia	3fb302c08d	Merge pull request #205 from kayousterhout/logging Added logging of scheduler delays to UI This commit adds two metrics to the UI: 1) The time to get task results, if they're fetched remotely 2) The scheduler delay. When the scheduler starts getting overwhelmed (because it can't keep up with the rate at which tasks are being submitted), the result is that tasks get delayed on the tail-end: the message from the worker saying that the task has completed ends up in a long queue and takes a while to be processed by the scheduler. This commit records that delay in the UI so that users can tell when the scheduler is becoming the bottleneck.	2013-12-06 11:03:32 -08:00
Matei Zaharia	87676a6af2	Merge pull request #220 from rxin/zippart Memoize preferred locations in ZippedPartitionsBaseRDD so preferred location computation doesn't lead to exponential explosion. This was a problem in GraphX where we have a whole chain of RDDs that are ZippedPartitionsRDD's, and the preferred locations were taking eternity to compute. (cherry picked from commit `e36fe55a03`) Signed-off-by: Reynold Xin <rxin@apache.org>	2013-12-06 11:01:42 -08:00
Aaron Davidson	94b5881ee9	Fix long lines	2013-12-06 00:22:00 -08:00
Aaron Davidson	5a864e3fce	Rename SparkActorSystem to IndestructibleActorSystem	2013-12-06 00:21:43 -08:00
Prashant Sharma	c9cd2af71e	Merge branch 'wip-scala-2.10' into akka-bug-fix	2013-12-06 13:32:15 +05:30
Mark Hamstra	ee888f6b25	FutureAction result tests	2013-12-05 23:01:18 -08:00

... 5 6 7 8 9 ...

3250 commits