ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Patrick Wendell	75d161b357	Forcing shuffle consolidation in DiskBlockManagerSuite	2013-12-05 11:36:41 -08:00
Matei Zaharia	72b696156c	Merge pull request #199 from harveyfeng/yarn-2.2 Hadoop 2.2 migration Includes support for the YARN API stabilized in the Hadoop 2.2 release, and a few style patches. Short description for each set of commits: `a98f5a0` - "Misc style changes in the 'yarn' package" `a67ebf4` - "A few more style fixes in the 'yarn' package" Both of these are some minor style changes, such as fixing lines over 100 chars, to the existing YARN code. `ab8652f` - "Add a 'new-yarn' directory ... " Copies everything from `SPARK_HOME/yarn` to `SPARK_HOME/new-yarn`. No actual code changes here. `4f1c3fa` - "Hadoop 2.2 YARN API migration ..." API patches to code in the `SPARK_HOME/new-yarn` directory. There are a few more small style changes mixed in, too. Based on @colorant's Hadoop 2.2 support for the scala-2.10 branch in #141. `a1a1c62` - "Add optional Hadoop 2.2 settings in sbt build ... " If Spark should be built against Hadoop 2.2, then: a) the `org.apache.spark.deploy.yarn` package will be compiled from the `new-yarn` directory. b) Protobuf v2.5 will be used as a Spark dependency, since Hadoop 2.2 depends on it. Also, Spark will be built against a version of Akka v2.0.5 that's built against Protobuf 2.5, named `akka-2.0.5-protobuf-2.5`. The patched Akka is here: https://github.com/harveyfeng/akka/tree/2.0.5-protobuf-2.5, and was published to local Ivy during testing. There's also a new boolean environment variable, `SPARK_IS_NEW_HADOOP`, that users can manually set if their `SPARK_HADOOP_VERSION` specification does not start with `2.2`, which is how the build file tries to detect a 2.2 version. Not sure if this is necessary or done in the best way, though...	2013-12-04 23:33:04 -08:00
Patrick Wendell	1450b8ef87	Small changes from Matei review	2013-12-04 18:49:32 -08:00
Patrick Wendell	b1c6fa1584	Document missing configs and set shuffle consolidation to false.	2013-12-04 18:39:34 -08:00
Patrick Wendell	182f9baeed	Merge pull request #227 from pwendell/master Fix small bug in web UI and minor clean-up. There was a bug where sorting order didn't work correctly for write time metrics. I also cleaned up some earlier code that fixed the same issue for read and write bytes.	2013-12-04 15:52:07 -08:00
Reynold Xin	b9e7609f2c	Merge pull request #225 from ash211/patch-3 Add missing space after "Serialized" in StorageLevel Current code creates outputs like: scala> res0.getStorageLevel.description res2: String = Serialized1x Replicated	2013-12-04 14:42:09 -08:00
Patrick Wendell	380b90b9b3	Fix small bug in web UI and minor clean-up. There was a bug where sorting order didn't work correctly for write time metrics. I also cleaned up some earlier code that fixed the same issue for read and write bytes.	2013-12-04 14:41:48 -08:00
Reynold Xin	055462c1d3	Merge pull request #226 from ash211/patch-4 Typo: applicaton	2013-12-04 14:02:11 -08:00
Andrew Ash	0c5af38b86	Typo: applicaton	2013-12-04 12:30:25 -08:00
Andrew Ash	217611680d	Add missing space after "Serialized" in StorageLevel Current code creates outputs like: scala> res0.getStorageLevel.description res2: String = Serialized1x Replicated	2013-12-04 11:29:20 -08:00
Matei Zaharia	d6e5473872	Merge pull request #223 from rxin/transient Mark partitioner, name, and generator field in RDD as @transient. As part of the effort to reduce serialized task size.	2013-12-04 10:28:50 -08:00
Reynold Xin	8a3475aed6	Merge pull request #218 from JoshRosen/spark-970-pyspark-unicode-error Fix UnicodeEncodeError in PySpark saveAsTextFile() (SPARK-970) This fixes [SPARK-970](https://spark-project.atlassian.net/browse/SPARK-970), an issue where PySpark's saveAsTextFile() could throw UnicodeEncodeError when called on an RDD of Unicode strings. Please merge this into master and branch-0.8.	2013-12-03 14:21:40 -08:00
Reynold Xin	974a69d79c	Marked doCheckpointCalled as transient.	2013-12-03 11:34:38 -08:00
Mark Hamstra	403234dd0d	SparkListenerJobStart posted from local jobs	2013-12-03 09:57:32 -08:00
Mark Hamstra	f55d0b935d	Synchronous, inline cleanup after runLocally	2013-12-03 09:57:32 -08:00
Mark Hamstra	c9fcd909d0	Local jobs post SparkListenerJobEnd, and DAGScheduler data structure cleanup always occurs before any posting of SparkListenerJobEnd.	2013-12-03 09:57:32 -08:00
Mark Hamstra	9ae2d094a9	Tightly couple stageIdToJobIds and jobIdToStageIds	2013-12-03 09:57:32 -08:00
Mark Hamstra	27c45e5236	Cleaned up job cancellation handling	2013-12-03 09:57:32 -08:00
Mark Hamstra	686a420ddc	Refactoring to make job removal, stage removal, task cancellation clearer	2013-12-03 09:57:32 -08:00
Mark Hamstra	205566e56e	Improved comment	2013-12-03 09:57:32 -08:00
Mark Hamstra	94087c463b	Removed redundant residual re: reverted refactoring.	2013-12-03 09:57:31 -08:00
Mark Hamstra	982797dcba	Fixed intended side-effects	2013-12-03 09:57:31 -08:00
Mark Hamstra	6f8359b5ad	Actor instead of eventQueue for LocalJobCompleted	2013-12-03 09:57:31 -08:00
Mark Hamstra	51458ab4a1	Added stageId <--> jobId mapping in DAGScheduler ...and make sure that DAGScheduler data structures are cleaned up on job completion. Initial effort and discussion at https://github.com/mesos/spark/pull/842	2013-12-03 09:57:31 -08:00
Harvey Feng	46b87b8a25	Merge pull request #2 from colorant/yarn-client-2.2 Fix pom.xml for maven build	2013-12-03 00:41:11 -08:00
Raymond Liu	4738818dd6	Fix pom.xml for maven build	2013-12-03 16:36:05 +08:00
Harvey Feng	1b6e450771	Use published "org.spark-project.akka-*" in sbt build for Hadoop-2.2 dependencies. This also includes: -Change `isNewYarn` to `isNewHadoop`, since the protobuf-2.5 dependency is from Hadoop-2.2 itself. -Regexp bugix Credits to @alig for this patch.	2013-12-03 00:28:33 -08:00
Reynold Xin	58d9bbcfec	Merge pull request #217 from aarondav/mesos-urls Re-enable zk:// urls for Mesos SparkContexts This was broken in PR #71 when we explicitly disallow anything that didn't fit a mesos:// url. Although it is not really clear that a zk:// url should match Mesos, it is what the docs say and it is necessary for backwards compatibility. Additionally added a unit test for the creation of all types of TaskSchedulers. Since YARN and Mesos are not necessarily available in the system, they are allowed to pass as long as the YARN/Mesos code paths are exercised.	2013-12-02 21:58:53 -08:00
Prashant Sharma	09e8be9a62	Made running SparkActorSystem specific to executors only.	2013-12-03 11:27:45 +05:30
Aaron Davidson	0f24576c08	Cleanup and documentation of SparkActorSystem	2013-12-03 11:05:12 +05:30
Reynold Xin	e34b4693d3	Mark partitioner, name, and generator field in RDD as @transient.	2013-12-02 21:24:44 -08:00
Kay Ousterhout	58b3aff9a8	Fixed problem with scheduler delay	2013-12-02 20:30:03 -08:00
Aaron Davidson	f6c8c1c7b6	Cleanup and documentation of SparkActorSystem	2013-12-02 11:42:53 -08:00
Prashant Sharma	5b11028a04	Made akka capable of tolerating fatal exceptions and moving on.	2013-12-02 10:47:39 +05:30
Reynold Xin	740922f25d	Merge pull request #219 from sundeepn/schedulerexception Scheduler quits when newStage fails The current scheduler thread does not handle exceptions from newStage stage while launching new jobs. The thread fails on any exception that gets triggered at that level, leaving the cluster hanging with no schduler.	2013-12-01 12:46:58 -08:00
Sundeep Narravula	be3ea2394f	Log exception in scheduler in addition to passing it to the caller. Code Styling changes.	2013-12-01 00:50:34 -08:00
Reynold Xin	60e23a58b2	Merge pull request #216 from liancheng/fix-spark-966 Bugfix: SPARK-965 & SPARK-966 SPARK-965: https://spark-project.atlassian.net/browse/SPARK-965 SPARK-966: https://spark-project.atlassian.net/browse/SPARK-966 * Add back `DAGScheduler.start()`, `eventProcessActor` is created and started here. Notice that function is only called by `SparkContext`. * Cancel the scheduled stage resubmission task when stopping `eventProcessActor` * Add a new `DAGSchedulerEvent` `ResubmitFailedStages` This event message is sent by the scheduled stage resubmission task to `eventProcessActor`. In this way, `DAGScheduler.resubmitFailedStages()` is guaranteed to be executed from the same thread that runs `DAGScheduler.processEvent()`. Please refer to discussion in [SPARK-966](https://spark-project.atlassian.net/browse/SPARK-966) for details.	2013-11-30 23:38:49 -08:00
Reynold Xin	9cf7f31e4d	Memoize preferred locations in ZippedPartitionsBaseRDD so preferred location computation doesn't lead to exponential explosion. (cherry picked from commit `e36fe55a03`) Signed-off-by: Reynold Xin <rxin@apache.org>	2013-11-30 18:10:52 -08:00
Sundeep Narravula	4d53830eb7	Scheduler quits when createStage fails. The current scheduler thread does not handle exceptions from createStage stage while launching new jobs. The thread fails on any exception that gets triggered at that level, leaving the cluster hanging with no schduler.	2013-11-30 16:18:12 -08:00
Aaron Davidson	96df26be47	Add spaces between tests	2013-11-29 13:20:43 -08:00
Prashant Sharma	5618af6803	Merge branch 'master' into wip-scala-2.10	2013-11-29 13:41:21 +05:30
Prashant Sharma	1bc83ca791	Changed defaults for akka to almost disable failure detector.	2013-11-29 13:41:05 +05:30
Lian, Cheng	4a1d966e26	More comments	2013-11-29 16:02:58 +08:00
Lian, Cheng	1e25086009	Updated some inline comments in DAGScheduler	2013-11-29 15:56:47 +08:00
Josh Rosen	3787f514d9	Fix UnicodeEncodeError in PySpark saveAsTextFile(). Fixes SPARK-970.	2013-11-28 23:44:56 -08:00
Aaron Davidson	081a0b6861	Add unit test for SparkContext scheduler creation Since YARN and Mesos are not necessarily available in the system, they are allowed to pass as long as the YARN/Mesos code paths are exercised.	2013-11-28 20:40:57 -08:00
Aaron Davidson	37f161cf6b	Re-enable zk:// urls for Mesos SparkContexts This was broken in PR #71 when we explicitly disallow anything that didn't fit a mesos:// url. Although it is not really clear that a zk:// url should match Mesos, it is what the docs say and it is necessary for backwards compatibility.	2013-11-28 20:37:56 -08:00
Lian, Cheng	18def5d6f2	Bugfix: SPARK-965 & SPARK-966 SPARK-965: https://spark-project.atlassian.net/browse/SPARK-965 SPARK-966: https://spark-project.atlassian.net/browse/SPARK-966 * Add back DAGScheduler.start(), eventProcessActor is created and started here. Notice that function is only called by SparkContext. * Cancel the scheduled stage resubmission task when stopping eventProcessActor * Add a new DAGSchedulerEvent ResubmitFailedStages This event message is sent by the scheduled stage resubmission task to eventProcessActor. In this way, DAGScheduler.resubmitFailedStages is guaranteed to be executed from the same thread that runs DAGScheduler.processEvent. Please refer to discussion in SPARK-966 for details.	2013-11-28 17:46:06 +08:00
Prashant Sharma	3ec5d74766	Fixed the broken build.	2013-11-28 13:02:28 +05:30
Matei Zaharia	743a31a7ca	Merge pull request #210 from haitaoyao/http-timeout add http timeout for httpbroadcast While pulling task bytecode from HttpBroadcast server, there's no timeout value set. This may cause spark executor code hang and other task in the same executor process wait for the lock. I have encountered the issue in my cluster. Here's the stacktrace I captured : https://gist.github.com/haitaoyao/7655830 So add a time out value to ensure the task fail fast.	2013-11-27 18:24:39 -08:00

... 3 4 5 6 7 ...

4987 commits