ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Matei Zaharia	232765f7b2	Merge pull request #26 from Du-Li/master fixed a wildcard bug in make-distribution.sh; ask sbt to check local maven repo in project/SparkBuild.scala (1) fixed a wildcard bug in make-distribution.sh: with the wildcard * in quotes, this cp command failed. it worked after moving the wildcard out quotes. (2) ask sbt to check local maven repo in SparkBuild.scala: To build Spark (0.9.0-SNAPSHOT) with the HEAD of mesos (0.15.0), I must do "make maven-install" under mesos/build, which publishes the java .jar file under ~/.m2. However, when building Spark (after pointing mesos to version 0.15.0), sbt uses ivy which by default only checks ~/.ivy2. This change is to tell sbt to also check ~/.m2.	2013-10-03 12:00:48 -07:00
Matei Zaharia	405e69bb20	Merge pull request #25 from CruncherBigData/master Update README: updated the link	2013-10-03 10:52:41 -07:00
Matei Zaharia	49dbfccf6b	Merge pull request #28 from tgravescs/sparYarnAppName Allow users to set the application name for Spark on Yarn	2013-10-03 10:52:06 -07:00
tgravescs	c021b8c202	Add default value to usage statement	2013-10-03 08:07:19 -05:00
Reynold Xin	802bfb870d	- Created AsyncRDDActions. - Make FutureJob a Scala Future instead of Java Future.	2013-10-03 01:22:28 -07:00
Reynold Xin	e8e917f209	Merge branch 'master' into kill Conflicts: core/src/main/scala/org/apache/spark/TaskEndReason.scala core/src/main/scala/org/apache/spark/executor/Executor.scala core/src/main/scala/org/apache/spark/scheduler/cluster/ClusterTaskSetManager.scala core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala	2013-10-02 23:01:34 -07:00
Matei Zaharia	e597ea34a6	Merge pull request #10 from kayousterhout/results_through-bm Send Task results through the block manager when larger than Akka frame size (fixes SPARK-669). This change requires adding an extra failure mode: tasks can complete successfully, but the result gets lost or flushed from the block manager before it's been fetched. This change also moves the deserialization of tasks into a separate thread, so it's no longer part of the DAG scheduler's tight loop. This should improve scheduler throughput, particularly when tasks are sending back large results. Thanks Josh for writing the original version of this patch! This is duplicated from the mesos/spark repo: https://github.com/mesos/spark/pull/835	2013-10-02 21:14:24 -07:00
Reynold Xin	1c48ba0d9f	Merge remote-tracking branch 'origin' into kill Conflicts: core/src/main/scala/org/apache/spark/scheduler/TaskScheduler.scala core/src/main/scala/org/apache/spark/scheduler/cluster/ClusterTaskSetManager.scala core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala	2013-10-02 16:40:44 -07:00
tgravescs	bc3b20abdc	Allow users to set the application name for Spark on Yarn	2013-10-02 12:54:17 -05:00
David McCauley	1577b373a9	SPARK-921 - Add Application UI URL to ApplicationInfo Json output	2013-10-02 15:03:41 +01:00
David McCauley	351da54676	SPARK-920 - JSON endpoint URI scheme part (spark://) duplicated	2013-10-02 13:23:38 +01:00
Du Li	9fd6bba60d	ask ivy/sbt to check local maven repo under ~/.m2	2013-10-01 15:46:51 -07:00
Du Li	0d19f00e9e	fixed a bug of using wildcard in quotes	2013-10-01 15:42:06 -07:00
CruncherBigData	c85f720588	Update README	2013-10-01 09:05:03 -07:00
Kay Ousterhout	0dcad2edcb	Added additional unit test for repeated task failures	2013-09-30 23:26:15 -07:00
Kay Ousterhout	dea4677c88	Fixed compilation errors and broken test.	2013-09-30 22:07:01 -07:00
Kay Ousterhout	8deda427bc	Merge remote-tracking branch 'upstream/master' into results_through-bm Conflicts: core/src/main/scala/org/apache/spark/scheduler/cluster/ClusterScheduler.scala core/src/main/scala/org/apache/spark/scheduler/cluster/ClusterTaskSetManager.scala core/src/main/scala/org/apache/spark/scheduler/local/LocalTaskSetManager.scala	2013-09-30 10:16:58 -07:00
Kay Ousterhout	58b764b7c6	Addressed Matei's code review comments	2013-09-30 10:11:59 -07:00
Grace Huang	892fb8ffa8	remedy the line-wrap while exceeding 100 chars	2013-09-30 20:12:55 +08:00
Harvey Feng	7d06bdde1d	Merge HadoopDatasetRDD into HadoopRDD.	2013-09-29 20:08:03 -07:00
Grace Huang	4b68be5f3c	SPARK-900 Use coarser grained naming for metrics	2013-09-27 14:47:38 +08:00
Harvey Feng	417085716a	Merge remote-tracking branch 'oldsparkme/hadoopRDD-broadcast-change' into hadoop-config-cache	2013-09-26 15:49:42 -07:00
Reynold Xin	714fdabd99	Merge pull request #17 from rxin/optimize Remove -optimize flag	2013-09-26 14:28:55 -07:00
Reynold Xin	13eced723f	Merge pull request #16 from pwendell/master Bug fix in master build	2013-09-26 14:18:19 -07:00
Reynold Xin	70a0b993d4	Merge pull request #14 from kayousterhout/untangle_scheduler Improved organization of scheduling packages. This commit does not change any code -- only file organization. Please let me know if there was some masterminded strategy behind the existing organization that I failed to understand! There are two components of this change: (1) Moving files out of the cluster package, and down a level to the scheduling package. These files are all used by the local scheduler in addition to the cluster scheduler(s), so should not be in the cluster package. As a result of this change, none of the files in the local package reference files in the cluster package. (2) Moving the mesos package to within the cluster package. The mesos scheduling code is for a cluster, and represents a specific case of cluster scheduling (the Mesos-related classes often subclass cluster scheduling classes). Thus, the most logical place for it seems to be within the cluster package. The one thing about the scheduling code that seems a little funny to me is the naming of the SchedulerBackends. The StandaloneSchedulerBackend is not just for Standalone mode, but instead is used by Mesos coarse grained mode and Yarn, and the backend that is just for Standalone mode is instead called SparkDeploySchedulerBackend. I didn't change this because I wasn't sure if there was a reason for this naming that I'm just not aware of.	2013-09-26 14:11:54 -07:00
Reynold Xin	76677b8fa1	Merge pull request #670 from jey/ec2-ssh-improvements EC2 SSH improvements	2013-09-26 14:03:46 -07:00
Reynold Xin	3f283278b0	Removed scala -optimize flag.	2013-09-26 13:58:10 -07:00
Reynold Xin	c514cd1587	Merge pull request #930 from holdenk/master Add mapPartitionsWithIndex	2013-09-26 13:48:20 -07:00
Patrick Wendell	e2ff59af72	Bug fix in master build	2013-09-26 13:06:51 -07:00
Reynold Xin	560ee5c9bb	Merge pull request #7 from wannabeast/memorystore-fixes some minor fixes to MemoryStore This is a repeat of #5, moved to its own branch in my repo. This makes all updates to on ; it skips on synchronizing the reads where it can get away with it.	2013-09-26 11:27:34 -07:00
Patrick Wendell	6566a19b38	Merge pull request #9 from rxin/limit Smarter take/limit implementation.	2013-09-26 08:01:04 -07:00
Kay Ousterhout	d85fe41b2b	Improved organization of scheduling packages. This commit does not change any code -- only file organization. There are two components of this change: (1) Moving files out of the cluster package, and down a level to the scheduling package. These files are all used by the local scheduler in addition to the cluster scheduler(s), so should not be in the cluster package. As a result of this change, none of the files in the local package reference files in the cluster package. (2) Moving the mesos package to within the cluster package. The mesos scheduling code is for a cluster, and represents a specific case of cluster scheduling (the Mesos-related classes often subclass cluster scheduling classes). Thus, the most logical place for it is within the cluster package.	2013-09-25 12:45:46 -07:00
Patrick Wendell	9d34838bde	Merge remote-tracking branch 'apache-github/pr/13' into HEAD	2013-09-24 15:31:12 -07:00
Patrick Wendell	6079721fa1	Update build version in master	2013-09-24 11:41:51 -07:00
Holden Karau	0cef683553	Fix formatting :)	2013-09-23 19:39:42 -07:00
Reynold Xin	7220e8f90b	Merge remote-tracking branch 'pr/12' Fix spacing so java.io.tmpdir doesn't run on with SPARK_JAVA_OPTS	2013-09-23 14:02:21 -07:00
$Y.CORP.YAHOO.COM\tgraves$ Y.CORP.YAHOO.COM\tgraves	a314b30733	Fix spacing so that the java.io.tmpdir doesn't run on with SPARK_JAVA_OPTS	2013-09-23 14:48:17 -05:00
Reynold Xin	0d2e5c3e24	Merge branch 'master' of https://git-wip-us.apache.org/repos/asf/incubator-spark	2013-09-23 11:55:55 -07:00
Reynold Xin	ff540a015b	Merge branch 'master' of github.com:markhamstra/incubator-spark	2013-09-23 11:55:02 -07:00
Reynold Xin	f4dc9d37f8	Merge branch 'master' of github.com:mesos/spark	2013-09-23 11:52:52 -07:00
Nick Pentreath	d952f04c8e	Merge remote-tracking branch 'upstream/master' into implicit-als	2013-09-23 13:07:40 +02:00
Kay Ousterhout	c75eb14fe5	Send Task results through the block manager when larger than Akka frame size. This change requires adding an extra failure mode: tasks can complete successfully, but the result gets lost or flushed from the block manager before it's been fetched.	2013-09-22 21:20:48 -07:00
Holden Karau	7fe0b0ff56	Switch indent from 2 to 4 spaces	2013-09-22 19:44:51 -07:00
Reynold Xin	834686b108	Merge pull request #928 from jerryshao/fairscheduler-refactor Refactor FairSchedulableBuilder	2013-09-22 15:06:48 -07:00
Harvey	ef34cfb26c	Move Configuration broadcasts to SparkContext.	2013-09-22 14:43:58 -07:00
Harvey	a6eeb5ffd5	Add a cache for HadoopRDD metadata needed during computation. Currently, the cache is in SparkHadoopUtils, since it's conveniently a member of the SparkEnv.	2013-09-22 03:09:17 -07:00
jerryshao	77e9da1f34	Change Exception to NoSuchElementException and minor style fix	2013-09-22 16:50:08 +08:00
jerryshao	85024acd2e	Remove infix style and others	2013-09-22 14:20:55 +08:00
jerryshao	5850f599dd	Refactor FairSchedulableBuilder: 1. Configuration can be read from classpath if not set explicitly. 2. Add missing close handler.	2013-09-22 14:20:55 +08:00
Reynold Xin	a2ea069a5f	Merge pull request #937 from jerryshao/localProperties-fix Fix PR926 local properties issues in Spark Streaming like scenarios	2013-09-21 23:04:42 -07:00

1 2 3 4 5 ...

4192 commits