Commit graph

4306 commits

Author SHA1 Message Date
Reynold Xin ff540a015b Merge branch 'master' of github.com:markhamstra/incubator-spark 2013-09-23 11:55:02 -07:00
Reynold Xin f4dc9d37f8 Merge branch 'master' of github.com:mesos/spark 2013-09-23 11:52:52 -07:00
Y.CORP.YAHOO.COM\tgraves 9d4246863a Support distributed cache files and archives on spark on yarn and attempt to cleanup the staging directory on exit 2013-09-23 09:09:59 -05:00
Nick Pentreath d952f04c8e Merge remote-tracking branch 'upstream/master' into implicit-als 2013-09-23 13:07:40 +02:00
Kay Ousterhout c75eb14fe5 Send Task results through the block manager when larger than Akka frame size.
This change requires adding an extra failure mode: tasks can complete
successfully, but the result gets lost or flushed from the block manager
before it's been fetched.
2013-09-22 21:20:48 -07:00
Holden Karau 7fe0b0ff56 Switch indent from 2 to 4 spaces 2013-09-22 19:44:51 -07:00
Reynold Xin 834686b108 Merge pull request #928 from jerryshao/fairscheduler-refactor
Refactor FairSchedulableBuilder
2013-09-22 15:06:48 -07:00
Harvey ef34cfb26c Move Configuration broadcasts to SparkContext. 2013-09-22 14:43:58 -07:00
Harvey a6eeb5ffd5 Add a cache for HadoopRDD metadata needed during computation.
Currently, the cache is in SparkHadoopUtils, since it's conveniently a member of the SparkEnv.
2013-09-22 03:09:17 -07:00
jerryshao 77e9da1f34 Change Exception to NoSuchElementException and minor style fix 2013-09-22 16:50:08 +08:00
jerryshao 85024acd2e Remove infix style and others 2013-09-22 14:20:55 +08:00
jerryshao 5850f599dd Refactor FairSchedulableBuilder:
1. Configuration can be read from classpath if not set explicitly.
2. Add missing close handler.
2013-09-22 14:20:55 +08:00
Reynold Xin a2ea069a5f Merge pull request #937 from jerryshao/localProperties-fix
Fix PR926 local properties issues in Spark Streaming like scenarios
2013-09-21 23:04:42 -07:00
Reynold Xin f06f2da2cb Merge pull request #941 from ilikerps/master
Add "org.apache." prefix to packages in spark-class
2013-09-21 22:43:34 -07:00
Reynold Xin 7bb12a2af3 Merge pull request #940 from ankurdave/clear-port-properties-after-tests
After unit tests, clear port properties unconditionally
2013-09-21 22:42:46 -07:00
Harvey be0fc7246f Split HadoopRDD into one for general Hadoop datasets and one tailored to Hadoop files, which is a common case.
This is the first step to avoiding unnecessary Configuration broadcasts per HadoopRDD instantiation.
2013-09-21 21:14:14 -07:00
jerryshao aa0c29f747 Add barrier for local properties unit test and fix some styles 2013-09-22 09:53:11 +08:00
Aaron Davidson 8933f9e98e Add "org.apache." prefix to packages in spark-class
Lacking this, the if/case statements never trigger on Spark 0.8.0+.
2013-09-20 19:27:08 -07:00
Reynold Xin 42571d30d0 Smarter take/limit implementation. 2013-09-20 17:09:53 -07:00
Reynold Xin 119de80294 Merge branch 'master' of github.com:mesos/spark 2013-09-20 15:03:55 -07:00
Vadim Chekan fbe40c5806 Serialize and restore spark.cleaner.ttl to savepoint 2013-09-20 12:13:48 -07:00
Reynold Xin 1d87616b61 Made output of CoGroup and aggregations interruptible. 2013-09-19 23:31:36 -07:00
Mike 9524b943a4 Synchronize on "entries" the remaining update to "currentMemory".
Make "currentMemory" @volatile, so that it's reads in ensureFreeSpace() are atomic and up-to-date--i.e., currentMemory can't increase while putLock is held (though it could decrease, which would only help ensureFreeSpace()).
2013-09-19 23:31:35 -07:00
Ankur Dave 026dba6aba After unit tests, clear port properties unconditionally
In MapOutputTrackerSuite, the "remote fetch" test sets spark.driver.port
and spark.hostPort, assuming that they will be cleared by
LocalSparkContext. However, the test never sets sc, so it remains null,
causing LocalSparkContext to skip clearing these properties. Subsequent
tests therefore fail with java.net.BindException: "Address already in
use".

This commit makes LocalSparkContext clear the properties even if sc is
null.
2013-09-19 22:05:23 -07:00
Reynold Xin c5e40954eb Wrap around cached data to InterruptibleIterator. 2013-09-19 18:44:38 -07:00
Reynold Xin c68e72be59 Added comment to InterruptibleIterator. 2013-09-19 18:40:06 -07:00
Reynold Xin 70953810b4 Added task killing iterator to RDDs that take inputs. 2013-09-19 18:33:16 -07:00
Reynold Xin f19984dafe More logging changes (task killing for local cluster doesn't work yet). 2013-09-19 18:14:51 -07:00
Reynold Xin 85a0dffe0f Made task killing work for standalone cluster schedulers. 2013-09-19 16:41:29 -07:00
Patrick Wendell cd7222c3dd Merge pull request #938 from ilikerps/master
Fix issue with spark_ec2 seeing empty security groups
2013-09-19 14:21:24 -07:00
Aaron Davidson f589ce771a Fix issue with spark_ec2 seeing empty security groups
Under unknown, but occasional, circumstances, reservation.groups is empty
despite reservation.instances each having groups. This means that the
spark_ec2 get_existing_clusters() method would fail to find any instances.
To fix it, we simply use the instances' groups as the source of truth.

Note that this is actually just a revival of PR #827, now that the issue
has been reproduced.
2013-09-19 14:09:26 -07:00
Reynold Xin 9f8190c17d Fixed a bug for zero partition in JobWaiter. 2013-09-18 22:42:35 -07:00
Reynold Xin 9332246bd0 Added a hack to kill all active jobs in SparkContext. 2013-09-18 04:38:24 -07:00
Reynold Xin bf515688e7 Allow SparkContext.submitJob to submit a job for only a subset of the partitions. 2013-09-18 04:16:18 -07:00
jerryshao ffa5f8e11d Fix issue when local properties pass from parent to child thread 2013-09-18 17:33:24 +08:00
Reynold Xin 37d8f37a8e Added a submitJob interface that returns a Future of the result. 2013-09-17 21:13:59 -07:00
Reynold Xin 1cb42e6b2d Properly handle job failure when the job gets killed. 2013-09-16 22:10:45 -07:00
Reynold Xin cbc48be13b Initial commit for job killing. 2013-09-16 18:54:06 -07:00
Reynold Xin 3443d3fd43 Merge branch 'master' of github.com:mesos/spark 2013-09-16 13:10:35 -07:00
Patrick Wendell 2aff7989ab Merge pull request #933 from jey/yarn-typo-fix
Fix typo in Maven build docs
2013-09-15 14:05:04 -07:00
Jey Kottalam ac0dd99394 Fix typo in Maven build docs 2013-09-15 13:29:22 -07:00
Patrick Wendell dbd2c4fd94 Merge pull request #932 from pwendell/mesos-version
Bumping Mesos version to 0.13.0
2013-09-15 13:20:41 -07:00
Patrick Wendell 9fb0b9d77f Merge pull request #931 from pwendell/yarn-docs
Explain yarn.version in Maven build docs
2013-09-15 13:02:53 -07:00
Patrick Wendell c856860c5b Bumping Mesos version to 0.13.0 2013-09-15 12:46:26 -07:00
Patrick Wendell 362ea0c051 Explain yarn.version in Maven build docs 2013-09-15 12:40:49 -07:00
Holden Karau 68068977b8 Fix build on ubuntu 2013-09-14 20:51:11 -07:00
Holden Karau 452db1083c Merge branch 'master' of https://github.com/mesos/spark 2013-09-14 15:54:04 -07:00
Holden Karau bfcddf4700 Make mapPartitionsWithIndex work with JavaRDD's 2013-09-14 15:53:42 -07:00
Patrick Wendell c4c1db2dd5 Merge pull request #929 from pwendell/master
Use different Hadoop version for YARN artifacts.
2013-09-13 19:52:12 -07:00
Patrick Wendell e9eba8c3ce Use different Hadoop version for YARN artifacts.
This uses a seperate Hadoop version for YARN artifact. This means when people link against
spark-yarn, things will resolve correctly.
2013-09-13 15:34:57 -07:00