Commit graph

1671 commits

Author SHA1 Message Date
seanm ee4ce2fc51 adding takeOrdered to java API 2013-07-10 10:46:04 -07:00
seanm 24705d0f46 adding takeOrdered() to RDD 2013-07-10 10:33:11 -07:00
Karen Feng 620a6974c6 Allows for larger files, refactors lastNBytes, removes old Log column, fixes imports, uses map 2013-07-10 10:20:53 -07:00
BlackNiuza ce18b50d5f set SUCCEEDED for all master in shutdown hook 2013-07-10 19:11:43 +08:00
Karen Feng b6072b58bf Fixes style, makes "std__-page" consistent, reads only parts of files 2013-07-09 17:25:10 -07:00
Karen Feng 13fc6f248c Clean commit of log paging 2013-07-09 14:17:15 -07:00
BlackNiuza aaa7b081df according to mridulm's comments to adjust the code 2013-07-09 20:03:01 +08:00
Charles Reiss e47253e0cc Reset ClassLoader in MesosSchedulerBackend, too. (per review comments).
Also set ClassLoader for all mesos callbacks, not just statusUpdate,
registered.
2013-07-09 01:23:23 -07:00
BlackNiuza c1d44be805 Bug fix: SPARK-796 2013-07-09 15:18:28 +08:00
Matei Zaharia 7dcda9ae74 Merge pull request #688 from markhamstra/scalaDependencies
Fixed SPARK-795 with explicit dependencies
2013-07-08 23:24:23 -07:00
Mark Hamstra 0b39d66f3f pom cleanup 2013-07-08 16:07:09 -07:00
Mark Hamstra afdaf430bd Explicit dependencies for scala-library and scalap to prevent 2.9.2 vs. 2.9.3 problems 2013-07-08 15:40:50 -07:00
Charles Reiss 8c1d1c98e0 Explicitly set class loader for MesosSchedulerDriver callbacks. 2013-07-08 12:25:46 -07:00
Shivaram Venkataraman 4af0d63cb1 Remove akka LogLevel fix as we no longer use spray 2013-07-07 10:42:43 -07:00
Shivaram Venkataraman d362d0f411 Ignore stderr when calling cat on a non-existing file 2013-07-07 04:09:46 -07:00
Shivaram Venkataraman 7d6d9e6ab2 Set DriverSuite log level to WARN 2013-07-07 04:09:15 -07:00
Shivaram Venkataraman a948f06725 Suppress log messages in sbt test with two changes:
1. Set akka log level to ERROR before shutting down the actorSystem.
This avoids akka log messages (like Spray) from falling back to INFO
on the Stdout logger
2. Initialize netty to use SLF4J in LocalSparkContext. This ensures that
stack trace thrown during shutdown is handled by SLF4J instead of stdout
2013-07-07 04:09:08 -07:00
Patrick Wendell 32b9d21a97 Fix occasional failure in UI listener.
If a task fails before the metrics are initialized, it remains possible
that the metrics field will be `None`. This patch accounts for that possbility
by keeping metrics as an `Option` at all times.
2013-07-06 16:40:02 -07:00
Matei Zaharia 1ffadb2d9e Merge remote-tracking branch 'pwendell/ui-updates'
Conflicts:
	core/src/main/scala/spark/scheduler/DAGScheduler.scala
	core/src/main/scala/spark/util/AkkaUtils.scala
	pom.xml
2013-07-06 15:51:41 -07:00
Matei Zaharia 94871e4703 Merge pull request #655 from tgravescs/master
Add support for running Spark on Yarn on a secure Hadoop Cluster
2013-07-06 15:26:19 -07:00
Matei Zaharia 3f918b33f8 Merge pull request #672 from holdenk/master
s/ActorSystemImpl/ExtendedActorSystem/ as ActorSystemImpl results in a warning
2013-07-06 12:45:18 -07:00
Matei Zaharia 2a36e5449b Merge pull request #673 from xiajunluan/master
Add config template file for fair scheduler feature
2013-07-06 12:43:21 -07:00
Matei Zaharia 7ba7fa110b Merge pull request #674 from liancheng/master
Bug fix: SPARK-789
2013-07-06 11:45:08 -07:00
BlackNiuza 44a2440039 Remove active job from idToActiveJob when job finished or aborted 2013-07-07 01:33:09 +08:00
Patrick Wendell 37abe84212 Tracking some task metrics even during failures. 2013-07-06 09:19:59 -07:00
Patrick Wendell 84b7fc54e6 Enforcing correct sort order for formatted strings 2013-07-05 17:21:08 -07:00
Matei Zaharia 399bd65ef5 Fixed compile error due to merge 2013-07-05 11:27:06 -07:00
Matei Zaharia 652ea0f1d8 Allow RDD.takeSample to give samples bigger than the RDD
Before, when withReplacement was set to true, we would not get a sample
bigger than the RDD's count().

Conflicts:
	core/src/main/scala/spark/RDD.scala
	core/src/test/scala/spark/RDDSuite.scala
2013-07-05 11:15:13 -07:00
Matei Zaharia 6586c5e28b Added a SparkContext accessor to RDD 2013-07-05 11:13:46 -07:00
jerryshao e4ff544a8d Clean StageToInfos periodically when spark.cleaner.ttl is enabled 2013-07-05 10:34:45 +08:00
Lian Cheng c0c3155c3c Bug fix: SPARK-789
https://spark-project.atlassian.net/browse/SPARK-789
2013-07-05 00:54:10 +08:00
Andrew xia 6ccfb73ca9 Add fair scheduler config template file 2013-07-04 19:19:44 +08:00
Holden Karau 0f06d6217d s/ActorSystemImpl/ExtendedActorSystem/ as ActorSystemImpl results in a warning 2013-07-04 01:05:39 -07:00
Prashant Sharma a5f1f6a907 Merge branch 'master' into master-merge
Conflicts:
	core/pom.xml
	core/src/main/scala/spark/MapOutputTracker.scala
	core/src/main/scala/spark/RDD.scala
	core/src/main/scala/spark/RDDCheckpointData.scala
	core/src/main/scala/spark/SparkContext.scala
	core/src/main/scala/spark/Utils.scala
	core/src/main/scala/spark/api/python/PythonRDD.scala
	core/src/main/scala/spark/deploy/client/Client.scala
	core/src/main/scala/spark/deploy/master/MasterWebUI.scala
	core/src/main/scala/spark/deploy/worker/Worker.scala
	core/src/main/scala/spark/deploy/worker/WorkerWebUI.scala
	core/src/main/scala/spark/rdd/BlockRDD.scala
	core/src/main/scala/spark/rdd/ZippedRDD.scala
	core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
	core/src/main/scala/spark/storage/BlockManager.scala
	core/src/main/scala/spark/storage/BlockManagerMaster.scala
	core/src/main/scala/spark/storage/BlockManagerMasterActor.scala
	core/src/main/scala/spark/storage/BlockManagerUI.scala
	core/src/main/scala/spark/util/AkkaUtils.scala
	core/src/test/scala/spark/SizeEstimatorSuite.scala
	pom.xml
	project/SparkBuild.scala
	repl/src/main/scala/spark/repl/SparkILoop.scala
	repl/src/test/scala/spark/repl/ReplSuite.scala
	streaming/src/main/scala/spark/streaming/StreamingContext.scala
	streaming/src/main/scala/spark/streaming/api/java/JavaStreamingContext.scala
	streaming/src/main/scala/spark/streaming/dstream/KafkaInputDStream.scala
	streaming/src/main/scala/spark/streaming/util/MasterFailureTest.scala
2013-07-03 11:43:26 +05:30
Y.CORP.YAHOO.COM\tgraves 923cf92900 Rework from pull request. Removed --user option from Spark on Yarn Client, made the user of JAVA_HOME environment
variable conditional on if its set, and created addCredentials in each of the SparkHadoopUtil classes
to only add the credentials when the profile is hadoop2-yarn.
2013-07-02 21:18:59 -05:00
Patrick Wendell 39e2325675 Removing dead code 2013-07-02 16:28:40 -07:00
Patrick Wendell 8ca1cc1786 Adding truncation for log files 2013-07-02 16:10:50 -07:00
Patrick Wendell 9a42d04efa Throw exception for missing resource 2013-07-01 14:43:13 -07:00
Patrick Wendell 1025d7d1ef Package refactoring 2013-07-01 14:40:53 -07:00
Patrick Wendell 30b9034241 Fixing bug where logs aren't shown 2013-07-01 13:48:01 -07:00
Patrick Wendell 8688689387 Various formatting changes 2013-07-01 13:40:12 -07:00
Patrick Wendell 735c951a09 Adding test script 2013-07-01 09:33:22 -07:00
Patrick Wendell 5de326db7d Print exception message 2013-07-01 09:19:45 -07:00
root ec31e68d5d Fixed PySpark perf regression by not using socket.makefile(), and improved
debuggability by letting "print" statements show up in the executor's stderr

Conflicts:
	core/src/main/scala/spark/api/python/PythonRDD.scala
2013-07-01 06:26:31 +00:00
root 3296d132b6 Fix performance bug with new Python code not using buffered streams 2013-07-01 06:25:43 +00:00
Matei Zaharia 03d0b858c8 Made use of spark.executor.memory setting consistent and documented it
Conflicts:

	core/src/main/scala/spark/SparkContext.scala
2013-06-30 15:46:46 -07:00
Patrick Wendell e721ff7e5a Allowing details for failed stages 2013-06-29 11:26:30 -07:00
Patrick Wendell 473961d82e Styling for progress bar 2013-06-29 08:38:04 -07:00
Patrick Wendell 249f0e54ba Minor changes from Matei's review 2013-06-28 13:25:26 -07:00
Matei Zaharia 50ca17635a Merge pull request #664 from pwendell/test-fix
Removing incorrect test statement
2013-06-27 22:24:52 -07:00
Patrick Wendell c537e869f3 Missing logo file 2013-06-27 22:02:03 -07:00
Patrick Wendell c767e74370 Removing incorrect test statement 2013-06-27 21:48:58 -07:00
Patrick Wendell 62c2c6b856 Forcing Jetty to run as daemon 2013-06-27 21:47:22 -07:00
Patrick Wendell a55190d314 Adding better tabs for UI headers. 2013-06-27 19:14:51 -07:00
Patrick Wendell 362d996c81 Handful of changes based on matei's review
- Avoid exception when no tasks have finished for a stage
- Adding DOCTYPE so css renders properly
- Adding progress slider
2013-06-27 19:14:28 -07:00
Patrick Wendell 92a4c2a5f6 Fixing bug in local scheduler time recording 2013-06-27 12:33:06 -07:00
Stephen Haberman d7011632d1 Wrap lines. 2013-06-26 12:35:57 -05:00
Patrick Wendell ee692482a6 One more private class 2013-06-26 09:07:32 -07:00
Patrick Wendell a59c15a37e Adding config option for retained stages 2013-06-26 08:54:57 -07:00
Patrick Wendell 274193664a Bumping timeouts 2013-06-26 08:51:28 -07:00
Patrick Wendell b14ad509ba Moving static ui package 2013-06-26 08:46:51 -07:00
Patrick Wendell 2cbaa0734b Making all new classes package private 2013-06-26 08:44:55 -07:00
Stephen Haberman d11025dc6a Be cute with Option and getenv. 2013-06-26 09:53:35 -05:00
Matei Zaharia 9f0d913295 Refactored tests to share SparkContexts in some of them
Creating these seems to take a while and clutters the output with Akka
stuff, so it would be nice to share them.
2013-06-25 19:18:30 -04:00
Matei Zaharia 6c8d1b2ca6 Fix computation of classpath when we launch java directly
The previous version assumed that a CLASSPATH environment variable was
set by the "run" script when launching the process that starts the
ExecutorRunner, but unfortunately this is not true in tests. Instead, we
factor the classpath calculation into an extenral script and call that.

NOTE: This includes a Windows version but hasn't yet been tested there.
2013-06-25 18:21:00 -04:00
Matei Zaharia 15b00914c5 Some fixes to the launch-java-directly change:
- Split SPARK_JAVA_OPTS into multiple command-line arguments if it
  contains spaces; this splitting follows quoting rules in bash
- Add the Scala JARs to the classpath if they're not in the CLASSPATH
  variable because the ExecutorRunner is launched with "scala" (this can
  happen when using local-cluster URLs in spark-shell)
2013-06-25 17:17:27 -04:00
Matei Zaharia 7680ce0bd6 Fixed deprecated use of expect in SizeEstimatorSuite 2013-06-25 16:11:44 -04:00
Matei Zaharia 7e0191c6ea Merge remote-tracking branch 'cgrothaus/SPARK-698'
Conflicts:
	run
2013-06-25 15:47:40 -04:00
Patrick Wendell d66bd6f885 Adding another unit test to Web UI suite 2013-06-24 17:12:55 -07:00
Patrick Wendell f7389330c3 Allowing for requested port on construction 2013-06-24 16:51:52 -07:00
Patrick Wendell 42157027f2 A few bug fixes and a unit test 2013-06-24 16:25:05 -07:00
Patrick Wendell a4248138b4 Minor style cleanup 2013-06-24 14:22:28 -07:00
Patrick Wendell b5e6e8bcc8 Cleaning up some code for Job Progress 2013-06-24 14:13:24 -07:00
Patrick Wendell 93e8ed85aa Work around for initalization issue 2013-06-24 13:11:18 -07:00
Patrick Wendell f6e64b5cd6 Updating based on changes to JobLogger (and one small change to JobLogger) 2013-06-24 12:40:41 -07:00
Matei Zaharia 78ffe164b3 Clone the zero value for each key in foldByKey
The old version reused the object within each task, leading to
overwriting of the object when a mutable type is used, which is expected
to be common in fold.

Conflicts:

	core/src/test/scala/spark/ShuffleSuite.scala
2013-06-23 10:26:53 -07:00
Matei Zaharia 0e0f9d3069 Fix search path for REPL class loader to really find added JARs 2013-06-22 17:44:04 -07:00
Matei Zaharia 3e61beff7b Merge pull request #648 from shivaram/netty-dbg
Shuffle fixes and cleanup
2013-06-22 16:22:47 -07:00
Patrick Wendell 7e9f1ed0de Some cleanup of styling 2013-06-22 10:31:37 -07:00
Patrick Wendell 3b7ebdeeb8 Handling entirely failed stages 2013-06-22 10:31:37 -07:00
Patrick Wendell be6107ce44 Some tweaking with shared page header 2013-06-22 10:31:37 -07:00
Patrick Wendell 9a24d1a2d0 Using scala in XML imports 2013-06-22 10:31:37 -07:00
Patrick Wendell f91e1c4822 Linking RDD information when available in stages 2013-06-22 10:31:37 -07:00
Patrick Wendell a86bb459e2 Showing shuffle status and purging old stages 2013-06-22 10:31:37 -07:00
Patrick Wendell 3485e73376 Style cleanup 2013-06-22 10:31:37 -07:00
Patrick Wendell dd696f3a3d Some renaming and comments 2013-06-22 10:31:37 -07:00
Patrick Wendell 5c872e9ef5 Documentation and some refactoring 2013-06-22 10:31:37 -07:00
Patrick Wendell 17776323a6 More work on percentile data: 2013-06-22 10:31:37 -07:00
Patrick Wendell dcf6a68177 Refactoring into different modules 2013-06-22 10:31:36 -07:00
Patrick Wendell ce81c320ac Adding helper function to make listing tables 2013-06-22 10:31:36 -07:00
Patrick Wendell 9fd5dc3ea9 Initial steps towards job progress UI 2013-06-22 10:31:36 -07:00
Patrick Wendell bc4a811c57 Stash 2013-06-22 10:31:36 -07:00
Patrick Wendell 77c53f7868 Refactoring UI packages 2013-06-22 10:31:36 -07:00
Patrick Wendell 8b5c7e71c4 Import cleanup 2013-06-22 10:31:36 -07:00
Patrick Wendell 32a45d01b1 Removing twirl files 2013-06-22 10:31:36 -07:00
Patrick Wendell 17f145f3bc Updating Maven build 2013-06-22 10:31:36 -07:00
Patrick Wendell 4e1f202481 Removing dead code 2013-06-22 10:31:36 -07:00
Patrick Wendell d6fde4ffe4 Some JSON cleanup 2013-06-22 10:31:36 -07:00
Patrick Wendell 91ec5a1a04 Changing JSON protocol and removing spray code 2013-06-22 10:31:36 -07:00
Patrick Wendell fc94576ece Adding worker version of UI 2013-06-22 10:31:36 -07:00