Commit graph

2112 commits

Author SHA1 Message Date
Shivaram Venkataraman 471fbadd0c Java examples, tests for KMeans and ALS
- Changes ALS to accept RDD[Rating] instead of (Int, Int, Double) making it
  easier to call from Java
- Renames class methods from `train` to `run` to enable static methods to be
  called from Java.
- Add unit tests which check if both static / class methods can be called.
- Also add examples which port the main() function in ALS, KMeans to the
  examples project.

Couple of minor changes to existing code:
- Add a toJavaRDD method in RDD to convert scala RDD to java RDD easily
- Workaround a bug where using double[] from Java leads to class cast exception in
  KMeans init
2013-08-06 15:43:46 -07:00
anfeng dda2ac8b5d reformat registerFileSystemStat() 2013-08-06 15:22:25 -07:00
Karen Feng 099528b6c4 Pre-sorts stage/env tables, changes text/link of stage summaries 2013-08-06 14:52:12 -07:00
Karen Feng 254a930730 Reverse sorts StageTable by submitted time 2013-08-06 14:18:38 -07:00
Karen Feng 5ed5b73026 Sorts first column of env tables 2013-08-06 13:59:53 -07:00
anfeng 0748c60817 expose HDFS file system stats via Executor metrics 2013-08-06 11:47:06 -07:00
Reynold Xin d031f73679 Merge pull request #782 from WANdisco/master
SHARK-94 Log the files computed by HadoopRDD and NewHadoopRDD
2013-08-05 22:33:00 -07:00
Matei Zaharia 1b63dea816 Merge pull request #769 from markhamstra/NegativeCores
SPARK-847 + SPARK-845: Zombie workers and negative cores
2013-08-05 22:21:26 -07:00
Alexander Pivovarov a30866438b SHARK-94 Log the files computed by HadoopRDD and NewHadoopRDD 2013-08-05 21:48:43 -07:00
Matei Zaharia 8b277892c9 Merge pull request #774 from pwendell/job-description
Show user-defined job name in UI
2013-08-05 19:14:52 -07:00
Christopher Nguyen b1bbbe699c [HOTFIX] Mark lastSetSparkEnv @volatile in case it gets HotSpot-cached
On branch adatao-global-SparkEnv
Changes to be committed:

	modified:   core/src/main/scala/spark/SparkEnv.scala
2013-08-05 17:22:27 -07:00
Mark Hamstra 35d8f5ee52 Moved handling of timed out workers within the Master actor 2013-08-05 13:13:56 -07:00
Mark Hamstra 37ccf9301a milliseconds -> seconds in timeOutDeadWorkers logging 2013-08-05 13:13:56 -07:00
Mark Hamstra cdd1af562e Timeout zombie workers 2013-08-05 13:13:56 -07:00
Mikhail Bautin e8bec8365f Only reduce the number of cores once when removing an executor 2013-08-05 13:13:56 -07:00
Karen Feng 95025afdec Made most small fixes for SPARK-849 except for table sort, task progress overlay 2013-08-05 13:04:56 -07:00
Bill Zhao 87134b3648 SPARK-850: give better console message 2013-08-05 11:55:35 -07:00
Christopher Nguyen 39e4fda76f [HOTFIX] Extend thread safety for SparkEnv.get()
A ThreadLocal SparkEnv.env is facing various situations leading to
NullPointerExceptions, where SparkEnv.env set in one thread is not
gettable in another thread, but often assumed to be available.

See, e.g., https://groups.google.com/forum/#!topic/spark-developers/GLx8yunSj0A

This hotfixes SparkEnv.env to return either (a) the ThreadLocal
value if non-null, or (b) the previously set value in any thread.

This approach preserves SparkEnv.set() thread safety needed by
RDD.compute() and possibly other places. A refactoring that
parameterizes SparkEnv should be addressed subsequently.

On branch adatao-global-SparkEnv
Changes to be committed:

	modified:   core/src/main/scala/spark/SparkEnv.scala
2013-08-05 02:09:54 -07:00
Patrick Wendell f3660d5ab8 Make output formatting consistent between bash/scala 2013-08-03 21:30:15 -07:00
Patrick Wendell ad94fbb322 Log the launch command for Spark executors 2013-08-03 09:19:46 -07:00
Matei Zaharia 22abbc10d6 Merge pull request #772 from karenfeng/ui-843
Show app duration
2013-08-02 16:37:59 -07:00
Patrick Wendell 5b3784a79c Show user-defined job name in UI 2013-08-02 15:47:41 -07:00
Karen Feng b3ae5b25d5 Shows time the app has been running 2013-08-02 13:25:14 -07:00
Patrick Wendell 9d7dfd2d5a Merge pull request #743 from pwendell/app-metrics
Add application metrics to standalone master
2013-08-01 17:41:58 -07:00
Patrick Wendell f1d2ad550e under_scores --> camelCase for config options 2013-08-01 15:26:26 -07:00
Patrick Wendell 12d9c82c9b Small style fix 2013-08-01 15:25:52 -07:00
Patrick Wendell 37bc64a205 Adding application-level metrics.
This adds metrics for applications in the deploy Master.
2013-08-01 15:25:52 -07:00
Karen Feng 73692f3cb9 Unify, reduce body font size 2013-08-01 15:10:30 -07:00
Patrick Wendell 87fd321a5a Minor refactoring and code cleanup 2013-08-01 15:02:31 -07:00
Patrick Wendell b10199413a Slight refactoring to SparkContext functions 2013-08-01 15:00:42 -07:00
Patrick Wendell cfcd77b5da Increasing inter job arrival 2013-08-01 15:00:42 -07:00
Patrick Wendell 5faac7f4f3 Minor style fixes 2013-08-01 15:00:42 -07:00
Patrick Wendell 5e7b38fbb3 Merge pull request #695 from xiajunluan/pool_ui
Enhance job ui in spark ui system with adding pool information
2013-08-01 14:59:33 -07:00
Karen Feng 47600e9579 Removed hr margin 2013-08-01 14:57:04 -07:00
Karen Feng e648a62fc8 Inserted needed line break for log paging 2013-08-01 14:46:19 -07:00
Karen Feng 686d6266c4 Use nav pills instead of default 2013-08-01 14:41:49 -07:00
Karen Feng 86d372d17f Removed line breaks 2013-08-01 14:37:21 -07:00
Karen Feng 99803d88b9 Reduced all header sizes 2013-08-01 14:18:33 -07:00
Karen Feng d216d687ef Reduced size of table text to compact 2013-08-01 13:27:23 -07:00
Karen Feng 5dae283996 Merge branch 'master' of https://github.com/mesos/spark into bootstrap-update 2013-08-01 11:28:28 -07:00
Matei Zaharia 0a96493ac6 Merge pull request #760 from karenfeng/heading-update
Clean up web UI page headers
2013-08-01 11:27:17 -07:00
Patrick Wendell 9177bea2b4 Removing extra imports 2013-08-01 10:42:50 -07:00
Patrick Wendell 3e4d5e5f8b Merge branch 'master' into master-json
Conflicts:
	core/src/main/scala/spark/deploy/master/ui/IndexPage.scala
2013-08-01 10:42:07 -07:00
Patrick Wendell ffc034e4fb Import cleanup 2013-08-01 10:39:56 -07:00
Andrew xia d58502a156 fix bug of spark "SubmitStage" listener as unit test error 2013-08-01 23:21:41 +08:00
Andrew xia 3b5a11e765 change function name "setName" to "setProperties" as "setName" is also member of Thread class 2013-08-01 19:37:15 +08:00
Dmitriy Lyubimov d29ee3689b Merge fixes merge commit hasn't picked 2013-08-01 00:21:26 -07:00
Dmitriy Lyubimov cb6be5bd7e Merge remote-tracking branch 'mesos/master' into SPARK-826
Conflicts:
	core/src/main/scala/spark/scheduler/cluster/ClusterTaskSetManager.scala
	core/src/main/scala/spark/scheduler/local/LocalTaskSetManager.scala
	core/src/test/scala/spark/KryoSerializerSuite.scala
2013-07-31 22:09:22 -07:00
Dmitriy Lyubimov 28f1550f01 More elegant rewrite of the same. 2013-07-31 21:41:00 -07:00
Dmitriy Lyubimov 7c52ecc6a4 (1) added reduce test case.
(2) added nested streaming in ParallelCollectionRDD
(3) added kryo with fold test which still doesn't work
2013-07-31 19:27:30 -07:00
Matei Zaharia 3097d75d6f Merge remote-tracking branch 'dlyubimov/SPARK-827'
Conflicts:
	docs/configuration.md
2013-07-31 18:36:43 -07:00
Karen Feng 7c9c5ef6c6 Merge branch 'master' of https://github.com/mesos/spark into bootstrap-update 2013-07-31 16:39:26 -07:00
Karen Feng 02cde8efdf Replaces theme with Bootswatch Spacelab theme 2013-07-31 16:34:07 -07:00
Karen Feng 09cd67bf98 Changed bootstrap colors, fixed logpaging buttons 2013-07-31 16:18:53 -07:00
Matei Zaharia 39c75f3033 Merge pull request #757 from BlackNiuza/result_task_generation
Bug fix: SPARK-837
2013-07-31 15:52:36 -07:00
Matei Zaharia 14bf2fe039 Merge pull request #749 from benh/spark-executor-uri
Added property 'spark.executor.uri' for launching on Mesos.
2013-07-31 14:18:16 -07:00
Benjamin Hindman 4692ea4892 Used 'uri.split('/').last' instead of 'new File(uri).getName()'. 2013-07-31 12:29:44 -07:00
Karen Feng c453967f9a Reduced size of heading 2013-07-31 11:57:50 -07:00
Matei Zaharia a386ced2c6 Merge pull request #754 from rxin/compression
Compression codec change
2013-07-31 11:22:50 -07:00
Karen Feng 49e6344142 Removed master URL from job UI, reduced heading size of basic spark pages 2013-07-31 11:17:59 -07:00
Reynold Xin c61843a69f Changed other LZF uses to use the compression codec interface. 2013-07-31 10:32:13 -07:00
Patrick Wendell 89da9d94b3 Add JSON path to master index page 2013-07-31 09:47:53 -07:00
BlackNiuza 9a815de4bf write and read generation in ResultTask 2013-08-01 00:36:47 +08:00
Roman Tkalenko 0c6553714a Refactored Vector.apply(length, initializer) replacing excessive code with library method
(also removed unused variable ```ans``` as minor change)
2013-07-31 19:05:46 +03:00
Matei Zaharia 12553e5c55 Simplified nonNegativeMod to match previous version 2013-07-31 08:50:28 -07:00
Matei Zaharia d4556f4207 Merge pull request #751 from cdshines/master
Cleaned Partitioner & PythonPartitioner source by taking out non-related logic to Utils
2013-07-31 08:48:14 -07:00
Andrew xia 5670c96f29 Merge branch 'master' into Pool_UI
Conflicts:
	core/src/main/scala/spark/SparkContext.scala
	core/src/main/scala/spark/scheduler/DAGScheduler.scala
	core/src/main/scala/spark/scheduler/SparkListener.scala
	core/src/main/scala/spark/scheduler/cluster/ClusterTaskSetManager.scala
	core/src/main/scala/spark/scheduler/cluster/TaskSetManager.scala
	core/src/main/scala/spark/scheduler/local/LocalTaskSetManager.scala
	core/src/main/scala/spark/ui/jobs/IndexPage.scala
	core/src/main/scala/spark/ui/jobs/JobProgressUI.scala
2013-07-31 19:36:36 +08:00
cdshines fefb03cbd7 Eliminated code duplication, refactored to pattern-matching style Partitioner and PythonPartitioner 2013-07-31 13:19:42 +03:00
Dmitriy Lyubimov 96664431cb IDEA flipped JavaSerialized import at some point to a wrong class. 2013-07-30 23:10:09 -07:00
Dmitriy Lyubimov c219fc94fd Minor, style 2013-07-30 22:08:39 -07:00
Dmitriy Lyubimov f4b4b8836e reverting back to one-by-one serialization for parallelize() 2013-07-30 19:00:58 -07:00
jerryshao bf9318091a Add Apache license header to metrics system 2013-07-31 09:42:16 +08:00
Reynold Xin 98024eadc3 Renamed compressionOutputStream and compressionInputStream to compressedOutputStream and compressedInputStream. 2013-07-30 18:28:46 -07:00
Dmitriy Lyubimov abada94ebf removing default constructor (not Externalizable any more) 2013-07-30 18:04:02 -07:00
Dmitriy Lyubimov 943c6590c9 realiging "extends" back manually 2013-07-30 18:01:35 -07:00
Dmitriy Lyubimov ca33b12e98 resetting wrap and continuation indent = 4 2013-07-30 17:51:44 -07:00
Reynold Xin dae12fef9e Updated the configuration option for Snappy block size to be consistent with the documentation. 2013-07-30 17:49:31 -07:00
Dmitriy Lyubimov 984b56155a changing approaches for parallelize(): java serialization needs to avoid writing headers! 2013-07-30 17:36:59 -07:00
Reynold Xin 311aae76a2 Added Snappy dependency to Maven build files. 2013-07-30 17:25:42 -07:00
Reynold Xin 56774b176e Added unit test for compression codecs. 2013-07-30 17:12:33 -07:00
Reynold Xin ad7e9d0d64 CompressionCodec cleanup. Moved it to spark.io package. 2013-07-30 17:11:54 -07:00
Dmitriy Lyubimov ef9529a943 refactoring using writeByteBuffer() from Utils. 2013-07-30 16:24:23 -07:00
Dmitriy Lyubimov 43394b9a6d fixing formatting 2013-07-30 16:13:41 -07:00
Dmitriy Lyubimov 13a9d66645 adding === 2013-07-30 16:10:55 -07:00
Reynold Xin 368c58eac5 Merge branch 'lazy_file_open' of github.com:lyogavin/spark into compression
Conflicts:
	project/SparkBuild.scala
2013-07-30 16:04:18 -07:00
Patrick Wendell e87de037d6 Merge pull request #744 from karenfeng/bootstrap-update
Use Bootstrap progress bars in web UI
2013-07-30 15:00:08 -07:00
Karen Feng 26144c400f Fixed wrap style 2013-07-30 12:40:41 -07:00
Karen Feng 218d7c4ed8 Fixed style, lowered height of progress bars 2013-07-30 12:39:17 -07:00
Karen Feng f1cab31b73 Removed intermediate set for activeTasks, removed progress bar margin 2013-07-30 11:06:47 -07:00
Dmitriy Lyubimov 1bca91633e + bug fixes;
test added

Conflicts:

	core/src/test/scala/spark/KryoSerializerSuite.scala
2013-07-30 11:04:11 -07:00
Benjamin Hindman f6f46455eb Added property 'spark.executor.uri' for launching on Mesos without
requiring Spark to be installed. Using 'make_distribution.sh' a user
can put a Spark distribution at a URI supported by Mesos (e.g.,
'hdfs://...') and then set that when launching their job. Also added
SPARK_EXECUTOR_URI for the REPL.
2013-07-29 23:32:52 -07:00
Josh Rosen 49be084ed3 Use File.pathSeparator instead of hardcoding ':'. 2013-07-29 22:08:57 -07:00
Josh Rosen b95732632b Do not inherit master's PYTHONPATH on workers.
This fixes SPARK-832, an issue where PySpark
would not work when the master and workers used
different SPARK_HOME paths.

This change may potentially break code that relied
on the master's PYTHONPATH being used on workers.
To have custom PYTHONPATH additions used on the
workers, users should set a custom PYTHONPATH in
spark-env.sh rather than setting it in the shell.
2013-07-29 22:08:57 -07:00
Andrew xia 5406013997 refactor codes less than 100 character per line 2013-07-30 11:41:38 +08:00
Andrew xia 614ee16cc4 refactor job ui with pool information 2013-07-30 10:57:26 +08:00
Dmitriy Lyubimov 8e5cd041bb initial externalization of ParallelCollectionRDD's split 2013-07-29 19:02:53 -07:00
Reynold Xin 81720e13fc Moved all StandaloneClusterMessage's into StandaloneClusterMessages object. 2013-07-29 17:53:01 -07:00
Reynold Xin 23b5da14ed Moved block manager messages into BlockManagerMessages object. 2013-07-29 17:42:05 -07:00
Reynold Xin 105f4d22e9 Removed Cache and SoftReferenceCache since they are no longer used. 2013-07-29 17:30:38 -07:00
Reynold Xin 17e62113d4 Moved DeployMessage's into its own DeployMessages object.
Also renamed MasterState to MasterStateResponse and WorkerState to WorkerStateResponse for clarity.
2013-07-29 17:14:44 -07:00
Karen Feng 87b821dc39 Fixed continuity of executorToTasksActive, changed color of progress bars 2013-07-29 16:50:51 -07:00
Karen Feng c7b2788948 Merge branch 'master' of https://github.com/mesos/spark into bootstrap-update
Conflicts:
	core/src/main/scala/spark/ui/jobs/IndexPage.scala
2013-07-29 16:36:07 -07:00
Patrick Wendell c99b674405 Merge pull request #735 from karenfeng/ui-807
Totals for shuffle data and CPU time
2013-07-29 16:32:55 -07:00
Karen Feng 2d6da9195a Alphabetized imports 2013-07-29 15:50:52 -07:00
Karen Feng 478a2886d9 Added started tasks to progress bar 2013-07-29 14:51:07 -07:00
Karen Feng e04a37a332 Merge branch 'master' of https://github.com/mesos/spark into bootstrap-update
cially if it merges an updated upstream into a topic branch.
2013-07-29 14:32:48 -07:00
Reynold Xin fe7298b587 Merge pull request #741 from pwendell/usability
Fix two small usability issues
2013-07-29 14:01:00 -07:00
Karen Feng 43a2cc15c0 Use Bootstrap progress bars in web UI 2013-07-29 13:37:24 -07:00
Matei Zaharia b9d6783f36 Optimize Python take() to not compute entire first partition 2013-07-29 02:51:43 -04:00
Dmitriy Lyubimov f5067abe85 changes per comments. 2013-07-27 23:08:00 -07:00
Karen Feng 077f2dad22 Fixed outdated bugs 2013-07-27 16:39:36 -07:00
Patrick Wendell bcafb36c1e Slight wording change 2013-07-27 16:03:50 -07:00
Patrick Wendell 8177165ac4 Log executor on finish 2013-07-27 16:02:06 -07:00
Patrick Wendell c2223e6801 Improve catch scope and logging for client stop()
This does two things:
1. Catches the more general `TimeoutException`, since those can be thrown.
2. Logs at info level when a timeout is detected.
2013-07-27 16:02:06 -07:00
Karen Feng 5a93e3c58c Cleaned up code based on pwendell's suggestions 2013-07-27 15:55:26 -07:00
Karen Feng dcc4743a95 Moved val now to render 2013-07-27 12:52:53 -07:00
Karen Feng 1714693324 Current time called once with value now 2013-07-27 12:24:41 -07:00
Dmitriy Lyubimov 6a47cee721 style 2013-07-26 22:35:13 -07:00
Dmitriy Lyubimov 0c391feb73 Maximum task failures configurable 2013-07-26 22:34:43 -07:00
Dmitriy Lyubimov 23f3e0f117 mixing in SharedSparkContext for the kryo-collect test 2013-07-26 19:15:11 -07:00
Karen Feng bd4cc52e30 Made metrics Option instead of Some, fixed NullPointerException 2013-07-26 17:23:18 -07:00
Reynold Xin cb366774c8 Merge pull request #738 from harsha2010/pruning
Fix bug in Partition Pruning.
2013-07-26 16:59:30 -07:00
harshars 392d7474fd Code review 2013-07-26 15:23:15 -07:00
harshars 72cf7ec0e5 Indentation 2013-07-26 15:16:41 -07:00
harshars 822aac8f5a Indentation 2013-07-26 15:10:32 -07:00
harshars 743fc4e7aa Fix Bug in Partition Pruning, index of Pruned Partitions should inherit from parent 2013-07-26 14:35:17 -07:00
Karen Feng 3fbe9eaac0 Displys shuffle read/write only if exists, wraps if statements, trims old vals, grabs current time once 2013-07-26 11:51:38 -07:00
Karen Feng 22faeab261 Split Shuffle Activity overview column for read/write 2013-07-25 17:14:18 -07:00
Karen Feng d4bbc8bd25 Shows totals for shuffle data and CPU time in Stage, homepage overviews including active time 2013-07-25 15:59:52 -07:00
Charles Reiss a6de90c927 For standalone mode, get JAVA_HOME, SPARK_JAVA_OPTS, SPARK_LIBRARY_PATH from application env, not worker env 2013-07-25 12:42:30 -07:00
Matei Zaharia 8eb8b52997 Fix Chill version in Maven 2013-07-25 08:58:02 -07:00
Matei Zaharia e2421c1311 Update Chill reference in pom.xml too 2013-07-25 00:05:43 -07:00
ryanlecompte e56aa75de0 fix wrapping 2013-07-24 22:08:09 -07:00
ryanlecompte fc4b025314 add test 2013-07-24 20:53:15 -07:00
ryanlecompte a1c515fb02 add copyright back in 2013-07-24 20:50:32 -07:00
ryanlecompte 8e0939f5a9 refactor Kryo serializer support to use chill/chill-java 2013-07-24 20:43:57 -07:00
Karen Feng 57009eef90 Fixed consistency of "success" status string 2013-07-24 13:43:09 -07:00
Karen Feng 4280e1768d Removed finished status for task info, changed name of success case 2013-07-24 12:48:48 -07:00
Karen Feng bd3931c874 Changed ifs with returns to if/else 2013-07-24 11:27:17 -07:00
Karen Feng 93c6015f82 Shows task status and running tasks on Stage Page: fixes SPARK-804 and 811 2013-07-24 10:53:02 -07:00
jerryshao 31ec72b243 Code refactor according to comments 2013-07-24 14:57:47 +08:00
jerryshao 8d1ef7f2df Code style changes 2013-07-24 14:57:47 +08:00
Andrew xia 05637de842 Change class xxxInstrumentation to class xxxSource 2013-07-24 14:57:47 +08:00
Andrew xia ed1a3bc206 continue to refactor code style and functions 2013-07-24 14:57:47 +08:00
jerryshao 5730193e0c Fix some typos 2013-07-24 14:57:47 +08:00
jerryshao a79f6077f0 Add Maven metrics library dependency and code changes 2013-07-24 14:57:47 +08:00
jerryshao 1daff54b2e Change Executor MetricsSystem initialize code to SparkEnv 2013-07-24 14:57:47 +08:00
Andrew xia 5f8802c1fb Register and init metricsSystem in SparkContext
Conflicts:

	core/src/main/scala/spark/SparkContext.scala
	core/src/main/scala/spark/SparkEnv.scala
2013-07-24 14:57:47 +08:00
Andrew xia 9cea0c2818 Refactor metricsSystem unit test, add resource files. 2013-07-24 14:57:47 +08:00
Andrew xia 7d2eada451 Add metrics source of DAGScheduler and blockManager
Conflicts:

	core/src/main/scala/spark/SparkContext.scala
	core/src/main/scala/spark/SparkEnv.scala
2013-07-24 14:57:47 +08:00
jerryshao e9ac88754d Remove twice add Source bug and code clean 2013-07-24 14:57:47 +08:00
jerryshao e080588f73 Add metrics system unit test 2013-07-24 14:57:47 +08:00
jerryshao 5ce5dc9fcd Add default properties to deal with no configure file situation 2013-07-24 14:57:47 +08:00
jerryshao 871bc1687e Add Executor instrumentation 2013-07-24 14:57:46 +08:00
jerryshao 7fb574bf66 Code clean and remarshal 2013-07-24 14:57:46 +08:00
Andrew xia 4d6dd67fa1 refactor metrics system
1.change source abstract class to support MetricRegistry
2.change master/work/jvm source class
2013-07-24 14:57:46 +08:00
jerryshao 03f9871116 MetricsSystem refactor 2013-07-24 14:57:46 +08:00
jerryshao c3daad3f65 Update metric source support for instrumentation 2013-07-24 14:57:46 +08:00
jerryshao 9dec8c73e6 Add Master and Worker instrumentation support 2013-07-24 14:57:46 +08:00
jerryshao 503acd3a37 Build metrics system framwork 2013-07-24 14:57:46 +08:00
Matei Zaharia b011329040 Merge pull request #727 from rxin/scheduler
Scheduler code style cleanup.
2013-07-23 22:50:09 -07:00
Matei Zaharia 876125b997 Merge pull request #726 from rxin/spark-826
SPARK-829: scheduler shouldn't hang if a task contains unserializable objects in its closure
2013-07-23 22:28:21 -07:00
Reynold Xin 3dae1df66f Moved non-serializable closure catching exception from submitStage to submitMissingTasks 2013-07-23 20:29:07 -07:00
Reynold Xin d33b8a2a0f Added comments on task closure serialization. 2013-07-23 20:28:39 -07:00
Reynold Xin 85ab8114bc Moved non-serializable closure catching exception from submitStage to submitMissingTasks 2013-07-23 20:25:58 -07:00
Matei Zaharia 6a31b7191d Small bug fix 2013-07-23 16:20:24 -07:00
Matei Zaharia 2f1736c396 Merge pull request #725 from karenfeng/task-start
Creates task start events
2013-07-23 15:53:30 -07:00
Karen Feng abc78cd331 Modifies instead of copies HashSets, fixes comment style 2013-07-23 15:47:16 -07:00
Karen Feng 383684daaa Replaces Seq with HashSet, removes redundant import 2013-07-23 15:33:27 -07:00
Reynold Xin f2422d4f29 SPARK-829: scheduler shouldn't hang if a task contains unserializable objects in its closure. 2013-07-23 15:30:20 -07:00
Reynold Xin 5ed38b4d1d Scheduler code style cleanup. 2013-07-23 15:28:59 -07:00
Reynold Xin 101b8cc78a SPARK-829: scheduler shouldn't hang if a task contains unserializable objects in its closure. 2013-07-23 15:28:20 -07:00
Dmitriy Lyubimov 72bac09c42 Leaking spark context in the test 2013-07-23 15:19:07 -07:00
Karen Feng 9f2dbb2a7c Adds/removes active tasks only once 2013-07-23 15:10:09 -07:00
Dmitriy Lyubimov ef82ff8564 Merge branch 'master' into SPARK-826
Conflicts:
	core/src/main/scala/spark/scheduler/local/LocalScheduler.scala
2013-07-23 13:43:00 -07:00
Karen Feng 0200801a55 Tracks task start events and shows number of active tasks on Executor UI 2013-07-23 13:35:43 -07:00
Dmitriy Lyubimov 310e73d566 style 2013-07-23 13:23:25 -07:00
Matei Zaharia f369e0e51b Merge pull request #720 from ooyala/2013-07/persistent-rdds-api
Add a public method getCachedRdds to SparkContext
2013-07-23 13:22:27 -07:00
Dmitriy Lyubimov ac60d06381 Re-working in terms of changes to TaskSetManager. Verified with Standalone and Local mode. 2013-07-23 13:13:19 -07:00
Evan Chan efd6418c1b Move getPersistentRDDs testing to a new Suite 2013-07-23 10:40:41 -07:00
Evan Chan 4830e22562 Rename method per rxin feedback 2013-07-23 09:50:13 -07:00
Evan Chan 2c2bfbe294 Add toMap method to TimeStampedHashMap and use it 2013-07-23 01:36:44 -07:00
Matei Zaharia 401aac8b18 Merge pull request #719 from karenfeng/ui-808
Creates Executors tab for Jobs UI
2013-07-22 16:57:16 -07:00
Karen Feng 872c97ad82 Split task columns, memory columns sort by numeric value 2013-07-22 16:54:37 -07:00
Matei Zaharia ea1cfabfdd Merge branch 'master' of github.com:mesos/spark 2013-07-22 16:22:02 -07:00
Matei Zaharia 8e38e77232 Fix a test that was using an outdated config setting 2013-07-22 16:05:32 -07:00
Karen Feng 2eea974795 Executors UI now calls executor ID from TaskInfo instead of TaskMetrics 2013-07-22 15:15:54 -07:00
Dmitriy Lyubimov 8ca0c31944 removing non-pertinent comment 2013-07-22 14:48:46 -07:00
Dmitriy Lyubimov b4b230e606 Fixing for LocalScheduler with test, that much works .. 2013-07-22 14:42:47 -07:00
Karen Feng 85c4d7bf3b Shows number of complete/total/failed tasks (bug: failed tasks assigned to null executor) 2013-07-22 14:35:47 -07:00
Josh Rosen f649dabb4a Fix bug: DoubleRDDFunctions.sampleStdev() computed non-sample stdev().
Update JavaDoubleRDD to add new methods and docs.

Fixes SPARK-825.
2013-07-22 13:21:48 -07:00
Karen Feng 8901f379c9 Fixed memory used/remaining/total bug 2013-07-22 09:58:03 -07:00
Karen Feng 636b19f833 Merge branch 'master' of https://github.com/mesos/spark into ui-808 2013-07-22 09:53:26 -07:00
Evan Chan 0337d88321 Add a public method getCachedRdds to SparkContext 2013-07-21 18:26:14 -07:00
Karen Feng 865dc63bac Changed table format for executors 2013-07-19 15:57:01 -07:00
Karen Feng 81bb5dc640 Creates Executors tab for application with RDD block and memory/disk used, solves SPARK-808 2013-07-19 14:08:30 -07:00
Konstantin Boudnik cfce9a6a36 Regression: default webui-port can't be set via command line "--webui-port" anymore 2013-07-19 14:00:58 -07:00
Liang-Chi Hsieh 4530e8a9bf fix typo. 2013-07-20 00:04:25 +08:00
Liang-Chi Hsieh aa6f83289b A better fix for giving local jars unde Yarn mode. 2013-07-19 22:25:28 +08:00
Liang-Chi Hsieh a613628c50 Do not copy local jars given to SparkContext in yarn mode since the Context is not running on local. This bug causes failure when jars can not be found. Example codes (such as spark.examples.SparkPi) can not work without this fix under yarn mode. 2013-07-19 16:59:12 +08:00
Matei Zaharia af3c9d5042 Add Apache license headers and LICENSE and NOTICE files 2013-07-16 17:21:33 -07:00
Matei Zaharia b1f9f64743 Merge branch 'master' of github.com:mesos/spark 2013-07-16 11:01:53 -07:00
Matei Zaharia 5c388808a8 SPARK-814: Result stages should be named after action 2013-07-16 11:01:14 -07:00
Matei Zaharia f347cc3f65 Fix deprecation warning and style issues 2013-07-16 10:53:30 -07:00
Reynold Xin 69316603d6 Throw a more meaningful message when runJob is called to launch tasks on non-existent partitions. 2013-07-15 22:50:11 -07:00
Karen Feng 6dc7c9bfb1 Removed job UI column, linked description to job UI 2013-07-15 16:33:50 -07:00
Karen Feng fbf5aa761e Removed log message, added field in master UI to link to log UI 2013-07-15 15:50:03 -07:00
Karen Feng eac381a957 Merge branch 'ui-802' of https://github.com/karenfeng/spark into ui-802 2013-07-15 15:48:44 -07:00
Karen Feng 3955711250 Added field to master UI with link to job UI 2013-07-15 15:47:21 -07:00
Karen Feng 0d78b6d9cd Links to job UI from standalone deploy cluster web UI: fixes SPARK-802 2013-07-15 13:47:38 -07:00
Karen Feng b2aaa1199e Adds app name in HTML page titles on job web UI: fixes SPARK-806 2013-07-15 11:44:42 -07:00
Matei Zaharia d47c16f78d Add an option to disable reference tracking in Kryo 2013-07-15 01:55:54 +00:00
Matei Zaharia c7877d5e16 Merge pull request #689 from BlackNiuza/application_status
Bug fix: SPARK-796
2013-07-14 12:58:13 -07:00
Matei Zaharia 10c05937bd Merge pull request #699 from pwendell/ui-env
Add `Environment` tab to SparkUI.
2013-07-14 11:45:18 -07:00
Patrick Wendell 4883586838 Responding to Matei's review 2013-07-14 10:37:26 -07:00
BlackNiuza 00556a94c9 add spaces before curly braces and after for if conditions 2013-07-14 17:04:53 +08:00
Matei Zaharia b91a218cea Cosmetic fixes to web UI 2013-07-14 07:31:33 +00:00
Matei Zaharia a44a7b1238 Determine Spark core classes better in getCallSite 2013-07-14 07:23:09 +00:00
root e271fde10b Fixed a delay scheduling bug in the YARN branch, found by Patrick 2013-07-14 06:24:29 +00:00
Patrick Wendell ddb97f0fdf Add Environment tab to SparkUI.
This adds a tab which displays system property and classpath information. This
can be useful in debugging various types of issues such as:

1. Extra/incorrect Hadoop jars being included in the classpath
2. Spark launching with a different JRE version than intended
3. Spark system properties not being set to intended values
4. User added jars that conflict with Spark jars
2013-07-13 16:14:40 -07:00
Matei Zaharia 77c69ae5a0 Merge pull request #697 from pwendell/block-locations
Show block locations in Web UI.
2013-07-12 23:05:21 -07:00
Matei Zaharia 5a7835c152 Merge pull request #691 from karenfeng/logpaging
Create log pages
2013-07-12 20:28:21 -07:00
Matei Zaharia 71ccca0cc1 Merge pull request #696 from woggle/executor-env
Pass executor env vars (e.g. SPARK_CLASSPATH) to compute-classpath.sh
2013-07-12 20:25:06 -07:00
Matei Zaharia 90fc3f30cd Merge pull request #692 from Reinvigorate/takeOrdered
adding takeOrdered() to RDD
2013-07-12 20:23:36 -07:00
Patrick Wendell 08150f19ab Minor style fix 2013-07-12 19:32:35 -07:00
Patrick Wendell 6855338e14 Show block locations in Web UI.
This fixes SPARK-769. Support is added for enumerating the locations of blocks
in the UI. There is also some minor cleanup in StorageUtils.
2013-07-12 19:30:32 -07:00
Karen Feng 73984b96a8 Removed unit test of nonexistent function Utils.lastNBytes 2013-07-12 14:26:56 -07:00
Charles Reiss 531a7e5574 Pass executor env vars (e.g. SPARK_CLASSPATH) to compute-classpath. 2013-07-12 12:58:25 -07:00
seanm a1662326e9 comment adjustment to takeOrdered 2013-07-12 08:38:19 -07:00
Andrew xia 2080e25006 Enhance job ui in spark ui system with adding pool information 2013-07-12 14:25:18 +08:00
seanm a2c915fba8 giving order to top and making tests more clear 2013-07-11 18:55:00 -07:00
Karen Feng 5c67ca0278 Remove "Bytes" in lieu of String notation 2013-07-11 17:31:59 -07:00
Karen Feng 6d054487bf Replace default buffer value to 100 GB, changed buttons to use String notation, removed default buffer parameter in UI URLs 2013-07-11 17:12:17 -07:00
Karen Feng a32784109d Fixed links for "Back to Master" 2013-07-11 16:57:55 -07:00
Karen Feng ece2388585 Removed logPageLength from logPage 2013-07-11 16:35:56 -07:00
Karen Feng 9ed036ccdb Replaced logPageLength with byteLength to prevent buffer shrink bug 2013-07-11 16:33:53 -07:00
Karen Feng fdc226a14c Clarified start and end byte variable names 2013-07-11 15:36:43 -07:00
Karen Feng 5d5dbc39f6 getByteRange moved to WorkerWebUI, takes converted parameters, returns only start/end offset 2013-07-11 15:22:45 -07:00
Karen Feng 15fd11d657 Removed redundant calls to request by logPage 2013-07-11 15:01:50 -07:00
Karen Feng 11872888ca Created getByteRange function for logs and log pages, removed lastNBytes function 2013-07-11 14:56:37 -07:00
Matei Zaharia 018d04c64e Merge pull request #684 from woggle/mesos-classloader
Explicitly set class loader for MesosSchedulerDriver callbacks.
2013-07-11 12:48:37 -07:00
Karen Feng e3a3fcf61b Scrollbar on log pages appear automatically 2013-07-11 12:16:38 -07:00
Karen Feng 044d4577ec Fixed capitalization of log page 2013-07-11 12:02:15 -07:00
Karen Feng 0ecc33f0c8 Added byte range, page title with log name, previous/next bytes buttons, initialization to end of log, large default buffer, buggy back to master link 2013-07-11 11:25:58 -07:00
Karen Feng 74bd3fc680 Added byte range on log pages 2013-07-10 15:44:28 -07:00
Karen Feng 24196c91f0 Changed buffer to 10,000 bytes, created scrollbar for fixed-height log 2013-07-10 15:27:52 -07:00
Karen Feng f5f3b272f8 Fixed mixup of start/end, moved more import files 2013-07-10 14:52:29 -07:00
Karen Feng dbe948d9a2 Moved appropriate import files from UISuite to UtilsSuite 2013-07-10 14:15:41 -07:00
Karen Feng 5f8a20b4a8 Moved unit tests for Utils from UISuite to UtilsSuite 2013-07-10 13:53:39 -07:00
Karen Feng 0d4580360b Fixed docstring of offsetBytes to match params and wrapped for 100+ character lines 2013-07-10 13:24:26 -07:00
Karen Feng 04263e4d46 Made some minor style changes 2013-07-10 13:15:42 -07:00
Karen Feng cfb6447ac4 Fixed for nonexistent bytes, added unit tests, changed stdout-page to stdout 2013-07-10 11:47:57 -07:00
seanm ee4ce2fc51 adding takeOrdered to java API 2013-07-10 10:46:04 -07:00
seanm 24705d0f46 adding takeOrdered() to RDD 2013-07-10 10:33:11 -07:00
Karen Feng 620a6974c6 Allows for larger files, refactors lastNBytes, removes old Log column, fixes imports, uses map 2013-07-10 10:20:53 -07:00
BlackNiuza ce18b50d5f set SUCCEEDED for all master in shutdown hook 2013-07-10 19:11:43 +08:00
Karen Feng b6072b58bf Fixes style, makes "std__-page" consistent, reads only parts of files 2013-07-09 17:25:10 -07:00
Karen Feng 13fc6f248c Clean commit of log paging 2013-07-09 14:17:15 -07:00
BlackNiuza aaa7b081df according to mridulm's comments to adjust the code 2013-07-09 20:03:01 +08:00
Charles Reiss e47253e0cc Reset ClassLoader in MesosSchedulerBackend, too. (per review comments).
Also set ClassLoader for all mesos callbacks, not just statusUpdate,
registered.
2013-07-09 01:23:23 -07:00
BlackNiuza c1d44be805 Bug fix: SPARK-796 2013-07-09 15:18:28 +08:00
Matei Zaharia 7dcda9ae74 Merge pull request #688 from markhamstra/scalaDependencies
Fixed SPARK-795 with explicit dependencies
2013-07-08 23:24:23 -07:00
Mark Hamstra 0b39d66f3f pom cleanup 2013-07-08 16:07:09 -07:00
Mark Hamstra afdaf430bd Explicit dependencies for scala-library and scalap to prevent 2.9.2 vs. 2.9.3 problems 2013-07-08 15:40:50 -07:00
Charles Reiss 8c1d1c98e0 Explicitly set class loader for MesosSchedulerDriver callbacks. 2013-07-08 12:25:46 -07:00
Shivaram Venkataraman 4af0d63cb1 Remove akka LogLevel fix as we no longer use spray 2013-07-07 10:42:43 -07:00
Shivaram Venkataraman d362d0f411 Ignore stderr when calling cat on a non-existing file 2013-07-07 04:09:46 -07:00
Shivaram Venkataraman 7d6d9e6ab2 Set DriverSuite log level to WARN 2013-07-07 04:09:15 -07:00
Shivaram Venkataraman a948f06725 Suppress log messages in sbt test with two changes:
1. Set akka log level to ERROR before shutting down the actorSystem.
This avoids akka log messages (like Spray) from falling back to INFO
on the Stdout logger
2. Initialize netty to use SLF4J in LocalSparkContext. This ensures that
stack trace thrown during shutdown is handled by SLF4J instead of stdout
2013-07-07 04:09:08 -07:00
Patrick Wendell 32b9d21a97 Fix occasional failure in UI listener.
If a task fails before the metrics are initialized, it remains possible
that the metrics field will be `None`. This patch accounts for that possbility
by keeping metrics as an `Option` at all times.
2013-07-06 16:40:02 -07:00
Matei Zaharia 1ffadb2d9e Merge remote-tracking branch 'pwendell/ui-updates'
Conflicts:
	core/src/main/scala/spark/scheduler/DAGScheduler.scala
	core/src/main/scala/spark/util/AkkaUtils.scala
	pom.xml
2013-07-06 15:51:41 -07:00
Matei Zaharia 94871e4703 Merge pull request #655 from tgravescs/master
Add support for running Spark on Yarn on a secure Hadoop Cluster
2013-07-06 15:26:19 -07:00
Matei Zaharia 3f918b33f8 Merge pull request #672 from holdenk/master
s/ActorSystemImpl/ExtendedActorSystem/ as ActorSystemImpl results in a warning
2013-07-06 12:45:18 -07:00
Matei Zaharia 2a36e5449b Merge pull request #673 from xiajunluan/master
Add config template file for fair scheduler feature
2013-07-06 12:43:21 -07:00
Matei Zaharia 7ba7fa110b Merge pull request #674 from liancheng/master
Bug fix: SPARK-789
2013-07-06 11:45:08 -07:00
BlackNiuza 44a2440039 Remove active job from idToActiveJob when job finished or aborted 2013-07-07 01:33:09 +08:00
Patrick Wendell 37abe84212 Tracking some task metrics even during failures. 2013-07-06 09:19:59 -07:00
Patrick Wendell 84b7fc54e6 Enforcing correct sort order for formatted strings 2013-07-05 17:21:08 -07:00
Matei Zaharia 399bd65ef5 Fixed compile error due to merge 2013-07-05 11:27:06 -07:00
Matei Zaharia 652ea0f1d8 Allow RDD.takeSample to give samples bigger than the RDD
Before, when withReplacement was set to true, we would not get a sample
bigger than the RDD's count().

Conflicts:
	core/src/main/scala/spark/RDD.scala
	core/src/test/scala/spark/RDDSuite.scala
2013-07-05 11:15:13 -07:00
Matei Zaharia 6586c5e28b Added a SparkContext accessor to RDD 2013-07-05 11:13:46 -07:00
jerryshao e4ff544a8d Clean StageToInfos periodically when spark.cleaner.ttl is enabled 2013-07-05 10:34:45 +08:00
Lian Cheng c0c3155c3c Bug fix: SPARK-789
https://spark-project.atlassian.net/browse/SPARK-789
2013-07-05 00:54:10 +08:00
Andrew xia 6ccfb73ca9 Add fair scheduler config template file 2013-07-04 19:19:44 +08:00
Holden Karau 0f06d6217d s/ActorSystemImpl/ExtendedActorSystem/ as ActorSystemImpl results in a warning 2013-07-04 01:05:39 -07:00
Gavin Li 94238aae57 fix dependencies 2013-07-03 18:08:38 +00:00
Gavin Li 96130c30d9 add compression codec trait and snappy compression 2013-07-03 05:49:04 +00:00
Y.CORP.YAHOO.COM\tgraves 923cf92900 Rework from pull request. Removed --user option from Spark on Yarn Client, made the user of JAVA_HOME environment
variable conditional on if its set, and created addCredentials in each of the SparkHadoopUtil classes
to only add the credentials when the profile is hadoop2-yarn.
2013-07-02 21:18:59 -05:00
Patrick Wendell 39e2325675 Removing dead code 2013-07-02 16:28:40 -07:00
Patrick Wendell 8ca1cc1786 Adding truncation for log files 2013-07-02 16:10:50 -07:00
Patrick Wendell 9a42d04efa Throw exception for missing resource 2013-07-01 14:43:13 -07:00
Patrick Wendell 1025d7d1ef Package refactoring 2013-07-01 14:40:53 -07:00
Patrick Wendell 30b9034241 Fixing bug where logs aren't shown 2013-07-01 13:48:01 -07:00
Patrick Wendell 8688689387 Various formatting changes 2013-07-01 13:40:12 -07:00
Patrick Wendell 735c951a09 Adding test script 2013-07-01 09:33:22 -07:00
Patrick Wendell 5de326db7d Print exception message 2013-07-01 09:19:45 -07:00
root ec31e68d5d Fixed PySpark perf regression by not using socket.makefile(), and improved
debuggability by letting "print" statements show up in the executor's stderr

Conflicts:
	core/src/main/scala/spark/api/python/PythonRDD.scala
2013-07-01 06:26:31 +00:00
root 3296d132b6 Fix performance bug with new Python code not using buffered streams 2013-07-01 06:25:43 +00:00
Matei Zaharia 03d0b858c8 Made use of spark.executor.memory setting consistent and documented it
Conflicts:

	core/src/main/scala/spark/SparkContext.scala
2013-06-30 15:46:46 -07:00
Patrick Wendell e721ff7e5a Allowing details for failed stages 2013-06-29 11:26:30 -07:00