ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Matei Zaharia	e89ffc7b3c	Merge pull request #839 from jegonzal/zip_partitions Currying RDD.zipPartitions	2013-08-16 14:02:34 -07:00
Joseph E. Gonzalez	53b2639a1e	Reversing the argument order in zipPartitions to enable stronger type inference.	2013-08-16 12:38:59 -07:00
Andre Schumacher	c7e348faec	Implementing SPARK-878 for PySpark: adding zip and egg files to context and passing it down to workers which add these to their sys.path	2013-08-16 11:58:20 -07:00
Reynold Xin	c961c19b7b	Use the JSON formatter from Scala library and removed dependency on lift-json. It made the JSON creation slightly more complicated, but reduces one external dependency. The scala library also properly escape "/" (which lift-json doesn't).	2013-08-15 18:23:01 -07:00
Reynold Xin	eddbf43b54	Revert "Merge pull request #834 from Daemoen/master" This reverts commit `230ab2722e`, reversing changes made to `659553b21d`.	2013-08-15 17:49:37 -07:00
Reynold Xin	230ab2722e	Merge pull request #834 from Daemoen/master Updated json output to allow for display of worker state	2013-08-15 17:45:17 -07:00
Patrick Wendell	659553b21d	Merge pull request #836 from pwendell/rename Rename `memoryBytesToString` and `memoryMegabytesToString`	2013-08-15 16:56:31 -07:00
Jey Kottalam	a06a9d5c5f	Rename HadoopWriter to SparkHadoopWriter since it's outside of our package	2013-08-15 16:50:37 -07:00
Jey Kottalam	8f979edef5	Fix newTaskAttemptID to work under YARN	2013-08-15 16:50:37 -07:00
Jey Kottalam	e2d7656ca3	re-enable YARN support	2013-08-15 16:50:37 -07:00
Jey Kottalam	bd0bab47c9	SparkEnv isn't available this early, and not needed anyway	2013-08-15 16:50:37 -07:00
Jey Kottalam	4f43fd791a	make SparkHadoopUtil a member of SparkEnv	2013-08-15 16:50:37 -07:00
Jey Kottalam	43ebcb8484	rename HadoopMapRedUtil => SparkHadoopMapRedUtil, HadoopMapReduceUtil => SparkHadoopMapReduceUtil	2013-08-15 16:50:37 -07:00
Jey Kottalam	8b1c1520fc	add comment	2013-08-15 16:50:37 -07:00
Jey Kottalam	69c3bbf688	dynamically detect hadoop version	2013-08-15 16:50:37 -07:00
Jey Kottalam	f67b94ad4f	remove core/src/hadoop{1,2} dirs	2013-08-15 16:50:36 -07:00
Jey Kottalam	b877e20a33	move yarn to its own directory	2013-08-15 16:50:36 -07:00
Patrick Wendell	4c6ade1ad5	Rename `memoryBytesToString` and `memoryMegabytesToString` These are used all over the place now and they are not specific to memory at all. memoryBytesToString --> bytesToString memoryMegabytesToString --> megabytesToString	2013-08-15 15:58:07 -07:00
Reynold Xin	1a51deae8a	More minor UI changes including code review feedback.	2013-08-15 14:34:07 -07:00
Daemoen	ad2e8b5126	Updated json output to allow for display of worker state Ops teams need to ensure that the cluster is functional and performant. Having to scrape the html source for worker state won't work reliably, and will be slow. By exposing the state in the json output, ops teams are able to ensure a fully functional environment by querying for the json output and parsing for dead nodes.	2013-08-15 12:19:14 -07:00
Reynold Xin	2d2a556bdf	Various UI improvements.	2013-08-14 23:23:09 -07:00
Reynold Xin	290e3e6e65	Renamed setCurrentJobDescription to setJobDescription.	2013-08-14 18:40:53 -07:00
Reynold Xin	3886b54933	A few small scheduler / job description changes. 1. Renamed SparkContext.addLocalProperty to setLocalProperty. And allow this function to unset a property. 2. Renamed SparkContext.setDescription to setCurrentJobDescription. 3. Throw an exception if the fair scheduler allocation file is invalid.	2013-08-14 17:19:42 -07:00
Matei Zaharia	839f2d4f3f	Merge pull request #822 from pwendell/ui-features Adding GC Stats to TaskMetrics (and three small fixes)	2013-08-14 16:17:23 -07:00
Patrick Wendell	04ad78b09d	Style cleanup based on Matei feedback	2013-08-14 14:57:21 -07:00
Kay Ousterhout	a88aa5e6ed	Fixed 2 bugs in executor UI. 1) UI crashed if the executor UI was loaded before any tasks started. 2) The total tasks was incorrectly reported due to using string (rather than int) arithmetic.	2013-08-13 23:44:58 -07:00
Patrick Wendell	c223176388	Small style clean-up	2013-08-13 16:56:37 -07:00
Patrick Wendell	fab5cee111	Correcting terminology in RDD page	2013-08-13 16:25:55 -07:00
Patrick Wendell	024e5c5ce1	Correct sorting order for stages	2013-08-13 16:25:55 -07:00
Patrick Wendell	4e9f0c2df6	Capturing GC detials in TaskMetrics	2013-08-13 16:25:55 -07:00
Patrick Wendell	f0382007dc	Bug fix for display of shuffle read/write metrics. This fixes an error where empty cells are missing if a given task has no shuffle read/write.	2013-08-13 16:25:55 -07:00
Matei Zaharia	d316af9c84	Merge pull request #821 from pwendell/print-launch-command Print run command to stderr rather than stdout	2013-08-13 15:31:01 -07:00
Patrick Wendell	a7feb69ae8	Print run command to stderr rather than stdout	2013-08-13 15:07:03 -07:00
Kay Ousterhout	1beb843a6f	Reuse the set of failed states rather than creating a new object each time	2013-08-13 14:27:40 -07:00
Kay Ousterhout	c92dd627ca	Properly account for killed tasks. The TaskState class's isFinished() method didn't return true for KILLED tasks, which means some resources are never reclaimed for tasks that are killed. This also made it inconsistent with the isFinished() method used by CoarseMesosSchedulerBackend.	2013-08-13 12:40:15 -07:00
Patrick Wendell	ed6a1646e6	Slight change to pr-784	2013-08-13 09:29:40 -07:00
Patrick Wendell	a0133bfbad	Merge pull request #784 from jerryshao/dev-metrics-servlet Add MetricsServlet for Spark metrics system	2013-08-13 09:28:18 -07:00
Matei Zaharia	65d0d91fba	Merge pull request #807 from JoshRosen/guava-optional Change scala.Option to Guava Optional in Java APIs	2013-08-12 19:00:57 -07:00
Josh Rosen	cf08bb7a3e	Fix import organization.	2013-08-12 18:55:02 -07:00
jerryshao	09c7179e81	MetricsServlet code refactor according to comments	2013-08-12 13:23:23 +08:00
jerryshao	320e87e7ab	Add MetricsServlet for Spark metrics system	2013-08-12 13:23:23 +08:00
Reynold Xin	e5b9ed2833	Merge pull request #808 from pwendell/ui_compressed_bytes Report compressed bytes read when calculating TaskMetrics	2013-08-11 17:22:47 -07:00
Patrick Wendell	3d8f281604	Report compressed bytes read when calculating TaskMetrics	2013-08-11 16:25:57 -07:00
Matei Zaharia	379648630b	Merge pull request #805 from woggle/hadoop-rdd-jobconf Use new Configuration() instead of slower new JobConf() in SerializableWritable	2013-08-11 14:51:47 -07:00
Josh Rosen	d7f78b443b	Change scala.Option to Guava Optional in Java APIs.	2013-08-11 12:05:09 -07:00
Charles Reiss	6402b539d0	Use new Configuration() instead of new JobConf() for ObjectWritable. JobConf's constructor loads default config files in some verisons of Hadoop, which is quite slow, and we only need the Configuration object to pass the correct ClassLoader.	2013-08-10 21:31:05 -07:00
Matei Zaharia	71c63de22f	Merge pull request #795 from mridulm/master Fix bug reported in PR 791 : a race condition in ConnectionManager and Connection	2013-08-10 10:21:20 -07:00
Matei Zaharia	d3277a0daf	Merge remote-tracking branch 'origin/pr/792' Conflicts: core/src/main/scala/spark/ui/jobs/IndexPage.scala core/src/main/scala/spark/ui/jobs/StagePage.scala	2013-08-10 10:18:50 -07:00
Patrick Wendell	d17eeb997d	Merge pull request #785 from anfeng/master expose HDFS file system stats via Executor metrics	2013-08-10 09:02:27 -07:00
Kay Ousterhout	14d14f451a	Shortened names, as per Matei's suggestion	2013-08-10 07:50:27 -07:00
Matei Zaharia	cd247ba5bb	Merge pull request #786 from shivaram/mllib-java Java fixes, tests and examples for ALS, KMeans	2013-08-09 20:41:13 -07:00
Kay Ousterhout	7810a76512	Only print event queue full error message once	2013-08-09 18:20:48 -07:00
Kay Ousterhout	44ca8629d8	Style fix: removing unnecessary return type	2013-08-09 17:22:50 -07:00
Kay Ousterhout	29b79714f9	Style fixes based on code review	2013-08-09 16:46:34 -07:00
Kay Ousterhout	81e1d4a7d1	Refactored SparkListener to process all events asynchronously. This commit fixes issues where SparkListeners that take a while to process events slow the DAGScheduler. This commit also fixes a bug in the UI where if a user goes to a web page of a stage that does not exist, they can create a memory leak (granted, this is not an issue at small scale -- probably only an issue if someone actively tried to DOS the UI).	2013-08-09 13:27:41 -07:00
Matei Zaharia	b09d4b79e8	Merge pull request #799 from woggle/sync-fix Remove extra synchronization in ResultTask	2013-08-09 13:17:08 -07:00
Patrick Wendell	cc6b92e80e	Merge pull request #775 from pwendell/print-launch-command Log the launch command for Spark daemons	2013-08-09 13:00:33 -07:00
Patrick Wendell	3970b580c2	Using quotes when printing out command	2013-08-09 11:53:32 -07:00
Charles Reiss	9dfc280f74	Remove extra synchronization in ResultTask	2013-08-09 11:09:02 -07:00
Matei Zaharia	f94fc75c3f	Merge pull request #788 from shane-huang/sparkjavaopts For standalone mode, add worker local env setting of SPARK_JAVA_OPTS as ...	2013-08-09 10:04:03 -07:00
Matei Zaharia	d1e1c1b24d	Add test for Kryo with WrappedArray (which was failing in Chill 0.3.0)	2013-08-08 13:34:11 -07:00
Mridul Muralidharan	c230ca3b4e	Change line size	2013-08-08 22:28:40 +05:30
Mridul Muralidharan	dc47084f4e	Attempt to fix bug reported in PR 791 : a race condition in ConnectionManager and Connection	2013-08-08 22:19:27 +05:30
Kay Ousterhout	88049a214d	Fixed 3 bugs that caused UI to crash (including SPARK-810). One bug caused the UI to crash if you try to look at a job's status before any of the tasks have finished. The second bug was a concurrency issue where two different threads (the scheduling thread and a UI thread) could be reading/updating the data structures in JobProgressListener concurrently. The third bug mis-used an Option, also causing the UI to crash under certain conditions.	2013-08-07 23:09:25 -07:00
Patrick Wendell	b4321edf68	Reverting boostrap change	2013-08-07 22:18:18 -07:00
Patrick Wendell	21392f2a73	Change I forgot to merge in	2013-08-07 21:45:32 -07:00
Patrick Wendell	706394b370	Bumping font size to 14px and fixing sytle issue in progress bars	2013-08-07 21:27:04 -07:00
Patrick Wendell	8c0d668468	Merge branch 'master' into bootstrap-design Conflicts: core/src/main/scala/spark/ui/UIUtils.scala core/src/main/scala/spark/ui/jobs/IndexPage.scala core/src/main/scala/spark/ui/storage/RDDPage.scala	2013-08-07 21:06:03 -07:00
Kay Ousterhout	b88e26248e	Fixed issue in UI that limited scheduler throughput. Removal of items from ArrayBuffers in the UI code was slow and significantly impacted scheduler throughput. This commit improves scheduler throughput by 5x.	2013-08-07 14:42:05 -07:00
shane-huang	cbc5107e36	For standalone mode, add worker local env setting of SPARK_JAVA_OPTS as default and let application env override default options if applicable Signed-off-by: shane-huang <shengsheng.huang@intel.com>	2013-08-07 14:36:48 +08:00
Matei Zaharia	6b043a6f11	Merge pull request #724 from dlyubimov/SPARK-826 SPARK-826: fold(), reduce(), collect() always attempt to use java serialization	2013-08-06 22:31:02 -07:00
Matei Zaharia	7c4b7a53b1	Merge remote-tracking branch 'origin/pr/781' Conflicts: core/src/main/resources/spark/ui/static/webui.css	2013-08-06 17:19:49 -07:00
Karen Feng	908032e79b	Used saturated colors for progress bars	2013-08-06 16:52:21 -07:00
Karen Feng	8bc497fa10	Lightened color of progress bars	2013-08-06 16:33:05 -07:00
Karen Feng	ca1903ea63	Overlays progress text on top of bar	2013-08-06 15:45:42 -07:00
Matei Zaharia	df4d10d630	Merge pull request #779 from adatao/adatao-global-SparkEnv [HOTFIX] Extend thread safety for SparkEnv.get()	2013-08-06 15:44:05 -07:00
Shivaram Venkataraman	471fbadd0c	Java examples, tests for KMeans and ALS - Changes ALS to accept RDD[Rating] instead of (Int, Int, Double) making it easier to call from Java - Renames class methods from `train` to `run` to enable static methods to be called from Java. - Add unit tests which check if both static / class methods can be called. - Also add examples which port the main() function in ALS, KMeans to the examples project. Couple of minor changes to existing code: - Add a toJavaRDD method in RDD to convert scala RDD to java RDD easily - Workaround a bug where using double[] from Java leads to class cast exception in KMeans init	2013-08-06 15:43:46 -07:00
anfeng	dda2ac8b5d	reformat registerFileSystemStat()	2013-08-06 15:22:25 -07:00
Karen Feng	099528b6c4	Pre-sorts stage/env tables, changes text/link of stage summaries	2013-08-06 14:52:12 -07:00
Karen Feng	254a930730	Reverse sorts StageTable by submitted time	2013-08-06 14:18:38 -07:00
Karen Feng	5ed5b73026	Sorts first column of env tables	2013-08-06 13:59:53 -07:00
anfeng	0748c60817	expose HDFS file system stats via Executor metrics	2013-08-06 11:47:06 -07:00
Reynold Xin	d031f73679	Merge pull request #782 from WANdisco/master SHARK-94 Log the files computed by HadoopRDD and NewHadoopRDD	2013-08-05 22:33:00 -07:00
Matei Zaharia	1b63dea816	Merge pull request #769 from markhamstra/NegativeCores SPARK-847 + SPARK-845: Zombie workers and negative cores	2013-08-05 22:21:26 -07:00
Alexander Pivovarov	a30866438b	SHARK-94 Log the files computed by HadoopRDD and NewHadoopRDD	2013-08-05 21:48:43 -07:00
Matei Zaharia	8b277892c9	Merge pull request #774 from pwendell/job-description Show user-defined job name in UI	2013-08-05 19:14:52 -07:00
Christopher Nguyen	b1bbbe699c	[HOTFIX] Mark lastSetSparkEnv @volatile in case it gets HotSpot-cached On branch adatao-global-SparkEnv Changes to be committed: modified: core/src/main/scala/spark/SparkEnv.scala	2013-08-05 17:22:27 -07:00
Mark Hamstra	35d8f5ee52	Moved handling of timed out workers within the Master actor	2013-08-05 13:13:56 -07:00
Mark Hamstra	37ccf9301a	milliseconds -> seconds in timeOutDeadWorkers logging	2013-08-05 13:13:56 -07:00
Mark Hamstra	cdd1af562e	Timeout zombie workers	2013-08-05 13:13:56 -07:00
Mikhail Bautin	e8bec8365f	Only reduce the number of cores once when removing an executor	2013-08-05 13:13:56 -07:00
Karen Feng	95025afdec	Made most small fixes for SPARK-849 except for table sort, task progress overlay	2013-08-05 13:04:56 -07:00
Bill Zhao	87134b3648	SPARK-850: give better console message	2013-08-05 11:55:35 -07:00
Christopher Nguyen	39e4fda76f	[HOTFIX] Extend thread safety for SparkEnv.get() A ThreadLocal SparkEnv.env is facing various situations leading to NullPointerExceptions, where SparkEnv.env set in one thread is not gettable in another thread, but often assumed to be available. See, e.g., https://groups.google.com/forum/#!topic/spark-developers/GLx8yunSj0A This hotfixes SparkEnv.env to return either (a) the ThreadLocal value if non-null, or (b) the previously set value in any thread. This approach preserves SparkEnv.set() thread safety needed by RDD.compute() and possibly other places. A refactoring that parameterizes SparkEnv should be addressed subsequently. On branch adatao-global-SparkEnv Changes to be committed: modified: core/src/main/scala/spark/SparkEnv.scala	2013-08-05 02:09:54 -07:00
Patrick Wendell	f3660d5ab8	Make output formatting consistent between bash/scala	2013-08-03 21:30:15 -07:00
Patrick Wendell	ad94fbb322	Log the launch command for Spark executors	2013-08-03 09:19:46 -07:00
Matei Zaharia	22abbc10d6	Merge pull request #772 from karenfeng/ui-843 Show app duration	2013-08-02 16:37:59 -07:00
Patrick Wendell	5b3784a79c	Show user-defined job name in UI	2013-08-02 15:47:41 -07:00
Karen Feng	b3ae5b25d5	Shows time the app has been running	2013-08-02 13:25:14 -07:00
Patrick Wendell	9d7dfd2d5a	Merge pull request #743 from pwendell/app-metrics Add application metrics to standalone master	2013-08-01 17:41:58 -07:00

1 2 3 4 5 ...

1947 commits