ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Imran Rashid	d9461b15d3	cleanup a bunch of imports	2013-02-10 21:41:40 -08:00
Imran Rashid	383af599bb	SparkContext.addSparkListener; "std" listener in StatsReportListener	2013-02-10 14:19:37 -08:00
Imran Rashid	b7d9e24394	use TaskMetrics to gather all stats; lots of plumbing to get it all the way back to driver	2013-02-10 14:18:52 -08:00
Imran Rashid	04e828f7c1	general fixes to Distribution, plus some tests	2013-02-08 19:07:36 -08:00
Imran Rashid	379564c7e0	setup plumbing to get task metrics; lots of unfinished parts, but basic flow in place	2013-02-05 18:30:21 -08:00
Imran Rashid	1704b124d8	add as many fetch requests as we can, subject to maxBytesInFlight	2013-02-05 14:33:52 -08:00
Imran Rashid	696e4b2167	track remoteFetchTime	2013-02-05 14:29:16 -08:00
Imran Rashid	b29f9cc978	BlockManager.getMultiple returns a custom iterator, to enable tracking of shuffle performance	2013-02-05 14:00:44 -08:00
Imran Rashid	e319ac74c1	cogrouped RDD stores the amount of time taken to read shuffle data in each task	2013-02-05 10:18:16 -08:00
Imran Rashid	295b534398	task context keeps a handle on Task -- giant hack, temporary for tracking shuffle times & amount	2013-02-05 10:18:16 -08:00
Imran Rashid	9df7e2ae55	Shuffle Fetchers use a timed iterator	2013-02-05 10:18:16 -08:00
Imran Rashid	1ad77c4766	add TimedIterator	2013-02-05 10:18:15 -08:00
Imran Rashid	843084d69d	track total bytes written by ShuffleMapTasks	2013-02-05 10:18:15 -08:00
Imran Rashid	b430d2359d	Merge branch 'master' into stageInfo Conflicts: core/src/main/scala/spark/scheduler/DAGScheduler.scala core/src/main/scala/spark/scheduler/local/LocalScheduler.scala	2013-02-04 21:40:44 -08:00
Matei Zaharia	f7b4e428be	Merge pull request #445 from JoshRosen/pyspark_fixes Fix exit status in PySpark unit tests; fix/optimize PySpark's RDD.take()	2013-02-03 21:36:36 -08:00
Matei Zaharia	3bfaf3ab1d	Merge pull request #379 from stephenh/sparkmem Add spark.executor.memory to differentiate executor memory from spark-shell	2013-02-02 23:58:23 -08:00
Matei Zaharia	88ee6163a1	Merge pull request #422 from squito/blockmanager_info RDDInfo available from SparkContext	2013-02-02 23:44:13 -08:00
Matei Zaharia	cd4ca93679	Merge pull request #436 from stephenh/removeextraloop Once we find a split with no block, we don't have to look for more.	2013-02-02 23:39:28 -08:00
Matei Zaharia	d5daaab381	Merge pull request #442 from stephenh/fixsystemnames Fix createActorSystem not actually using the systemName parameter.	2013-02-02 23:38:46 -08:00
Matei Zaharia	9163c3705d	Formatting	2013-02-02 23:34:47 -08:00
Josh Rosen	8fbd5380b7	Fetch fewer objects in PySpark's take() method.	2013-02-03 06:44:49 +00:00
Matei Zaharia	34a7bcdb3a	Formatting	2013-02-02 19:40:30 -08:00
Stephen Haberman	7aba123f0c	Further simplify checking for Nil.	2013-02-02 13:53:28 -06:00
Charles Reiss	6107957962	Merge remote-tracking branch 'base/master' into dag-sched-tests Conflicts: core/src/main/scala/spark/scheduler/DAGScheduler.scala	2013-02-02 00:33:30 -08:00
Stephen Haberman	cae8a6795c	Fix dangling old variable names.	2013-02-02 02:15:39 -06:00
Stephen Haberman	696eec32c9	Move executorMemory up into SchedulerBackend.	2013-02-02 02:03:26 -06:00
Stephen Haberman	103c375ba0	Merge branch 'master' into sparkmem	2013-02-02 01:57:18 -06:00
Stephen Haberman	28e0cb9f31	Fix createActorSystem not actually using the systemName parameter. This meant all system names were "spark", which worked, but didn't lead to the most intuitive log output. This fixes createActorSystem to use the passed system name, and refactors Master/Worker to encapsulate their system/actor names instead of having the clients guess at them. Note that the driver system name, "spark", is left as is, and is still repeated a few times, but that seems like a separate issue.	2013-02-02 01:11:37 -06:00
Stephen Haberman	12c1eb4756	Reduce the amount of duplicate logging Akka does to stdout. Given we have Akka logging go through SLF4j to log4j, we don't need all the extra noise of Akka's stdout logger that is supposedly only used during Akka init time but seems to continue logging lots of noisy network events that we either don't care about or are in the log4j logs anyway. See: http://doc.akka.io/docs/akka/2.0/general/configuration.html # Log level for the very basic logger activated during AkkaApplication startup # Options: ERROR, WARNING, INFO, DEBUG # stdout-loglevel = "WARNING"	2013-02-01 21:21:44 -06:00
Matei Zaharia	8b3041c723	Reduced the memory usage of reduce and similar operations These operations used to wait for all the results to be available in an array on the driver program before merging them. They now merge values incrementally as they arrive.	2013-02-01 15:38:42 -08:00
Matei Zaharia	4529876db0	Merge branch 'master' of github.com:mesos/spark	2013-02-01 14:07:38 -08:00
Matei Zaharia	9970926ede	formatting	2013-02-01 14:07:34 -08:00
Matei Zaharia	79c24abe4c	Merge pull request #432 from stephenh/moreprivacy Add more private declarations.	2013-02-01 14:06:55 -08:00
Matei Zaharia	de340ddf0b	Merge pull request #437 from stephenh/cancelmetacleaner Stop BlockManagers metadataCleaner.	2013-02-01 12:59:25 -08:00
Imran Rashid	c6190067ae	remove unneeded (and unused) filter on block info	2013-02-01 09:55:25 -08:00
Stephen Haberman	59c57e48df	Stop BlockManagers metadataCleaner.	2013-02-01 10:34:02 -06:00
Matei Zaharia	571af31304	Merge pull request #433 from rxin/master Changed PartitionPruningRDD's split to make sure it returns the correct split index.	2013-02-01 00:32:41 -08:00
Imran Rashid	8a0a5ed533	track total partitions, in addition to cached partitions; use scala string formatting	2013-02-01 00:23:38 -08:00
Imran Rashid	f127f2ae76	fixup merge (master -> driver renaming)	2013-02-01 00:20:49 -08:00
Reynold Xin	f9af9cee6f	Moved PruneDependency into PartitionPruningRDD.scala.	2013-02-01 00:02:46 -08:00
Patrick Wendell	39ab83e957	Small fix from last commit	2013-01-31 21:52:52 -08:00
Patrick Wendell	c33f0ef41a	Some style cleanup	2013-01-31 21:50:02 -08:00
Patrick Wendell	3446d5c8d6	SPARK-673: Capture and re-throw Python exceptions This patch alters the Python <-> executor protocol to pass on exception data when they occur in user Python code.	2013-01-31 18:06:11 -08:00
Reynold Xin	6289d9654e	Removed the TODO comment from PartitionPruningRDD.	2013-01-31 17:49:36 -08:00
Reynold Xin	5b0fc265c2	Changed PartitionPruningRDD's split to make sure it returns the correct split index.	2013-01-31 17:48:39 -08:00
Stephen Haberman	782187c210	Once we find a split with no block, we don't have to look for more.	2013-01-31 18:27:25 -06:00
Stephen Haberman	418e36caa8	Add more private declarations.	2013-01-31 17:18:33 -06:00
Imran Rashid	02a6761589	Merge branch 'master' into blockmanager_info Conflicts: core/src/main/scala/spark/storage/BlockManagerMaster.scala	2013-01-30 18:52:35 -08:00
Imran Rashid	c1df24d085	rename Slaves --> Executor	2013-01-30 18:51:14 -08:00
Matei Zaharia	d12330bd2c	Merge pull request #426 from woggling/conn-manager-ips Remember ConnectionManagerId used to initiate SendingConnections	2013-01-30 15:02:53 -08:00

1 2 3 4 5 ...

971 commits