ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Patrick Wendell	50e3b8ec4c	Merge pull request #308 from kayousterhout/stage_naming Changed naming of StageCompleted event to be consistent The rest of the SparkListener events are named with "SparkListener" as the prefix of the name; this commit renames the StageCompleted event to SparkListenerStageCompleted for consistency.	2013-12-30 07:44:26 -08:00
Kay Ousterhout	c2c1af39f5	Updated code style according to Patrick's comments	2013-12-29 21:10:08 -08:00
Reynold Xin	72a17b69f5	Revert "Merge pull request #310 from jyunfan/master" This reverts commit `79b20e4dbe`, reversing changes made to `7375047d51`.	2013-12-28 21:25:40 -10:00
Reynold Xin	79b20e4dbe	Merge pull request #310 from jyunfan/master Fix typo in the Accumulators section Change 'val' to 'var'	2013-12-28 21:13:36 -10:00
Jyun-Fan Tsai	17f6620a71	Fix typo in the Accumulators section val => var	2013-12-29 11:30:02 +08:00
Patrick Wendell	7375047d51	Merge pull request #304 from kayousterhout/remove_unused Removed unused failed and causeOfFailure variables (in TaskSetManager)	2013-12-28 13:25:06 -08:00
Matei Zaharia	ad3dfd1531	Merge pull request #307 from kayousterhout/other_failure Removed unused OtherFailure TaskEndReason. The OtherFailure TaskEndReason was added by @mateiz 3 years ago in this commit: `24a1e7f838` Unless I am missing something, it doesn't seem to have been used then, and is not used now, so seems safe for deletion.	2013-12-27 22:10:14 -05:00
Matei Zaharia	b579b83277	Merge pull request #306 from kayousterhout/remove_pending Remove unused hasPendingTasks methods	2013-12-27 22:09:04 -05:00
Kay Ousterhout	b4619e509b	Changed naming of StageCompleted event to be consistent The rest of the SparkListener events are named with "SparkListener" as the prefix of the name; this commit renames the StageCompleted event to SparkListenerStageCompleted for consistency.	2013-12-27 17:45:20 -08:00
Kay Ousterhout	e17d7518ab	Removed unused OtherFailure TaskEndReason.	2013-12-27 15:51:27 -08:00
Kay Ousterhout	8419148e5f	Remove unused hasPendingTasks methods	2013-12-27 15:19:42 -08:00
Patrick Wendell	19672dca32	Merge pull request #305 from kayousterhout/line_spacing Fixed >100char lines in DAGScheduler.scala There's no changed functionality here -- only line spacing and one grammatical fix in a comment.	2013-12-27 13:37:10 -08:00
Kay Ousterhout	0c71ffe924	Style fixes as per Reynold's review	2013-12-27 12:19:38 -08:00
Kay Ousterhout	8c81068e16	Fixed >100char lines in DAGScheduler.scala	2013-12-27 11:36:54 -08:00
Kay Ousterhout	baaabcedc9	Removed unused failed and causeOfFailure variables	2013-12-27 11:12:36 -08:00
Reynold Xin	7be1e57786	Merge pull request #298 from aarondav/minor Minor: Decrease margin of left side of Log page Before ![before](https://f.cloud.github.com/assets/1400247/1812647/1a4be53e-6e87-11e3-9d5b-f851274be0e9.png) After ![after](https://f.cloud.github.com/assets/1400247/1812648/1ca1ea2c-6e87-11e3-946c-31be9258f450.png) It's a start anyway...	2013-12-26 23:41:40 -10:00
Reynold Xin	7d811ba6f2	Merge pull request #302 from pwendell/SPARK-1007 SPARK-1007: spark-class2.cmd should change SCALA_VERSION to be 2.10 Reported by Qiuzhuang Lian	2013-12-26 23:39:58 -10:00
Patrick Wendell	0cc1e0d43d	SPARK-1007: spark-class2.cmd should change SCALA_VERSION to be 2.10	2013-12-26 23:21:08 -08:00
Matei Zaharia	5e69fc5bb4	Merge pull request #295 from markhamstra/JobProgressListenerNPE Avoid a lump of coal (NPE) in JobProgressListener's stocking.	2013-12-26 19:10:39 -05:00
Aaron Davidson	4f2fb761b0	Decrease margin of left side of log page	2013-12-26 15:38:45 -08:00
Matei Zaharia	e240bad03b	Merge pull request #296 from witgo/master Renamed ClusterScheduler to TaskSchedulerImpl for yarn and new-yarn package	2013-12-26 12:30:48 -05:00
liguoqiang	b662c88a24	fix this import order	2013-12-26 15:49:33 +08:00
Mark Hamstra	c529dceaff	Avoid a lump of coal (NPE) in JobProgressListener's stocking.	2013-12-25 23:10:02 -08:00
Matei Zaharia	c344ed04c7	Merge pull request #283 from tmyklebu/master Python bindings for mllib This pull request contains Python bindings for the regression, clustering, classification, and recommendation tools in mllib. For each 'train' frontend exposed, there is a Scala stub in PythonMLLibAPI.scala and a Python stub in mllib.py. The Python stub serialises the input RDD and any vector/matrix arguments into a mutually-understood format and calls the Scala stub. The Scala stub deserialises the RDD and the vector/matrix arguments, calls the appropriate 'train' function, serialises the resulting model, and returns the serialised model. ALSModel is slightly different since a MatrixFactorizationModel has RDDs inside. The Scala stub returns a handle to a Scala MatrixFactorizationModel; prediction is done by calling the Scala predict method. I have tested these bindings on an x86_64 machine running Linux. There is a risk that these bindings may fail on some choose-your-own-endian platform if Python's endian differs from java.nio.ByteBuffer's idea of the native byte order.	2013-12-26 01:31:06 -05:00
liguoqiang	2bd76f693d	Renamed ClusterScheduler to TaskSchedulerImpl for yarn and new-yarn	2013-12-26 11:10:35 +08:00
liguoqiang	14fcef72db	Renamed ClusterScheduler to TaskSchedulerImpl for yarn and new-yarn	2013-12-26 11:05:07 +08:00
Tor Myklebust	9cbcf81453	Remove commented code in __init__.py.	2013-12-25 14:12:42 -05:00
Tor Myklebust	5e71354cb7	Fix copypasta in __init__.py. Don't import anything directly into pyspark.mllib.	2013-12-25 14:10:55 -05:00
Matei Zaharia	56094bcd8d	Merge pull request #290 from ash211/patch-3 Typo: avaiable -> available	2013-12-25 13:14:33 -05:00
Reynold Xin	4842a07da8	Merge pull request #287 from azuryyu/master Fixed job name in the java streaming example.	2013-12-25 01:52:15 -08:00
Tor Myklebust	02208a175c	Initial weights in Scala are ones; do that too. Also fix some errors.	2013-12-25 00:53:48 -05:00
Tor Myklebust	4e821390bc	Scala stubs for updated Python bindings.	2013-12-25 00:09:00 -05:00
Tor Myklebust	05163057a1	Split the mllib bindings into a whole bunch of modules and rename some things.	2013-12-25 00:08:05 -05:00
Andrew Ash	3665c722b5	Typo: avaiable -> available	2013-12-24 17:25:04 -08:00
Patrick Wendell	85a344b4f0	Merge pull request #127 from kayousterhout/consolidate_schedulers Deduplicate Local and Cluster schedulers. The code in LocalScheduler/LocalTaskSetManager was nearly identical to the code in ClusterScheduler/ClusterTaskSetManager. The redundancy made making updating the schedulers unnecessarily painful and error- prone. This commit combines the two into a single TaskScheduler/ TaskSetManager. Unfortunately the diff makes this change look much more invasive than it is -- TaskScheduler.scala is only superficially changed (names updated, overrides removed) from the old ClusterScheduler.scala, and the same with TaskSetManager.scala. Thanks @rxin for suggesting this change!	2013-12-24 16:35:06 -08:00
Patrick Wendell	c2dd6bcd6e	Merge pull request #279 from aarondav/shuffle-cleanup0 Clean up shuffle files once their metadata is gone Previously, we would only clean the in-memory metadata for consolidated shuffle files. Additionally, fixes a bug where the Metadata Cleaner was ignoring type-specific TTLs.	2013-12-24 14:36:47 -08:00
Kay Ousterhout	1efe3adf56	Responded to Reynold's style comments	2013-12-24 14:18:39 -08:00
Tor Myklebust	86e38c4942	Remove useless line from test stub.	2013-12-24 16:49:31 -05:00
Tor Myklebust	4efec6eb94	Python change for move of PythonMLLibAPI.	2013-12-24 16:49:03 -05:00
Tor Myklebust	58e2a7d6d4	Move PythonMLLibAPI into its own package.	2013-12-24 16:48:40 -05:00
Matei Zaharia	3bf7c708d3	Merge pull request #275 from ueshin/wip/changeclasspathorder Change the order of CLASSPATH. SPARK_TOOLS_JAR should be placed after CLASSPATH or at least after SPARK_CLASSPATH. If SPARK_TOOLS_JAR is placed before CLASSPATH, all assembled classes and resources in spark-tools-assembly.jar beat those in CLASSPATH or SPARK_CLASSPATH, which might be replaced by customized versions.	2013-12-24 16:37:13 -05:00
Tor Myklebust	2402180b32	Fix error message ugliness.	2013-12-24 16:18:33 -05:00
azuryyu	66b7bea7f8	Make App report interval configurable during 'run on Yarn'	2013-12-24 18:16:49 +08:00
azuryyu	a8bb86389d	Fixed job name in the java streaming example.	2013-12-24 16:52:20 +08:00
Reynold Xin	d63856c361	Merge pull request #286 from rxin/build Show full stack trace and time taken in unit tests.	2013-12-23 22:07:26 -08:00
Reynold Xin	fc80b2e693	Show full stack trace and time taken in unit tests.	2013-12-23 21:20:20 -08:00
Matei Zaharia	23a9ae6be3	Merge pull request #277 from tdas/scheduler-update Refactored the streaming scheduler and added StreamingListener interface - Refactored the streaming scheduler for cleaner code. Specifically, the JobManager was renamed to JobScheduler, as it does the actual scheduling of Spark jobs to the SparkContext. The earlier Scheduler was renamed to JobGenerator, as it actually generates the jobs from the DStreams. The JobScheduler starts the JobGenerator. Also, moved all the scheduler related code from spark.streaming to spark.streaming.scheduler package. - Implemented the StreamingListener interface, similar to SparkListener. The streaming version of StatusReportListener prints the batch processing time statistics (for now). Added StreamingListernerSuite to test it. - Refactored streaming TestSuiteBase for deduping code in the other streaming testsuites.	2013-12-24 00:08:48 -05:00
Tathagata Das	6eaa050549	Minor change for PR 277.	2013-12-23 15:55:45 -08:00
Tathagata Das	f9771690a6	Minor formatting fixes.	2013-12-23 11:32:26 -08:00
Tathagata Das	dc3ee6b612	Added comments to BatchInfo and JobSet, based on Patrick's comment on PR 277.	2013-12-23 11:30:42 -08:00

1 2 3 4 5 ...

4987 commits