Commit graph

4998 commits

Author SHA1 Message Date
Matei Zaharia 3bf7c708d3 Merge pull request #275 from ueshin/wip/changeclasspathorder
Change the order of CLASSPATH.

SPARK_TOOLS_JAR should be placed after CLASSPATH or at least after
SPARK_CLASSPATH.

If SPARK_TOOLS_JAR is placed before CLASSPATH, all assembled classes and
resources in spark-tools-assembly.jar beat those in CLASSPATH or
SPARK_CLASSPATH, which might be replaced by customized versions.
2013-12-24 16:37:13 -05:00
Tor Myklebust 2402180b32 Fix error message ugliness. 2013-12-24 16:18:33 -05:00
Prashant Sharma 2573add94c spark-544, introducing SparkConf and related configuration overhaul. 2013-12-25 00:09:36 +05:30
azuryyu 66b7bea7f8 Make App report interval configurable during 'run on Yarn' 2013-12-24 18:16:49 +08:00
azuryyu a8bb86389d Fixed job name in the java streaming example. 2013-12-24 16:52:20 +08:00
Reynold Xin d63856c361 Merge pull request #286 from rxin/build
Show full stack trace and time taken in unit tests.
2013-12-23 22:07:26 -08:00
Reynold Xin fc80b2e693 Show full stack trace and time taken in unit tests. 2013-12-23 21:20:20 -08:00
Matei Zaharia 23a9ae6be3 Merge pull request #277 from tdas/scheduler-update
Refactored the streaming scheduler and added StreamingListener interface

- Refactored the streaming scheduler for cleaner code. Specifically, the JobManager was renamed to JobScheduler, as it does the actual scheduling of Spark jobs to the SparkContext. The earlier Scheduler was renamed to JobGenerator, as it actually generates the jobs from the DStreams. The JobScheduler starts the JobGenerator. Also, moved all the scheduler related code from spark.streaming to spark.streaming.scheduler package.
- Implemented the StreamingListener interface, similar to SparkListener. The streaming version of StatusReportListener prints the batch processing time statistics (for now). Added StreamingListernerSuite to test it.
- Refactored streaming TestSuiteBase for deduping code in the other streaming testsuites.
2013-12-24 00:08:48 -05:00
Tathagata Das 6eaa050549 Minor change for PR 277. 2013-12-23 15:55:45 -08:00
Tathagata Das f9771690a6 Minor formatting fixes. 2013-12-23 11:32:26 -08:00
Tathagata Das dc3ee6b612 Added comments to BatchInfo and JobSet, based on Patrick's comment on PR 277. 2013-12-23 11:30:42 -08:00
Reynold Xin 11107c9de5 Merge pull request #244 from leftnoteasy/master
Added SPARK-968 implementation for review

Added SPARK-968 implementation for review
2013-12-23 10:38:20 -08:00
wangda.tan 2f689ba97b SPARK-968, added executor address showing in aggregated metrics by executors table 2013-12-23 15:03:45 +08:00
Tor Myklebust cbb2811189 Release JVM reference to the ALSModel when done. 2013-12-22 15:03:58 -05:00
Kay Ousterhout b7bfae1afe Correctly merged in maxTaskFailures fix 2013-12-22 07:34:44 -08:00
wangda.tan c979eecdf6 added changes according to comments from rxin 2013-12-22 21:43:15 +08:00
Kay Ousterhout b8ae096a40 Fix build error in test 2013-12-21 23:28:48 -08:00
Tor Myklebust 20f85eca3d Java stubs for ALSModel. 2013-12-21 14:54:13 -05:00
Tor Myklebust 076fc16221 Python stubs for ALSModel. 2013-12-21 14:54:01 -05:00
Tathagata Das 3ddbdbfbc7 Minor updated based on comments on PR 277. 2013-12-20 19:51:37 -08:00
Kay Ousterhout 30186aa264 Renamed ClusterScheduler to TaskSchedulerImpl 2013-12-20 14:58:04 -08:00
Kay Ousterhout c06945cfe0 Merge remote branch 'upstream/master' into consolidate_schedulers
Conflicts:
	core/src/main/scala/org/apache/spark/scheduler/cluster/ClusterTaskSetManager.scala
	core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
	core/src/test/scala/org/apache/spark/scheduler/TaskSetManagerSuite.scala
2013-12-20 14:39:30 -08:00
Patrick Wendell 0bc57c5767 Merge pull request #280 from aarondav/minor
Minor cleanup for standalone scheduler

See commit messages
2013-12-20 11:56:54 -08:00
Tor Myklebust b454fdc2eb Javadocs; also, declare some things private. 2013-12-20 02:10:21 -05:00
Tor Myklebust 0b494c2167 Un-semicolon mllib.py. 2013-12-20 02:05:55 -05:00
Tor Myklebust 0a5cacb961 Change some docstrings and add some others. 2013-12-20 02:05:15 -05:00
Tor Myklebust b835ddf3df Licence notice. 2013-12-20 01:55:03 -05:00
Tor Myklebust d89cc1e28a Whitespace. 2013-12-20 01:50:42 -05:00
Tor Myklebust 319520b9bb Remove gigantic endian-specific test and exception tests. 2013-12-20 01:48:44 -05:00
Tor Myklebust 2940201ad8 Tests for the Python side of the mllib bindings. 2013-12-20 01:33:32 -05:00
Kay Ousterhout 9228ec847e Merge pull request #1 from aarondav/127
Merge master into 127
2013-12-19 21:37:15 -08:00
Tor Myklebust 73e17064c6 Python stubs for classification and clustering. 2013-12-20 00:12:48 -05:00
Tor Myklebust f99970e8cd Scala classification and clustering stubs; matrix serialization/deserialization. 2013-12-20 00:12:22 -05:00
Tor Myklebust 2328bdd00f Python side of python bindings for linear, Lasso, and ridge regression 2013-12-19 22:45:16 -05:00
Tor Myklebust ded67ee90c Bindings for linear, Lasso, and ridge regression. 2013-12-19 22:42:12 -05:00
Tor Myklebust 2a41c9aad3 Un-semicolon PythonMLLibAPI. 2013-12-19 21:27:11 -05:00
Patrick Wendell eca68d4425 Merge pull request #272 from tmyklebu/master
Track and report task result serialisation time.

 - DirectTaskResult now has a ByteBuffer valueBytes instead of a T value.
 - DirectTaskResult now has a member function T value() that deserialises valueBytes.
 - Executor serialises value into a ByteBuffer and passes it to DTR's ctor.
 - Executor tracks the time taken to do so and puts it in a new field in TaskMetrics.
 - StagePage now reports serialisation time from TaskMetrics along with the other things it reported.
2013-12-19 18:12:22 -08:00
Aaron Davidson 6613ab663d Fix compiler warning in SparkZooKeeperSession 2013-12-19 17:56:13 -08:00
Aaron Davidson 4d74b899b7 Remove firstApp from the standalone scheduler Master
As a lonely child with no one to care for it... we had to put it down.
2013-12-19 17:53:41 -08:00
Aaron Davidson 1ab031eaff Extraordinarily minor code/comment cleanup 2013-12-19 17:51:29 -08:00
Aaron Davidson 0647ec9757 Clean up shuffle files once their metadata is gone
Previously, we would only clean the in-memory metadata for consolidated
shuffle files.

Additionally, fixes a bug where the Metadata Cleaner was ignoring type-
specific TTLs.
2013-12-19 15:40:48 -08:00
Reynold Xin 7990c56375 Merge pull request #276 from shivaram/collectPartition
Add collectPartition to JavaRDD interface.

This interface is useful for implementing `take` from other language frontends where the data is serialized. Also remove `takePartition` from PythonRDD and use `collectPartition` in rdd.py.

Thanks @concretevitamin for the original change and tests.
2013-12-19 13:35:09 -08:00
Shivaram Venkataraman 9cc3a6d3c0 Add comment explaining collectPartitions's use 2013-12-19 11:49:17 -08:00
Shivaram Venkataraman d3234f9726 Make collectPartitions take an array of partitions
Change the implementation to use runJob instead of PartitionPruningRDD.
Also update the unit tests and the python take implementation
to use the new interface.
2013-12-19 11:40:34 -08:00
Matei Zaharia 440e531a5e Merge pull request #278 from MLnick/java-python-tostring
Add toString to Java RDD, and __repr__ to Python RDD

Addresses [SPARK-992](https://spark-project.atlassian.net/browse/SPARK-992)
2013-12-19 10:38:56 -08:00
Nick Pentreath a76f53416c Add toString to Java RDD, and __repr__ to Python RDD 2013-12-19 14:38:20 +02:00
Tor Myklebust bf20591a00 Incorporate most of Josh's style suggestions. I don't want to deal with the type and length checking errors until we've got at least one working stub that we're all happy with. 2013-12-19 03:40:57 -05:00
Reynold Xin d8d3f3e60d Merge pull request #183 from aarondav/spark-959
[SPARK-959] Explicitly depend on org.eclipse.jetty.orbit jar

Without this, in some cases, Ivy attempts to download the wrong file and fails, stopping the whole build. See [bug](https://spark-project.atlassian.net/browse/SPARK-959) for more details.

Note that this may not be the best solution, as I do not understand the root cause of why this only happens for some people. However, it is reported to work.
2013-12-19 00:06:43 -08:00
Tathagata Das ec71b445ad Minor changes. 2013-12-18 23:39:28 -08:00
Aaron Davidson eaf6a269b1 [SPARK-959] Explicitly depend on org.eclipse.jetty.orbit jar
Without this, in some cases, Ivy attempts to download the wrong file
and fails, stopping the whole build. See bug for more details.

(This is probably also the beginning of the slow death of our
recently prettified dependencies. Form follow function.)
2013-12-18 23:37:31 -08:00