Commit graph

4238 commits

Author SHA1 Message Date
Sandy Ryza 6f341310bf SPARK-5492. Thread statistics can break with older Hadoop versions
Author: Sandy Ryza <sandy@cloudera.com>

Closes #4305 from sryza/sandy-spark-5492 and squashes the following commits:

b7d4497 [Sandy Ryza] SPARK-5492. Thread statistics can break with older Hadoop versions
2015-02-02 00:54:06 -08:00
jerryshao 63dfe21dc7 [SPARK-5478][UI][Minor] Add missing right parentheses
![UI](https://dl.dropboxusercontent.com/u/19230832/Capture.PNG)

Author: jerryshao <saisai.shao@intel.com>

Closes #4267 from jerryshao/SPARK-5478 and squashes the following commits:

9fe51cc [jerryshao] Add missing right parentheses
2015-02-01 23:56:13 -08:00
Patrick Wendell a15f6e31fc [SPARK-3996]: Shade Jetty in Spark deliverables
(v2 of this patch with a fix that was only relevant for the maven build).

This patch piggy-back's on vanzin's work to simplify the Guava shading,
and adds Jetty as a shaded library in Spark. Other than adding Jetty,
it consilidates the <artifactSet>'s into the root pom. I found it was
a bit easier to follow that way, since you don't need to look into
child pom's to find out specific artifact sets included in shading.

Author: Patrick Wendell <patrick@databricks.com>

Closes #4285 from pwendell/jetty and squashes the following commits:

d3e7f4e [Patrick Wendell] Fix for shaded deps causing compile errors
19f0710 [Patrick Wendell] More code review feedback
961452d [Patrick Wendell] Responding to feedback from Marcello
6df25ca [Patrick Wendell] [WIP] [SPARK-3996]: Shade Jetty in Spark deliverables
2015-02-01 21:13:57 -08:00
Tom Panning 1ca0a1014e [SPARK-5176] The thrift server does not support cluster mode
Output an error message if the thrift server is started in cluster mode.

Author: Tom Panning <tom.panning@nextcentury.com>

Closes #4137 from tpanningnextcen/spark-5176-thrift-cluster-mode-error and squashes the following commits:

f5c0509 [Tom Panning] [SPARK-5176] The thrift server does not support cluster mode
2015-02-01 17:57:31 -08:00
zsxwing 883bc88d52 [SPARK-4859][Core][Streaming] Refactor LiveListenerBus and StreamingListenerBus
This PR refactors LiveListenerBus and StreamingListenerBus and extracts the common codes to a parent class `ListenerBus`.

It also includes bug fixes in #3710:
1. Fix the race condition of queueFullErrorMessageLogged in LiveListenerBus and StreamingListenerBus to avoid outputing `queue-full-error` logs multiple times.
2. Make sure the SHUTDOWN message will be delivered to listenerThread, so that we can make sure listenerThread will always be able to exit.
3. Log the error from listener rather than crashing listenerThread in StreamingListenerBus.

During fixing the above bugs, we find it's better to make LiveListenerBus and StreamingListenerBus have the same bahaviors. Then there will be many duplicated codes in LiveListenerBus and StreamingListenerBus.

Therefore, I extracted their common codes to `ListenerBus` as a parent class: LiveListenerBus and StreamingListenerBus only need to extend `ListenerBus` and implement `onPostEvent` (how to process an event) and `onDropEvent` (do something when droppping an event).

Author: zsxwing <zsxwing@gmail.com>

Closes #4006 from zsxwing/SPARK-4859-refactor and squashes the following commits:

c8dade2 [zsxwing] Fix the code style after renaming
5715061 [zsxwing] Rename ListenerHelper to ListenerBus and the original ListenerBus to AsynchronousListenerBus
f0ef647 [zsxwing] Fix the code style
4e85ffc [zsxwing] Merge branch 'master' into SPARK-4859-refactor
d2ef990 [zsxwing] Add private[spark]
4539f91 [zsxwing] Remove final to pass MiMa tests
a9dccd3 [zsxwing] Remove SparkListenerShutdown
7cc04c3 [zsxwing] Refactor LiveListenerBus and StreamingListenerBus and make them share same code base
2015-02-01 17:48:41 -08:00
Ryan Williams 80bd715a3e [SPARK-5422] Add support for sending Graphite metrics via UDP
Depends on [SPARK-5413](https://issues.apache.org/jira/browse/SPARK-5413) / #4209, included here, will rebase once the latter's merged.

Author: Ryan Williams <ryan.blake.williams@gmail.com>

Closes #4218 from ryan-williams/udp and squashes the following commits:

ebae393 [Ryan Williams] Add support for sending Graphite metrics via UDP
cb58262 [Ryan Williams] bump metrics dependency to v3.1.0
2015-01-31 23:41:05 -08:00
Sean Owen c84d5a10e8 SPARK-3359 [CORE] [DOCS] sbt/sbt unidoc doesn't work with Java 8
These are more `javadoc` 8-related changes I spotted while investigating. These should be helpful in any event, but this does not nearly resolve SPARK-3359, which may never be feasible while using `unidoc` and `javadoc` 8.

Author: Sean Owen <sowen@cloudera.com>

Closes #4193 from srowen/SPARK-3359 and squashes the following commits:

5b33f66 [Sean Owen] Additional scaladoc fixes for javadoc 8; still not going to be javadoc 8 compatible
2015-01-31 10:40:42 -08:00
Reynold Xin 636408311d [SPARK-5307] Add a config option for SerializationDebugger.
Just in case there is a bug in the SerializationDebugger that makes error reporting worse than it was.

Author: Reynold Xin <rxin@databricks.com>

Closes #4297 from rxin/ser-config and squashes the following commits:

f1d4629 [Reynold Xin] [SPARK-5307] Add a config option for SerializationDebugger.
2015-01-31 00:06:36 -08:00
Reynold Xin 740a56862b [SPARK-5307] SerializationDebugger
This patch adds a SerializationDebugger that is used to add serialization path to a NotSerializableException. When a NotSerializableException is encountered, the debugger visits the object graph to find the path towards the object that cannot be serialized, and constructs information to help user to find the object.

The patch uses the internals of JVM serialization (in particular, heavy usage of ObjectStreamClass). Compared with an earlier attempt, this one provides extra information including field names, array offsets, writeExternal calls, etc.

An example serialization stack:
```
Serialization stack:
  - object not serializable (class: org.apache.spark.serializer.NotSerializable, value: org.apache.spark.serializer.NotSerializable2c43caa4)
  - element of array (index: 0)
  - array (class [Ljava.lang.Object;, size 1)
  - field (class: org.apache.spark.serializer.SerializableArray, name: arrayField, type: class [Ljava.lang.Object;)
  - object (class org.apache.spark.serializer.SerializableArray, org.apache.spark.serializer.SerializableArray193c5908)
  - writeExternal data
  - externalizable object (class org.apache.spark.serializer.ExternalizableClass, org.apache.spark.serializer.ExternalizableClass320bdadc)
```

Author: Reynold Xin <rxin@databricks.com>

Closes #4098 from rxin/SerializationDebugger and squashes the following commits:

553b3ff [Reynold Xin] Update SerializationDebuggerSuite.scala
572d0cb [Reynold Xin] Disable automatically when reflection fails.
b349b77 [Reynold Xin] [SPARK-5307] SerializationDebugger to help debug NotSerializableException - take 2
2015-01-30 22:34:10 -08:00
Sandy Ryza 254eaa4d35 SPARK-5393. Flood of util.RackResolver log messages after SPARK-1714
Previously I had tried to solve this with by adding a line in Spark's log4j-defaults.properties.

The issue with the message in log4j-defaults.properties was that the log4j.properties packaged inside Hadoop was getting picked up instead. While it would be ideal to fix that as well, we still want to quiet this in situations where a user supplies their own custom log4j properties.

Author: Sandy Ryza <sandy@cloudera.com>

Closes #4192 from sryza/sandy-spark-5393 and squashes the following commits:

4d5dedc [Sandy Ryza] Only set log level if unset
46e07c5 [Sandy Ryza] SPARK-5393. Flood of util.RackResolver log messages after SPARK-1714
2015-01-30 11:31:54 -06:00
Davies Liu 5c746eedda [SPARK-5395] [PySpark] fix python process leak while coalesce()
Currently, the Python process is released into pool only after the task had finished, it cause many process forked if coalesce() is called.

This PR will change it to release the process as soon as read all the data from it (finish the partition), then a process could be reused to process multiple partitions in a single task.

Author: Davies Liu <davies@databricks.com>

Closes #4238 from davies/py_leak and squashes the following commits:

ec80a43 [Davies Liu] add @volatile
6da437a [Davies Liu] address comments
24ed322 [Davies Liu] fix python process leak while coalesce()
2015-01-29 17:28:37 -08:00
Patrick Wendell d2071e8f45 Revert "[WIP] [SPARK-3996]: Shade Jetty in Spark deliverables"
This reverts commit f240fe390b.
2015-01-29 17:14:27 -08:00
Patrick Wendell f240fe390b [WIP] [SPARK-3996]: Shade Jetty in Spark deliverables
This patch piggy-back's on vanzin's work to simplify the Guava shading,
and adds Jetty as a shaded library in Spark. Other than adding Jetty,
it consilidates the \<artifactSet\>'s into the root pom. I found it was
a bit easier to follow that way, since you don't need to look into
child pom's to find out specific artifact sets included in shading.

Author: Patrick Wendell <patrick@databricks.com>

Closes #4252 from pwendell/jetty and squashes the following commits:

19f0710 [Patrick Wendell] More code review feedback
961452d [Patrick Wendell] Responding to feedback from Marcello
6df25ca [Patrick Wendell] [WIP] [SPARK-3996]: Shade Jetty in Spark deliverables
2015-01-29 16:31:19 -08:00
Marcelo Vanzin f9e569452e [SPARK-5466] Add explicit guava dependencies where needed.
One side-effect of shading guava is that it disappears as a transitive
dependency. For Hadoop 2.x, this was masked by the fact that Hadoop
itself depends on guava. But certain versions of Hadoop 1.x also
shade guava, leaving either no guava or some random version pulled
by another dependency on the classpath.

So be explicit about the dependency in modules that use guava directly,
which is the right thing to do anyway.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #4272 from vanzin/SPARK-5466 and squashes the following commits:

e3f30e5 [Marcelo Vanzin] Dependency for catalyst is not needed.
d3b2c84 [Marcelo Vanzin] [SPARK-5466] Add explicit guava dependencies where needed.
2015-01-29 13:00:45 -08:00
Xiangrui Meng 4ee79c71af [SPARK-5430] move treeReduce and treeAggregate from mllib to core
We have seen many use cases of `treeAggregate`/`treeReduce` outside the ML domain. Maybe it is time to move them to Core. pwendell

Author: Xiangrui Meng <meng@databricks.com>

Closes #4228 from mengxr/SPARK-5430 and squashes the following commits:

20ad40d [Xiangrui Meng] exclude tree* from mima
e89a43e [Xiangrui Meng] fix compile and update java doc
3ae1a4b [Xiangrui Meng] add treeReduce/treeAggregate to Python
6f948c5 [Xiangrui Meng] add treeReduce/treeAggregate to JavaRDDLike
d600b6c [Xiangrui Meng] move treeReduce and treeAggregate to core
2015-01-28 17:26:03 -08:00
Michael Nazario e023112d33 [SPARK-5441][pyspark] Make SerDeUtil PairRDD to Python conversions more robust
SerDeUtil.pairRDDToPython and SerDeUtil.pythonToPairRDD now both support empty RDDs by checking the result of take(1) instead of calling first which throws an exception.

Author: Michael Nazario <mnazario@palantir.com>

Closes #4236 from mnazario/feature/empty-first and squashes the following commits:

a531c0c [Michael Nazario] Added regression tests for SPARK-5441
e3b2fb6 [Michael Nazario] Added acceptance of the empty case
2015-01-28 13:58:46 -08:00
Ryan Williams a731314c31 [SPARK-5417] Remove redundant executor-id set() call
This happens inside SparkEnv initialization as of #4194

Author: Ryan Williams <ryan.blake.williams@gmail.com>

Closes #4213 from ryan-williams/exec-id-set and squashes the following commits:

b3e4f7b [Ryan Williams] Remove redundant executor-id set() call
2015-01-28 13:04:52 -08:00
Andrew Or 84b6ecdef6 [SPARK-5437] Fix DriverSuite and SparkSubmitSuite timeout issues
In DriverSuite, we currently set a timeout of 60 seconds. If after this time the process has not terminated, we leak the process because we never destroy it.

In SparkSubmitSuite, we currently do not have a timeout so the test can hang indefinitely.

Author: Andrew Or <andrew@databricks.com>

Closes #4230 from andrewor14/fix-driver-suite and squashes the following commits:

f5c80fd [Andrew Or] Fix timeout behaviors in both suites
8092c36 [Andrew Or] Stop SparkContext after every individual test
2015-01-28 12:53:22 -08:00
Sean Owen 9b18009b83 SPARK-1934 [CORE] "this" reference escape to "selectorThread" during construction in ConnectionManager
This change reshuffles the order of initialization in `ConnectionManager` so that the last thing that happens is running `selectorThread`, which invokes a method that relies on object state in `ConnectionManager`

zsxwing also reported a similar problem in `BlockManager` in the JIRA, but I can't find a similar pattern there. Maybe it was subsequently fixed?

Author: Sean Owen <sowen@cloudera.com>

Closes #4225 from srowen/SPARK-1934 and squashes the following commits:

c4dec3b [Sean Owen] Init all object state in ConnectionManager constructor before starting thread in constructor that accesses object's state
2015-01-28 12:44:35 -08:00
Winston Chen 453d7999b8 [SPARK-5361]Multiple Java RDD <-> Python RDD conversions not working correctly
This is found through reading RDD from `sc.newAPIHadoopRDD` and writing it back using `rdd.saveAsNewAPIHadoopFile` in pyspark.

It turns out that whenever there are multiple RDD conversions from JavaRDD to PythonRDD then back to JavaRDD, the exception below happens:

```
15/01/16 10:28:31 ERROR Executor: Exception in task 0.0 in stage 3.0 (TID 7)
java.lang.ClassCastException: [Ljava.lang.Object; cannot be cast to java.util.ArrayList
	at org.apache.spark.api.python.SerDeUtil$$anonfun$pythonToJava$1$$anonfun$apply$1.apply(SerDeUtil.scala:157)
	at org.apache.spark.api.python.SerDeUtil$$anonfun$pythonToJava$1$$anonfun$apply$1.apply(SerDeUtil.scala:153)
	at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
	at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:308)
```

The test case code below reproduces it:

```
from pyspark.rdd import RDD

dl = [
    (u'2', {u'director': u'David Lean'}),
    (u'7', {u'director': u'Andrew Dominik'})
]

dl_rdd = sc.parallelize(dl)
tmp = dl_rdd._to_java_object_rdd()
tmp2 = sc._jvm.SerDe.javaToPython(tmp)
t = RDD(tmp2, sc)
t.count()

tmp = t._to_java_object_rdd()
tmp2 = sc._jvm.SerDe.javaToPython(tmp)
t = RDD(tmp2, sc)
t.count() # it blows up here during the 2nd time of conversion
```

Author: Winston Chen <wchen@quid.com>

Closes #4146 from wingchen/master and squashes the following commits:

903df7d [Winston Chen] SPARK-5361, update to toSeq based on the PR
5d90a83 [Winston Chen] SPARK-5361, make python pretty, so to pass PEP 8 checks
126be6b [Winston Chen] SPARK-5361, add in test case
4cf1187 [Winston Chen] SPARK-5361, add in test case
9f1a097 [Winston Chen] add in tuple handling while converting form python RDD back to JavaRDD
2015-01-28 11:08:44 -08:00
Kousuke Saruta 0b35fcd7f0 [SPARK-5291][CORE] Add timestamp and reason why an executor is removed to SparkListenerExecutorAdded and SparkListenerExecutorRemoved
Recently `SparkListenerExecutorAdded` and `SparkListenerExecutorRemoved` are added.
I think it's useful if they have timestamp and the reason why an executor is removed.

Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>

Closes #4082 from sarutak/SPARK-5291 and squashes the following commits:

a026ff2 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-5291
979dfe1 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-5291
cf9f9080 [Kousuke Saruta] Fixed test case
1f2a89b [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-5291
243f2a60 [Kousuke Saruta] Modified MesosSchedulerBackendSuite
a527c35 [Kousuke Saruta] Added timestamp to SparkListenerExecutorAdded
2015-01-28 11:02:51 -08:00
Marcelo Vanzin 37a5e272f8 [SPARK-4809] Rework Guava library shading.
The current way of shading Guava is a little problematic. Code that
depends on "spark-core" does not see the transitive dependency, yet
classes in "spark-core" actually depend on Guava. So it's a little
tricky to run unit tests that use spark-core classes, since you need
a compatible version of Guava in your dependencies when running the
tests. This can become a little tricky, and is kind of a bad user
experience.

This change modifies the way Guava is shaded so that it's applied
uniformly across the Spark build. This means Guava is shaded inside
spark-core itself, so that the dependency issues above are solved.
Aside from that, all Spark sub-modules have their Guava references
relocated, so that they refer to the relocated classes now packaged
inside spark-core. Before, this was only done by the time the assembly
was built, so projects that did not end up inside the assembly (such
as streaming backends) could still reference the original location
of Guava classes.

The Guava classes are added to the "first" artifact Spark generates
(network-common), so that all downstream modules have the needed
classes available. Since "network-common" is a dependency of spark-core,
all Spark apps should get the relocated classes automatically.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #3658 from vanzin/SPARK-4809 and squashes the following commits:

3c93e42 [Marcelo Vanzin] Shade Guava in the network-common artifact.
5d69ec9 [Marcelo Vanzin] Merge branch 'master' into SPARK-4809
b3104fc [Marcelo Vanzin] Add comment.
941848f [Marcelo Vanzin] Merge branch 'master' into SPARK-4809
f78c48a [Marcelo Vanzin] Merge branch 'master' into SPARK-4809
8053dd4 [Marcelo Vanzin] Merge branch 'master' into SPARK-4809
107d7da [Marcelo Vanzin] Add fix for SPARK-5052 (PR #3874).
40b8723 [Marcelo Vanzin] Merge branch 'master' into SPARK-4809
4a4ed42 [Marcelo Vanzin] [SPARK-4809] Rework Guava library shading.
2015-01-28 00:29:29 -08:00
Sandy Ryza b1b35ca2e4 SPARK-5199. FS read metrics should support CombineFileSplits and track bytes from all FSs
...mbineFileSplits

Author: Sandy Ryza <sandy@cloudera.com>

Closes #4050 from sryza/sandy-spark-5199 and squashes the following commits:

864514b [Sandy Ryza] Add tests and fix bug
0d504f1 [Sandy Ryza] Prettify
915c7e6 [Sandy Ryza] Get metrics from all filesystems
cdbc3e8 [Sandy Ryza] SPARK-5199. Input metrics should show up for InputFormats that return CombineFileSplits
2015-01-27 15:42:55 -08:00
Elmer Garduno 661e0fca5d [SPARK-5052] Add common/base classes to fix guava methods signatures.
Fixes problems with incorrect method signatures related to shaded classes. For discussion see the jira issue.

Author: Elmer Garduno <elmerg@google.com>

Closes #3874 from elmer-garduno/fix_guava_signatures and squashes the following commits:

aa5d8e0 [Elmer Garduno] Unshade common/base[Function|Supplier] classes to fix guava methods signatures.
2015-01-26 17:40:48 -08:00
Sean Owen 0497ea51ac SPARK-960 [CORE] [TEST] JobCancellationSuite "two jobs sharing the same stage" is broken
This reenables and fixes this test, after addressing two issues:

- The Semaphore that was intended to be shared locally was being serialized and copied; it's now a static member in the companion object as in other tests
- Later changes to Spark means that cancelling the first task will not cancel the shared stage and therefore the second task should succeed

Author: Sean Owen <sowen@cloudera.com>

Closes #4180 from srowen/SPARK-960 and squashes the following commits:

43da66f [Sean Owen] Fix 'two jobs sharing the same stage' test and reenable it: truly share a Semaphore locally as intended, and update expectation of failure in non-cancelled task
2015-01-26 14:32:27 -08:00
Sean Owen 54e7b456dd SPARK-4147 [CORE] Reduce log4j dependency
Defer use of log4j class until it's known that log4j 1.2 is being used. This may avoid dealing with log4j dependencies for callers that reroute slf4j to another logging framework. The only change is to push one half of the check in the original `if` condition inside. This is a trivial change, may or may not actually solve a problem, but I think it's all that makes sense to do for SPARK-4147.

Author: Sean Owen <sowen@cloudera.com>

Closes #4190 from srowen/SPARK-4147 and squashes the following commits:

4e99942 [Sean Owen] Defer use of log4j class until it's known that log4j 1.2 is being used. This may avoid dealing with log4j dependencies for callers that reroute slf4j to another logging framework.
2015-01-26 14:23:42 -08:00
Davies Liu 142093179a [SPARK-5355] use j.u.c.ConcurrentHashMap instead of TrieMap
j.u.c.ConcurrentHashMap is more battle tested.

cc rxin JoshRosen pwendell

Author: Davies Liu <davies@databricks.com>

Closes #4208 from davies/safe-conf and squashes the following commits:

c2182dc [Davies Liu] address comments, fix tests
3a1d821 [Davies Liu] fix test
da14ced [Davies Liu] Merge branch 'master' of github.com:apache/spark into safe-conf
ae4d305 [Davies Liu] change to j.u.c.ConcurrentMap
f8fa1cf [Davies Liu] change to TrieMap
a1d769a [Davies Liu] make SparkConf thread-safe
2015-01-26 12:51:32 -08:00
CodingCat 8df9435512 [SPARK-5268] don't stop CoarseGrainedExecutorBackend for irrelevant DisassociatedEvent
https://issues.apache.org/jira/browse/SPARK-5268

In CoarseGrainedExecutorBackend, we subscribe DisassociatedEvent in executor backend actor and exit the program upon receive such event...

let's consider the following case

The user may develop an Akka-based program which starts the actor with Spark's actor system and communicate with an external actor system (e.g. an Akka-based receiver in spark streaming which communicates with an external system) If the external actor system fails or disassociates with the actor within spark's system with purpose, we may receive DisassociatedEvent and the executor is restarted.

This is not the expected behavior.....

----

This is a simple fix to check the event before making the quit decision

Author: CodingCat <zhunansjtu@gmail.com>

Closes #4063 from CodingCat/SPARK-5268 and squashes the following commits:

4d7d48e [CodingCat] simplify the log
18c36f4 [CodingCat] more descriptive log
f299e0b [CodingCat] clean log
1632e79 [CodingCat] check whether DisassociatedEvent is relevant before quit
2015-01-25 19:28:53 -08:00
Kay Ousterhout fc2168f04e [SPARK-5326] Show fetch wait time as optional metric in the UI
With this change, here's what the UI looks like:

![image](https://cloud.githubusercontent.com/assets/1108612/5809994/1ec8a904-9ff4-11e4-8f24-6a59a1a858f7.png)

If you want to locally test this, you need to spin up multiple executors, because the shuffle read metrics are only shown for data read remotely.

Author: Kay Ousterhout <kayousterhout@gmail.com>

Closes #4110 from kayousterhout/SPARK-5326 and squashes the following commits:

610051e [Kay Ousterhout] Josh style comments
5feaa28 [Kay Ousterhout] What is the difference here??
aa129cb [Kay Ousterhout] Removed inadvertent change
721c742 [Kay Ousterhout] Improved tooltip
f3a7111 [Kay Ousterhout] Style fix
679b4e9 [Kay Ousterhout] [SPARK-5326] Show fetch wait time as optional metric in the UI
2015-01-25 16:48:26 -08:00
Kousuke Saruta 8f5c827b01 [SPARK-5344][WebUI] HistoryServer cannot recognize that inprogress file was renamed to completed file
`FsHistoryProvider` tries to update application status but if `checkForLogs` is called before `.inprogress` file is renamed to completed file, the file is not recognized as completed.

Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>

Closes #4132 from sarutak/SPARK-5344 and squashes the following commits:

9658008 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-5344
d2c72b6 [Kousuke Saruta] Fixed update issue of FsHistoryProvider
2015-01-25 15:34:20 -08:00
Ryan Williams aea25482c3 [SPARK-5402] log executor ID at executor-construction time
also rename "slaveHostname" to "executorHostname"

Author: Ryan Williams <ryan.blake.williams@gmail.com>

Closes #4195 from ryan-williams/exec and squashes the following commits:

e60a7bb [Ryan Williams] log executor ID at executor-construction time
2015-01-25 14:20:02 -08:00
Ryan Williams 2d9887bae5 [SPARK-5401] set executor ID before creating MetricsSystem
Author: Ryan Williams <ryan.blake.williams@gmail.com>

Closes #4194 from ryan-williams/metrics and squashes the following commits:

7c5a33f [Ryan Williams] set executor ID before creating MetricsSystem
2015-01-25 14:17:59 -08:00
Idan Zalzberg 412a58e118 Add comment about defaultMinPartitions
Added a comment about using math.min for choosing default partition count

Author: Idan Zalzberg <idanzalz@gmail.com>

Closes #4102 from idanz/patch-2 and squashes the following commits:

50e9d58 [Idan Zalzberg] Update SparkContext.scala
2015-01-25 11:28:05 -08:00
zsxwing 0d1e67ee9b [SPARK-5214][Test] Add a test to demonstrate EventLoop can be stopped in the event thread
Author: zsxwing <zsxwing@gmail.com>

Closes #4174 from zsxwing/SPARK-5214-unittest and squashes the following commits:

443e564 [zsxwing] Change the check interval to 5ms
7aaa2d7 [zsxwing] Add a test to demonstrate EventLoop can be stopped in the event thread
2015-01-24 11:00:35 -08:00
Josh Rosen cef1f092a6 [SPARK-5063] More helpful error messages for several invalid operations
This patch adds more helpful error messages for invalid programs that define nested RDDs, broadcast RDDs, perform actions inside of transformations (e.g. calling `count()` from inside of `map()`), and call certain methods on stopped SparkContexts.  Currently, these invalid programs lead to confusing NullPointerExceptions at runtime and have been a major source of questions on the mailing list and StackOverflow.

In a few cases, I chose to log warnings instead of throwing exceptions in order to avoid any chance that this patch breaks programs that worked "by accident" in earlier Spark releases (e.g. programs that define nested RDDs but never run any jobs with them).

In SparkContext, the new `assertNotStopped()` method is used to check whether methods are being invoked on a stopped SparkContext.  In some cases, user programs will not crash in spite of calling methods on stopped SparkContexts, so I've only added `assertNotStopped()` calls to methods that always throw exceptions when called on stopped contexts (e.g. by dereferencing a null `dagScheduler` pointer).

Author: Josh Rosen <joshrosen@databricks.com>

Closes #3884 from JoshRosen/SPARK-5063 and squashes the following commits:

a38774b [Josh Rosen] Fix spelling typo
a943e00 [Josh Rosen] Convert two exceptions into warnings in order to avoid breaking user programs in some edge-cases.
2d0d7f7 [Josh Rosen] Fix test to reflect 1.2.1 compatibility
3f0ea0c [Josh Rosen] Revert two unintentional formatting changes
8e5da69 [Josh Rosen] Remove assertNotStopped() calls for methods that were sometimes safe to call on stopped SC's in Spark 1.2
8cff41a [Josh Rosen] IllegalStateException fix
6ef68d0 [Josh Rosen] Fix Python line length issues.
9f6a0b8 [Josh Rosen] Add improved error messages to PySpark.
13afd0f [Josh Rosen] SparkException -> IllegalStateException
8d404f3 [Josh Rosen] Merge remote-tracking branch 'origin/master' into SPARK-5063
b39e041 [Josh Rosen] Fix BroadcastSuite test which broadcasted an RDD
99cc09f [Josh Rosen] Guard against calling methods on stopped SparkContexts.
34833e8 [Josh Rosen] Add more descriptive error message.
57cc8a1 [Josh Rosen] Add error message when directly broadcasting RDD.
15b2e6b [Josh Rosen] [SPARK-5063] Useful error messages for nested RDDs and actions inside of transformations
2015-01-23 17:53:15 -08:00
Davies Liu 9bad062268 [SPARK-5355] make SparkConf thread-safe
The SparkConf is not thread-safe, but is accessed by many threads. The getAll() could return parts of the configs if another thread is access it.

This PR changes SparkConf.settings to a thread-safe TrieMap.

Author: Davies Liu <davies@databricks.com>

Closes #4143 from davies/safe-conf and squashes the following commits:

f8fa1cf [Davies Liu] change to TrieMap
a1d769a [Davies Liu] make SparkConf thread-safe
2015-01-21 16:51:54 -08:00
wangfei 3be2a887bf [SPARK-4984][CORE][WEBUI] Adding a pop-up containing the full job description when it is very long
In some case the job description will be very long, such as a long sql. refer to #3718
This PR add a pop-up for job description when it is long.

![image](https://cloud.githubusercontent.com/assets/7018048/5847400/c757cbbc-a207-11e4-891f-528821c2e68d.png)

![image](https://cloud.githubusercontent.com/assets/7018048/5847409/d434b2b4-a207-11e4-8813-03a74b43d766.png)

Author: wangfei <wangfei1@huawei.com>

Closes #3819 from scwf/popup-descrip-ui and squashes the following commits:

ba02b83 [wangfei] address comments
a7c5e7b [wangfei] spot that it's been truncated
fbf6162 [wangfei] Merge branch 'master' into popup-descrip-ui
0bca96d [wangfei] remove no use val
4b55c3b [wangfei] fix style issue
353c6f4 [wangfei] pop up the description of job with a styled read-only text form field
2015-01-21 15:27:42 -08:00
Sandy Ryza 2eeada373e SPARK-1714. Take advantage of AMRMClient APIs to simplify logic in YarnA...
...llocator

The goal of this PR is to simplify YarnAllocator as much as possible and get it up to the level of code quality we see in the rest of Spark.

In service of this, it does a few things:
* Uses AMRMClient APIs for matching containers to requests.
* Adds calls to AMRMClient.removeContainerRequest so that, when we use a container, we don't end up requesting it again.
* Removes YarnAllocator's host->rack cache. YARN's RackResolver already does this caching, so this is redundant.
* Adds tests for basic YarnAllocator functionality.
* Breaks up the allocateResources method, which was previously nearly 300 lines.
* A little bit of stylistic cleanup.
* Fixes a bug that causes three times the requests to be filed when preferred host locations are given.

The patch is lossy. In particular, it loses the logic for trying to avoid containers bunching up on nodes. As I understand it, the logic that's gone is:

* If, in a single response from the RM, we receive a set of containers on a node, and prefer some number of containers on that node greater than 0 but less than the number we received, give back the delta between what we preferred and what we received.

This seems like a weird way to avoid bunching E.g. it does nothing to avoid bunching when we don't request containers on particular nodes.

Author: Sandy Ryza <sandy@cloudera.com>

Closes #3765 from sryza/sandy-spark-1714 and squashes the following commits:

32a5942 [Sandy Ryza] Muffle RackResolver logs
74f56dd [Sandy Ryza] Fix a couple comments and simplify requestTotalExecutors
60ea4bd [Sandy Ryza] Fix scalastyle
ca35b53 [Sandy Ryza] Simplify further
e9cf8a6 [Sandy Ryza] Fix YarnClusterSuite
257acf3 [Sandy Ryza] Remove locality stuff and more cleanup
59a3c5e [Sandy Ryza] Take out rack stuff
5f72fd5 [Sandy Ryza] Further documentation and cleanup
89edd68 [Sandy Ryza] SPARK-1714. Take advantage of AMRMClient APIs to simplify logic in YarnAllocator
2015-01-21 10:31:54 -06:00
WangTao 8c06a5faac [SPARK-5336][YARN]spark.executor.cores must not be less than spark.task.cpus
https://issues.apache.org/jira/browse/SPARK-5336

Author: WangTao <barneystinson@aliyun.com>
Author: WangTaoTheTonic <barneystinson@aliyun.com>

Closes #4123 from WangTaoTheTonic/SPARK-5336 and squashes the following commits:

6c9676a [WangTao] Update ClientArguments.scala
9632d3a [WangTaoTheTonic] minor comment fix
d03d6fa [WangTaoTheTonic] import ordering should be alphabetical'
3112af9 [WangTao] spark.executor.cores must not be less than spark.task.cpus
2015-01-21 09:42:30 -06:00
Kousuke Saruta 9a151ce58b [SPARK-5294][WebUI] Hide tables in AllStagePages for "Active Stages, Completed Stages and Failed Stages" when they are empty
Related to SPARK-5228 and #4028, `AllStagesPage` also should hide the table for  `ActiveStages`, `CompleteStages` and `FailedStages` when they are empty.

Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>

Closes #4083 from sarutak/SPARK-5294 and squashes the following commits:

a7625c1 [Kousuke Saruta] Fixed conflicts
2015-01-20 16:40:46 -08:00
Kousuke Saruta 769aced9e7 [SPARK-5329][WebUI] UIWorkloadGenerator should stop SparkContext.
UIWorkloadGenerator don't stop SparkContext. I ran UIWorkloadGenerator and try to watch the result at WebUI but Jobs are marked as finished.
It's because SparkContext is not stopped.

Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>

Closes #4112 from sarutak/SPARK-5329 and squashes the following commits:

bcc0fa9 [Kousuke Saruta] Disabled scalastyle for a bock comment
86a3b95 [Kousuke Saruta] Fixed UIWorkloadGenerator to stop SparkContext in it
2015-01-20 12:40:55 -08:00
Jacek Lewandowski c93a57f0d6 SPARK-4660: Use correct class loader in JavaSerializer (copy of PR #3840...
... by Piotr Kolaczkowski)

Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>

Closes #4113 from jacek-lewandowski/SPARK-4660-master and squashes the following commits:

a5e84ca [Jacek Lewandowski] SPARK-4660: Use correct class loader in JavaSerializer (copy of PR #3840 by Piotr Kolaczkowski)
2015-01-20 12:38:01 -08:00
Jongyoul Lee 9d9294aebf [SPARK-5333][Mesos] MesosTaskLaunchData occurs BufferUnderflowException
- Rewind ByteBuffer before making ByteString

(This fixes a bug introduced in #3849 / SPARK-4014)

Author: Jongyoul Lee <jongyoul@gmail.com>

Closes #4119 from jongyoul/SPARK-5333 and squashes the following commits:

c6693a8 [Jongyoul Lee] [SPARK-5333][Mesos] MesosTaskLaunchData occurs BufferUnderflowException - changed logDebug location
4141f58 [Jongyoul Lee] [SPARK-5333][Mesos] MesosTaskLaunchData occurs BufferUnderflowException - Added license information
2190606 [Jongyoul Lee] [SPARK-5333][Mesos] MesosTaskLaunchData occurs BufferUnderflowException - Adjusted imported libraries
b7f5517 [Jongyoul Lee] [SPARK-5333][Mesos] MesosTaskLaunchData occurs BufferUnderflowException - Rewind ByteBuffer before making ByteString
2015-01-20 10:18:10 -08:00
Sean Owen 306ff187af SPARK-5270 [CORE] Provide isEmpty() function in RDD API
Pretty minor, but submitted for consideration -- this would at least help people make this check in the most efficient way I know.

Author: Sean Owen <sowen@cloudera.com>

Closes #4074 from srowen/SPARK-5270 and squashes the following commits:

66885b8 [Sean Owen] Add note that JavaRDDLike should not be implemented by user code
2e9b490 [Sean Owen] More tests, and Mima-exclude the new isEmpty method in JavaRDDLike
28395ff [Sean Owen] Add isEmpty to Java, Python
7dd04b7 [Sean Owen] Add efficient RDD.isEmpty()
2015-01-19 22:50:45 -08:00
zsxwing e69fb8c75a [SPARK-5214][Core] Add EventLoop and change DAGScheduler to an EventLoop
This PR adds a simple `EventLoop` and use it to replace Actor in DAGScheduler. `EventLoop` is a general class to support that posting events in multiple threads and handling events in a single event thread.

Author: zsxwing <zsxwing@gmail.com>

Closes #4016 from zsxwing/event-loop and squashes the following commits:

aefa1ce [zsxwing] Add protected to on*** methods
5cfac83 [zsxwing] Remove null check of eventProcessLoop
dba35b2 [zsxwing] Add a test that onReceive swallows InterruptException
460f7b3 [zsxwing] Use volatile instead of Atomic things in unit tests
227bf33 [zsxwing] Add a stop flag and some tests
37f79c6 [zsxwing] Fix docs
55fb6f6 [zsxwing] Add private[spark] to EventLoop
1f73eac [zsxwing] Fix the import order
3b2e59c [zsxwing] Add EventLoop and change DAGScheduler to an EventLoop
2015-01-19 18:15:51 -08:00
Jongyoul Lee 4a4f9ccba2 [SPARK-5088] Use spark-class for running executors directly
Author: Jongyoul Lee <jongyoul@gmail.com>

Closes #3897 from jongyoul/SPARK-5088 and squashes the following commits:

8232aa8 [Jongyoul Lee] [SPARK-5088] Use spark-class for running executors directly - Added a listenerBus for fixing test cases
932289f [Jongyoul Lee] [SPARK-5088] Use spark-class for running executors directly - Rebased from master
613cb47 [Jongyoul Lee] [SPARK-5088] Use spark-class for running executors directly - Fixed code if spark.executor.uri doesn't have any value - Added test cases
ff57bda [Jongyoul Lee] [SPARK-5088] Use spark-class for running executors directly - Adjusted orders of import
97e4bd4 [Jongyoul Lee] [SPARK-5088] Use spark-class for running executors directly - Changed command for using spark-class directly - Delete sbin/spark-executor and moved some codes into spark-class' case statement
2015-01-19 02:01:56 -08:00
Ilya Ganelin 3453d578ad [SPARK-3288] All fields in TaskMetrics should be private and use getters/setters
I've updated the fields and all usages of these fields in the Spark code. I've verified that this did not break anything on my local repo.

Author: Ilya Ganelin <ilya.ganelin@capitalone.com>

Closes #4020 from ilganeli/SPARK-3288 and squashes the following commits:

39f3810 [Ilya Ganelin] resolved merge issues
e446287 [Ilya Ganelin] Merge remote-tracking branch 'upstream/master' into SPARK-3288
b8c05cb [Ilya Ganelin] Missed making a variable private
6444391 [Ilya Ganelin] Made inc/dec functions private[spark]
1149e78 [Ilya Ganelin] Merge remote-tracking branch 'upstream/master' into SPARK-3288
26b312b [Ilya Ganelin] Debugging tests
17146c2 [Ilya Ganelin] Merge remote-tracking branch 'upstream/master' into SPARK-3288
5525c20 [Ilya Ganelin] Completed refactoring to make vars in TaskMetrics class private
c64da4f [Ilya Ganelin] Partially updated task metrics to make some vars private
2015-01-19 01:32:36 -08:00
Prashant Sharma 851b6a9bba SPARK-5217 Spark UI should report pending stages during job execution on AllStagesPage.
![screenshot from 2015-01-16 13 43 25](https://cloud.githubusercontent.com/assets/992952/5773256/d61df300-9d85-11e4-9b5a-6730058839fa.png)

This is a first step towards having time remaining estimates for queued and running jobs. See SPARK-5216

Author: Prashant Sharma <prashant.s@imaginea.com>

Closes #4043 from ScrapCodes/SPARK-5216/5217-show-waiting-stages and squashes the following commits:

3b11803 [Prashant Sharma] Review feedback.
0992842 [Prashant Sharma] Switched to Linked hashmap, changed the order to active->pending->completed->failed. And changed pending stages to not reverse sort.
c19d82a [Prashant Sharma] SPARK-5217 Spark UI should report pending stages during job execution on AllStagesPage.
2015-01-19 01:28:42 -08:00
Kousuke Saruta ecf943d353 [WebUI] Fix collapse of WebUI layout
When we decrease the width of browsers, the header of WebUI wraps and collapses like as following image.

![2015-01-11 19 49 37](https://cloud.githubusercontent.com/assets/4736016/5698887/b0b9aeee-99cd-11e4-9020-08f3f0014de0.png)

Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>

Closes #3995 from sarutak/fixed-collapse-webui-layout and squashes the following commits:

3e60b5b [Kousuke Saruta] Modified line-height property in webui.css
7bfb5fb [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into fixed-collapse-webui-layout
5d83e18 [Kousuke Saruta] Fixed collapse of WebUI layout
2015-01-16 12:19:08 -08:00
Kousuke Saruta e8422c521b [SPARK-5231][WebUI] History Server shows wrong job submission time.
History Server doesn't show collect job submission time.
It's because `JobProgressListener` updates job submission time every time `onJobStart` method is invoked from `ReplayListenerBus`.

Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>

Closes #4029 from sarutak/SPARK-5231 and squashes the following commits:

0af9e22 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-5231
da8bd14 [Kousuke Saruta] Made submissionTime in SparkListenerJobStartas and completionTime in SparkListenerJobEnd as regular Long
0412a6a [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-5231
26b9b99 [Kousuke Saruta] Fixed the test cases
2d47bd3 [Kousuke Saruta] Fixed to record job submission time and completion time collectly
2015-01-16 10:05:11 -08:00
Ye Xianjin e200ac8e53 [SPARK-5201][CORE] deal with int overflow in the ParallelCollectionRDD.slice method
There is an int overflow in the ParallelCollectionRDD.slice method. That's originally reported by SaintBacchus.
```
sc.makeRDD(1 to (Int.MaxValue)).count       // result = 0
sc.makeRDD(1 to (Int.MaxValue - 1)).count   // result = 2147483646 = Int.MaxValue - 1
sc.makeRDD(1 until (Int.MaxValue)).count    // result = 2147483646 = Int.MaxValue - 1
```
see https://github.com/apache/spark/pull/2874 for more details.
This pr try to fix the overflow. However, There's another issue I don't address.
```
val largeRange = Int.MinValue to Int.MaxValue
largeRange.length // throws java.lang.IllegalArgumentException: -2147483648 to 2147483647 by 1: seqs cannot contain more than Int.MaxValue elements.
```

So, the range we feed to sc.makeRDD cannot contain more than Int.MaxValue elements. This is the limitation of Scala. However I think  we may want to support that kind of range. But the fix is beyond this pr.

srowen andrewor14 would you mind take a look at this pr?

Author: Ye Xianjin <advancedxy@gmail.com>

Closes #4002 from advancedxy/SPARk-5201 and squashes the following commits:

96265a1 [Ye Xianjin] Update slice method comment and some responding docs.
e143d7a [Ye Xianjin] Update inclusive range check for splitting inclusive range.
b3f5577 [Ye Xianjin] We can include the last element in the last slice in general for inclusive range, hence eliminate the need to check Int.MaxValue or Int.MinValue.
7d39b9e [Ye Xianjin] Convert the two cases pattern matching to one case.
651c959 [Ye Xianjin] rename sign to needsInclusiveRange. add some comments
196f8a8 [Ye Xianjin] Add test cases for ranges end with Int.MaxValue or Int.MinValue
e66e60a [Ye Xianjin] Deal with inclusive and exclusive ranges in one case. If the range is inclusive and the end of the range is (Int.MaxValue or Int.MinValue), we should use inclusive range instead of exclusive
2015-01-16 09:21:33 -08:00
WangTaoTheTonic 2be82b1e66 [SPARK-1507][YARN]specify # cores for ApplicationMaster
Based on top of changes in https://github.com/apache/spark/pull/3806.

https://issues.apache.org/jira/browse/SPARK-1507

`--driver-cores` and `spark.driver.cores` for all cluster modes and `spark.yarn.am.cores` for yarn client mode.

Author: WangTaoTheTonic <barneystinson@aliyun.com>
Author: WangTao <barneystinson@aliyun.com>

Closes #4018 from WangTaoTheTonic/SPARK-1507 and squashes the following commits:

01419d3 [WangTaoTheTonic] amend the args name
b255795 [WangTaoTheTonic] indet thing
d86557c [WangTaoTheTonic] some comments amend
43c9392 [WangTao] fix compile error
b39a100 [WangTao] specify # cores for ApplicationMaster
2015-01-16 09:16:56 -08:00
Kostas Sakellis a79a9f923c [SPARK-4092] [CORE] Fix InputMetrics for coalesce'd Rdds
When calculating the input metrics there was an assumption that one task only reads from one block - this is not true for some operations including coalesce. This patch simply increments the task's input metrics if previous ones existed of the same read method.

A limitation to this patch is that if a task reads from two different blocks of different read methods, one will override the other.

Author: Kostas Sakellis <kostas@cloudera.com>

Closes #3120 from ksakellis/kostas-spark-4092 and squashes the following commits:

54e6658 [Kostas Sakellis] Drops metrics if conflicting read methods exist
f0e0cc5 [Kostas Sakellis] Add bytesReadCallback to InputMetrics
a2a36d4 [Kostas Sakellis] CR feedback
5a0c770 [Kostas Sakellis] [SPARK-4092] [CORE] Fix InputMetrics for coalesce'd Rdds
2015-01-15 18:48:39 -08:00
Kostas Sakellis 96c2c714f4 [SPARK-4857] [CORE] Adds Executor membership events to SparkListener
Adds onExecutorAdded and onExecutorRemoved events to the SparkListener. This will allow a client to get notified when an executor has been added/removed and provide additional information such as how many vcores it is consuming.

In addition, this commit adds a SparkListenerAdapter to the Java API that provides default implementations to the SparkListener. This is to get around the fact that default implementations for traits don't work in Java. Having Java clients extend SparkListenerAdapter moving forward will prevent breakage in java when we add new events to SparkListener.

Author: Kostas Sakellis <kostas@cloudera.com>

Closes #3711 from ksakellis/kostas-spark-4857 and squashes the following commits:

946d2c5 [Kostas Sakellis] Added executorAdded/Removed events to MesosSchedulerBackend
b1d054a [Kostas Sakellis] Remove executorInfo from ExecutorRemoved event
1727b38 [Kostas Sakellis] Renamed ExecutorDetails back to ExecutorInfo and other CR feedback
14fe78d [Kostas Sakellis] Added executor added/removed events to json protocol
93d087b [Kostas Sakellis] [SPARK-4857] [CORE] Adds Executor membership events to SparkListener
2015-01-15 17:53:42 -08:00
Kousuke Saruta 65858ba555 [Minor] Fix tiny typo in BlockManager
In BlockManager, there is a word `BlockTranserService` but I think it's typo for `BlockTransferService`.

Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>

Closes #4046 from sarutak/fix-tiny-typo and squashes the following commits:

a3e2a2f [Kousuke Saruta] Fixed tiny typo in BlockManager
2015-01-15 17:07:44 -08:00
Josh Rosen 259936be71 [SPARK-4014] Add TaskContext.attemptNumber and deprecate TaskContext.attemptId
`TaskContext.attemptId` is misleadingly-named, since it currently returns a taskId, which uniquely identifies a particular task attempt within a particular SparkContext, instead of an attempt number, which conveys how many times a task has been attempted.

This patch deprecates `TaskContext.attemptId` and add `TaskContext.taskId` and `TaskContext.attemptNumber` fields.  Prior to this change, it was impossible to determine whether a task was being re-attempted (or was a speculative copy), which made it difficult to write unit tests for tasks that fail on early attempts or speculative tasks that complete faster than original tasks.

Earlier versions of the TaskContext docs suggest that `attemptId` behaves like `attemptNumber`, so there's an argument to be made in favor of changing this method's implementation.  Since we've decided against making that change in maintenance branches, I think it's simpler to add better-named methods and retain the old behavior for `attemptId`; if `attemptId` behaved differently in different branches, then this would cause confusing build-breaks when backporting regression tests that rely on the new `attemptId` behavior.

Most of this patch is fairly straightforward, but there is a bit of trickiness related to Mesos tasks: since there's no field in MesosTaskInfo to encode the attemptId, I packed it into the `data` field alongside the task binary.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #3849 from JoshRosen/SPARK-4014 and squashes the following commits:

89d03e0 [Josh Rosen] Merge remote-tracking branch 'origin/master' into SPARK-4014
5cfff05 [Josh Rosen] Introduce wrapper for serializing Mesos task launch data.
38574d4 [Josh Rosen] attemptId -> taskAttemptId in PairRDDFunctions
a180b88 [Josh Rosen] Merge remote-tracking branch 'origin/master' into SPARK-4014
1d43aa6 [Josh Rosen] Merge remote-tracking branch 'origin/master' into SPARK-4014
eee6a45 [Josh Rosen] Merge remote-tracking branch 'origin/master' into SPARK-4014
0b10526 [Josh Rosen] Use putInt instead of putLong (silly mistake)
8c387ce [Josh Rosen] Use local with maxRetries instead of local-cluster.
cbe4d76 [Josh Rosen] Preserve attemptId behavior and deprecate it:
b2dffa3 [Josh Rosen] Address some of Reynold's minor comments
9d8d4d1 [Josh Rosen] Doc typo
1e7a933 [Josh Rosen] [SPARK-4014] Change TaskContext.attemptId to return attempt number instead of task ID.
fd515a5 [Josh Rosen] Add failing test for SPARK-4014
2015-01-14 11:45:40 -08:00
Kousuke Saruta 9d4449c4b3 [SPARK-5228][WebUI] Hide tables for "Active Jobs/Completed Jobs/Failed Jobs" when they are empty
In current WebUI, tables for Active Stages, Completed Stages, Skipped Stages and Failed Stages are hidden when they are empty while tables for Active Jobs, Completed Jobs and Failed Jobs are not hidden though they are empty.

This is before my patch is applied.

![2015-01-13 14 13 03](https://cloud.githubusercontent.com/assets/4736016/5730793/2b73d6f4-9b32-11e4-9a24-1784d758c644.png)

And this is after my patch is applied.

![2015-01-13 14 38 13](https://cloud.githubusercontent.com/assets/4736016/5730797/359ea2da-9b32-11e4-97b0-544739ddbf4c.png)

Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>

Closes #4028 from sarutak/SPARK-5228 and squashes the following commits:

b1e6e8b [Kousuke Saruta] Fixed a small typo
daab563 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-5228
9493a1d [Kousuke Saruta] Modified AllJobPage.scala so that hide Active Jobs/Completed Jobs/Failed Jobs when they are empty
2015-01-14 11:10:29 -08:00
WangTaoTheTonic 9dea64e53a [SPARK-4697][YARN]System properties should override environment variables
I found some arguments in yarn module take environment variables before system properties while the latter override the former in core module.

Author: WangTaoTheTonic <barneystinson@aliyun.com>
Author: WangTao <barneystinson@aliyun.com>

Closes #3557 from WangTaoTheTonic/SPARK4697 and squashes the following commits:

836b9ef [WangTaoTheTonic] fix type mismatch
e3e486a [WangTaoTheTonic] remove the comma
1262d57 [WangTaoTheTonic] handle spark.app.name and SPARK_YARN_APP_NAME in SparkSubmitArguments
bee9447 [WangTaoTheTonic] wrong brace
81833bb [WangTaoTheTonic] rebase
40934b4 [WangTaoTheTonic] just switch blocks
5f43f45 [WangTao] System property can override environment variable
2015-01-13 09:50:14 -08:00
WangTaoTheTonic f7741a9a72 [SPARK-5006][Deploy]spark.port.maxRetries doesn't work
https://issues.apache.org/jira/browse/SPARK-5006

I think the issue is produced in https://github.com/apache/spark/pull/1777.

Not digging mesos's backend yet. Maybe should add same logic either.

Author: WangTaoTheTonic <barneystinson@aliyun.com>
Author: WangTao <barneystinson@aliyun.com>

Closes #3841 from WangTaoTheTonic/SPARK-5006 and squashes the following commits:

8cdf96d [WangTao] indent thing
2d86d65 [WangTaoTheTonic] fix line length
7cdfd98 [WangTaoTheTonic] fit for new HttpServer constructor
61a370d [WangTaoTheTonic] some minor fixes
bc6e1ec [WangTaoTheTonic] rebase
67bcb46 [WangTaoTheTonic] put conf at 3rd position, modify suite class, add comments
f450cd1 [WangTaoTheTonic] startServiceOnPort will use a SparkConf arg
29b751b [WangTaoTheTonic] rebase as ExecutorRunnableUtil changed to ExecutorRunnable
396c226 [WangTaoTheTonic] make the grammar more like scala
191face [WangTaoTheTonic] invalid value name
62ec336 [WangTaoTheTonic] spark.port.maxRetries doesn't work
2015-01-13 09:29:25 -08:00
Michael Armbrust a3978f3e15 [SPARK-5078] Optionally read from SPARK_LOCAL_HOSTNAME
Current spark lets you set the ip address using SPARK_LOCAL_IP, but then this is given to akka after doing a reverse DNS lookup. This makes it difficult to run spark in Docker. You can already change the hostname that is used programmatically, but it would be nice to be able to do this with an environment variable as well.

Author: Michael Armbrust <michael@databricks.com>

Closes #3893 from marmbrus/localHostnameEnv and squashes the following commits:

85045b6 [Michael Armbrust] Optionally read from SPARK_LOCAL_HOSTNAME
2015-01-12 11:57:59 -08:00
lianhuiwang ef9224e080 [SPARK-5102][Core]subclass of MapStatus needs to be registered with Kryo
CompressedMapStatus and HighlyCompressedMapStatus needs to be registered with Kryo, because they are subclass of MapStatus.

Author: lianhuiwang <lianhuiwang09@gmail.com>

Closes #4007 from lianhuiwang/SPARK-5102 and squashes the following commits:

9d2238a [lianhuiwang] remove register of MapStatus
05a285d [lianhuiwang] subclass of MapStatus needs to be registered with Kryo
2015-01-12 10:57:12 -08:00
zsxwing 6942b974ad [SPARK-4951][Core] Fix the issue that a busy executor may be killed
A few changes to fix this issue:

1. Handle the case that receiving `SparkListenerTaskStart` before `SparkListenerBlockManagerAdded`.
2. Don't add `executorId` to `removeTimes` when the executor is busy.
3. Use `HashMap.retain` to safely traverse the HashMap and remove items.
4. Use the same lock in ExecutorAllocationManager and ExecutorAllocationListener to fix the race condition in `totalPendingTasks`.
5. Move the blocking codes out of the message processing code in YarnSchedulerActor.

Author: zsxwing <zsxwing@gmail.com>

Closes #3783 from zsxwing/SPARK-4951 and squashes the following commits:

d51fa0d [zsxwing] Add comments
2e365ce [zsxwing] Remove expired executors from 'removeTimes' and add idle executors back when a new executor joins
49f61a9 [zsxwing] Eliminate duplicate executor registered warnings
d4c4e9a [zsxwing] Minor fixes for the code style
05f6238 [zsxwing] Move the blocking codes out of the message processing code
105ba3a [zsxwing] Fix the race condition in totalPendingTasks
d5c615d [zsxwing] Fix the issue that a busy executor may be killed
2015-01-11 16:23:28 -08:00
lewuathe 1656aae2b4 [SPARK-5073] spark.storage.memoryMapThreshold have two default value
Because major OS page sizes is about 4KB, the default value of spark.storage.memoryMapThreshold is integrated to 2 * 4096

Author: lewuathe <lewuathe@me.com>

Closes #3900 from Lewuathe/integrate-memoryMapThreshold and squashes the following commits:

e417acd [lewuathe] [SPARK-5073] Update docs/configuration
834aba4 [lewuathe] [SPARK-5073] Fix style
adcea33 [lewuathe] [SPARK-5073] Integrate memory map threshold to 2MB
fcce2e5 [lewuathe] [SPARK-5073] spark.storage.memoryMapThreshold have two default value
2015-01-11 13:50:42 -08:00
wangfei 92d9a704ce [SPARK-4871][SQL] Show sql statement in spark ui when run sql with spark-sql
Author: wangfei <wangfei1@huawei.com>

Closes #3718 from scwf/sparksqlui and squashes the following commits:

e0d6b5d [wangfei] format fix
383b505 [wangfei] fix conflicts
4d2038a [wangfei] using setJobDescription
df79837 [wangfei] fix compile error
92ce834 [wangfei] show sql statement in spark ui when run sql use spark-sql
2015-01-10 17:04:56 -08:00
mcheah e0f28e010c [SPARK-4737] Task set manager properly handles serialization errors
Dealing with [SPARK-4737], the handling of serialization errors should not be the DAGScheduler's responsibility. The task set manager now catches the error and aborts the stage.

If the TaskSetManager throws a TaskNotSerializableException, the TaskSchedulerImpl will return an empty list of task descriptions, because no tasks were started. The scheduler should abort the stage gracefully.

Note that I'm not too familiar with this part of the codebase and its place in the overall architecture of the Spark stack. If implementing it this way will have any averse side effects please voice that loudly.

Author: mcheah <mcheah@palantir.com>

Closes #3638 from mccheah/task-set-manager-properly-handle-ser-err and squashes the following commits:

1545984 [mcheah] Some more style fixes from Andrew Or.
5267929 [mcheah] Fixing style suggestions from Andrew Or.
dfa145b [mcheah] Fixing style from Josh Rosen's feedback
b2a430d [mcheah] Not returning empty seq when a task set cannot be serialized.
94844d7 [mcheah] Fixing compilation error, one brace too many
5f486f4 [mcheah] Adding license header for fake task class
bf5e706 [mcheah] Fixing indentation.
097e7a2 [mcheah] [SPARK-4737] Catching task serialization exception in TaskSetManager
2015-01-09 14:16:20 -08:00
WangTaoTheTonic e966452060 [SPARK-1953][YARN]yarn client mode Application Master memory size is same as driver memory...
... size

Ways to set Application Master's memory on yarn-client mode:
1.  `spark.yarn.am.memory` in SparkConf or System Properties
2.  default value 512m

Note: this arguments is only available in yarn-client mode.

Author: WangTaoTheTonic <barneystinson@aliyun.com>

Closes #3607 from WangTaoTheTonic/SPARK4181 and squashes the following commits:

d5ceb1b [WangTaoTheTonic] spark.driver.memeory is used in both modes
6c1b264 [WangTaoTheTonic] rebase
b8410c0 [WangTaoTheTonic] minor optiminzation
ddcd592 [WangTaoTheTonic] fix the bug produced in rebase and some improvements
3bf70cc [WangTaoTheTonic] rebase and give proper hint
987b99d [WangTaoTheTonic] disable --driver-memory in client mode
2b27928 [WangTaoTheTonic] inaccurate description
b7acbb2 [WangTaoTheTonic] incorrect method invoked
2557c5e [WangTaoTheTonic] missing a single blank
42075b0 [WangTaoTheTonic] arrange the args and warn logging
69c7dba [WangTaoTheTonic] rebase
1960d16 [WangTaoTheTonic] fix wrong comment
7fa9e2e [WangTaoTheTonic] log a warning
f6bee0e [WangTaoTheTonic] docs issue
d619996 [WangTaoTheTonic] Merge branch 'master' into SPARK4181
b09c309 [WangTaoTheTonic] use code format
ab16bb5 [WangTaoTheTonic] fix bug and add comments
44e48c2 [WangTaoTheTonic] minor fix
6fd13e1 [WangTaoTheTonic] add overhead mem and remove some configs
0566bb8 [WangTaoTheTonic] yarn client mode Application Master memory size is same as driver memory size
2015-01-09 13:23:13 -08:00
Kay Ousterhout b6aa557300 [SPARK-1143] Separate pool tests into their own suite.
The current TaskSchedulerImplSuite includes some tests that are
actually for the TaskSchedulerImpl, but the remainder of the tests avoid using
the TaskSchedulerImpl entirely, and actually test the pool and scheduling
algorithm mechanisms. This commit separates the pool/scheduling algorithm
tests into their own suite, and also simplifies those tests.

The pull request replaces #339.

Author: Kay Ousterhout <kayousterhout@gmail.com>

Closes #3967 from kayousterhout/SPARK-1143 and squashes the following commits:

8a898c4 [Kay Ousterhout] [SPARK-1143] Separate pool tests into their own suite.
2015-01-09 09:47:06 -08:00
Marcelo Vanzin 48cecf673c [SPARK-4048] Enhance and extend hadoop-provided profile.
This change does a few things to make the hadoop-provided profile more useful:

- Create new profiles for other libraries / services that might be provided by the infrastructure
- Simplify and fix the poms so that the profiles are only activated while building assemblies.
- Fix tests so that they're able to run when the profiles are activated
- Add a new env variable to be used by distributions that use these profiles to provide the runtime
  classpath for Spark jobs and daemons.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #2982 from vanzin/SPARK-4048 and squashes the following commits:

82eb688 [Marcelo Vanzin] Add a comment.
eb228c0 [Marcelo Vanzin] Fix borked merge.
4e38f4e [Marcelo Vanzin] Merge branch 'master' into SPARK-4048
9ef79a3 [Marcelo Vanzin] Alternative way to propagate test classpath to child processes.
371ebee [Marcelo Vanzin] Review feedback.
52f366d [Marcelo Vanzin] Merge branch 'master' into SPARK-4048
83099fc [Marcelo Vanzin] Merge branch 'master' into SPARK-4048
7377e7b [Marcelo Vanzin] Merge branch 'master' into SPARK-4048
322f882 [Marcelo Vanzin] Fix merge fail.
f24e9e7 [Marcelo Vanzin] Merge branch 'master' into SPARK-4048
8b00b6a [Marcelo Vanzin] Merge branch 'master' into SPARK-4048
9640503 [Marcelo Vanzin] Cleanup child process log message.
115fde5 [Marcelo Vanzin] Simplify a comment (and make it consistent with another pom).
e3ab2da [Marcelo Vanzin] Fix hive-thriftserver profile.
7820d58 [Marcelo Vanzin] Fix CliSuite with provided profiles.
1be73d4 [Marcelo Vanzin] Restore flume-provided profile.
d1399ed [Marcelo Vanzin] Restore jetty dependency.
82a54b9 [Marcelo Vanzin] Remove unused profile.
5c54a25 [Marcelo Vanzin] Fix HiveThriftServer2Suite with *-provided profiles.
1fc4d0b [Marcelo Vanzin] Update dependencies for hive-thriftserver.
f7b3bbe [Marcelo Vanzin] Add snappy to hadoop-provided list.
9e4e001 [Marcelo Vanzin] Remove duplicate hive profile.
d928d62 [Marcelo Vanzin] Redirect child stderr to parent's log.
4d67469 [Marcelo Vanzin] Propagate SPARK_DIST_CLASSPATH on Yarn.
417d90e [Marcelo Vanzin] Introduce "SPARK_DIST_CLASSPATH".
2f95f0d [Marcelo Vanzin] Propagate classpath to child processes during testing.
1adf91c [Marcelo Vanzin] Re-enable maven-install-plugin for a few projects.
284dda6 [Marcelo Vanzin] Rework the "hadoop-provided" profile, add new ones.
2015-01-08 17:15:13 -08:00
Kousuke Saruta a00af6bec5 [SPARK-4973][CORE] Local directory in the driver of client-mode continues remaining even if application finished when external shuffle is enabled
When we enables external shuffle service, local directories in the driver of client-mode continue remaining even if application has finished.
I think local directories for drivers should be deleted.

Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>

Closes #3811 from sarutak/SPARK-4973 and squashes the following commits:

ad944ab [Kousuke Saruta] Fixed DiskBlockManager to cleanup local directory if it's the driver
43770da [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-4973
88feecd [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-4973
d99718e [Kousuke Saruta] Fixed SparkSubmit.scala and DiskBlockManager.scala in order to delete local directories of the driver of local-mode when external shuffle service is enabled
2015-01-08 13:43:09 -08:00
Eric Moyer 538f221627 Document that groupByKey will OOM for large keys
This pull request is my own work and I license it under Spark's open-source license.

This contribution is an improvement to the documentation. I documented that the maximum number of values per key for groupByKey is limited by available RAM (see [Datablox][datablox link] and [the spark mailing list][list link]).

Just saying that better performance is available is not sufficient. Sometimes you need to do a group-by - your operation needs all the items available in order to complete. This warning explains the problem.

[datablox link]: http://databricks.gitbooks.io/databricks-spark-knowledge-base/content/best_practices/prefer_reducebykey_over_groupbykey.html
[list link]: http://apache-spark-user-list.1001560.n3.nabble.com/Understanding-RDD-GroupBy-OutOfMemory-Exceptions-tp11427p11466.html

Author: Eric Moyer <eric_moyer@yahoo.com>

Closes #3936 from RadixSeven/better-group-by-docs and squashes the following commits:

5b6f4e9 [Eric Moyer] groupByKey docs naming updates
238e81b [Eric Moyer] Doc that groupByKey will OOM for large keys
2015-01-08 11:55:23 -08:00
Kousuke Saruta 0a597276db [Minor] Fix the value represented by spark.executor.id for consistency.
The property  `spark.executor.id` can represent both `driver` and `<driver>`  for one driver.
It's inconsistent.

This issue is minor so I didn't file this in JIRA.

Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>

Closes #3812 from sarutak/fix-driver-identifier and squashes the following commits:

d885498 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into fix-driver-identifier
4275663 [Kousuke Saruta] Fixed the value represented by spark.executor.id of local mode
2015-01-08 11:35:56 -08:00
Zhang, Liye 06dc4b5206 [SPARK-4989][CORE] avoid wrong eventlog conf cause cluster down in standalone mode
when enabling eventlog in standalone mode, if give the wrong configuration, the standalone cluster will down (cause master restart, lose connection with workers).
How to reproduce: just give an invalid value to "spark.eventLog.dir", for example: spark.eventLog.dir=hdfs://tmp/logdir1, hdfs://tmp/logdir2. This will throw illegalArgumentException, which will cause the Master restart. And the whole cluster is not available.

Author: Zhang, Liye <liye.zhang@intel.com>

Closes #3824 from liyezhang556520/wrongConf4Cluster and squashes the following commits:

3c24d98 [Zhang, Liye] revert change with logwarning and excetption for FileNotFoundException
3c1ac2e [Zhang, Liye] change var to val
a49c52f [Zhang, Liye] revert wrong modification
12eee85 [Zhang, Liye] add more message in log and on webUI
5c1fa33 [Zhang, Liye] cache exceptions when eventlog with wrong conf
2015-01-08 10:40:26 -08:00
zsxwing 2b729d2250 [SPARK-5126][Core] Verify Spark urls before creating Actors so that invalid urls can crash the process.
Because `actorSelection` will return `deadLetters` for an invalid path,  Worker keeps quiet for an invalid master url. It's better to log an error so that people can find such problem quickly.

This PR will check the url before sending to `actorSelection`, throw and log a SparkException for an invalid url.

Author: zsxwing <zsxwing@gmail.com>

Closes #3927 from zsxwing/SPARK-5126 and squashes the following commits:

9d429ee [zsxwing] Create a utility method in Utils to parse Spark url; verify urls before creating Actors so that invalid urls can crash the process.
8286e51 [zsxwing] Check the url before sending to Akka and log the error if the url is invalid
2015-01-07 23:01:30 -08:00
hushan[胡珊] d345ebebd5 [SPARK-5132][Core]Correct stage Attempt Id key in stageInfofromJson
SPARK-5132:
stageInfoToJson: Stage Attempt Id
stageInfoFromJson: Attempt Id

Author: hushan[胡珊] <hushan@xiaomi.com>

Closes #3932 from suyanNone/json-stage and squashes the following commits:

41419ab [hushan[胡珊]] Correct stage Attempt Id key in stageInfofromJson
2015-01-07 12:09:12 -08:00
Masayoshi TSUZUKI 6e74edeca3 [SPARK-2458] Make failed application log visible on History Server
Enabled HistoryServer to show incomplete applications.
We can see the log for incomplete applications by clicking the bottom link.

Author: Masayoshi TSUZUKI <tsudukim@oss.nttdata.co.jp>

Closes #3467 from tsudukim/feature/SPARK-2458-2 and squashes the following commits:

76205d2 [Masayoshi TSUZUKI] Fixed and added test code.
29a04a9 [Masayoshi TSUZUKI] Merge branch 'master' of github.com:tsudukim/spark into feature/SPARK-2458-2
f9ef854 [Masayoshi TSUZUKI] Added space between "if" and "(". Fixed "Incomplete" as capitalized in the web UI. Modified double negative variable name.
9b465b0 [Masayoshi TSUZUKI] Modified typo and better implementation.
3ed8a41 [Masayoshi TSUZUKI] Modified too long lines.
08ea14d [Masayoshi TSUZUKI] [SPARK-2458] Make failed application log visible on History Server
2015-01-07 07:32:53 -08:00
Sean Owen 4cba6eb420 SPARK-4159 [CORE] Maven build doesn't run JUnit test suites
This PR:

- Reenables `surefire`, and copies config from `scalatest` (which is itself an old fork of `surefire`, so similar)
- Tells `surefire` to test only Java tests
- Enables `surefire` and `scalatest` for all children, and in turn eliminates some duplication.

For me this causes the Scala and Java tests to be run once each, it seems, as desired. It doesn't affect the SBT build but works for Maven. I still need to verify that all of the Scala tests and Java tests are being run.

Author: Sean Owen <sowen@cloudera.com>

Closes #3651 from srowen/SPARK-4159 and squashes the following commits:

2e8a0af [Sean Owen] Remove specialized SPARK_HOME setting for REPL, YARN tests as it appears to be obsolete
12e4558 [Sean Owen] Append to unit-test.log instead of overwriting, so that both surefire and scalatest output is preserved. Also standardize/correct comments a bit.
e6f8601 [Sean Owen] Reenable Java tests by reenabling surefire with config cloned from scalatest; centralize test config in the parent
2015-01-06 12:02:08 -08:00
Reynold Xin bbcba3a943 [SPARK-5093] Set spark.network.timeout to 120s consistently.
Author: Reynold Xin <rxin@databricks.com>

Closes #3903 from rxin/timeout-120 and squashes the following commits:

7c2138e [Reynold Xin] [SPARK-5093] Set spark.network.timeout to 120s consistently.
2015-01-05 15:19:53 -08:00
Jongyoul Lee 1c0e7ce056 [SPARK-4465] runAsSparkUser doesn't affect TaskRunner in Mesos environme...
...nt at all.

- fixed a scope of runAsSparkUser from MesosExecutorDriver.run to MesosExecutorBackend.launchTask
- See the Jira Issue for more details.

Author: Jongyoul Lee <jongyoul@gmail.com>

Closes #3741 from jongyoul/SPARK-4465 and squashes the following commits:

46ad71e [Jongyoul Lee] [SPARK-4465] runAsSparkUser doesn't affect TaskRunner in Mesos environment at all. - Removed unused import
3d6631f [Jongyoul Lee] [SPARK-4465] runAsSparkUser doesn't affect TaskRunner in Mesos environment at all. - Removed comments and adjusted indentations
2343f13 [Jongyoul Lee] [SPARK-4465] runAsSparkUser doesn't affect TaskRunner in Mesos environment at all. - fixed a scope of runAsSparkUser from MesosExecutorDriver.run to MesosExecutorBackend.launchTask
2015-01-05 12:05:09 -08:00
WangTao ce39b34404 [SPARK-5057] Log message in failed askWithReply attempts
https://issues.apache.org/jira/browse/SPARK-5057

Author: WangTao <barneystinson@aliyun.com>
Author: WangTaoTheTonic <barneystinson@aliyun.com>

Closes #3875 from WangTaoTheTonic/SPARK-5057 and squashes the following commits:

1503487 [WangTao] use string interpolation
706c8a7 [WangTaoTheTonic] log more messages
2015-01-05 12:00:02 -08:00
Varun Saxena d3f07fd23c [SPARK-4688] Have a single shared network timeout in Spark
[SPARK-4688] Have a single shared network timeout in Spark

Author: Varun Saxena <vsaxena.varun@gmail.com>
Author: varunsaxena <vsaxena.varun@gmail.com>

Closes #3562 from varunsaxena/SPARK-4688 and squashes the following commits:

6e97f72 [Varun Saxena] [SPARK-4688] Single shared network timeout
cd783a2 [Varun Saxena] SPARK-4688
d6f8c29 [Varun Saxena] SCALA-4688
9562b15 [Varun Saxena] SPARK-4688
a75f014 [varunsaxena] SPARK-4688
594226c [varunsaxena] SPARK-4688
2015-01-05 10:32:37 -08:00
zsxwing 5c506cecb9 [SPARK-5074][Core] Fix a non-deterministic test failure
Add `assert(sc.listenerBus.waitUntilEmpty(WAIT_TIMEOUT_MILLIS))` to make sure `sparkListener` receive the message.

Author: zsxwing <zsxwing@gmail.com>

Closes #3889 from zsxwing/SPARK-5074 and squashes the following commits:

e61c198 [zsxwing] Fix a non-deterministic test failure
2015-01-04 21:18:33 -08:00
zsxwing 27e7f5a723 [SPARK-5083][Core] Fix a flaky test in TaskResultGetterSuite
Because `sparkEnv.blockManager.master.removeBlock` is asynchronous, we need to make sure the block has already been removed before calling `super.enqueueSuccessfulTask`.

Author: zsxwing <zsxwing@gmail.com>

Closes #3894 from zsxwing/SPARK-5083 and squashes the following commits:

d97c03d [zsxwing] Fix a flaky test in TaskResultGetterSuite
2015-01-04 21:09:21 -08:00
zsxwing 6c726a3fbd [SPARK-5069][Core] Fix the race condition of TaskSchedulerImpl.dagScheduler
It's not necessary to set `TaskSchedulerImpl.dagScheduler` in preStart. It's safe to set it after `initializeEventProcessActor()`.

Author: zsxwing <zsxwing@gmail.com>

Closes #3887 from zsxwing/SPARK-5069 and squashes the following commits:

d95894f [zsxwing] Fix the race condition of TaskSchedulerImpl.dagScheduler
2015-01-04 21:06:04 -08:00
zsxwing 72396522bc [SPARK-5067][Core] Use '===' to compare well-defined case class
A simple fix would be adding `assert(e1.appId == e2.appId)` for `SparkListenerApplicationStart`. But actually we can use `===` for well-defined case class directly. Therefore, instead of fixing this issue, I use `===` to compare those well-defined case classes (all fields have implemented a correct `equals` method, such as primitive types)

Author: zsxwing <zsxwing@gmail.com>

Closes #3886 from zsxwing/SPARK-5067 and squashes the following commits:

0a51711 [zsxwing] Use '===' to compare well-defined case class
2015-01-04 21:03:17 -08:00
Josh Rosen 939ba1f8f6 [SPARK-4835] Disable validateOutputSpecs for Spark Streaming jobs
This patch disables output spec. validation for jobs launched through Spark Streaming, since this interferes with checkpoint recovery.

Hadoop OutputFormats have a `checkOutputSpecs` method which performs certain checks prior to writing output, such as checking whether the output directory already exists.  SPARK-1100 added checks for FileOutputFormat, SPARK-1677 (#947) added a SparkConf configuration to disable these checks, and SPARK-2309 (#1088) extended these checks to run for all OutputFormats, not just FileOutputFormat.

In Spark Streaming, we might have to re-process a batch during checkpoint recovery, so `save` actions may be called multiple times.  In addition to `DStream`'s own save actions, users might use `transform` or `foreachRDD` and call the `RDD` and `PairRDD` save actions.  When output spec. validation is enabled, the second calls to these actions will fail due to existing output.

This patch automatically disables output spec. validation for jobs submitted by the Spark Streaming scheduler.  This is done by using Scala's `DynamicVariable` to propagate the bypass setting without having to mutate SparkConf or introduce a global variable.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #3832 from JoshRosen/SPARK-4835 and squashes the following commits:

36eaf35 [Josh Rosen] Add comment explaining use of transform() in test.
6485cf8 [Josh Rosen] Add test case in Streaming; fix bug for transform()
7b3e06a [Josh Rosen] Remove Streaming-specific setting to undo this change; update conf. guide
bf9094d [Josh Rosen] Revise disableOutputSpecValidation() comment to not refer to Spark Streaming.
e581d17 [Josh Rosen] Deduplicate isOutputSpecValidationEnabled logic.
762e473 [Josh Rosen] [SPARK-4835] Disable validateOutputSpecs for Spark Streaming jobs.
2015-01-04 20:26:18 -08:00
Dale 3fddc9468f [SPARK-4787] Stop SparkContext if a DAGScheduler init error occurs
Author: Dale <tigerquoll@outlook.com>

Closes #3809 from tigerquoll/SPARK-4787 and squashes the following commits:

5661e01 [Dale] [SPARK-4787] Ensure that call to stop() doesn't lose the exception by using a finally block.
2172578 [Dale] [SPARK-4787] Stop context properly if an exception occurs during DAGScheduler initialization.
2015-01-04 13:29:13 -08:00
Brennon York b96008d552 [SPARK-794][Core] Remove sleep() in ClusterScheduler.stop
Removed `sleep()` from the `stop()` method of the `TaskSchedulerImpl` class which, from the JIRA ticket, is believed to be a legacy artifact slowing down testing originally introduced in the `ClusterScheduler` class.

Author: Brennon York <brennon.york@capitalone.com>

Closes #3851 from brennonyork/SPARK-794 and squashes the following commits:

04c3e64 [Brennon York] Removed sleep() from the stop() method
2015-01-04 12:40:39 -08:00
Josh Rosen 012839807c [HOTFIX] Bind web UI to ephemeral port in DriverSuite
The job launched by DriverSuite should bind the web UI to an ephemeral port, since it looks like port contention in this test has caused a large number of Jenkins failures when many builds are started simultaneously.  Our tests already disable the web UI, but this doesn't affect subprocesses launched by our tests.  In this case, I've opted to bind to an ephemeral port instead of disabling the UI because disabling features in this test may mask its ability to catch certain bugs.

See also: e24d3a9

Author: Josh Rosen <joshrosen@databricks.com>

Closes #3873 from JoshRosen/driversuite-webui-port and squashes the following commits:

48cd05c [Josh Rosen] [HOTFIX] Bind web UI to ephemeral port in DriverSuite.
2015-01-01 15:03:54 -08:00
Reynold Xin 7749dd6c36 [SPARK-5038] Add explicit return type for implicit functions.
As we learned in #3580, not explicitly typing implicit functions can lead to compiler bugs and potentially unexpected runtime behavior.

This is a follow up PR for rest of Spark (outside Spark SQL). The original PR for Spark SQL can be found at https://github.com/apache/spark/pull/3859

Author: Reynold Xin <rxin@databricks.com>

Closes #3860 from rxin/implicit and squashes the following commits:

73702f9 [Reynold Xin] [SPARK-5038] Add explicit return type for implicit functions.
2014-12-31 17:07:47 -08:00
Josh Rosen e24d3a9a29 [HOTFIX] Disable Spark UI in SparkSubmitSuite tests
This should fix a major cause of build breaks when running many parallel tests.
2014-12-31 14:13:09 -08:00
Brennon York 8e14c5eb55 [SPARK-4298][Core] - The spark-submit cannot read Main-Class from Manifest.
Resolves a bug where the `Main-Class` from a .jar file wasn't being read in properly. This was caused by the fact that the `primaryResource` object was a URI and needed to be normalized through a call to `.getPath` before it could be passed into the `JarFile` object.

Author: Brennon York <brennon.york@capitalone.com>

Closes #3561 from brennonyork/SPARK-4298 and squashes the following commits:

5e0fce1 [Brennon York] Use string interpolation for error messages, moved comment line from original code to above its necessary code segment
14daa20 [Brennon York] pushed mainClass assignment into match statement, removed spurious spaces, removed { } from case statements, removed return values
c6dad68 [Brennon York] Set case statement to support multiple jar URI's and enabled the 'file' URI to load the main-class
8d20936 [Brennon York] updated to reset the error message back to the default
a043039 [Brennon York] updated to split the uri and jar vals
8da7cbf [Brennon York] fixes SPARK-4298
2014-12-31 11:54:10 -08:00
Josh Rosen 352ed6bbe3 [SPARK-1010] Clean up uses of System.setProperty in unit tests
Several of our tests call System.setProperty (or test code which implicitly sets system properties) and don't always reset/clear the modified properties, which can create ordering dependencies between tests and cause hard-to-diagnose failures.

This patch removes most uses of System.setProperty from our tests, since in most cases we can use SparkConf to set these configurations (there are a few exceptions, including the tests of SparkConf itself).

For the cases where we continue to use System.setProperty, this patch introduces a `ResetSystemProperties` ScalaTest mixin class which snapshots the system properties before individual tests and to automatically restores them on test completion / failure.  See the block comment at the top of the ResetSystemProperties class for more details.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #3739 from JoshRosen/cleanup-system-properties-in-tests and squashes the following commits:

0236d66 [Josh Rosen] Replace setProperty uses in two example programs / tools
3888fe3 [Josh Rosen] Remove setProperty use in LocalJavaStreamingContext
4f4031d [Josh Rosen] Add note on why SparkSubmitSuite needs ResetSystemProperties
4742a5b [Josh Rosen] Clarify ResetSystemProperties trait inheritance ordering.
0eaf0b6 [Josh Rosen] Remove setProperty call in TaskResultGetterSuite.
7a3d224 [Josh Rosen] Fix trait ordering
3fdb554 [Josh Rosen] Remove setProperty call in TaskSchedulerImplSuite
bee20df [Josh Rosen] Remove setProperty calls in SparkContextSchedulerCreationSuite
655587c [Josh Rosen] Remove setProperty calls in JobCancellationSuite
3f2f955 [Josh Rosen] Remove System.setProperty calls in DistributedSuite
cfe9cce [Josh Rosen] Remove use of system properties in SparkContextSuite
8783ab0 [Josh Rosen] Remove TestUtils.setSystemProperty, since it is subsumed by the ResetSystemProperties trait.
633a84a [Josh Rosen] Remove use of system properties in FileServerSuite
25bfce2 [Josh Rosen] Use ResetSystemProperties in UtilsSuite
1d1aa5a [Josh Rosen] Use ResetSystemProperties in SizeEstimatorSuite
dd9492b [Josh Rosen] Use ResetSystemProperties in AkkaUtilsSuite
b0daff2 [Josh Rosen] Use ResetSystemProperties in BlockManagerSuite
e9ded62 [Josh Rosen] Use ResetSystemProperties in TaskSchedulerImplSuite
5b3cb54 [Josh Rosen] Use ResetSystemProperties in SparkListenerSuite
0995c4b [Josh Rosen] Use ResetSystemProperties in SparkContextSchedulerCreationSuite
c83ded8 [Josh Rosen] Use ResetSystemProperties in SparkConfSuite
51aa870 [Josh Rosen] Use withSystemProperty in ShuffleSuite
60a63a1 [Josh Rosen] Use ResetSystemProperties in JobCancellationSuite
14a92e4 [Josh Rosen] Use withSystemProperty in FileServerSuite
628f46c [Josh Rosen] Use ResetSystemProperties in DistributedSuite
9e3e0dd [Josh Rosen] Add ResetSystemProperties test fixture mixin; use it in SparkSubmitSuite.
4dcea38 [Josh Rosen] Move withSystemProperty to TestUtils class.
2014-12-30 18:12:20 -08:00
Josh Rosen efa80a531e [SPARK-4882] Register PythonBroadcast with Kryo so that PySpark works with KryoSerializer
This PR fixes an issue where PySpark broadcast variables caused NullPointerExceptions if KryoSerializer was used.  The fix is to register PythonBroadcast with Kryo so that it's deserialized with a KryoJavaSerializer.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #3831 from JoshRosen/SPARK-4882 and squashes the following commits:

0466c7a [Josh Rosen] Register PythonBroadcast with Kryo.
d5b409f [Josh Rosen] Enable registrationRequired, which would have caught this bug.
069d8a7 [Josh Rosen] Add failing test for SPARK-4882
2014-12-30 09:29:52 -08:00
Zhang, Liye 9077e721cd [SPARK-4920][UI] add version on master and worker page for standalone mode
Author: Zhang, Liye <liye.zhang@intel.com>

Closes #3769 from liyezhang556520/spark-4920_WebVersion and squashes the following commits:

3bb7e0d [Zhang, Liye] add version on master and worker page
2014-12-30 09:19:47 -08:00
Yash Datta 9bc0df6804 SPARK-4968: takeOrdered to skip reduce step in case mappers return no partitions
takeOrdered should skip reduce step in case mapped RDDs have no partitions. This prevents the mentioned exception :

4. run query
SELECT * FROM testTable WHERE market = 'market2' ORDER BY End_Time DESC LIMIT 100;
Error trace
java.lang.UnsupportedOperationException: empty collection
at org.apache.spark.rdd.RDD$$anonfun$reduce$1.apply(RDD.scala:863)
at org.apache.spark.rdd.RDD$$anonfun$reduce$1.apply(RDD.scala:863)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.reduce(RDD.scala:863)
at org.apache.spark.rdd.RDD.takeOrdered(RDD.scala:1136)

Author: Yash Datta <Yash.Datta@guavus.com>

Closes #3830 from saucam/fix_takeorder and squashes the following commits:

5974d10 [Yash Datta] SPARK-4968: takeOrdered to skip reduce step in case mappers return no partitions
2014-12-29 13:49:45 -08:00
Kousuke Saruta 8d72341ab7 [Minor] Fix a typo of type parameter in JavaUtils.scala
In JavaUtils.scala, thare is a typo of type parameter. In addition, the type information is removed at the time of compile by erasure.

This issue is really minor so I don't  file in JIRA.

Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>

Closes #3789 from sarutak/fix-typo-in-javautils and squashes the following commits:

e20193d [Kousuke Saruta] Fixed a typo of type parameter
82bc5d9 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into fix-typo-in-javautils
99f6f63 [Kousuke Saruta] Fixed a typo of type parameter in JavaUtils.scala
2014-12-29 12:05:08 -08:00
YanTangZhai 815de54002 [SPARK-4946] [CORE] Using AkkaUtils.askWithReply in MapOutputTracker.askTracker to reduce the chance of the communicating problem
Using AkkaUtils.askWithReply in MapOutputTracker.askTracker to reduce the chance of the communicating problem

Author: YanTangZhai <hakeemzhai@tencent.com>
Author: yantangzhai <tyz0303@163.com>

Closes #3785 from YanTangZhai/SPARK-4946 and squashes the following commits:

9ca6541 [yantangzhai] [SPARK-4946] [CORE] Using AkkaUtils.askWithReply in MapOutputTracker.askTracker to reduce the chance of the communicating problem
e4c2c0a [YanTangZhai] Merge pull request #15 from apache/master
718afeb [YanTangZhai] Merge pull request #12 from apache/master
6e643f8 [YanTangZhai] Merge pull request #11 from apache/master
e249846 [YanTangZhai] Merge pull request #10 from apache/master
d26d982 [YanTangZhai] Merge pull request #9 from apache/master
76d4027 [YanTangZhai] Merge pull request #8 from apache/master
03b62b0 [YanTangZhai] Merge pull request #7 from apache/master
8a00106 [YanTangZhai] Merge pull request #6 from apache/master
cbcba66 [YanTangZhai] Merge pull request #3 from apache/master
cdef539 [YanTangZhai] Merge pull request #1 from apache/master
2014-12-29 11:30:54 -08:00
GuoQiang Li 080ceb771a [SPARK-4952][Core]Handle ConcurrentModificationExceptions in SparkEnv.environmentDetails
Author: GuoQiang Li <witgo@qq.com>

Closes #3788 from witgo/SPARK-4952 and squashes the following commits:

d903529 [GuoQiang Li] Handle ConcurrentModificationExceptions in SparkEnv.environmentDetails
2014-12-26 23:31:29 -08:00
Zhang, Liye 786808abfd [SPARK-4954][Core] add spark version infomation in log for standalone mode
The master and worker spark version may be not the same with Driver spark version. That is because spark Jar file might be replaced for new application without restarting the spark cluster. So there shall log out the spark-version in both Mater and Worker log.

Author: Zhang, Liye <liye.zhang@intel.com>

Closes #3790 from liyezhang556520/version4Standalone and squashes the following commits:

e05e1e3 [Zhang, Liye] add spark version infomation in log for standalone mode
2014-12-26 23:24:22 -08:00
Sean Owen 29fabb1b52 SPARK-4297 [BUILD] Build warning fixes omnibus
There are a number of warnings generated in a normal, successful build right now. They're mostly Java unchecked cast warnings, which can be suppressed. But there's a grab bag of other Scala language warnings and so on that can all be easily fixed. The forthcoming PR fixes about 90% of the build warnings I see now.

Author: Sean Owen <sowen@cloudera.com>

Closes #3157 from srowen/SPARK-4297 and squashes the following commits:

8c9e469 [Sean Owen] Suppress unchecked cast warnings, and several other build warning fixes
2014-12-24 13:32:51 -08:00
Kousuke Saruta 199e59aacd [SPARK-4881][Minor] Use SparkConf#getBoolean instead of get().toBoolean
It's really a minor issue.

In ApplicationMaster, there is code like as follows.

    val preserveFiles = sparkConf.get("spark.yarn.preserve.staging.files", "false").toBoolean

I think, the code can be simplified like as follows.

    val preserveFiles = sparkConf.getBoolean("spark.yarn.preserve.staging.files", false)

Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>

Closes #3733 from sarutak/SPARK-4881 and squashes the following commits:

1771430 [Kousuke Saruta] Modified the code like sparkConf.get(...).toBoolean to sparkConf.getBoolean(...)
c63daa0 [Kousuke Saruta] Simplified code
2014-12-23 19:14:34 -08:00
Marcelo Vanzin 7e2deb71c4 [SPARK-4606] Send EOF to child JVM when there's no more data to read.
Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #3460 from vanzin/SPARK-4606 and squashes the following commits:

031207d [Marcelo Vanzin] [SPARK-4606] Send EOF to child JVM when there's no more data to read.
2014-12-23 16:07:59 -08:00
Liang-Chi Hsieh 96281cd0c3 [SPARK-4913] Fix incorrect event log path
SPARK-2261 uses a single file to log events for an app. `eventLogDir` in `ApplicationDescription` is replaced with `eventLogFile`. However, `ApplicationDescription` in `SparkDeploySchedulerBackend` is initialized with `SparkContext`'s `eventLogDir`. It is just the log directory, not the actual log file path. `Master.rebuildSparkUI` can not correctly rebuild a new SparkUI for the app.

Because the `ApplicationDescription` is remotely registered with `Master` and the app's id is then generated in `Master`, we can not get the app id in advance before registration. So the received description needs to be modified with correct `eventLogFile` value.

Author: Liang-Chi Hsieh <viirya@gmail.com>

Closes #3755 from viirya/fix_app_logdir and squashes the following commits:

5e0ea35 [Liang-Chi Hsieh] Revision for comment.
b5730a1 [Liang-Chi Hsieh] Fix incorrect event log path.

Closes #3777 (a duplicate PR for the same JIRA)
2014-12-23 14:58:44 -08:00
Andrew Or 27c5399f4d [SPARK-4730][YARN] Warn against deprecated YARN settings
See https://issues.apache.org/jira/browse/SPARK-4730.

Author: Andrew Or <andrew@databricks.com>

Closes #3590 from andrewor14/yarn-settings and squashes the following commits:

36e0753 [Andrew Or] Merge branch 'master' of github.com:apache/spark into yarn-settings
dcd1316 [Andrew Or] Warn against deprecated YARN settings
2014-12-23 14:28:36 -08:00
Marcelo Vanzin dd155369a0 [SPARK-4834] [standalone] Clean up application files after app finishes.
Commit 7aacb7bfa added support for sharing downloaded files among multiple
executors of the same app. That works great in Yarn, since the app's directory
is cleaned up after the app is done.

But Spark standalone mode didn't do that, so the lock/cache files created
by that change were left around and could eventually fill up the disk hosting
/tmp.

To solve that, create app-specific directories under the local dirs when
launching executors. Multiple executors launched by the same Worker will
use the same app directories, so they should be able to share the downloaded
files. When the application finishes, a new message is sent to all workers
telling them the application has finished; once that message has been received,
and all executors registered for the application shut down, then those
directories will be cleaned up by the Worker.

Note: Unit testing this is hard (if even possible), since local-cluster mode
doesn't seem to leave the Master/Worker daemons running long enough after
`sc.stop()` is called for the clean up protocol to take effect.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #3705 from vanzin/SPARK-4834 and squashes the following commits:

b430534 [Marcelo Vanzin] Remove seemingly unnecessary synchronization.
50eb4b9 [Marcelo Vanzin] Review feedback.
c0e5ea5 [Marcelo Vanzin] [SPARK-4834] [standalone] Clean up application files after app finishes.
2014-12-23 12:02:08 -08:00
zsxwing c233ab3d8d [SPARK-4818][Core] Add 'iterator' to reduce memory consumed by join
In Scala, `map` and `flatMap` of `Iterable` will copy the contents of `Iterable` to a new `Seq`. Such as,
```Scala
  val iterable = Seq(1, 2, 3).map(v => {
    println(v)
    v
  })
  println("Iterable map done")

  val iterator = Seq(1, 2, 3).iterator.map(v => {
    println(v)
    v
  })
  println("Iterator map done")
```
outputed
```
1
2
3
Iterable map done
Iterator map done
```
So we should use 'iterator' to reduce memory consumed by join.

Found by Johannes Simon in http://mail-archives.apache.org/mod_mbox/spark-user/201412.mbox/%3C5BE70814-9D03-4F61-AE2C-0D63F2DE4446%40mail.de%3E

Author: zsxwing <zsxwing@gmail.com>

Closes #3671 from zsxwing/SPARK-4824 and squashes the following commits:

48ee7b9 [zsxwing] Remove the explicit types
95d59d6 [zsxwing] Add 'iterator' to reduce memory consumed by join
2014-12-22 14:26:28 -08:00
genmao.ygm de9d7d2b5b [SPARK-4920][UI]:current spark version in UI is not striking.
It is not convenient to see the Spark version. We can keep the same style with Spark website.

![spark_version](https://cloud.githubusercontent.com/assets/7402327/5527025/1c8c721c-8a35-11e4-8d6a-2734f3c6bdf8.jpg)

Author: genmao.ygm <genmao.ygm@alibaba-inc.com>

Closes #3763 from uncleGen/master-clean-141222 and squashes the following commits:

0dcb9a9 [genmao.ygm] [SPARK-4920][UI]:current spark version in UI is not striking.
2014-12-22 14:14:39 -08:00
Kostas Sakellis 7c0ed13d29 [SPARK-4079] [CORE] Consolidates Errors if a CompressionCodec is not available
This commit consolidates some of the exceptions thrown if compression codecs are not available. If a bad configuration string was passed in, a ClassNotFoundException was through. Also, if Snappy was not available, it would throw an InvocationTargetException when the codec was being used (not when it was being initialized). Now, an IllegalArgumentException is thrown when a codec is not available at creation time - either because the class does not exist or the codec itself is not available in the system. This will allow us to have a better message and fail faster.

Author: Kostas Sakellis <kostas@cloudera.com>

Closes #3119 from ksakellis/kostas-spark-4079 and squashes the following commits:

9709c7c [Kostas Sakellis] Removed unnecessary Logging class
63bfdd0 [Kostas Sakellis] Removed isAvailable to preserve binary compatibility
1d0ef2f [Kostas Sakellis] [SPARK-4079] [CORE] Added more information to exception
64f3d27 [Kostas Sakellis] [SPARK-4079] [CORE] Code review feedback
52dfa8f [Kostas Sakellis] [SPARK-4079] [CORE] Default to LZF if Snappy not available
2014-12-22 13:07:01 -08:00
Takeshi Yamamuro fb8e85e80e [SPARK-4733] Add missing prameter comments in ShuffleDependency
Add missing Javadoc comments in ShuffleDependency.

Author: Takeshi Yamamuro <linguin.m.s@gmail.com>

Closes #3594 from maropu/DependencyJavadocFix and squashes the following commits:

32129b4 [Takeshi Yamamuro] Fix comments in @aggregator and @mapSideCombine
303c75d [Takeshi Yamamuro] [SPARK-4733] Add missing prameter comments in ShuffleDependency
2014-12-22 12:19:23 -08:00
Zhang, Liye 39272c8cdb [SPARK-4870] Add spark version to driver log
Author: Zhang, Liye <liye.zhang@intel.com>

Closes #3717 from liyezhang556520/version2Log and squashes the following commits:

ccd30d7 [Zhang, Liye] delete log in sparkConf
330f70c [Zhang, Liye] move the log from SaprkConf to SparkContext
96dc115 [Zhang, Liye] remove curly brace
e833330 [Zhang, Liye] add spark version to driver log
2014-12-22 11:38:28 -08:00
zsxwing 93b2f3a882 [SPARK-4918][Core] Reuse Text in saveAsTextFile
Reuse Text in saveAsTextFile to reduce GC.

/cc rxin

Author: zsxwing <zsxwing@gmail.com>

Closes #3762 from zsxwing/SPARK-4918 and squashes the following commits:

59f03eb [zsxwing] Reuse Text in saveAsTextFile
2014-12-22 11:20:00 -08:00
zsxwing 6ee6aa70b7 [SPARK-2075][Core] Make the compiler generate same bytes code for Hadoop 1.+ and Hadoop 2.+
`NullWritable` is a `Comparable` rather than `Comparable[NullWritable]` in Hadoop 1.+, so the compiler cannot find an implicit Ordering for it. It will generate different anonymous classes for `saveAsTextFile` in Hadoop 1.+ and Hadoop 2.+. Therefore, here we provide an Ordering for NullWritable so that the compiler will generate same codes.

I used the following commands to confirm the generated byte codes are some.
```
mvn -Dhadoop.version=1.2.1 -DskipTests clean package -pl core -am
javap -private -c -classpath core/target/scala-2.10/classes org.apache.spark.rdd.RDD > ~/hadoop1.txt

mvn -Pyarn -Phadoop-2.2 -Dhadoop.version=2.2.0 -DskipTests clean package -pl core -am
javap -private -c -classpath core/target/scala-2.10/classes org.apache.spark.rdd.RDD > ~/hadoop2.txt

diff ~/hadoop1.txt ~/hadoop2.txt
```

However, the compiler will generate different codes for the classes which call methods of `JobContext/TaskAttemptContext`. `JobContext/TaskAttemptContext` is a class in Hadoop 1.+, and calling its method will use `invokevirtual`, while it's an interface in Hadoop 2.+, and will use `invokeinterface`.

To fix it, we can use reflection to call `JobContext/TaskAttemptContext.getConfiguration`.

Author: zsxwing <zsxwing@gmail.com>

Closes #3740 from zsxwing/SPARK-2075 and squashes the following commits:

39d9df2 [zsxwing] Fix the code style
e4ad8b5 [zsxwing] Use null for the implicit Ordering
734bac9 [zsxwing] Explicitly set the implicit parameters
ca03559 [zsxwing] Use reflection to access JobContext/TaskAttemptContext.getConfiguration
fa40db0 [zsxwing] Add an Ordering for NullWritable to make the compiler generate same byte codes for RDD
2014-12-21 22:10:19 -08:00
Sean Owen c6a3c0d505 SPARK-4910 [CORE] build failed (use of FileStatus.isFile in Hadoop 1.x)
Fix small Hadoop 1 compile error from SPARK-2261. In Hadoop 1.x, all we have is FileStatus.isDir, so these "is file" assertions are changed to "is not a dir". This is how similar checks are done so far in the code base.

Author: Sean Owen <sowen@cloudera.com>

Closes #3754 from srowen/SPARK-4910 and squashes the following commits:

52c5e4e [Sean Owen] Fix small Hadoop 1 compile error from SPARK-2261
2014-12-21 13:16:57 -08:00
huangzhaowei a764960b3b [Minor] Build Failed: value defaultProperties not found
Mvn Build Failed: value defaultProperties not found .Maybe related to this pr:
1d648123a7
andrewor14 can you look at this problem?

Author: huangzhaowei <carlmartinmax@gmail.com>

Closes #3749 from SaintBacchus/Mvn-Build-Fail and squashes the following commits:

8e2917c [huangzhaowei] Build Failed: value defaultProperties not found
2014-12-19 23:32:56 -08:00
Kanwaljit Singh 1d648123a7 SPARK-2641: Passing num executors to spark arguments from properties file
Since we can set spark executor memory and executor cores using property file, we must also be allowed to set the executor instances.

Author: Kanwaljit Singh <kanwaljit.singh@guavus.com>

Closes #1657 from kjsingh/branch-1.0 and squashes the following commits:

d8a5a12 [Kanwaljit Singh] SPARK-2641: Fixing how spark arguments are loaded from properties file for num executors

Conflicts:
	core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala
2014-12-19 19:27:23 -08:00
Marcelo Vanzin 456451911d [SPARK-2261] Make event logger use a single file.
Currently the event logger uses a directory and several files to
describe an app's event log, all but one of which are empty. This
is not very HDFS-friendly, since creating lots of nodes in HDFS
(especially when they don't contain any data) is frowned upon due
to the node metadata being kept in the NameNode's memory.

Instead, add a header section to the event log file that contains metadata
needed to read the events. This metadata includes things like the Spark
version (for future code that may need it for backwards compatibility) and
the compression codec used for the event data.

With the new approach, aside from reducing the load on the NN, there's
also a lot less remote calls needed when reading the log directory.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #1222 from vanzin/hist-server-single-log and squashes the following commits:

cc8f5de [Marcelo Vanzin] Store header in plain text.
c7e6123 [Marcelo Vanzin] Update comment.
59c561c [Marcelo Vanzin] Review feedback.
216c5a3 [Marcelo Vanzin] Review comments.
dce28e9 [Marcelo Vanzin] Fix log overwrite test.
f91c13e [Marcelo Vanzin] Handle "spark.eventLog.overwrite", and add unit test.
346f0b4 [Marcelo Vanzin] Review feedback.
ed0023e [Marcelo Vanzin] Merge branch 'master' into hist-server-single-log
3f4500f [Marcelo Vanzin] Unit test for SPARK-3697.
45c7a1f [Marcelo Vanzin] Version of SPARK-3697 for this branch.
b3ee30b [Marcelo Vanzin] Merge branch 'master' into hist-server-single-log
a6d5c50 [Marcelo Vanzin] Merge branch 'master' into hist-server-single-log
16fd491 [Marcelo Vanzin] Use unique log directory for each codec.
0ef3f70 [Marcelo Vanzin] Merge branch 'master' into hist-server-single-log
d93c44a [Marcelo Vanzin] Add a newline to make the header more readable.
9e928ba [Marcelo Vanzin] Add types.
bd6ba8c [Marcelo Vanzin] Review feedback.
a624a89 [Marcelo Vanzin] Merge branch 'master' into hist-server-single-log
04364dc [Marcelo Vanzin] Merge branch 'master' into hist-server-single-log
bb7c2d3 [Marcelo Vanzin] Fix scalastyle warning.
16661a3 [Marcelo Vanzin] Simplify some internal code.
cc6bce4 [Marcelo Vanzin] Some review feedback.
a722184 [Marcelo Vanzin] Do not encode metadata in log file name.
3700586 [Marcelo Vanzin] Restore log flushing.
f677930 [Marcelo Vanzin] Fix botched rebase.
ae571fa [Marcelo Vanzin] Fix end-to-end event logger test.
9db0efd [Marcelo Vanzin] Show prettier name in UI.
8f42274 [Marcelo Vanzin] Make history server parse old-style log directories.
6251dd7 [Marcelo Vanzin] Make event logger use a single file.
2014-12-19 18:23:42 -08:00
Ryan Williams 7981f96976 [SPARK-4896] don’t redundantly overwrite executor JAR deps
Author: Ryan Williams <ryan.blake.williams@gmail.com>

Closes #2848 from ryan-williams/fetch-file and squashes the following commits:

c14daff [Ryan Williams] Fix copy that was changed to a move inadvertently
8e39c16 [Ryan Williams] code review feedback
788ed41 [Ryan Williams] don’t redundantly overwrite executor JAR deps
2014-12-19 15:24:41 -08:00
Ryan Williams cdb2c645ab [SPARK-4889] update history server example cmds
Author: Ryan Williams <ryan.blake.williams@gmail.com>

Closes #3736 from ryan-williams/hist and squashes the following commits:

421d8ff [Ryan Williams] add another random typo fix
76d6a4c [Ryan Williams] remove hdfs example
a2d0f82 [Ryan Williams] code review feedback
9ca7629 [Ryan Williams] [SPARK-4889] update history server example cmds
2014-12-19 13:56:04 -08:00
Reynold Xin 336cd341ee Small refactoring to pass SparkEnv into Executor rather than creating SparkEnv in Executor.
This consolidates some code path and makes constructor arguments simpler for a few classes.

Author: Reynold Xin <rxin@databricks.com>

Closes #3738 from rxin/sparkEnvDepRefactor and squashes the following commits:

82e02cc [Reynold Xin] Fixed couple bugs.
217062a [Reynold Xin] Code review feedback.
bd00af7 [Reynold Xin] Small refactoring to pass SparkEnv into Executor rather than creating SparkEnv in Executor.
2014-12-19 12:51:12 -08:00
Sandy Ryza 283263ffaa SPARK-3428. TaskMetrics for running tasks is missing GC time metrics
Author: Sandy Ryza <sandy@cloudera.com>

Closes #3684 from sryza/sandy-spark-3428 and squashes the following commits:

cb827fe [Sandy Ryza] SPARK-3428. TaskMetrics for running tasks is missing GC time metrics
2014-12-18 22:40:44 -08:00
Liang-Chi Hsieh d7fc69a8b5 [SPARK-4674] Refactor getCallSite
The current version of `getCallSite` visits the collection of `StackTraceElement` twice. However, it is unnecessary since we can perform our work with a single visit. We also do not need to keep filtered `StackTraceElement`.

Author: Liang-Chi Hsieh <viirya@gmail.com>

Closes #3532 from viirya/refactor_getCallSite and squashes the following commits:

62aa124 [Liang-Chi Hsieh] Fix style.
e741017 [Liang-Chi Hsieh] Refactor getCallSite.
2014-12-18 21:41:02 -08:00
Andrew Or 9804a759b6 [SPARK-4754] Refactor SparkContext into ExecutorAllocationClient
This is such that the `ExecutorAllocationManager` does not take in the `SparkContext` with all of its dependencies as an argument. This prevents future developers of this class to tie down this class further with the `SparkContext`, which has really become quite a monstrous object.

cc'ing pwendell who originally suggested this, and JoshRosen who may have thoughts about the trait mix-in style of `SparkContext`.

Author: Andrew Or <andrew@databricks.com>

Closes #3614 from andrewor14/dynamic-allocation-sc and squashes the following commits:

187070d [Andrew Or] Merge branch 'master' of github.com:apache/spark into dynamic-allocation-sc
59baf6c [Andrew Or] Merge branch 'master' of github.com:apache/spark into dynamic-allocation-sc
347a348 [Andrew Or] Refactor SparkContext into ExecutorAllocationClient
2014-12-18 17:38:33 -08:00
Aaron Davidson 105293a7d0 [SPARK-4837] NettyBlockTransferService should use spark.blockManager.port config
This is used in NioBlockTransferService here:
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/network/nio/NioBlockTransferService.scala#L66

Author: Aaron Davidson <aaron@databricks.com>

Closes #3688 from aarondav/SPARK-4837 and squashes the following commits:

ebd2007 [Aaron Davidson] [SPARK-4837] NettyBlockTransferService should use spark.blockManager.port config
2014-12-18 16:43:16 -08:00
Ivan Vergiliev f9f58b9a01 SPARK-4743 - Use SparkEnv.serializer instead of closureSerializer in aggregateByKey and foldByKey
Author: Ivan Vergiliev <ivan@leanplum.com>

Closes #3605 from IvanVergiliev/change-serializer and squashes the following commits:

a49b7cf [Ivan Vergiliev] Use serializer instead of closureSerializer in aggregate/foldByKey.
2014-12-18 16:29:36 -08:00
Madhu Siddalingaiah d5a596d418 [SPARK-4884]: Improve Partition docs
Rewording was based on this discussion: http://apache-spark-developers-list.1001551.n3.nabble.com/RDD-data-flow-td9804.html
This is the associated JIRA ticket: https://issues.apache.org/jira/browse/SPARK-4884

Author: Madhu Siddalingaiah <madhu@madhu.com>

Closes #3722 from msiddalingaiah/master and squashes the following commits:

79e679f [Madhu Siddalingaiah] [DOC]: improve documentation
51d14b9 [Madhu Siddalingaiah] Merge remote-tracking branch 'upstream/master'
38faca4 [Madhu Siddalingaiah] Merge remote-tracking branch 'upstream/master'
cbccbfe [Madhu Siddalingaiah] Documentation: replace <b> with <code> (again)
332f7a2 [Madhu Siddalingaiah] Documentation: replace <b> with <code>
cd2b05a [Madhu Siddalingaiah] Merge remote-tracking branch 'upstream/master'
0fc12d7 [Madhu Siddalingaiah] Documentation: add description for repartitionAndSortWithinPartitions
2014-12-18 16:00:53 -08:00
Ilya Ganelin 3720057b8e [SPARK-3607] ConnectionManager threads.max configs on the thread pools don't work
Hi all - cleaned up the code to get rid of the unused parameter and added some discussion of the ThreadPoolExecutor parameters to explain why we can use a single threadCount instead of providing a min/max.

Author: Ilya Ganelin <ilya.ganelin@capitalone.com>

Closes #3664 from ilganeli/SPARK-3607C and squashes the following commits:

3c05690 [Ilya Ganelin] Updated documentation and refactored code to extract shared variables
2014-12-18 12:53:18 -08:00
Saisai Shao cf50631a66 [SPARK-4595][Core] Fix MetricsServlet not work issue
`MetricsServlet` handler should be added to the web UI after initialized by `MetricsSystem`, otherwise servlet handler cannot be attached.

Author: Saisai Shao <saisai.shao@intel.com>
Author: Josh Rosen <joshrosen@databricks.com>
Author: jerryshao <saisai.shao@intel.com>

Closes #3444 from jerryshao/SPARK-4595 and squashes the following commits:

434d17e [Saisai Shao] Merge pull request #10 from JoshRosen/metrics-system-cleanup
87a2292 [Josh Rosen] Guard against misuse of MetricsSystem methods.
f779fe0 [jerryshao] Fix MetricsServlet not work issue
2014-12-17 11:47:44 -08:00
Davies Liu ed362008f0 [SPARK-4437] update doc for WholeCombineFileRecordReader
update doc for WholeCombineFileRecordReader

Author: Davies Liu <davies@databricks.com>
Author: Josh Rosen <joshrosen@databricks.com>

Closes #3301 from davies/fix_doc and squashes the following commits:

1d7422f [Davies Liu] Merge pull request #2 from JoshRosen/whole-text-file-cleanup
dc3d21a [Josh Rosen] More genericization in ConfigurableCombineFileRecordReader.
95d13eb [Davies Liu] address comment
bf800b9 [Davies Liu] update doc for WholeCombineFileRecordReader
2014-12-16 11:19:36 -08:00
meiyoula c7628771da [SPARK-4792] Add error message when making local dir unsuccessfully
Author: meiyoula <1039320815@qq.com>

Closes #3635 from XuTingjun/master and squashes the following commits:

dd1c66d [meiyoula] when old is deleted, it will throw an exception where call it
2a55bc2 [meiyoula] Update DiskBlockManager.scala
1483a4a [meiyoula] Delete multiple retries to make dir
67f7902 [meiyoula] Try some times to make dir maybe more reasonable
1c51a0c [meiyoula] Update DiskBlockManager.scala
2014-12-15 22:30:18 -08:00
wangfei 5c24759ddc [Minor][Core] fix comments in MapOutputTracker
Using driver and executor in the comments of ```MapOutputTracker``` is more clear.

Author: wangfei <wangfei1@huawei.com>

Closes #3700 from scwf/commentFix and squashes the following commits:

aa68524 [wangfei] master and worker should be driver and executor
2014-12-15 16:46:43 -08:00
Sean Owen 2a28bc6100 SPARK-785 [CORE] ClosureCleaner not invoked on most PairRDDFunctions
This looked like perhaps a simple and important one. `combineByKey` looks like it should clean its arguments' closures, and that in turn covers apparently all remaining functions in `PairRDDFunctions` which delegate to it.

Author: Sean Owen <sowen@cloudera.com>

Closes #3690 from srowen/SPARK-785 and squashes the following commits:

8df68fe [Sean Owen] Clean context of most remaining functions in PairRDDFunctions, which ultimately call combineByKey
2014-12-15 16:06:15 -08:00
Ryan Williams 8176b7a02e [SPARK-4668] Fix some documentation typos.
Author: Ryan Williams <ryan.blake.williams@gmail.com>

Closes #3523 from ryan-williams/tweaks and squashes the following commits:

d2eddaa [Ryan Williams] code review feedback
ce27fc1 [Ryan Williams] CoGroupedRDD comment nit
c6cfad9 [Ryan Williams] remove unnecessary if statement
b74ea35 [Ryan Williams] comment fix
b0221f0 [Ryan Williams] fix a gendered pronoun
c71ffed [Ryan Williams] use names on a few boolean parameters
89954aa [Ryan Williams] clarify some comments in {Security,Shuffle}Manager
e465dac [Ryan Williams] Saved building-spark.md with Dillinger.io
83e8358 [Ryan Williams] fix pom.xml typo
dc4662b [Ryan Williams] typo fixes in tuning.md, configuration.md
2014-12-15 14:52:17 -08:00
Ilya Ganelin 38703bbca8 [SPARK-1037] The name of findTaskFromList & findTask in TaskSetManager.scala is confusing
Hi all - I've renamed the methods referenced in this JIRA to clarify that they modify the provided arrays (find vs. deque).

Author: Ilya Ganelin <ilya.ganelin@capitalone.com>

Closes #3665 from ilganeli/SPARK-1037B and squashes the following commits:

64c177c [Ilya Ganelin] Renamed deque to dequeue
f27d85e [Ilya Ganelin] Renamed private methods to clarify that they modify the provided parameters
683482a [Ilya Ganelin] Renamed private methods to clarify that they modify the provided parameters
2014-12-15 14:51:15 -08:00
Zhang, Liye 57d37f9c71 [CORE]codeStyle: uniform ConcurrentHashMap define in StorageLevel.scala with other places
Author: Zhang, Liye <liye.zhang@intel.com>

Closes #2793 from liyezhang556520/uniformHashMap and squashes the following commits:

5884735 [Zhang, Liye] [CORE]codeStyle: uniform ConcurrentHashMap define in StorageLevel.scala
2014-12-10 20:44:59 -08:00
Andrew Or 4f93d0cabe [SPARK-4759] Fix driver hanging from coalescing partitions
The driver hangs sometimes when we coalesce RDD partitions. See JIRA for more details and reproduction.

This is because our use of empty string as default preferred location in `CoalescedRDDPartition` causes the `TaskSetManager` to schedule the corresponding task on host `""` (empty string). The intended semantics here, however, is that the partition does not have a preferred location, and the TSM should schedule the corresponding task accordingly.

Author: Andrew Or <andrew@databricks.com>

Closes #3633 from andrewor14/coalesce-preferred-loc and squashes the following commits:

e520d6b [Andrew Or] Oops
3ebf8bd [Andrew Or] A few comments
f370a4e [Andrew Or] Fix tests
2f7dfb6 [Andrew Or] Avoid using empty string as default preferred location
2014-12-10 14:27:53 -08:00
Ilya Ganelin 447ae2de5d [SPARK-4569] Rename 'externalSorting' in Aggregator
Hi all - I've renamed the unhelpfully named variable and added a comment clarifying what's actually happening.

Author: Ilya Ganelin <ilya.ganelin@capitalone.com>

Closes #3666 from ilganeli/SPARK-4569B and squashes the following commits:

1810394 [Ilya Ganelin] [SPARK-4569] Rename 'externalSorting' in Aggregator
e2d2092 [Ilya Ganelin] [SPARK-4569] Rename 'externalSorting' in Aggregator
d7cefec [Ilya Ganelin] [SPARK-4569] Rename 'externalSorting' in Aggregator
5b3f39c [Ilya Ganelin] [SPARK-4569] Rename  in Aggregator
2014-12-10 14:19:37 -08:00
Andrew Or faa8fd8178 [SPARK-4215] Allow requesting / killing executors only in YARN mode
Currently this doesn't do anything in other modes, so we might as well just disable it rather than having the user mistakenly rely on it.

Author: Andrew Or <andrew@databricks.com>

Closes #3615 from andrewor14/dynamic-allocation-yarn-only and squashes the following commits:

ce6487a [Andrew Or] Allow requesting / killing executors only in YARN mode
2014-12-10 12:48:24 -08:00
Kousuke Saruta 0fc637b4c2 [SPARK-4329][WebUI] HistoryPage pagenation
Current HistoryPage have links only to previous page or next page.
I suggest to add index to access history pages easily.

I implemented like following pics.

If there are many pages, current page +/- N pages, head page and last page are indexed.

![2014-11-10 16 13 25](https://cloud.githubusercontent.com/assets/4736016/4986246/9c7bbac4-6937-11e4-8695-8634d039d5b6.png)
![2014-11-10 16 03 21](https://cloud.githubusercontent.com/assets/4736016/4986210/3951bb74-6937-11e4-8b4e-9f90d266d736.png)
![2014-11-10 16 03 39](https://cloud.githubusercontent.com/assets/4736016/4986211/3b196ad8-6937-11e4-9f81-74bc0a6dad5b.png)
![2014-11-10 16 03 49](https://cloud.githubusercontent.com/assets/4736016/4986213/40686138-6937-11e4-86c0-41100f0404f6.png)
![2014-11-10 16 04 04](https://cloud.githubusercontent.com/assets/4736016/4986215/4326c9b4-6937-11e4-87ac-0f30c86ec6e3.png)

Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>

Closes #3194 from sarutak/history-page-indexing and squashes the following commits:

15d3d2d [Kousuke Saruta] Simplified code
c93932e [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into history-page-indexing
1c2f605 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into history-page-indexing
76b05e3 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into history-page-indexing
b2240f8 [Kousuke Saruta] Fixed style
ec7922e [Kousuke Saruta] Simplified code
755a004 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into history-page-indexing
cfa242b [Kousuke Saruta] Added index to HistoryPage
2014-12-10 12:30:45 -08:00
Nathan Kronenfeld 94b377f944 [SPARK-4772] Clear local copies of accumulators as soon as we're done with them
Accumulators keep thread-local copies of themselves.  These copies were only cleared at the beginning of a task.  This meant that (a) the memory they used was tied up until the next task ran on that thread, and (b) if a thread died, the memory it had used for accumulators was locked up forever on that worker.

This PR clears the thread-local copies of accumulators at the end of each task, in the tasks finally block, to make sure they are cleaned up between tasks.  It also stores them in a ThreadLocal object, so that if, for some reason, the thread dies, any memory they are using at the time should be freed up.

Author: Nathan Kronenfeld <nkronenfeld@oculusinfo.com>

Closes #3570 from nkronenfeld/Accumulator-Improvements and squashes the following commits:

a581f3f [Nathan Kronenfeld] Change Accumulators to private[spark] instead of adding mima exclude to get around false positive in mima tests
b6c2180 [Nathan Kronenfeld] Include MiMa exclude as per build error instructions - this version incompatibility should be irrelevent, as it will only surface if a master is talking to a worker running a different version of spark.
537baad [Nathan Kronenfeld] Fuller refactoring as intended, incorporating JR's suggestions for ThreadLocal localAccums, and keeping clear(), but also calling it in tasks' finally block, rather than just at the beginning of the task.
39a82f2 [Nathan Kronenfeld] Clear local copies of accumulators as soon as we're done with them
2014-12-09 23:53:17 -08:00
Josh Rosen f79c1cfc99 [Minor] Use <sup> tag for help icon in web UI page header
This small commit makes the `(?)` web UI help link into a superscript, which should address feedback that the current design makes it look like an error occurred or like information is missing.

Before:

![image](https://cloud.githubusercontent.com/assets/50748/5370611/a3ed0034-7fd9-11e4-870f-05bd9faad5b9.png)

After:

![image](https://cloud.githubusercontent.com/assets/50748/5370602/6c5ca8d6-7fd9-11e4-8d1a-568d71290aa7.png)

Author: Josh Rosen <joshrosen@databricks.com>

Closes #3659 from JoshRosen/webui-help-sup and squashes the following commits:

bd72899 [Josh Rosen] Use <sup> tag for help icon in web UI page header.
2014-12-09 23:47:05 -08:00
Sandy Ryza 5e4c06f8e5 SPARK-4567. Make SparkJobInfo and SparkStageInfo serializable
Author: Sandy Ryza <sandy@cloudera.com>

Closes #3426 from sryza/sandy-spark-4567 and squashes the following commits:

cb4b8d2 [Sandy Ryza] SPARK-4567. Make SparkJobInfo and SparkStageInfo serializable
2014-12-09 16:26:07 -08:00
hushan[胡珊] 30dca924df [SPARK-4714] BlockManager.dropFromMemory() should check whether block has been removed after synchronizing on BlockInfo instance.
After synchronizing on the `info` lock in the `removeBlock`/`dropOldBlocks`/`dropFromMemory` methods in BlockManager, the block that `info` represented may have already removed.

The three methods have the same logic to get the `info` lock:
```
   info = blockInfo.get(id)
   if (info != null) {
     info.synchronized {
       // do something
     }
   }
```

So, there is chance that when a thread enters the `info.synchronized` block, `info` has already been removed from the `blockInfo` map by some other thread who entered `info.synchronized` first.

The `removeBlock` and `dropOldBlocks` methods are idempotent, so it's safe for them to run on blocks that have already been removed.
But in `dropFromMemory` it may be problematic since it may drop block data which already removed into the diskstore, and this calls data store operations that are not designed to handle missing blocks.

This patch fixes this issue by adding a check to `dropFromMemory` to test whether blocks have been removed by a racing thread.

Author: hushan[胡珊] <hushan@xiaomi.com>

Closes #3574 from suyanNone/refine-block-concurrency and squashes the following commits:

edb989d [hushan[胡珊]] Refine code style and comments position
55fa4ba [hushan[胡珊]] refine code
e57e270 [hushan[胡珊]] add check info is already remove or not while having gotten info.syn
2014-12-09 15:54:40 -08:00
Kay Ousterhout 1f5110630c [SPARK-4765] Make GC time always shown in UI.
This commit removes the GC time for each task from the set of
optional, additional metrics, and instead always shows it for
each task.

cc pwendell

Author: Kay Ousterhout <kayousterhout@gmail.com>

Closes #3622 from kayousterhout/gc_time and squashes the following commits:

15ac242 [Kay Ousterhout] Make TaskDetailsClassNames private[spark]
e71d893 [Kay Ousterhout] [SPARK-4765] Make GC time always shown in UI.
2014-12-09 15:10:36 -08:00
maji2014 b31074466a [SPARK-4691][shuffle] Restructure a few lines in shuffle code
In HashShuffleReader.scala and HashShuffleWriter.scala, no need to judge "dep.aggregator.isEmpty" again as this is judged by "dep.aggregator.isDefined"

In SortShuffleWriter.scala, "dep.aggregator.isEmpty"  is better than "!dep.aggregator.isDefined" ?

Author: maji2014 <maji3@asiainfo.com>

Closes #3553 from maji2014/spark-4691 and squashes the following commits:

bf7b14d [maji2014] change a elegant way for SortShuffleWriter.scala
10d0cf0 [maji2014] change a elegant way
d8f52dc [maji2014] code optimization for judgement
2014-12-09 13:13:12 -08:00
Sean Owen e829bfa1ab SPARK-3926 [CORE] Reopened: result of JavaRDD collectAsMap() is not serializable
My original 'fix' didn't fix at all. Now, there's a unit test to check whether it works. Of the two options to really fix it -- copy the `Map` to a `java.util.HashMap`, or copy and modify Scala's implementation in `Wrappers.MapWrapper`, I went with the latter.

Author: Sean Owen <sowen@cloudera.com>

Closes #3587 from srowen/SPARK-3926 and squashes the following commits:

8586bb9 [Sean Owen] Remove unneeded no-arg constructor, and add additional note about copied code in LICENSE
7bb0e66 [Sean Owen] Make SerializableMapWrapper actually serialize, and add unit test
2014-12-08 16:13:03 -08:00
Andrew Or 65f929d5b3 [SPARK-4750] Dynamic allocation - synchronize kills
Simple omission on my part.

Author: Andrew Or <andrew@databricks.com>

Closes #3612 from andrewor14/dynamic-allocation-synchronization and squashes the following commits:

1f03b60 [Andrew Or] Synchronize kills
2014-12-08 16:02:33 -08:00
Christophe Préaud ab2abcb5ef [SPARK-4764] Ensure that files are fetched atomically
tempFile is created in the same directory than targetFile, so that the
move from tempFile to targetFile is always atomic

Author: Christophe Préaud <christophe.preaud@kelkoo.com>

Closes #2855 from preaudc/master and squashes the following commits:

9ba89ca [Christophe Préaud] Ensure that files are fetched atomically
54419ae [Christophe Préaud] Merge remote-tracking branch 'upstream/master'
c6a5590 [Christophe Préaud] Revert commit 8ea871f8130b2490f1bad7374a819bf56f0ccbbd
7456a33 [Christophe Préaud] Merge remote-tracking branch 'upstream/master'
8ea871f [Christophe Préaud] Ensure that files are fetched atomically
2014-12-08 11:44:54 -08:00
Zhang, Liye 98a7d09978 [SPARK-4005][CORE] handle message replies in receive instead of in the individual private methods
In BlockManagermasterActor, when handling message type UpdateBlockInfo, the message replies is in handled in individual private methods, should handle it in receive of Akka.

Author: Zhang, Liye <liye.zhang@intel.com>

Closes #2853 from liyezhang556520/akkaRecv and squashes the following commits:

9b06f0a [Zhang, Liye] remove the unreachable code
bf518cd [Zhang, Liye] change the indent
242166b [Zhang, Liye] modified accroding to the comments
d4b929b [Zhang, Liye] [SPARK-4005][CORE] handle message replies in receive instead of in the individual private methods
2014-12-05 12:00:32 -08:00
Reynold Xin ed92b47e83 [SPARK-4397] Move object RDD to the front of RDD.scala.
I ran into multiple cases that SBT/Scala compiler was confused by the implicits in continuous compilation mode. Adding explicit return types fixes the problem.

Author: Reynold Xin <rxin@databricks.com>

Closes #3580 from rxin/rdd-implicit and squashes the following commits:

ee32fcd [Reynold Xin] Move object RDD to the end of the file.
b8562c9 [Reynold Xin] Merge branch 'master' of github.com:apache/spark into rdd-implicit
d4e9f85 [Reynold Xin] Code review.
a836a37 [Reynold Xin] Move object RDD to the front of RDD.scala.
2014-12-04 16:32:20 -08:00
Saldanha 743a889d27 [SPARK-4459] Change groupBy type parameter from K to U
Please see https://issues.apache.org/jira/browse/SPARK-4459

Author: Saldanha <saldaal1@phusca-l24858.wlan.na.novartis.net>

Closes #3327 from alokito/master and squashes the following commits:

54b1095 [Saldanha] [SPARK-4459] changed type parameter for keyBy from K to U
d5f73c3 [Saldanha] [SPARK-4459] added keyBy test
316ad77 [Saldanha] SPARK-4459 changed type parameter for groupBy from K to U.
62ddd4b [Saldanha] SPARK-4459 added failing unit test
2014-12-04 14:57:41 -08:00