Commit graph

1421 commits

Author SHA1 Message Date
Hong Shen 8e1c00dbf4 [SPARK-6738] [CORE] Improve estimate the size of a large array
Currently, SizeEstimator.visitArray is not correct in the follow case,
```
array size > 200,
elem has the share object
```

when I add a debug log in SizeTracker.scala:
```
 System.err.println(s"numUpdates:$numUpdates, size:$ts, bytesPerUpdate:$bytesPerUpdate, cost time:$b")
```
I get the following log:
```
 numUpdates:1, size:262448, bytesPerUpdate:0.0, cost time:35
 numUpdates:2, size:420698, bytesPerUpdate:158250.0, cost time:35
 numUpdates:4, size:420754, bytesPerUpdate:28.0, cost time:32
 numUpdates:7, size:420754, bytesPerUpdate:0.0, cost time:27
 numUpdates:12, size:420754, bytesPerUpdate:0.0, cost time:28
 numUpdates:20, size:420754, bytesPerUpdate:0.0, cost time:25
 numUpdates:32, size:420754, bytesPerUpdate:0.0, cost time:21
 numUpdates:52, size:420754, bytesPerUpdate:0.0, cost time:20
 numUpdates:84, size:420754, bytesPerUpdate:0.0, cost time:20
 numUpdates:135, size:420754, bytesPerUpdate:0.0, cost time:20
 numUpdates:216, size:420754, bytesPerUpdate:0.0, cost time:11
 numUpdates:346, size:420754, bytesPerUpdate:0.0, cost time:6
 numUpdates:554, size:488911, bytesPerUpdate:327.67788461538464, cost time:8
 numUpdates:887, size:2312259426, bytesPerUpdate:6942253.798798799, cost time:198
15/04/21 14:27:26 INFO collection.ExternalAppendOnlyMap: Thread 51 spilling in-memory map of 3.0 GB to disk (1 time so far)
15/04/21 14:27:26 INFO collection.ExternalAppendOnlyMap: /data11/yarnenv/local/usercache/spark/appcache/application_1426746631567_11745/spark-local-20150421142719-c001/30/temp_local_066af981-c2fc-4b70-a00e-110e23006fbc
```
But in fact the file size is only 162K:
```
$ ll -h /data11/yarnenv/local/usercache/spark/appcache/application_1426746631567_11745/spark-local-20150421142719-c001/30/temp_local_066af981-c2fc-4b70-a00e-110e23006fbc
-rw-r----- 1 spark users 162K Apr 21 14:27 /data11/yarnenv/local/usercache/spark/appcache/application_1426746631567_11745/spark-local-20150421142719-c001/30/temp_local_066af981-c2fc-4b70-a00e-110e23006fbc
```

In order to test case, I change visitArray to:
```
       var size = 0l
         for (i <- 0 until length) {
          val obj = JArray.get(array, i)
          size += SizeEstimator.estimate(obj, state.visited).toLong
        }
       state.size += size
```
I get the following log:
```
...
14895 277016088 566.9046118590662 time:8470
23832 281840544 552.3308270676691 time:8031
38132 289891824 539.8294729775092 time:7897
61012 302803640 563.0265734265735 time:13044
97620 322904416 564.3276223776223 time:13554
15/04/14 11:46:43 INFO collection.ExternalAppendOnlyMap: Thread 51 spilling in-memory map of 314.5 MB to disk (1 time so far)
15/04/14 11:46:43 INFO collection.ExternalAppendOnlyMap: /data1/yarnenv/local/usercache/spark/appcache/application_1426746631567_8477/spark-local-20150414114020-2fcb/14/temp_local_5b6b98d5-5bfa-47e2-8216-059482ccbda0
```
 the file size is 85M.
```
$ ll -h /data1/yarnenv/local/usercache/spark/appcache/application_1426746631567_8477/spark- local-20150414114020-2fcb/14/
total 85M
-rw-r----- 1 spark users 85M Apr 14 11:46 temp_local_5b6b98d5-5bfa-47e2-8216-059482ccbda0
```

The following log is when I use this patch,
```
....
numUpdates:32, size:365484, bytesPerUpdate:0.0, cost time:7
numUpdates:52, size:365484, bytesPerUpdate:0.0, cost time:5
numUpdates:84, size:365484, bytesPerUpdate:0.0, cost time:5
numUpdates:135, size:372208, bytesPerUpdate:131.84313725490196, cost time:86
numUpdates:216, size:379020, bytesPerUpdate:84.09876543209876, cost time:21
numUpdates:346, size:1865208, bytesPerUpdate:11432.215384615385, cost time:23
numUpdates:554, size:2052380, bytesPerUpdate:899.8653846153846, cost time:16
numUpdates:887, size:2142820, bytesPerUpdate:271.59159159159157, cost time:15
..
numUpdates:14895, size:251675500, bytesPerUpdate:438.5263157894737, cost time:13
numUpdates:23832, size:257010268, bytesPerUpdate:596.9305135951662, cost time:14
numUpdates:38132, size:263922396, bytesPerUpdate:483.3655944055944, cost time:15
numUpdates:61012, size:268962596, bytesPerUpdate:220.28846153846155, cost time:24
numUpdates:97620, size:286980644, bytesPerUpdate:492.1888111888112, cost time:22
15/04/21 14:45:12 INFO collection.ExternalAppendOnlyMap: Thread 53 spilling in-memory map of 328.7 MB to disk (1 time so far)
15/04/21 14:45:12 INFO collection.ExternalAppendOnlyMap: /data4/yarnenv/local/usercache/spark/appcache/application_1426746631567_11758/spark-local-20150421144456-a2a5/2a/temp_local_9c109510-af16-4468-8f23-48cad04da88f
```
 the file size is 88M.
```
$ ll -h /data4/yarnenv/local/usercache/spark/appcache/application_1426746631567_11758/spark-local-20150421144456-a2a5/2a/
total 88M
-rw-r----- 1 spark users 88M Apr 21 14:45 temp_local_9c109510-af16-4468-8f23-48cad04da88f
```

Author: Hong Shen <hongshen@tencent.com>

Closes #5608 from shenh062326/my_change5 and squashes the following commits:

5506bae [Hong Shen] Fix compile error
c275dd3 [Hong Shen] Alter code style
fe202a2 [Hong Shen] Change the code style and add documentation.
a9fca84 [Hong Shen] Add test case for SizeEstimator
4877eee [Hong Shen] Improve estimate the size of a large array
a2ea7ac [Hong Shen] Alter code style
4c28e36 [Hong Shen] Improve estimate the size of a large array
2015-04-27 18:59:45 -04:00
Steven She b9de9e040a [SPARK-7103] Fix crash with SparkContext.union when RDD has no partitioner
Added a check to the SparkContext.union method to check that a partitioner is defined on all RDDs when instantiating a PartitionerAwareUnionRDD.

Author: Steven She <steven@canopylabs.com>

Closes #5679 from stevencanopy/SPARK-7103 and squashes the following commits:

5a3d846 [Steven She] SPARK-7103: Fix crash with SparkContext.union when at least one RDD has no partitioner
2015-04-27 18:55:02 -04:00
Kay Ousterhout 03e85b4a11 [SPARK-7046] Remove InputMetrics from BlockResult
This is a code cleanup.

The BlockResult class originally contained an InputMetrics object so that InputMetrics could
directly be used as the InputMetrics for the whole task. Now we copy the fields out of here, and
the presence of this object is confusing because it's only a partial input metrics (it doesn't
include the records read). Because this object is no longer useful (and is confusing), it should
be removed.

Author: Kay Ousterhout <kayousterhout@gmail.com>

Closes #5627 from kayousterhout/SPARK-7046 and squashes the following commits:

bf64bbe [Kay Ousterhout] Import fix
a08ca19 [Kay Ousterhout] [SPARK-7046] Remove InputMetrics from BlockResult
2015-04-22 21:42:09 -07:00
zsxwing 33b85620f9 [SPARK-7052][Core] Add ThreadUtils and move thread methods from Utils to ThreadUtils
As per rxin 's suggestion in https://github.com/apache/spark/pull/5392/files#r28757176

What's more, there is a race condition in the global shared `daemonThreadFactoryBuilder`. `daemonThreadFactoryBuilder` may be modified by multiple threads. This PR removed the global `daemonThreadFactoryBuilder` and created a new `ThreadFactoryBuilder` every time.

Author: zsxwing <zsxwing@gmail.com>

Closes #5631 from zsxwing/thread-utils and squashes the following commits:

9fe5b0e [zsxwing] Add ThreadUtils and move thread methods from Utils to ThreadUtils
2015-04-22 11:08:59 -07:00
zsxwing 3a3f7100f4 [SPARK-6490][Docs] Add docs for rpc configurations
Added docs for rpc configurations and also fixed two places that should have been fixed in #5595.

Author: zsxwing <zsxwing@gmail.com>

Closes #5607 from zsxwing/SPARK-6490-docs and squashes the following commits:

25a6736 [zsxwing] Increase the default timeout to 120s
6e37c30 [zsxwing] Update docs
5577540 [zsxwing] Use spark.network.timeout as the default timeout if it presents
4f07174 [zsxwing] Fix unit tests
1c2cf26 [zsxwing] Add docs for rpc configurations
2015-04-21 18:37:53 -07:00
Marcelo Vanzin e72c16e30d [SPARK-6014] [core] Revamp Spark shutdown hooks, fix shutdown races.
This change adds some new utility code to handle shutdown hooks in
Spark. The main goal is to take advantage of Hadoop 2.x's API for
shutdown hooks, which allows Spark to register a hook that will
run before the one that cleans up HDFS clients, and thus avoids
some races that would cause exceptions to show up and other issues
such as failure to properly close event logs.

Unfortunately, Hadoop 1.x does not have such APIs, so in that case
correctness is still left to chance.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #5560 from vanzin/SPARK-6014 and squashes the following commits:

edfafb1 [Marcelo Vanzin] Better scaladoc.
fcaeedd [Marcelo Vanzin] Merge branch 'master' into SPARK-6014
e7039dc [Marcelo Vanzin] [SPARK-6014] [core] Revamp Spark shutdown hooks, fix shutdown races.
2015-04-21 20:33:57 -04:00
Josh Rosen f83c0f112d [SPARK-3386] Share and reuse SerializerInstances in shuffle paths
This patch modifies several shuffle-related code paths to share and re-use SerializerInstances instead of creating new ones.  Some serializers, such as KryoSerializer or SqlSerializer, can be fairly expensive to create or may consume moderate amounts of memory, so it's probably best to avoid unnecessary serializer creation in hot code paths.

The key change in this patch is modifying `getDiskWriter()` / `DiskBlockObjectWriter` to accept `SerializerInstance`s instead of `Serializer`s (which are factories for instances).  This allows the disk writer's creator to decide whether the serializer instance can be shared or re-used.

The rest of the patch modifies several write and read paths to use shared serializers.  One big win is in `ShuffleBlockFetcherIterator`, where we used to create a new serializer per received block.  Similarly, the shuffle write path used to create a new serializer per file even though in many cases only a single thread would be writing to a file at a time.

I made a small serializer reuse optimization in CoarseGrainedExecutorBackend as well, since it seemed like a small and obvious improvement.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #5606 from JoshRosen/SPARK-3386 and squashes the following commits:

f661ce7 [Josh Rosen] Remove thread local; add comment instead
64f8398 [Josh Rosen] Use ThreadLocal for serializer instance in CoarseGrainedExecutorBackend
aeb680e [Josh Rosen] [SPARK-3386] Reuse SerializerInstance in shuffle code paths
2015-04-21 16:24:15 -07:00
zsxwing 8136810dfa [SPARK-6490][Core] Add spark.rpc.* and deprecate spark.akka.*
Deprecated `spark.akka.num.retries`, `spark.akka.retry.wait`, `spark.akka.askTimeout`,  `spark.akka.lookupTimeout`, and added `spark.rpc.num.retries`, `spark.rpc.retry.wait`, `spark.rpc.askTimeout`, `spark.rpc.lookupTimeout`.

Author: zsxwing <zsxwing@gmail.com>

Closes #5595 from zsxwing/SPARK-6490 and squashes the following commits:

e0d80a9 [zsxwing] Use getTimeAsMs and getTimeAsSeconds and other minor fixes
31dbe69 [zsxwing] Add spark.rpc.* and deprecate spark.akka.*
2015-04-20 23:18:42 -07:00
GuoQiang Li 0424da68d4 [SPARK-6963][CORE]Flaky test: o.a.s.ContextCleanerSuite automatically cleanup checkpoint
cc andrewor14

Author: GuoQiang Li <witgo@qq.com>

Closes #5548 from witgo/SPARK-6963 and squashes the following commits:

964aea7 [GuoQiang Li] review commits
b08b3c9 [GuoQiang Li] Flaky test: o.a.s.ContextCleanerSuite automatically cleanup checkpoint
2015-04-19 09:37:09 +01:00
Olivier Girardot 8fbd45c74e SPARK-6993 : Add default min, max methods for JavaDoubleRDD
The default method will use Guava's Ordering instead of
java.util.Comparator.naturalOrder() because it's not available
in Java 7, only in Java 8.

Author: Olivier Girardot <o.girardot@lateral-thoughts.com>

Closes #5571 from ogirardot/master and squashes the following commits:

7fe2e9e [Olivier Girardot] SPARK-6993 : Add default min, max methods for JavaDoubleRDD
2015-04-18 18:21:44 -07:00
Marcelo Vanzin 1991337336 [SPARK-5933] [core] Move config deprecation warnings to SparkConf.
I didn't find many deprecated configs after a grep-based search,
but the ones I could find were moved to the centralized location
in SparkConf.

While there, I deprecated a couple more HS configs that mentioned
time units.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #5562 from vanzin/SPARK-5933 and squashes the following commits:

dcb617e7 [Marcelo Vanzin] [SPARK-5933] [core] Move config deprecation warnings to SparkConf.
2015-04-17 19:02:07 -07:00
Jongyoul Lee 6fbeb82e13 [SPARK-6350][Mesos] Make mesosExecutorCores configurable in mesos "fine-grained" mode
- Defined executorCores from "spark.mesos.executor.cores"
- Changed the amount of mesosExecutor's cores to executorCores.
- Added new configuration option on running-on-mesos.md

Author: Jongyoul Lee <jongyoul@gmail.com>

Closes #5063 from jongyoul/SPARK-6350 and squashes the following commits:

9238d6e [Jongyoul Lee] [SPARK-6350][Mesos] Make mesosExecutorCores configurable in mesos "fine-grained" mode - Fixed docs - Changed configuration name - Made mesosExecutorCores private
2d41241 [Jongyoul Lee] [SPARK-6350][Mesos] Make mesosExecutorCores configurable in mesos "fine-grained" mode - Fixed docs
89edb4f [Jongyoul Lee] [SPARK-6350][Mesos] Make mesosExecutorCores configurable in mesos "fine-grained" mode - Fixed docs
8ba7694 [Jongyoul Lee] [SPARK-6350][Mesos] Make mesosExecutorCores configurable in mesos "fine-grained" mode - Fixed docs
7549314 [Jongyoul Lee] [SPARK-6453][Mesos] Some Mesos*Suite have a different package with their classes - Fixed docs
4ae7b0c [Jongyoul Lee] [SPARK-6453][Mesos] Some Mesos*Suite have a different package with their classes - Removed TODO
c27efce [Jongyoul Lee] [SPARK-6453][Mesos] Some Mesos*Suite have a different package with their classes - Fixed Mesos*Suite for supporting integer WorkerOffers - Fixed Documentation
1fe4c03 [Jongyoul Lee] [SPARK-6453][Mesos] Some Mesos*Suite have a different package with their classes - Change available resources of cpus to integer value beacuse WorkerOffer support the amount cpus as integer value
5f3767e [Jongyoul Lee] Revert "[SPARK-6350][Mesos] Make mesosExecutorCores configurable in mesos "fine-grained" mode"
4b7c69e [Jongyoul Lee] [SPARK-6350][Mesos] Make mesosExecutorCores configurable in mesos "fine-grained" mode - Changed configruation name and description from "spark.mesos.executor.cores" to "spark.executor.frameworkCores"
0556792 [Jongyoul Lee] [SPARK-6350][Mesos] Make mesosExecutorCores configurable in mesos "fine-grained" mode - Defined executorCores from "spark.mesos.executor.cores" - Changed the amount of mesosExecutor's cores to executorCores. - Added new configuration option on running-on-mesos.md
2015-04-17 18:30:55 -07:00
Ilya Ganelin c5ed510135 [SPARK-6703][Core] Provide a way to discover existing SparkContext's
I've added a static getOrCreate method to the static SparkContext object that allows one to either retrieve a previously created SparkContext or to instantiate a new one with the provided config. The method accepts an optional SparkConf to make usage intuitive.

Still working on a test for this, basically want to create a new context from scratch, then ensure that subsequent calls don't overwrite that.

Author: Ilya Ganelin <ilya.ganelin@capitalone.com>

Closes #5501 from ilganeli/SPARK-6703 and squashes the following commits:

db9a963 [Ilya Ganelin] Closing second spark context
1dc0444 [Ilya Ganelin] Added ref equality check
8c884fa [Ilya Ganelin] Made getOrCreate synchronized
cb0c6b7 [Ilya Ganelin] Doc updates and code cleanup
270cfe3 [Ilya Ganelin] [SPARK-6703] Documentation fixes
15e8dea [Ilya Ganelin] Updated comments and added MiMa Exclude
0e1567c [Ilya Ganelin] Got rid of unecessary option for AtomicReference
dfec4da [Ilya Ganelin] Changed activeContext to AtomicReference
733ec9f [Ilya Ganelin] Fixed some bugs in test code
8be2f83 [Ilya Ganelin] Replaced match with if
e92caf7 [Ilya Ganelin] [SPARK-6703] Added test to ensure that getOrCreate both allows creation, retrieval, and a second context if desired
a99032f [Ilya Ganelin] Spacing fix
d7a06b8 [Ilya Ganelin] Updated SparkConf class to add getOrCreate method. Started test suite implementation
2015-04-17 18:28:42 -07:00
Marcelo Vanzin 4527761bcd [SPARK-6046] [core] Reorganize deprecated config support in SparkConf.
This change tries to follow the chosen way for handling deprecated
configs in SparkConf: all values (old and new) are kept in the conf
object, and newer names take precedence over older ones when
retrieving the value.

Warnings are logged when config options are set, which generally happens
on the driver node (where the logs are most visible).

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #5514 from vanzin/SPARK-6046 and squashes the following commits:

9371529 [Marcelo Vanzin] Avoid math.
6cf3f11 [Marcelo Vanzin] Review feedback.
2445d48 [Marcelo Vanzin] Fix (and cleanup) update interval initialization.
b6824be [Marcelo Vanzin] Clean up the other deprecated config use also.
ab20351 [Marcelo Vanzin] Update FsHistoryProvider to only retrieve new config key.
2c93209 [Marcelo Vanzin] [SPARK-6046] [core] Reorganize deprecated config support in SparkConf.
2015-04-17 11:06:01 +01:00
Sean Owen f7a25644ed SPARK-6846 [WEBUI] Stage kill URL easy to accidentally trigger and possibility for security issue
kill endpoints now only accept a POST (kill stage, master kill app, master kill driver); kill link now POSTs

Author: Sean Owen <sowen@cloudera.com>

Closes #5528 from srowen/SPARK-6846 and squashes the following commits:

137ac9f [Sean Owen] Oops, fix scalastyle line length probelm
7c5f961 [Sean Owen] Add Imran's test of kill link
59f447d [Sean Owen] kill endpoints now only accept a POST (kill stage, master kill app, master kill driver); kill link now POSTs
2015-04-17 11:02:31 +01:00
Marcelo Vanzin de4fa6b6d1 [SPARK-4194] [core] Make SparkContext initialization exception-safe.
SparkContext has a very long constructor, where multiple things are
initialized, multiple threads are spawned, and multiple opportunities
for exceptions to be thrown exist. If one of these happens at an
innoportune time, lots of garbage tends to stick around.

This patch re-organizes SparkContext so that its internal state is
initialized in a big "try" block. The fields keeping state are now
completely private to SparkContext, and are "vars", because Scala
doesn't allow you to initialize a val later. The existing API interface
is kept by turning vals into defs (which works because Scala guarantees
the same binary interface for those).

On top of that, a few things in other areas were changed to avoid more
things leaking:

- Executor was changed to explicitly wait for the heartbeat thread to
  stop. LocalBackend was changed to wait for the "StopExecutor"
  message to be received, since otherwise there could be a race
  between that message arriving and the actor system being shut down.
- ConnectionManager could possibly hang during shutdown, because an
  interrupt at the wrong moment could cause the selector thread to
  still call select and then wait forever. So also wake up the
  selector so that this situation is avoided.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #5335 from vanzin/SPARK-4194 and squashes the following commits:

746b661 [Marcelo Vanzin] Fix borked merge.
80fc00e [Marcelo Vanzin] Merge branch 'master' into SPARK-4194
408dada [Marcelo Vanzin] Merge branch 'master' into SPARK-4194
2621609 [Marcelo Vanzin] Merge branch 'master' into SPARK-4194
6b73fcb [Marcelo Vanzin] Scalastyle.
c671c46 [Marcelo Vanzin] Fix merge.
3979aad [Marcelo Vanzin] Merge branch 'master' into SPARK-4194
8caa8b3 [Marcelo Vanzin] [SPARK-4194] [core] Make SparkContext initialization exception-safe.
071f16e [Marcelo Vanzin] Nits.
27456b9 [Marcelo Vanzin] More exception safety.
a0b0881 [Marcelo Vanzin] Stop alloc manager before scheduler.
5545d83 [Marcelo Vanzin] [SPARK-6650] [core] Stop ExecutorAllocationManager when context stops.
2015-04-16 10:48:31 +01:00
Kousuke Saruta 4d4b249274 [SPARK-6769][YARN][TEST] Usage of the ListenerBus in YarnClusterSuite is wrong
In YarnClusterSuite, a test case uses `SaveExecutorInfo`  to handle ExecutorAddedEvent as follows.

```
private class SaveExecutorInfo extends SparkListener {
  val addedExecutorInfos = mutable.Map[String, ExecutorInfo]()

  override def onExecutorAdded(executor: SparkListenerExecutorAdded) {
    addedExecutorInfos(executor.executorId) = executor.executorInfo
  }
}

...

    listener = new SaveExecutorInfo
    val sc = new SparkContext(new SparkConf()
      .setAppName("yarn \"test app\" 'with quotes' and \\back\\slashes and $dollarSigns"))
    sc.addSparkListener(listener)
    val status = new File(args(0))
    var result = "failure"
    try {
      val data = sc.parallelize(1 to 4, 4).collect().toSet
      assert(sc.listenerBus.waitUntilEmpty(WAIT_TIMEOUT_MILLIS))
      data should be (Set(1, 2, 3, 4))
      result = "success"
    } finally {
      sc.stop()
      Files.write(result, status, UTF_8)
    }
```

But, the usage is wrong because Executors will spawn during initializing SparkContext and SparkContext#addSparkListener should be invoked after the initialization, thus after Executors spawn, so SaveExecutorInfo cannot handle ExecutorAddedEvent.

Following code refers the result of the handling ExecutorAddedEvent. Because of the reason above, we cannot reach the assertion.

```
    // verify log urls are present
    listener.addedExecutorInfos.values.foreach { info =>
      assert(info.logUrlMap.nonEmpty)
    }
```

Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>

Closes #5417 from sarutak/SPARK-6769 and squashes the following commits:

8adc8ba [Kousuke Saruta] Fixed compile error
e258530 [Kousuke Saruta] Fixed style
591cf3e [Kousuke Saruta] Fixed style
48ec89a [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-6769
860c965 [Kousuke Saruta] Simplified code
207d325 [Kousuke Saruta] Added findListenersByClass method to ListenerBus
2408c84 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-6769
2d7e409 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-6769
3874adf [Kousuke Saruta] Fixed the usage of listener bus in LogUrlsStandaloneSuite
153a91b [Kousuke Saruta] Fixed the usage of listener bus in YarnClusterSuite
2015-04-14 14:01:55 -07:00
GuoQiang Li 25998e4d73 [SPARK-2033] Automatically cleanup checkpoint
Author: GuoQiang Li <witgo@qq.com>

Closes #855 from witgo/cleanup_checkpoint_date and squashes the following commits:

1649850 [GuoQiang Li] review commit
c0087e0 [GuoQiang Li] Automatically cleanup checkpoint
2015-04-14 12:56:47 -07:00
Timothy Chen 320bca4508 [SPARK-6081] Support fetching http/https uris in driver runner.
Currently if passed uris such as http/https, it won't able to fetch them as it only calls HadoopFs get.
This fix utilizes the existing util method to fetch remote uris as well.

Author: Timothy Chen <tnachen@gmail.com>

Closes #4832 from tnachen/driver_remote and squashes the following commits:

aa52cd6 [Timothy Chen] Support fetching remote uris in driver runner.
2015-04-14 11:49:04 -07:00
Erik van Oosten 51b306b930 SPARK-6878 [CORE] Fix for sum on empty RDD fails with exception
Author: Erik van Oosten <evanoosten@ebay.com>

Closes #5489 from erikvanoosten/master and squashes the following commits:

1c91954 [Erik van Oosten] Rewrote double range matcher to an exact equality assert (SPARK-6878)
f1708c9 [Erik van Oosten] Fix for sum on empty RDD fails with exception (SPARK-6878)
2015-04-14 12:39:56 +01:00
Ilya Ganelin c4ab255e94 [SPARK-5931][CORE] Use consistent naming for time properties
I've added new utility methods to do the conversion from times specified as e.g. 120s, 240ms, 360us to convert to a consistent internal representation. I've updated usage of these constants throughout the code to be consistent.

I believe I've captured all usages of time-based properties throughout the code. I've also updated variable names in a number of places to reflect their units for clarity and updated documentation where appropriate.

Author: Ilya Ganelin <ilya.ganelin@capitalone.com>
Author: Ilya Ganelin <ilganeli@gmail.com>

Closes #5236 from ilganeli/SPARK-5931 and squashes the following commits:

4526c81 [Ilya Ganelin] Update configuration.md
de3bff9 [Ilya Ganelin] Fixing style errors
f5fafcd [Ilya Ganelin] Doc updates
951ca2d [Ilya Ganelin] Made the most recent round of changes
bc04e05 [Ilya Ganelin] Minor fixes and doc updates
25d3f52 [Ilya Ganelin] Minor nit fixes
642a06d [Ilya Ganelin] Fixed logic for invalid suffixes and addid matching test
8927e66 [Ilya Ganelin] Fixed handling of -1
69fedcc [Ilya Ganelin] Added test for zero
dc7bd08 [Ilya Ganelin] Fixed error in exception handling
7d19cdd [Ilya Ganelin] Added fix for possible NPE
6f651a8 [Ilya Ganelin] Now using regexes to simplify code in parseTimeString. Introduces getTimeAsSec and getTimeAsMs methods in SparkConf. Updated documentation
cbd2ca6 [Ilya Ganelin] Formatting error
1a1122c [Ilya Ganelin] Formatting fixes and added m for use as minute formatter
4e48679 [Ilya Ganelin] Fixed priority order and mixed up conversions in a couple spots
d4efd26 [Ilya Ganelin] Added time conversion for yarn.scheduler.heartbeat.interval-ms
cbf41db [Ilya Ganelin] Got rid of thrown exceptions
1465390 [Ilya Ganelin] Nit
28187bf [Ilya Ganelin] Convert straight to seconds
ff40bfe [Ilya Ganelin] Updated tests to fix small bugs
19c31af [Ilya Ganelin] Added cleaner computation of time conversions in tests
6387772 [Ilya Ganelin] Updated suffix handling to handle overlap of units more gracefully
5193d5f [Ilya Ganelin] Resolved merge conflicts
76cfa27 [Ilya Ganelin] [SPARK-5931] Minor nit fixes'
bf779b0 [Ilya Ganelin] Special handling of overlapping usffixes for java
dd0a680 [Ilya Ganelin] Updated scala code to call into java
b2fc965 [Ilya Ganelin] replaced get or default since it's not present in this version of java
39164f9 [Ilya Ganelin] [SPARK-5931] Updated Java conversion to be similar to scala conversion. Updated conversions to clean up code a little using TimeUnit.convert. Added Unit tests
3b126e1 [Ilya Ganelin] Fixed conversion to US from seconds
1858197 [Ilya Ganelin] Fixed bug where all time was being converted to us instead of the appropriate units
bac9edf [Ilya Ganelin] More whitespace
8613631 [Ilya Ganelin] Whitespace
1c0c07c [Ilya Ganelin] Updated Java code to add day, minutes, and hours
647b5ac [Ilya Ganelin] Udpated time conversion to use map iterator instead of if fall through
70ac213 [Ilya Ganelin] Fixed remaining usages to be consistent. Updated Java-side time conversion
68f4e93 [Ilya Ganelin] Updated more files to clean up usage of default time strings
3a12dd8 [Ilya Ganelin] Updated host revceiver
5232a36 [Ilya Ganelin] [SPARK-5931] Changed default behavior of time string conversion.
499bdf0 [Ilya Ganelin] Merge branch 'SPARK-5931' of github.com:ilganeli/spark into SPARK-5931
9e2547c [Ilya Ganelin] Reverting doc changes
8f741e1 [Ilya Ganelin] Update JavaUtils.java
34f87c2 [Ilya Ganelin] Update Utils.scala
9a29d8d [Ilya Ganelin] Fixed misuse of time in streaming context test
42477aa [Ilya Ganelin] Updated configuration doc with note on specifying time properties
cde9bff [Ilya Ganelin] Updated spark.streaming.blockInterval
c6a0095 [Ilya Ganelin] Updated spark.core.connection.auth.wait.timeout
5181597 [Ilya Ganelin] Updated spark.dynamicAllocation.schedulerBacklogTimeout
2fcc91c [Ilya Ganelin] Updated spark.dynamicAllocation.executorIdleTimeout
6d1518e [Ilya Ganelin] Upated spark.speculation.interval
3f1cfc8 [Ilya Ganelin] Updated spark.scheduler.revive.interval
3352d34 [Ilya Ganelin] Updated spark.scheduler.maxRegisteredResourcesWaitingTime
272c215 [Ilya Ganelin] Updated spark.locality.wait
7320c87 [Ilya Ganelin] updated spark.akka.heartbeat.interval
064ebd6 [Ilya Ganelin] Updated usage of spark.cleaner.ttl
21ef3dd [Ilya Ganelin] updated spark.shuffle.sasl.timeout
c9f5cad [Ilya Ganelin] Updated spark.shuffle.io.retryWait
4933fda [Ilya Ganelin] Updated usage of spark.storage.blockManagerSlaveTimeout
7db6d2a [Ilya Ganelin] Updated usage of spark.akka.timeout
404f8c3 [Ilya Ganelin] Updated usage of spark.core.connection.ack.wait.timeout
59bf9e1 [Ilya Ganelin] [SPARK-5931] Updated Utils and JavaUtils classes to add helper methods to handle time strings. Updated time strings in a few places to properly parse time
2015-04-13 16:28:07 -07:00
Reynold Xin c5b0b296b8 [SPARK-6765] Enable scalastyle on test code.
Turn scalastyle on for all test code. Most of the violations have been resolved in my previous pull requests:

Core: https://github.com/apache/spark/pull/5484
SQL: https://github.com/apache/spark/pull/5412
MLlib: https://github.com/apache/spark/pull/5411
GraphX: https://github.com/apache/spark/pull/5410
Streaming: https://github.com/apache/spark/pull/5409

Author: Reynold Xin <rxin@databricks.com>

Closes #5486 from rxin/test-style-enable and squashes the following commits:

01683de [Reynold Xin] Fixed new code.
a4ab46e [Reynold Xin] Fixed tests.
20adbc8 [Reynold Xin] Missed one violation.
5e36521 [Reynold Xin] [SPARK-6765] Enable scalastyle on test code.
2015-04-13 09:29:04 -07:00
Reynold Xin a1fe59dae5 [SPARK-6765] Fix test code style for core.
Author: Reynold Xin <rxin@databricks.com>

Closes #5484 from rxin/test-style-core and squashes the following commits:

e0b0100 [Reynold Xin] [SPARK-6765] Fix test code style for core.
2015-04-12 20:50:49 -07:00
WangTaoTheTonic 7d92db342e [SPARK-6758]block the right jetty package in log
https://issues.apache.org/jira/browse/SPARK-6758

I am not sure if it is ok to block them in test resources too (as we shade jetty in assembly?).

Author: WangTaoTheTonic <wangtao111@huawei.com>

Closes #5406 from WangTaoTheTonic/SPARK-6758 and squashes the following commits:

e09605b [WangTaoTheTonic] block the right jetty package
2015-04-09 17:44:08 -04:00
Kay Ousterhout 9d44ddce1d [SPARK-6753] Clone SparkConf in ShuffleSuite tests
Prior to this change, the unit test for SPARK-3426 did not clone the
original SparkConf, which meant that that test did not use the options
set by suites that subclass ShuffleSuite.scala. This commit fixes that
problem.

JoshRosen would be great if you could take a look at this, since you wrote this
test originally.

Author: Kay Ousterhout <kayousterhout@gmail.com>

Closes #5401 from kayousterhout/SPARK-6753 and squashes the following commits:

368c540 [Kay Ousterhout] [SPARK-6753] Clone SparkConf in ShuffleSuite tests
2015-04-08 10:26:45 -07:00
Josh Rosen c83e03948b [SPARK-6737] Fix memory leak in OutputCommitCoordinator
This patch fixes a memory leak in the DAGScheduler, which caused us to leak a map entry per submitted stage.  The problem is that the OutputCommitCoordinator needs to be informed when stages end in order to remove entries from its `authorizedCommitters` map, but the DAGScheduler only called it in one of the four code paths that are used to mark stages as completed.

This patch fixes this issue by consolidating the processing of stage completion into a new `markStageAsFinished` method and updates DAGSchedulerSuite's `assertDataStructuresEmpty` assertion to also check the OutputCommitCoordinator data structures.  I've also added a comment at the top of DAGScheduler so that we remember to update this test when adding new data structures.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #5397 from JoshRosen/SPARK-6737 and squashes the following commits:

af3b02f [Josh Rosen] Consolidate stage completion handling code in a single method.
e96ce3a [Josh Rosen] Consolidate stage completion handling code in a single method.
3052aea [Josh Rosen] Comment update
7896899 [Josh Rosen] Fix SPARK-6737 by informing OutputCommitCoordinator of all stage end events.
4ead1dc [Josh Rosen] Add regression tests for SPARK-6737
2015-04-07 16:18:55 -07:00
Xiangrui Meng e6f08fb42f Revert "[SPARK-6568] spark-shell.cmd --jars option does not accept the jar that has space in its path"
This reverts commit 596ba77c5f.
2015-04-07 14:34:15 -07:00
Masayoshi TSUZUKI 596ba77c5f [SPARK-6568] spark-shell.cmd --jars option does not accept the jar that has space in its path
escape spaces in the arguments.

Author: Masayoshi TSUZUKI <tsudukim@oss.nttdata.co.jp>

Closes #5347 from tsudukim/feature/SPARK-6568 and squashes the following commits:

9180aaf [Masayoshi TSUZUKI] [SPARK-6568] spark-shell.cmd --jars option does not accept the jar that has space in its path
2015-04-07 14:29:53 -07:00
Josh Rosen a0846c4b63 [SPARK-6716] Change SparkContext.DRIVER_IDENTIFIER from <driver> to driver
Currently, the driver's executorId is set to `<driver>`. This choice of ID was present in older Spark versions, but it has started to cause problems now that executorIds are used in more contexts, such as Ganglia metric names or driver thread-dump links the web UI. The angle brackets must be escaped when embedding this ID in XML or as part of URLs and this has led to multiple problems:

- https://issues.apache.org/jira/browse/SPARK-6484
- https://issues.apache.org/jira/browse/SPARK-4313

The simplest solution seems to be to change this id to something that does not contain any special characters, such as `driver`.

I'm not sure whether we can perform this change in a patch release, since this ID may be considered a stable API by metrics users, but it's probably okay to do this in a major release as long as we document it in the release notes.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #5372 from JoshRosen/driver-id-fix and squashes the following commits:

42d3c10 [Josh Rosen] Clarify comment
0c5d04b [Josh Rosen] Add backwards-compatibility in BlockManagerId.isDriver
7ff12e0 [Josh Rosen] Change SparkContext.DRIVER_IDENTIFIER from <driver> to driver
2015-04-06 23:33:16 -07:00
zsxwing 0b5d028a93 [SPARK-6602][Core] Update MapOutputTrackerMasterActor to MapOutputTrackerMasterEndpoint
This is the second PR for [SPARK-6602]. It updated MapOutputTrackerMasterActor and its unit tests.

cc rxin

Author: zsxwing <zsxwing@gmail.com>

Closes #5371 from zsxwing/rpc-rewrite-part2 and squashes the following commits:

fcf3816 [zsxwing] Fix the code style
4013a22 [zsxwing] Add doc for uncaught exceptions in RpcEnv
93c6c20 [zsxwing] Add an example of UnserializableException and add ErrorMonitor to monitor errors from Akka
134fe7b [zsxwing] Update MapOutputTrackerMasterActor to MapOutputTrackerMasterEndpoint
2015-04-05 21:57:15 -07:00
zsxwing f15806a8f8 [SPARK-6602][Core] Replace direct use of Akka with Spark RPC interface - part 1
This PR replaced the following `Actor`s to `RpcEndpoint`:

1. HeartbeatReceiver
1. ExecutorActor
1. BlockManagerMasterActor
1. BlockManagerSlaveActor
1. CoarseGrainedExecutorBackend and subclasses
1. CoarseGrainedSchedulerBackend.DriverActor

This is the first PR. I will split the work of SPARK-6602 to several PRs for code review.

Author: zsxwing <zsxwing@gmail.com>

Closes #5268 from zsxwing/rpc-rewrite and squashes the following commits:

287e9f8 [zsxwing] Fix the code style
26c56b7 [zsxwing] Merge branch 'master' into rpc-rewrite
9cc825a [zsxwing] Rmove setupThreadSafeEndpoint and add ThreadSafeRpcEndpoint
30a9036 [zsxwing] Make self return null after stopping RpcEndpointRef; fix docs and error messages
705245d [zsxwing] Fix some bugs after rebasing the changes on the master
003cf80 [zsxwing] Update CoarseGrainedExecutorBackend and CoarseGrainedSchedulerBackend to use RpcEndpoint
7d0e6dc [zsxwing] Update BlockManagerSlaveActor to use RpcEndpoint
f5d6543 [zsxwing] Update BlockManagerMaster to use RpcEndpoint
30e3f9f [zsxwing] Update ExecutorActor to use RpcEndpoint
478b443 [zsxwing] Update HeartbeatReceiver to use RpcEndpoint
2015-04-04 11:52:05 -07:00
Marcelo Vanzin 14632b7942 [SPARK-6688] [core] Always use resolved URIs in EventLoggingListener.
Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #5340 from vanzin/SPARK-6688 and squashes the following commits:

ccfddd9 [Marcelo Vanzin] Resolve at the source.
20d2a34 [Marcelo Vanzin] [SPARK-6688] [core] Always use resolved URIs in EventLoggingListener.
2015-04-03 11:55:04 -07:00
zsxwing 440ea31b76 [SPARK-6621][Core] Fix the bug that calling EventLoop.stop in EventLoop.onReceive/onError/onStart doesn't call onStop
Author: zsxwing <zsxwing@gmail.com>

Closes #5280 from zsxwing/SPARK-6621 and squashes the following commits:

521125e [zsxwing] Fix the bug that calling EventLoop.stop in EventLoop.onReceive and EventLoop.onError doesn't call onStop
2015-04-02 22:54:30 -07:00
Marcelo Vanzin 45134ec920 [SPARK-6650] [core] Stop ExecutorAllocationManager when context stops.
This fixes the thread leak. I also changed the unit test to keep track
of allocated contexts and make sure they're closed after tests are
run; this is needed since some tests use this pattern:

    val sc = createContext()
    doSomethingThatMayThrow()
    sc.stop()

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #5311 from vanzin/SPARK-6650 and squashes the following commits:

652c73b [Marcelo Vanzin] Nits.
5711512 [Marcelo Vanzin] More exception safety.
cc5a744 [Marcelo Vanzin] Stop alloc manager before scheduler.
9886f69 [Marcelo Vanzin] [SPARK-6650] [core] Stop ExecutorAllocationManager when context stops.
2015-04-02 19:48:55 -07:00
Hung Lin e3202aa2e9 SPARK-6414: Spark driver failed with NPE on job cancelation
Use Option for ActiveJob.properties to avoid NPE bug

Author: Hung Lin <hung.lin@gmail.com>

Closes #5124 from hunglin/SPARK-6414 and squashes the following commits:

2290b6b [Hung Lin] [SPARK-6414][core] Fix NPE in SparkContext.cancelJobGroup()
2015-04-02 14:01:43 -07:00
Patrick Wendell 6562787b96 [SPARK-6627] Some clean-up in shuffle code.
Before diving into review #4450 I did a look through the existing shuffle
code to learn how it works. Unfortunately, there are some very
confusing things in this code. This patch makes a few small changes
to simplify things. It is not easily to concisely describe the changes
because of how convoluted the issues were, but they are fairly small
logically:

1. There is a trait named `ShuffleBlockManager` that only deals with
   one logical function which is retrieving shuffle block data given shuffle
   block coordinates. This trait has two implementors FileShuffleBlockManager
   and IndexShuffleBlockManager. Confusingly the vast majority of those
   implementations have nothing to do with this particular functionality.
   So I've renamed the trait to ShuffleBlockResolver and documented it.
2. The aforementioned trait had two almost identical methods, for no good
   reason. I removed one method (getBytes) and modified callers to use the
   other one. I think the behavior is preserved in all cases.
3. The sort shuffle code uses an identifier "0" in the reduce slot of a
   BlockID as a placeholder. I made it into a constant since it needs to
   be consistent across multiple places.

I think for (3) there is actually a better solution that would avoid the
need to do this type of workaround/hack in the first place, but it's more
complex so I'm punting it for now.

Author: Patrick Wendell <patrick@databricks.com>

Closes #5286 from pwendell/cleanup and squashes the following commits:

c71fbc7 [Patrick Wendell] Open interface back up for testing
f36edd5 [Patrick Wendell] Code review feedback
d1c0494 [Patrick Wendell] Style fix
a406079 [Patrick Wendell] [HOTFIX] Some clean-up in shuffle code.
2015-04-01 23:42:09 -07:00
Josh Rosen 37326079d8 [SPARK-6614] OutputCommitCoordinator should clear authorized committer only after authorized committer fails, not after any failure
In OutputCommitCoordinator, there is some logic to clear the authorized committer's lock on committing in case that task fails.  However, it looks like the current code also clears this lock if other non-authorized tasks fail, which is an obvious bug.

In theory, it's possible that this could allow a new committer to start, run to completion, and commit output before the authorized committer finished, but it's unlikely that this race occurs often in practice due to the complex combination of failure and timing conditions that would be required to expose it.

This patch addresses this issue and adds a regression test.

Thanks to aarondav for spotting this issue.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #5276 from JoshRosen/SPARK-6614 and squashes the following commits:

d532ba7 [Josh Rosen] Check whether failed task was authorized committer
cbb3784 [Josh Rosen] Add regression test for SPARK-6614
2015-03-31 16:18:39 -07:00
zsxwing a8d53afb4e [SPARK-5124][Core] A standard RPC interface and an Akka implementation
This PR added a standard internal RPC interface for Spark and an Akka implementation. See [the design document](https://issues.apache.org/jira/secure/attachment/12698710/Pluggable%20RPC%20-%20draft%202.pdf) for more details.

I will split the whole work into multiple PRs to make it easier for code review. This is the first PR and avoid to touch too many files.

Author: zsxwing <zsxwing@gmail.com>

Closes #4588 from zsxwing/rpc-part1 and squashes the following commits:

fe3df4c [zsxwing] Move registerEndpoint and use actorSystem.dispatcher in asyncSetupEndpointRefByURI
f6f3287 [zsxwing] Remove RpcEndpointRef.toURI
8bd1097 [zsxwing] Fix docs and the code style
f459380 [zsxwing] Add RpcAddress.fromURI and rename urls to uris
b221398 [zsxwing] Move send methods above ask methods
15cfd7b [zsxwing] Merge branch 'master' into rpc-part1
9ffa997 [zsxwing] Fix MiMa tests
78a1733 [zsxwing] Merge remote-tracking branch 'origin/master' into rpc-part1
385b9c3 [zsxwing] Fix the code style and add docs
2cc3f78 [zsxwing] Add an asynchronous version of setupEndpointRefByUrl
e8dfec3 [zsxwing] Remove 'sendWithReply(message: Any, sender: RpcEndpointRef): Unit'
08564ae [zsxwing] Add RpcEnvFactory to create RpcEnv
e5df4ca [zsxwing] Handle AkkaFailure(e) in Actor
ec7c5b0 [zsxwing] Fix docs
7fc95e1 [zsxwing] Implement askWithReply in RpcEndpointRef
9288406 [zsxwing] Document thread-safety for setupThreadSafeEndpoint
3007c09 [zsxwing] Move setupDriverEndpointRef to RpcUtils and rename to makeDriverRef
c425022 [zsxwing] Fix the code style
5f87700 [zsxwing] Move the logical of processing message to a private function
3e56123 [zsxwing] Use lazy to eliminate CountDownLatch
07f128f [zsxwing] Remove ActionScheduler.scala
4d34191 [zsxwing] Remove scheduler from RpcEnv
7cdd95e [zsxwing] Add docs for RpcEnv
51e6667 [zsxwing] Add 'sender' to RpcCallContext and rename the parameter of receiveAndReply to 'context'
ffc1280 [zsxwing] Rename 'fail' to 'sendFailure' and other minor code style changes
28e6d0f [zsxwing] Add onXXX for network events and remove the companion objects of network events
3751c97 [zsxwing] Rename RpcResponse to RpcCallContext
fe7d1ff [zsxwing] Add explicit reply in rpc
7b9e0c9 [zsxwing] Fix the indentation
04a106e [zsxwing] Remove NopCancellable and add a const NOP in object SettableCancellable
2a579f4 [zsxwing] Remove RpcEnv.systemName
155b987 [zsxwing] Change newURI to uriOf and add some comments
45b2317 [zsxwing] A standard RPC interface and An Akka implementation
2015-03-29 21:25:09 -07:00
June.He 0e2753ff14 [SPARK-6585][Tests]Fix FileServerSuite testcase in some Env.
Change FileServerSuite.test("HttpFileServer should not work with SSL when the server is untrusted") catch SSLException

Author: June.He <jun.hejun@huawei.com>

Closes #5239 from sisihj/SPARK-6585 and squashes the following commits:

cb19ae3 [June.He] Change FileServerSuite.test("HttpFileServer should not work with SSL when the server is untrusted") catch SSLException
2015-03-29 12:47:22 +01:00
Sean Owen fe15ea9760 SPARK-6480 [CORE] histogram() bucket function is wrong in some simple edge cases
Fix fastBucketFunction for histogram() to handle edge conditions more correctly. Add a test, and fix existing one accordingly

Author: Sean Owen <sowen@cloudera.com>

Closes #5148 from srowen/SPARK-6480 and squashes the following commits:

974a0a0 [Sean Owen] Additional test of huge ranges, and a few more comments (and comment fixes)
23ec01e [Sean Owen] Fix fastBucketFunction for histogram() to handle edge conditions more correctly. Add a test, and fix existing one accordingly
2015-03-26 15:00:23 +00:00
Josh Rosen d44a3362ed [SPARK-6079] Use index to speed up StatusTracker.getJobIdsForGroup()
`StatusTracker.getJobIdsForGroup()` is implemented via a linear scan over a HashMap rather than using an index, which might be an expensive operation if there are many (e.g. thousands) of retained jobs.

This patch adds a new map to `JobProgressListener` in order to speed up these lookups.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #4830 from JoshRosen/statustracker-job-group-indexing and squashes the following commits:

e39c5c7 [Josh Rosen] Address review feedback
6709fb2 [Josh Rosen] Merge remote-tracking branch 'origin/master' into statustracker-job-group-indexing
2c49614 [Josh Rosen] getOrElse
97275a7 [Josh Rosen] Add jobGroup to jobId index to JobProgressListener
2015-03-25 17:40:00 -07:00
zsxwing 883b7e9030 [SPARK-6076][Block Manager] Fix a potential OOM issue when StorageLevel is MEMORY_AND_DISK_SER
In dcd1e42d6b/core/src/main/scala/org/apache/spark/storage/BlockManager.scala (L538) , when StorageLevel is `MEMORY_AND_DISK_SER`, it will copy the content from file into memory, then put it into MemoryStore.
```scala
              val copyForMemory = ByteBuffer.allocate(bytes.limit)
              copyForMemory.put(bytes)
              memoryStore.putBytes(blockId, copyForMemory, level)
              bytes.rewind()
```
However, if the file is bigger than the free memory, OOM will happen. A better approach is testing if there is enough memory. If not, copyForMemory should not be created, since this is an optional operation.

Author: zsxwing <zsxwing@gmail.com>

Closes #4827 from zsxwing/SPARK-6076 and squashes the following commits:

7d25545 [zsxwing] Add alias for tryToPut and dropFromMemory
1100a54 [zsxwing] Replace call-by-name with () => T
0cc0257 [zsxwing] Fix a potential OOM issue when StorageLevel is MEMORY_AND_DISK_SER
2015-03-25 12:17:18 -07:00
Xiangrui Meng 6930e965e2 [SPARK-6512] add contains to OpenHashMap
Add `contains` to test whether a key exists in an OpenHashMap. rxin

Author: Xiangrui Meng <meng@databricks.com>

Closes #5171 from mengxr/openhashmap-contains and squashes the following commits:

d6e6f1f [Xiangrui Meng] add contains to primitivekeyopenhashmap
748a69b [Xiangrui Meng] add contains to OpenHashMap
2015-03-24 17:06:22 -07:00
Jongyoul Lee adb2ff752f [SPARK-6453][Mesos] Some Mesos*Suite have a different package with their classes
- Moved Suites from o.a.s.s.mesos to o.a.s.s.cluster.mesos

Author: Jongyoul Lee <jongyoul@gmail.com>

Closes #5126 from jongyoul/SPARK-6453 and squashes the following commits:

4f24a3e [Jongyoul Lee] [SPARK-6453][Mesos] Some Mesos*Suite have a different package with their classes - Fixed imports orders
8ab149d [Jongyoul Lee] [SPARK-6453][Mesos] Some Mesos*Suite have a different package with their classes - Moved Suites from o.a.s.s.mesos to o.a.s.s.cluster.mesos
2015-03-22 15:54:19 +00:00
Jongyoul Lee 49a01c7ea2 [SPARK-6423][Mesos] MemoryUtils should use memoryOverhead if it's set
- Fixed calculateTotalMemory to use spark.mesos.executor.memoryOverhead
- Added testCase

Author: Jongyoul Lee <jongyoul@gmail.com>

Closes #5099 from jongyoul/SPARK-6423 and squashes the following commits:

6747fce [Jongyoul Lee] [SPARK-6423][Mesos] MemoryUtils should use memoryOverhead if it's set - Changed a description of spark.mesos.executor.memoryOverhead
475a7c8 [Jongyoul Lee] [SPARK-6423][Mesos] MemoryUtils should use memoryOverhead if it's set - Fit the import rules
453c5a2 [Jongyoul Lee] [SPARK-6423][Mesos] MemoryUtils should use memoryOverhead if it's set - Fixed calculateTotalMemory to use spark.mesos.executor.memoryOverhead - Added testCase
2015-03-20 19:14:35 +00:00
Sean Owen 6f80c3e888 SPARK-6338 [CORE] Use standard temp dir mechanisms in tests to avoid orphaned temp files
Use `Utils.createTempDir()` to replace other temp file mechanisms used in some tests, to further ensure they are cleaned up, and simplify

Author: Sean Owen <sowen@cloudera.com>

Closes #5029 from srowen/SPARK-6338 and squashes the following commits:

27b740a [Sean Owen] Fix hive-thriftserver tests that don't expect an existing dir
4a212fa [Sean Owen] Standardize a bit more temp dir management
9004081 [Sean Owen] Revert some added recursive-delete calls
57609e4 [Sean Owen] Use Utils.createTempDir() to replace other temp file mechanisms used in some tests, to further ensure they are cleaned up, and simplify
2015-03-20 14:16:21 +00:00
mcheah 3c4e486b9c [SPARK-5843] [API] Allowing map-side combine to be specified in Java.
Specifically, when calling JavaPairRDD.combineByKey(), there is a new
six-parameter method that exposes the map-side-combine boolean as the
fifth parameter and the serializer as the sixth parameter.

Author: mcheah <mcheah@palantir.com>

Closes #4634 from mccheah/pair-rdd-map-side-combine and squashes the following commits:

5c58319 [mcheah] Fixing compiler errors.
3ce7deb [mcheah] Addressing style and documentation comments.
7455c7a [mcheah] Allowing Java combineByKey to specify Serializer as well.
6ddd729 [mcheah] [SPARK-5843] Allowing map-side combine to be specified in Java.
2015-03-19 08:51:49 -04:00
CodingCat 2c3f83c34b [SPARK-4012] stop SparkContext when the exception is thrown from an infinite loop
https://issues.apache.org/jira/browse/SPARK-4012

This patch is a resubmission for https://github.com/apache/spark/pull/2864

What I am proposing in this patch is that ***when the exception is thrown from an infinite loop, we should stop the SparkContext, instead of let JVM throws exception forever***

So, in the infinite loops where we originally wrapped with a ` logUncaughtExceptions`, I changed to `tryOrStopSparkContext`, so that the Spark component is stopped

Early stopped JVM process is helpful for HA scheme design, for example,

The user has a script checking the existence of the pid of the Spark Streaming driver for monitoring the availability; with the code before this patch, the JVM process is still available but not functional when the exceptions are thrown

andrewor14, srowen , mind taking further consideration about the change?

Author: CodingCat <zhunansjtu@gmail.com>

Closes #5004 from CodingCat/SPARK-4012-1 and squashes the following commits:

589276a [CodingCat] throw fatal error again
3c72cd8 [CodingCat] address the comments
6087864 [CodingCat] revise comments
6ad3eb0 [CodingCat] stop SparkContext instead of quit the JVM process
6322959 [CodingCat] exit JVM process when the exception is thrown from an infinite loop
2015-03-18 23:48:45 -07:00
Wenchen Fan 540b2a4eab [SPARK-6394][Core] cleanup BlockManager companion object and improve the getCacheLocs method in DAGScheduler
The current implementation include searching a HashMap many times, we can avoid this.
Actually if you look into `BlockManager.blockIdsToBlockManagers`, the core function call is [this](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/storage/BlockManager.scala#L1258), so we can call `blockManagerMaster.getLocations` directly and avoid building a HashMap.

Author: Wenchen Fan <cloud0fan@outlook.com>

Closes #5043 from cloud-fan/small and squashes the following commits:

e959d12 [Wenchen Fan] fix style
203c493 [Wenchen Fan] some cleanup in BlockManager companion object
d409099 [Wenchen Fan] address rxin's comment
faec999 [Wenchen Fan] add regression test
2fb57aa [Wenchen Fan] imporve the getCacheLocs method
2015-03-18 19:43:04 -07:00
Josh Rosen 0f673c21f6 [SPARK-3266] Use intermediate abstract classes to fix type erasure issues in Java APIs
This PR addresses a Scala compiler bug ([SI-8905](https://issues.scala-lang.org/browse/SI-8905)) that was breaking some of the Spark Java APIs.  In a nutshell, it seems that methods whose implementations are inherited from generic traits sometimes have their type parameters erased to Object.  This was causing methods like `DoubleRDD.min()` to throw confusing NoSuchMethodErrors at runtime.

The fix implemented here is to introduce an intermediate layer of abstract classes and inherit from those instead of directly extends the `Java*Like` traits.  This should not break binary compatibility.

I also improved the test coverage of the Java API, adding several new tests for methods that failed at runtime due to this bug.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #5050 from JoshRosen/javardd-si-8905-fix and squashes the following commits:

2feb068 [Josh Rosen] Use intermediate abstract classes to work around SPARK-3266
d5f3e5d [Josh Rosen] Add failing regression tests for SPARK-3266
2015-03-17 09:18:57 -07:00
lisurprise f149b8b5e5 [SPARK-6077] Remove streaming tab while stopping StreamingContext
Currently we would create a new streaming tab for each streamingContext even if there's already one on the same sparkContext which would cause duplicate StreamingTab created and none of them is taking effect.
snapshot: https://www.dropbox.com/s/t4gd6hqyqo0nivz/bad%20multiple%20streamings.png?dl=0
How to reproduce:
1)
import org.apache.spark.SparkConf
import org.apache.spark.streaming.
{Seconds, StreamingContext}
import org.apache.spark.storage.StorageLevel
val ssc = new StreamingContext(sc, Seconds(1))
val lines = ssc.socketTextStream("localhost", 9999, StorageLevel.MEMORY_AND_DISK_SER)
val words = lines.flatMap(_.split(" "))
val wordCounts = words.map(x => (x, 1)).reduceByKey(_ + _)
wordCounts.print()
ssc.start()
.....
2)
ssc.stop(false)
val ssc = new StreamingContext(sc, Seconds(1))
val lines = ssc.socketTextStream("localhost", 9999, StorageLevel.MEMORY_AND_DISK_SER)
val words = lines.flatMap(_.split(" "))
val wordCounts = words.map(x => (x, 1)).reduceByKey(_ + _)
wordCounts.print()
ssc.start()

Author: lisurprise <zhichao.li@intel.com>

Closes #4828 from zhichao-li/master and squashes the following commits:

c329806 [lisurprise] add test for attaching/detaching streaming tab
51e6c7f [lisurprise] move detach method into StreamingTab
31a44fa [lisurprise] add unit test for attaching and detaching new tab
db25ed2 [lisurprise] clean code
8281bcb [lisurprise] clean code
193c542 [lisurprise] remove streaming tab while closing streaming context
2015-03-16 13:10:32 -07:00
DoingDone9 00e730b94c [SPARK-6300][Spark Core] sc.addFile(path) does not support the relative path.
when i run cmd like that sc.addFile("../test.txt"), it did not work and throwed an exception:
java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: file:../test.txt
at org.apache.hadoop.fs.Path.initialize(Path.java:206)
at org.apache.hadoop.fs.Path.<init>(Path.java:172)
........
.......
Caused by: java.net.URISyntaxException: Relative path in absolute URI: file:../test.txt
at java.net.URI.checkPath(URI.java:1804)
at java.net.URI.<init>(URI.java:752)
at org.apache.hadoop.fs.Path.initialize(Path.java:203)

Author: DoingDone9 <799203320@qq.com>

Closes #4993 from DoingDone9/relativePath and squashes the following commits:

ee375cd [DoingDone9] Update SparkContextSuite.scala
d594e16 [DoingDone9] Update SparkContext.scala
0ff3fa8 [DoingDone9] test for add file
dced8eb [DoingDone9] Update SparkContext.scala
e4a13fe [DoingDone9] getCanonicalPath
161cae3 [DoingDone9] Merge pull request #4 from apache/master
c87e8b6 [DoingDone9] Merge pull request #3 from apache/master
cb1852d [DoingDone9] Merge pull request #2 from apache/master
c3f046f [DoingDone9] Merge pull request #1 from apache/master
2015-03-16 12:27:15 +00:00
Sean Owen 6e94c4eadf SPARK-6225 [CORE] [SQL] [STREAMING] Resolve most build warnings, 1.3.0 edition
Resolve javac, scalac warnings of various types -- deprecations, Scala lang, unchecked cast, etc.

Author: Sean Owen <sowen@cloudera.com>

Closes #4950 from srowen/SPARK-6225 and squashes the following commits:

3080972 [Sean Owen] Ordered imports: Java, Scala, 3rd party, Spark
c67985b [Sean Owen] Resolve javac, scalac warnings of various types -- deprecations, Scala lang, unchecked cast, etc.
2015-03-11 13:15:19 +00:00
Lev Khomich c4c4b07bf6 [SPARK-6087][CORE] Provide actionable exception if Kryo buffer is not large enough
A simple try-catch wrapping KryoException to be more informative.

Author: Lev Khomich <levkhomich@gmail.com>

Closes #4947 from levkhomich/master and squashes the following commits:

0f7a947 [Lev Khomich] [SPARK-6087][CORE] Provide actionable exception if Kryo buffer is not large enough
2015-03-10 10:55:42 +00:00
Marcelo Vanzin cd7594ca6a [core] [minor] Don't pollute source directory when running UtilsSuite.
Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #4921 from vanzin/utils-suite and squashes the following commits:

7795dd4 [Marcelo Vanzin] [core] [minor] Don't pollute source directory when running UtilsSuite.
2015-03-06 09:43:24 +00:00
Josh Rosen 424a86a1ed [SPARK-6175] Fix standalone executor log links when ephemeral ports or SPARK_PUBLIC_DNS are used
This patch fixes two issues with the executor log viewing links added in Spark 1.3.  In standalone mode, the log URLs might include a port value of 0 rather than the actual bound port of the UI, which broke the ability to view logs from workers whose web UIs had been configured to bind to ephemeral ports.  In addition, the URLs used workers' local hostnames instead of respecting SPARK_PUBLIC_DNS, which prevented this feature from working properly on Spark EC2 clusters because the links would point to internal DNS names instead of external ones.

I included tests for both of these bugs:

- We now browse to the URLs and verify that they point to the expected pages.
- To test SPARK_PUBLIC_DNS, I changed the code that reads the environment variable to do so via `SparkConf.getenv`, then used a custom SparkConf subclass to mock the environment variable (this pattern is used elsewhere in Spark's tests).

Author: Josh Rosen <joshrosen@databricks.com>

Closes #4903 from JoshRosen/SPARK-6175 and squashes the following commits:

5577f41 [Josh Rosen] Remove println
cfec135 [Josh Rosen] Use webUi.boundPort and publicAddress in log links
27918c7 [Josh Rosen] Add failing unit tests for standalone log URL viewing
c250fbe [Josh Rosen] Respect SparkConf in local-cluster Workers.
422a2ef [Josh Rosen] Use conf.getenv to read SPARK_PUBLIC_DNS
2015-03-05 12:04:00 -08:00
Marcelo Vanzin 3a35a0dfe9 [SPARK-6144] [core] Fix addFile when source files are on "hdfs:"
The code failed in two modes: it complained when it tried to re-create a directory that already existed, and it was placing some files in the wrong parent directory. The patch fixes both issues.

Author: Marcelo Vanzin <vanzin@cloudera.com>
Author: trystanleftwich <trystan@atscale.com>

Closes #4894 from vanzin/SPARK-6144 and squashes the following commits:

100b3a1 [Marcelo Vanzin] Style fix.
58266aa [Marcelo Vanzin] Fix fetchHcfs file for directories.
91733b7 [trystanleftwich] [SPARK-6144]When in cluster mode using ADD JAR with a hdfs:// sourced jar will fail
2015-03-04 12:58:39 -08:00
Imran Rashid 1f1fccc5ce [SPARK-5949] HighlyCompressedMapStatus needs more classes registered w/ kryo
https://issues.apache.org/jira/browse/SPARK-5949

Author: Imran Rashid <irashid@cloudera.com>

Closes #4877 from squito/SPARK-5949_register_roaring_bitmap and squashes the following commits:

7e13316 [Imran Rashid] style style style
5f6bb6d [Imran Rashid] more style
709bfe0 [Imran Rashid] style
a5cb744 [Imran Rashid] update tests to cover both types of RoaringBitmapContainers
09610c6 [Imran Rashid] formatting
f9a0b7c [Imran Rashid] put primitive array registrations together
97beaf8 [Imran Rashid] SPARK-5949 HighlyCompressedMapStatus needs more classes registered w/ kryo
2015-03-03 15:33:19 -08:00
CodingCat 975643c256 [SPARK-6118] making package name of deploy.worker.CommandUtils and deploy.CommandUtilsSuite consistent
https://issues.apache.org/jira/browse/SPARK-6118

I found that the object CommandUtils is placed under deploy.worker package, while CommandUtilsSuite is  under deploy

Conventionally, we put the implementation and unit test class under the same package

here, to minimize the change, I move CommandUtilsSuite to worker package,

**However, CommandUtils seems to contain some general methods (though only used by worker.* classes currently**,  we may also consider to replace CommonUtils

Author: CodingCat <zhunansjtu@gmail.com>

Closes #4856 from CodingCat/SPARK-6118 and squashes the following commits:

cb93700 [CodingCat] making package name consistent
2015-03-03 10:32:57 +00:00
hushan[胡珊] e3a88d1104 [SPARK-4777][CORE] Some block memory after unrollSafely not count into used memory(memoryStore.entrys or unrollMemory)
Some memory not count into memory used by memoryStore or unrollMemory.
Thread A after unrollsafely memory, it will release 40MB unrollMemory(40MB will used by other threads). then ThreadA wait get accountingLock to tryToPut blockA(30MB). before Thread A get accountingLock, blockA memory size is not counting into unrollMemory or memoryStore.currentMemory.
IIUC, freeMemory should minus that block memory

So, put this release memory into pending, and release it in tryToPut before ensureSpace

Author: hushan[胡珊] <hushan@xiaomi.com>

Closes #3629 from suyanNone/unroll-memory and squashes the following commits:

809cc41 [hushan[胡珊]] Refine
407b2c9 [hushan[胡珊]] Refine according comments
39960d0 [hushan[胡珊]] Refine comments
0fd0213 [hushan[胡珊]] add comments
0fc2bec [hushan[胡珊]] Release pending unroll memory after put block in memoryStore
3a3f2c8 [hushan[胡珊]] Refine blockManagerSuite unroll test
3323c45 [hushan[胡珊]] Refine getOrElse
f664317 [hushan[胡珊]] Make sure not add pending in every releaseUnrollMemory call
08b32ba [hushan[胡珊]] Pending unroll memory for this block untill tryToPut
2015-03-02 16:53:54 -08:00
Andrew Or 258d154c9f [SPARK-6048] SparkConf should not translate deprecated configs on set
There are multiple issues with translating on set outlined in the JIRA.

This PR reverts the translation logic added to `SparkConf`. In the future, after the 1.3.0 release we will figure out a way to reorganize the internal structure more elegantly. For now, let's preserve the existing semantics of `SparkConf` since it's a public interface. Unfortunately this means duplicating some code for now, but this is all internal and we can always clean it up later.

Author: Andrew Or <andrew@databricks.com>

Closes #4799 from andrewor14/conf-set-translate and squashes the following commits:

11c525b [Andrew Or] Move warning to driver
10e77b5 [Andrew Or] Add documentation for deprecation precedence
a369cb1 [Andrew Or] Merge branch 'master' of github.com:apache/spark into conf-set-translate
c26a9e3 [Andrew Or] Revert all translate logic in SparkConf
fef6c9c [Andrew Or] Restore deprecation logic for spark.executor.userClassPathFirst
94b4dfa [Andrew Or] Translate on get, not set
2015-03-02 16:36:42 -08:00
Andrew Or 6776cb33ea [SPARK-6066] Make event log format easier to parse
Some users have reported difficulty in parsing the new event log format. Since we embed the metadata in the beginning of the file, when we compress the event log we need to skip the metadata because we need that information to parse the log later. This means we'll end up with a partially compressed file if event logging compression is turned on. The old format looks like:
```
sparkVersion = 1.3.0
compressionCodec = org.apache.spark.io.LZFCompressionCodec
=== LOG_HEADER_END ===
// actual events, could be compressed bytes
```
The new format in this patch puts the compression codec in the log file name instead. It also removes the metadata header altogether along with the Spark version, which was not needed. The new file name looks something like:
```
app_without_compression
app_123.lzf
app_456.snappy
```

I tested this with and without compression, using different compression codecs and event logging directories. I verified that both the `Master` and the `HistoryServer` can render both compressed and uncompressed logs as before.

Author: Andrew Or <andrew@databricks.com>

Closes #4821 from andrewor14/event-log-format and squashes the following commits:

8511141 [Andrew Or] Fix test
654883d [Andrew Or] Add back metadata with Spark version
7f537cd [Andrew Or] Address review feedback
7d6aa61 [Andrew Or] Make codec an extension
59abee9 [Andrew Or] Merge branch 'master' of github.com:apache/spark into event-log-format
27c9a6c [Andrew Or] Address review feedback
519e51a [Andrew Or] Address review feedback
ef69276 [Andrew Or] Merge branch 'master' of github.com:apache/spark into event-log-format
88a091d [Andrew Or] Add tests for new format and file name
f32d8d2 [Andrew Or] Fix tests
8db5a06 [Andrew Or] Embed metadata in the event log file name instead
2015-03-02 16:34:32 -08:00
Evan Yu 643300a6e2 SPARK-5984: Fix TimSort bug causes ArrayOutOfBoundsException
Fix TimSort bug which causes a ArrayOutOfBoundsException.

Using the proposed fix here
http://envisage-project.eu/proving-android-java-and-python-sorting-algorithm-is-broken-and-how-to-fix-it/

Author: Evan Yu <ehotou@gmail.com>

Closes #4804 from hotou/SPARK-5984 and squashes the following commits:

3421b6c [Evan Yu] SPARK-5984: Add info to LICENSE
e61c6b8 [Evan Yu] SPARK-5984: Fix license and document
6ccc280 [Evan Yu] SPARK-5984: Add License header to file
e06c0d2 [Evan Yu] SPARK-5984: Add License header to file
4d95f75 [Evan Yu] SPARK-5984: Fix TimSort bug causes ArrayOutOfBoundsException
479a106 [Evan Yu] SPARK-5984: Fix TimSort bug causes ArrayOutOfBoundsException
2015-02-28 18:55:34 -08:00
Burak Yavuz 6d8e5fbc0d [SPARK-5979][SPARK-6032] Smaller safer --packages fix
pwendell tdas
This is the safer parts of PR #4754:
 - SPARK-5979: All dependencies with the groupId `org.apache.spark` passed through `--packages`, were being excluded from the dependency tree on the assumption that they would be in the assembly jar. This is not the case, therefore the exclusion rules had to be defined more explicitly.
 - SPARK-6032: Ivy prints a whole lot of logs while retrieving dependencies. These were printed to `System.out`. Moved the logging to `System.err`.

Author: Burak Yavuz <brkyvz@gmail.com>

Closes #4802 from brkyvz/simple-streaming-fix and squashes the following commits:

e0f38cb [Burak Yavuz] Merge branch 'master' of github.com:apache/spark into simple-streaming-fix
bad921c [Burak Yavuz] [SPARK-5979][SPARK-6032] Smaller safer fix
2015-02-27 22:59:35 -08:00
Lukasz Jastrzebski 4a8a0a8ecd SPARK-2168 [Spark core] Use relative URIs for the app links in the History Server.
As agreed in PR #1160 adding test to verify if history server generates relative links to applications.

Author: Lukasz Jastrzebski <lukasz.jastrzebski@gmail.com>

Closes #4778 from elyast/master and squashes the following commits:

0c07fab [Lukasz Jastrzebski] Incorporating comments for SPARK-2168
6d7866d [Lukasz Jastrzebski] Adjusting test for  SPARK-2168 for master branch
d6f4fbe [Lukasz Jastrzebski] Added test for  SPARK-2168
2015-02-26 22:38:06 -08:00
Hong Shen 18f2098433 [SPARK-5529][CORE]Add expireDeadHosts in HeartbeatReceiver
If a blockManager has not send heartBeat more than 120s, BlockManagerMasterActor will remove it. But coarseGrainedSchedulerBackend can only remove executor after an DisassociatedEvent.  We should expireDeadHosts at HeartbeatReceiver.

Author: Hong Shen <hongshen@tencent.com>

Closes #4363 from shenh062326/my_change3 and squashes the following commits:

2c9a46a [Hong Shen] Change some code style.
1a042ff [Hong Shen] Change some code style.
2dc456e [Hong Shen] Change some code style.
d221493 [Hong Shen] Fix test failed
7448ac6 [Hong Shen] A minor change in sparkContext and heartbeatReceiver
b904aed [Hong Shen] Fix failed test
52725af [Hong Shen] Remove assert in SparkContext.killExecutors
5bedcb8 [Hong Shen] Remove assert in SparkContext.killExecutors
a858fb5 [Hong Shen] A minor change in HeartbeatReceiver
3e221d9 [Hong Shen] A minor change in HeartbeatReceiver
6bab7aa [Hong Shen] Change a code style.
07952f3 [Hong Shen] Change configs name and code style.
ce9257e [Hong Shen] Fix test failed
bccd515 [Hong Shen] Fix test failed
8e77408 [Hong Shen] Fix test failed
c1dfda1 [Hong Shen] Fix test failed
e197e20 [Hong Shen] Fix test failed
fb5df97 [Hong Shen] Remove ExpireDeadHosts in BlockManagerMessages
b5c0441 [Hong Shen] Remove expireDeadHosts in BlockManagerMasterActor
c922cb0 [Hong Shen] Add expireDeadHosts in HeartbeatReceiver
2015-02-26 18:43:23 -08:00
Marcelo Vanzin df3d559b32 [SPARK-5801] [core] Avoid creating nested directories.
Cache the value of the local root dirs to use for storing local data,
so that the same directories are reused.

Also, to avoid an extra level of nesting, use a different env variable
to propagate the local dirs from the Worker to the executors. And make
the executor directory use a different name.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #4747 from vanzin/SPARK-5801 and squashes the following commits:

e0114e1 [Marcelo Vanzin] Update unit test.
18ee0a7 [Marcelo Vanzin] [SPARK-5801] [core] Avoid creating nested directories.
2015-02-26 17:35:03 +00:00
Kay Ousterhout 838a48036c [SPARK-5982] Remove incorrect Local Read Time Metric
This metric is incomplete, because the files are memory mapped, so much of the read from disk occurs later as tasks actually read the file's data.

This should be merged into 1.3, so that we never expose this incorrect metric to users.

CC pwendell ksakellis sryza

Author: Kay Ousterhout <kayousterhout@gmail.com>

Closes #4749 from kayousterhout/SPARK-5982 and squashes the following commits:

9737b5e [Kay Ousterhout] More fixes
a1eb300 [Kay Ousterhout] Removed one more use of local read time
cf13497 [Kay Ousterhout] [SPARK-5982] Remove incorrectwq Local Read Time Metric
2015-02-25 14:55:24 -08:00
Tathagata Das 64d2c01ff1 [Spark-5967] [UI] Correctly clean JobProgressListener.stageIdToActiveJobIds
Patch should be self-explanatory
pwendell JoshRosen

Author: Tathagata Das <tathagata.das1565@gmail.com>

Closes #4741 from tdas/SPARK-5967 and squashes the following commits:

653b5bb [Tathagata Das] Fixed the fix and added test
e2de972 [Tathagata Das] Clear stages which have no corresponding active jobs.
2015-02-24 11:02:47 -08:00
Ilya Ganelin 95cd643aa9 [SPARK-3885] Provide mechanism to remove accumulators once they are no longer used
Instead of storing a strong reference to accumulators, I've replaced this with a weak reference and updated any code that uses these accumulators to check whether the reference resolves before using the accumulator. A weak reference will be cleared when there is no longer an existing copy of the variable versus using a soft reference in which case accumulators would only be cleared when the GC explicitly ran out of memory.

Author: Ilya Ganelin <ilya.ganelin@capitalone.com>

Closes #4021 from ilganeli/SPARK-3885 and squashes the following commits:

4ba9575 [Ilya Ganelin]  Fixed error in test suite
8510943 [Ilya Ganelin] Extra code
bb76ef0 [Ilya Ganelin] File deleted somehow
283a333 [Ilya Ganelin] Added cleanup method for accumulators to remove stale references within Accumulators.original to accumulators that are now out of scope
345fd4f [Ilya Ganelin] Merge remote-tracking branch 'upstream/master' into SPARK-3885
7485a82 [Ilya Ganelin] Fixed build error
c8e0f2b [Ilya Ganelin] Added working test for accumulator garbage collection
94ce754 [Ilya Ganelin] Still not being properly garbage collected
8722b63 [Ilya Ganelin] Fixing gc test
7414a9c [Ilya Ganelin] Added test for accumulator garbage collection
18d62ec [Ilya Ganelin] Updated to throw Exception when accessing a GCd accumulator
9a81928 [Ilya Ganelin] Reverting permissions changes
28f705c [Ilya Ganelin] Merge remote-tracking branch 'upstream/master' into SPARK-3885
b820ab4b [Ilya Ganelin] reset
d78f4bf [Ilya Ganelin] Removed obsolete comment
0746e61 [Ilya Ganelin] Updated DAGSchedulerSUite to fix bug
3350852 [Ilya Ganelin] Updated DAGScheduler and Suite to correctly use new implementation of WeakRef Accumulator storage
c49066a [Ilya Ganelin] Merge remote-tracking branch 'upstream/master' into SPARK-3885
cbb9023 [Ilya Ganelin] Merge remote-tracking branch 'upstream/master' into SPARK-3885
a77d11b [Ilya Ganelin] Updated Accumulators class to store weak references instead of strong references to allow garbage collection of old accumulators
2015-02-22 22:57:26 -08:00
Aaron Josephs e4f9d03d72 [SPARK-911] allow efficient queries for a range if RDD is partitioned wi...
...th RangePartitioner

Author: Aaron Josephs <ajoseph4@binghamton.edu>

Closes #1381 from aaronjosephs/PLAT-911 and squashes the following commits:

e30ade5 [Aaron Josephs] [SPARK-911] allow efficient queries for a range if RDD is partitioned with RangePartitioner
2015-02-22 22:09:06 -08:00
Evan Yu 7683982faf [SPARK-5860][CORE] JdbcRDD: overflow on large range with high number of partitions
Fix a overflow bug in JdbcRDD when calculating partitions for large BIGINT ids

Author: Evan Yu <ehotou@gmail.com>

Closes #4701 from hotou/SPARK-5860 and squashes the following commits:

9e038d1 [Evan Yu] [SPARK-5860][CORE] Prevent overflowing at the length level
7883ad9 [Evan Yu] [SPARK-5860][CORE] Prevent overflowing at the length level
c88755a [Evan Yu] [SPARK-5860][CORE] switch to BigInt instead of BigDecimal
4e9ff4f [Evan Yu] [SPARK-5860][CORE] JdbcRDD overflow on large range with high number of partitions
2015-02-21 20:40:21 +00:00
Sean Owen d3dfebebce SPARK-5744 [CORE] Take 2. RDD.isEmpty / take fails for (empty) RDD of Nothing
Follow-on to https://github.com/apache/spark/pull/4591

Document isEmpty / take / parallelize and their interaction with (an empty) RDD[Nothing] and RDD[Null]. Also, fix a marginally related minor issue with histogram() and EmptyRDD.

CC rxin since you reviewed the last one although I imagine this is an uncontroversial resolution.

Author: Sean Owen <sowen@cloudera.com>

Closes #4698 from srowen/SPARK-5744.2 and squashes the following commits:

9b2a811 [Sean Owen] 2 extra javadoc fixes
d1b9fba [Sean Owen] Document isEmpty / take / parallelize and their interaction with (an empty) RDD[Nothing] and RDD[Null]. Also, fix a marginally related minor issue with histogram() and EmptyRDD.
2015-02-20 10:21:39 +00:00
Sean Owen 34b7c35380 SPARK-4682 [CORE] Consolidate various 'Clock' classes
Another one from JoshRosen 's wish list. The first commit is much smaller and removes 2 of the 4 Clock classes. The second is much larger, necessary for consolidating the streaming one. I put together implementations in the way that seemed simplest. Almost all the change is standardizing class and method names.

Author: Sean Owen <sowen@cloudera.com>

Closes #4514 from srowen/SPARK-4682 and squashes the following commits:

5ed3a03 [Sean Owen] Javadoc Clock classes; make ManualClock private[spark]
169dd13 [Sean Owen] Add support for legacy org.apache.spark.streaming clock class names
277785a [Sean Owen] Reduce the net change in this patch by reversing some unnecessary syntax changes along the way
b5e53df [Sean Owen] FakeClock -> ManualClock; getTime() -> getTimeMillis()
160863a [Sean Owen] Consolidate Streaming Clock class into common util Clock
7c956b2 [Sean Owen] Consolidate Clocks except for Streaming Clock
2015-02-19 15:35:23 -08:00
Jacek Lewandowski fb87f44921 SPARK-5548: Fix for AkkaUtilsSuite failure - attempt 2
Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>

Closes #4653 from jacek-lewandowski/SPARK-5548-2-master and squashes the following commits:

0e199b6 [Jacek Lewandowski] SPARK-5548: applied reviewer's comments
843eafb [Jacek Lewandowski] SPARK-5548: Fix for AkkaUtilsSuite failure - attempt 2
2015-02-19 09:53:36 -08:00
Burak Yavuz ae6cfb3acd [SPARK-5811] Added documentation for maven coordinates and added Spark Packages support
Documentation for maven coordinates + Spark Package support. Added pyspark tests for `--packages`

Author: Burak Yavuz <brkyvz@gmail.com>
Author: Davies Liu <davies@databricks.com>

Closes #4662 from brkyvz/SPARK-5811 and squashes the following commits:

56ccccd [Burak Yavuz] fixed broken test
64cb8ee [Burak Yavuz] passed pep8 on local
c07b81e [Burak Yavuz] fixed pep8
a8bd6b7 [Burak Yavuz] submit PR
4ef4046 [Burak Yavuz] ready for PR
8fb02e5 [Burak Yavuz] merged master
25c9b9f [Burak Yavuz] Merge branch 'master' of github.com:apache/spark into python-jar
560d13b [Burak Yavuz] before PR
17d3f76 [Davies Liu] support .jar as python package
a3eb717 [Burak Yavuz] Merge branch 'master' of github.com:apache/spark into SPARK-5811
c60156d [Burak Yavuz] [SPARK-5811] Added documentation for maven coordinates
2015-02-17 17:23:22 -08:00
Ryan Williams d8f69cf788 [SPARK-5778] throw if nonexistent metrics config file provided
previous behavior was to log an error; this is fine in the general
case where no `spark.metrics.conf` parameter was specified, in which
case a default `metrics.properties` is looked for, and the execption
logged and suppressed if it doesn't exist.

if the user has purposefully specified a metrics.conf file, however,
it makes more sense to show them an error when said file doesn't
exist.

Author: Ryan Williams <ryan.blake.williams@gmail.com>

Closes #4571 from ryan-williams/metrics and squashes the following commits:

5bccb14 [Ryan Williams] private-ize some MetricsConfig members
08ff998 [Ryan Williams] rename METRICS_CONF: DEFAULT_METRICS_CONF_FILENAME
f4d7fab [Ryan Williams] fix tests
ad24b0e [Ryan Williams] add "metrics.properties" to .rat-excludes
94e810b [Ryan Williams] throw if nonexistent Sink class is specified
31d2c30 [Ryan Williams] metrics code review feedback
56287db [Ryan Williams] throw if nonexistent metrics config file provided
2015-02-17 10:57:16 -08:00
Josh Rosen 58a82a7882 [SPARK-5849] Handle more types of invalid JSON requests in SubmitRestProtocolMessage.parseAction
This patch improves SubmitRestProtocol's handling of invalid JSON requests in cases where those requests were parsable as JSON but not as JSON objects (e.g. they could be parsed as ararys or strings).  I replaced an unchecked cast with pattern-matching and added a new test case.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #4637 from JoshRosen/rest-protocol-cast and squashes the following commits:

b3f282b [Josh Rosen] [SPARK-5849] Handle more types of invalid JSON in SubmitRestProtocolMessage.parseAction
2015-02-16 18:08:02 -08:00
Josh Rosen d06d5ee9b3 [SPARK-5227] [SPARK-5679] Disable FileSystem cache in WholeTextFileRecordReaderSuite
This patch fixes two difficult-to-reproduce Jenkins test failures in InputOutputMetricsSuite (SPARK-5227 and SPARK-5679).  The problem was that WholeTextFileRecordReaderSuite modifies the `fs.local.block.size` Hadoop configuration and this change was affecting subsequent test suites due to Hadoop's caching of FileSystem instances (see HADOOP-8490 for more details).

The fix implemented here is to disable FileSystem caching in WholeTextFileRecordReaderSuite.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #4599 from JoshRosen/inputoutputsuite-fix and squashes the following commits:

47dc447 [Josh Rosen] [SPARK-5227] [SPARK-5679] Disable FileSystem cache in WholeTextFileRecordReaderSuite
2015-02-13 17:45:31 -08:00
Andrew Or 5d3cc6b3d7 [HOTFIX] Fix build break in MesosSchedulerBackendSuite 2015-02-13 13:10:29 -08:00
Josh Rosen 077eec2d9d [SPARK-5735] Replace uses of EasyMock with Mockito
This patch replaces all uses of EasyMock with Mockito.  There are two motivations for this:

1. We should use a single mocking framework in our tests in order to keep things consistent.
2. EasyMock may be responsible for non-deterministic unit test failures due to its Objensis dependency (see SPARK-5626 for more details).

Most of these changes are fairly mechanical translations of Mockito code to EasyMock, although I made a small change that strengthens the assertions in one test in KinesisReceiverSuite.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #4578 from JoshRosen/SPARK-5735-remove-easymock and squashes the following commits:

0ab192b [Josh Rosen] Import sorting plus two minor changes to more closely match old semantics.
977565b [Josh Rosen] Remove EasyMock from build.
fae1d8f [Josh Rosen] Remove EasyMock usage in KinesisReceiverSuite.
7cca486 [Josh Rosen] Remove EasyMock usage in MesosSchedulerBackendSuite
fc5e94d [Josh Rosen] Remove EasyMock in CacheManagerSuite
2015-02-13 09:55:36 -08:00
Ryan Williams fc6d3e796a [SPARK-5783] Better eventlog-parsing error messages
Author: Ryan Williams <ryan.blake.williams@gmail.com>

Closes #4573 from ryan-williams/history and squashes the following commits:

a8647ec [Ryan Williams] fix test calls to .replay()
98aa3fe [Ryan Williams] include filename in history-parsing error message
8deecf0 [Ryan Williams] add line number to history-parsing error message
b668b52 [Ryan Williams] add log info line to history-eventlog parsing
2015-02-13 09:47:26 -08:00
Andrew Or 1d5663e92c [SPARK-5760][SPARK-5761] Fix standalone rest protocol corner cases + revamp tests
The changes are summarized in the commit message. Test or test-related code accounts for 90% of the lines changed.

Author: Andrew Or <andrew@databricks.com>

Closes #4557 from andrewor14/rest-tests and squashes the following commits:

b4dc980 [Andrew Or] Merge branch 'master' of github.com:apache/spark into rest-tests
b55e40f [Andrew Or] Add test for unknown fields
cc96993 [Andrew Or] private[spark] -> private[rest]
578cf45 [Andrew Or] Clean up test code a little
d82d971 [Andrew Or] v1 -> serverVersion
ea48f65 [Andrew Or] Merge branch 'master' of github.com:apache/spark into rest-tests
00999a8 [Andrew Or] Revamp tests + fix a few corner cases
2015-02-12 14:47:52 -08:00
Kay Ousterhout 893d6fd704 [SPARK-5645] Added local read bytes/time to task metrics
ksakellis I stumbled on your JIRA for this yesterday; I know it's assigned to you but I'd already done this for my own uses a while ago so thought I could help save you the work of doing it!  Hopefully this doesn't duplicate any work you've already done.

Here's a screenshot of what the UI looks like:
![image](https://cloud.githubusercontent.com/assets/1108612/6135352/c03e7276-b11c-11e4-8f11-c6aefe1f35b9.png)
Based on a discussion with pwendell, I put the data read remotely in as an additional metric rather than showing it in brackets as you'd suggested, Kostas.  The assumption here is that the average user doesn't care about the differentiation between local / remote data, so it's better not to pollute the UI.

I also added data about the local read time, which I've found very helpful for debugging, but I didn't put it in the UI because I think it's probably something not a ton of people will need to use.

With this change, the total read time and total write time shown in the UI will be equal, fixing a long-term source of user confusion:
![image](https://cloud.githubusercontent.com/assets/1108612/6135399/25f14490-b11d-11e4-8086-20be5f4002e6.png)

Author: Kay Ousterhout <kayousterhout@gmail.com>

Closes #4510 from kayousterhout/SPARK-5645 and squashes the following commits:

4a0182c [Kay Ousterhout] oops
5f5da1b [Kay Ousterhout] Small style fix
5da04cf [Kay Ousterhout] Addressed more comments from Kostas
ba05149 [Kay Ousterhout] Remove parens
a9dc685 [Kay Ousterhout] Kostas comment, test fix
33d2e2d [Kay Ousterhout] Merge remote-tracking branch 'upstream/master' into SPARK-5645
347e2cd [Kay Ousterhout] [SPARK-5645] Added local read bytes/time to task metrics
2015-02-12 14:36:27 -08:00
mcheah 1cb3770075 [SPARK-4879] Use driver to coordinate Hadoop output committing for speculative tasks
Previously, SparkHadoopWriter always committed its tasks without question. The problem is that when speculation is enabled sometimes this can result in multiple tasks committing their output to the same file. Even though an HDFS-writing task may be re-launched due to speculation, the original task is not killed and may eventually commit as well.

This can cause strange race conditions where multiple tasks that commit interfere with each other, with the result being that some partition files are actually lost entirely. For more context on these kinds of scenarios, see SPARK-4879.

In Hadoop MapReduce jobs, the application master is a central coordinator that authorizes whether or not any given task can commit. Before a task commits its output, it queries the application master as to whether or not such a commit is safe, and the application master does bookkeeping as tasks are requesting commits. Duplicate tasks that would write to files that were already written to from other tasks are prohibited from committing.

This patch emulates that functionality - the crucial missing component was a central arbitrator, which is now a module called the OutputCommitCoordinator. The coordinator lives on the driver and the executors can obtain a reference to this actor and request its permission to commit. As tasks commit and are reported as completed successfully or unsuccessfully by the DAGScheduler, the commit coordinator is informed of the task completion events as well to update its internal state.

Future work includes more rigorous unit testing and extra optimizations should this patch cause a performance regression. It is unclear what the overall cost of communicating back to the driver on every hadoop-committing task will be. It's also important for those hitting this issue to backport this onto previous version of Spark because the bug has serious consequences, that is, data is lost.

Currently, the OutputCommitCoordinator is only used when `spark.speculation` is true.  It can be disabled by setting `spark.hadoop.outputCommitCoordination.enabled=false` in SparkConf.

This patch is an updated version of #4155 (by mccheah), which in turn was an updated version of this PR.

Closes #4155.

Author: mcheah <mcheah@palantir.com>
Author: Josh Rosen <joshrosen@databricks.com>

Closes #4066 from JoshRosen/SPARK-4879-sparkhadoopwriter-fix and squashes the following commits:

658116b [Josh Rosen] Merge remote-tracking branch 'origin/master' into SPARK-4879-sparkhadoopwriter-fix
ed783b2 [Josh Rosen] Address Andrew’s feedback.
e7be65a [Josh Rosen] Merge remote-tracking branch 'origin/master' into SPARK-4879-sparkhadoopwriter-fix
14861ea [Josh Rosen] splitID -> partitionID in a few places
ed8b554 [Josh Rosen] Merge remote-tracking branch 'origin/master' into SPARK-4879-sparkhadoopwriter-fix
48d5c1c [Josh Rosen] Roll back copiesRunning change in TaskSetManager
3969f5f [Josh Rosen] Re-enable guarding of commit coordination with spark.speculation setting.
ede7590 [Josh Rosen] Add test to ensure that a job that denies all commits cannot complete successfully.
97da5fe [Josh Rosen] Use actor only for RPC; call methods directly in DAGScheduler.
f582574 [Josh Rosen] Some cleanup in OutputCommitCoordinatorSuite
a7c0e29 [Josh Rosen] Create fake TaskInfo using dummy fields instead of Mockito.
997b41b [Josh Rosen] Roll back unnecessary DAGSchedulerSingleThreadedProcessLoop refactoring:
459310a [Josh Rosen] Roll back TaskSetManager changes that broke other tests.
dd00b7c [Josh Rosen] Move CommitDeniedException to executors package; remove `@DeveloperAPI` annotation.
c79df98 [Josh Rosen] Some misc. code style + doc changes:
f7d69c5 [Josh Rosen] Merge remote-tracking branch 'origin/master' into SPARK-4879-sparkhadoopwriter-fix
92e6dc9 [Josh Rosen] Bug fix: use task ID instead of StageID to index into authorizedCommitters.
b344bad [Josh Rosen] (Temporarily) re-enable “always coordinate” for testing purposes.
0aec91e [Josh Rosen] Only coordinate when speculation is enabled; add configuration option to bypass new coordination.
594e41a [mcheah] Fixing a scalastyle error
60a47f4 [mcheah] Writing proper unit test for OutputCommitCoordinator and fixing bugs.
d63f63f [mcheah] Fixing compiler error
9fe6495 [mcheah] Fixing scalastyle
1df2a91 [mcheah] Throwing exception if SparkHadoopWriter commit denied
d431144 [mcheah] Using more concurrency to process OutputCommitCoordinator requests.
c334255 [mcheah] Properly handling messages that could be sent after actor shutdown.
8d5a091 [mcheah] Was mistakenly serializing the accumulator in test suite.
9c6a4fa [mcheah] More OutputCommitCoordinator cleanup on stop()
78eb1b5 [mcheah] Better OutputCommitCoordinatorActor stopping; simpler canCommit
83de900 [mcheah] Making the OutputCommitCoordinatorMessage serializable
abc7db4 [mcheah] TaskInfo can't be null in DAGSchedulerSuite
f135a8e [mcheah] Moving the output commit coordinator from class into method.
1c2b219 [mcheah] Renaming oudated names for test function classes
66a71cd [mcheah] Removing whitespace modifications
6b543ba [mcheah] Removing redundant accumulator in unit test
c9decc6 [mcheah] Scalastyle fixes
bc80770 [mcheah] Unit tests for OutputCommitCoordinator
6e6f748 [mcheah] [SPARK-4879] Use the Spark driver to authorize Hadoop commits.
2015-02-10 20:12:18 -08:00
Andrew Or b640c841fc [HOTFIX][SPARK-4136] Fix compilation and tests 2015-02-10 11:18:01 -08:00
Sandy Ryza 69bc3bb6cf SPARK-4136. Under dynamic allocation, cancel outstanding executor requests when no longer needed
This takes advantage of the changes made in SPARK-4337 to cancel pending requests to YARN when they are no longer needed.

Each time the timer in `ExecutorAllocationManager` strikes, we compute `maxNumNeededExecutors`, the maximum number of executors we could fill with the current load.  This is calculated as the total number of running and pending tasks divided by the number of cores per executor.  If `maxNumNeededExecutors` is below the total number of running and pending executors, we call `requestTotalExecutors(maxNumNeededExecutors)` to let the cluster manager know that it should cancel any pending requests above this amount.  If not, `maxNumNeededExecutors` is just used as a bound in alongside the configured `maxExecutors` to limit the number of new requests.

The patch modifies the API exposed by `ExecutorAllocationClient` for requesting additional executors by moving from `requestExecutors` to `requestTotalExecutors`.  This makes the communication between the `ExecutorAllocationManager` and the `YarnAllocator` easier to reason about and removes some state that needed to be kept in the `CoarseGrainedSchedulerBackend`.  I think an argument can be made that this makes for a less attractive user-facing API in `SparkContext`, but I'm having trouble envisioning situations where a user would want to use either of these APIs.

This will likely break some tests, but I wanted to get feedback on the approach before adding tests and polishing.

Author: Sandy Ryza <sandy@cloudera.com>

Closes #4168 from sryza/sandy-spark-4136 and squashes the following commits:

37ce77d [Sandy Ryza] Warn on negative number
cd3b2ff [Sandy Ryza] SPARK-4136
2015-02-10 11:12:06 -08:00
Kay Ousterhout a2d33d0b01 [SPARK-5701] Only set ShuffleReadMetrics when task has shuffle deps
The updateShuffleReadMetrics method in TaskMetrics (called by the executor heartbeater) will currently always add a ShuffleReadMetrics to TaskMetrics (with values set to 0), even when the task didn't read any shuffle data. ShuffleReadMetrics should only be added if the task reads shuffle data.

Author: Kay Ousterhout <kayousterhout@gmail.com>

Closes #4488 from kayousterhout/SPARK-5701 and squashes the following commits:

673ed58 [Kay Ousterhout] SPARK-5701: Only set ShuffleReadMetrics when task has shuffle deps
2015-02-09 21:22:09 -08:00
Marcelo Vanzin 20a6013106 [SPARK-2996] Implement userClassPathFirst for driver, yarn.
Yarn's config option `spark.yarn.user.classpath.first` does not work the same way as
`spark.files.userClassPathFirst`; Yarn's version is a lot more dangerous, in that it
modifies the system classpath, instead of restricting the changes to the user's class
loader. So this change implements the behavior of the latter for Yarn, and deprecates
the more dangerous choice.

To be able to achieve feature-parity, I also implemented the option for drivers (the existing
option only applies to executors). So now there are two options, each controlling whether
to apply userClassPathFirst to the driver or executors. The old option was deprecated, and
aliased to the new one (`spark.executor.userClassPathFirst`).

The existing "child-first" class loader also had to be fixed. It didn't handle resources, and it
was also doing some things that ended up causing JVM errors depending on how things
were being called.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #3233 from vanzin/SPARK-2996 and squashes the following commits:

9cf9cf1 [Marcelo Vanzin] Merge branch 'master' into SPARK-2996
a1499e2 [Marcelo Vanzin] Remove SPARK_HOME propagation.
fa7df88 [Marcelo Vanzin] Remove 'test.resource' file, create it dynamically.
a8c69f1 [Marcelo Vanzin] Review feedback.
cabf962 [Marcelo Vanzin] Merge branch 'master' into SPARK-2996
a1b8d7e [Marcelo Vanzin] Merge branch 'master' into SPARK-2996
3f768e3 [Marcelo Vanzin] Merge branch 'master' into SPARK-2996
2ce3c7a [Marcelo Vanzin] Merge branch 'master' into SPARK-2996
0e6d6be [Marcelo Vanzin] Merge branch 'master' into SPARK-2996
70d4044 [Marcelo Vanzin] Fix pyspark/yarn-cluster test.
0fe7777 [Marcelo Vanzin] Merge branch 'master' into SPARK-2996
0e6ef19 [Marcelo Vanzin] Move class loaders around and make names more meaninful.
fe970a7 [Marcelo Vanzin] Review feedback.
25d4fed [Marcelo Vanzin] Merge branch 'master' into SPARK-2996
3cb6498 [Marcelo Vanzin] Call the right loadClass() method on the parent.
fbb8ab5 [Marcelo Vanzin] Add locking in loadClass() to avoid deadlocks.
2e6c4b7 [Marcelo Vanzin] Mention new setting in documentation.
b6497f9 [Marcelo Vanzin] Merge branch 'master' into SPARK-2996
a10f379 [Marcelo Vanzin] Some feedback.
3730151 [Marcelo Vanzin] Merge branch 'master' into SPARK-2996
f513871 [Marcelo Vanzin] Merge branch 'master' into SPARK-2996
44010b6 [Marcelo Vanzin] Merge branch 'master' into SPARK-2996
7b57cba [Marcelo Vanzin] Remove now outdated message.
5304d64 [Marcelo Vanzin] Merge branch 'master' into SPARK-2996
35949c8 [Marcelo Vanzin] Merge branch 'master' into SPARK-2996
54e1a98 [Marcelo Vanzin] Merge branch 'master' into SPARK-2996
d1273b2 [Marcelo Vanzin] Add test file to rat exclude.
fa1aafa [Marcelo Vanzin] Remove write check on user jars.
89d8072 [Marcelo Vanzin] Cleanups.
a963ea3 [Marcelo Vanzin] Implement spark.driver.userClassPathFirst for standalone cluster mode.
50afa5f [Marcelo Vanzin] Fix Yarn executor command line.
7d14397 [Marcelo Vanzin] Register user jars in executor up front.
7f8603c [Marcelo Vanzin] Fix yarn-cluster mode without userClassPathFirst.
20373f5 [Marcelo Vanzin] Fix ClientBaseSuite.
55c88fa [Marcelo Vanzin] Run all Yarn integration tests via spark-submit.
0b64d92 [Marcelo Vanzin] Add deprecation warning to yarn option.
4a84d87 [Marcelo Vanzin] Fix the child-first class loader.
d0394b8 [Marcelo Vanzin] Add "deprecated configs" to SparkConf.
46d8cf2 [Marcelo Vanzin] Update doc with new option, change name to "userClassPathFirst".
a314f2d [Marcelo Vanzin] Enable driver class path isolation in SparkSubmit.
91f7e54 [Marcelo Vanzin] [yarn] Enable executor class path isolation.
a853e74 [Marcelo Vanzin] Re-work CoarseGrainedExecutorBackend command line arguments.
89522ef [Marcelo Vanzin] Add class path isolation support for Yarn cluster mode.
2015-02-09 21:17:28 -08:00
Andrew Or 1390e56fa8 [SPARK-5388] Provide a stable application submission gateway for standalone cluster mode
The goal is to provide a stable, REST-based application submission gateway that is not inherently based on Akka, which is unstable across versions. This PR targets standalone cluster mode, but is implemented in a general enough manner that can be potentially extended to other modes in the future. Client mode is currently not included in the changes here because there are many more Akka messages exchanged there.

As of the changes here, the Master will advertise two ports, 7077 and 6066. We need to keep around the old one (7077) for client mode and older versions of Spark submit. However, all new versions of Spark submit will use the REST gateway (6066).

By the way this includes ~700 lines of tests and ~200 lines of license.

Author: Andrew Or <andrew@databricks.com>

Closes #4216 from andrewor14/rest and squashes the following commits:

8d7ce07 [Andrew Or] Merge branch 'master' of github.com:apache/spark into rest
6f0c597 [Andrew Or] Use nullable fields for integer and boolean values
dfe4bd7 [Andrew Or] Merge branch 'master' of github.com:apache/spark into rest
b9e2a08 [Andrew Or] Minor comments
02b5cea [Andrew Or] Fix tests
d2b1ef8 [Andrew Or] Comment changes + minor code refactoring across the board
9c82a36 [Andrew Or] Minor comment and wording updates
b4695e7 [Andrew Or] Merge branch 'master' of github.com:apache/spark into rest
c9a8ad7 [Andrew Or] Do not include appResource and mainClass as properties
6fc7670 [Andrew Or] Report REST server response back to the user
40e6095 [Andrew Or] Pass submit parameters through system properties
cbd670b [Andrew Or] Include unknown fields, if any, in server response
9fee16f [Andrew Or] Include server protocol version on mismatch
09f873a [Andrew Or] Fix style
8188e61 [Andrew Or] Upgrade Jackson from 2.3.0 to 2.4.4
37538e0 [Andrew Or] Merge branch 'master' of github.com:apache/spark into rest
9165ae8 [Andrew Or] Fall back to Akka if endpoint was not REST
252d53c [Andrew Or] Clean up server error handling behavior further
c643f64 [Andrew Or] Fix style
bbbd329 [Andrew Or] Merge branch 'master' of github.com:apache/spark into rest
792e112 [Andrew Or] Use specific HTTP response codes on error
f98660b [Andrew Or] Version the protocol and include it in REST URL
721819f [Andrew Or] Provide more REST-like interface for submit/kill/status
581f7bf [Andrew Or] Merge branch 'master' of github.com:apache/spark into rest
9e0d1af [Andrew Or] Move some classes around to reduce number of files (minor)
42e5de4 [Andrew Or] Merge branch 'master' of github.com:apache/spark into rest
1f1c03f [Andrew Or] Use Jackson's DefaultScalaModule to simplify messages
9229433 [Andrew Or] Reduce duplicate naming in REST field
ade28fd [Andrew Or] Clean up REST response output in Spark submit
b2fef8b [Andrew Or] Abstract the success field to the general response
6c57b4b [Andrew Or] Increase timeout in end-to-end tests
bf696ff [Andrew Or] Add checks for enabling REST when using kill/status
7ee6737 [Andrew Or] Merge branch 'master' of github.com:apache/spark into rest
e2f7f5f [Andrew Or] Provide more safeguard against missing fields
9581df7 [Andrew Or] Clean up uses of exceptions
914fdff [Andrew Or] Merge branch 'master' of github.com:apache/spark into rest
e2104e6 [Andrew Or] stable -> rest
3db7379 [Andrew Or] Fix comments and name fields for better error messages
8d43486 [Andrew Or] Replace SubmitRestProtocolAction with class name
df90e8b [Andrew Or] Use Jackson for JSON de/serialization
d7a1f9f [Andrew Or] Fix local cluster tests
efa5e18 [Andrew Or] Merge branch 'master' of github.com:apache/spark into rest
e42c131 [Andrew Or] Add end-to-end tests for standalone REST protocol
837475b [Andrew Or] Show the REST port on the Master UI
d8d3717 [Andrew Or] Use a daemon thread pool for REST server
6568ca5 [Andrew Or] Merge branch 'master' of github.com:apache/spark into rest
77774ba [Andrew Or] Minor fixes
206cae4 [Andrew Or] Refactor and add tests for the REST protocol
63c05b3 [Andrew Or] Remove MASTER as a field (minor)
9e21b72 [Andrew Or] Action -> SparkSubmitAction (minor)
51c5ca6 [Andrew Or] Distinguish client and server side Spark versions
b44e103 [Andrew Or] Implement status requests + fix validation behavior
120ab9d [Andrew Or] Support kill and request driver status through SparkSubmit
544de1d [Andrew Or] Major clean ups in code and comments
e958cae [Andrew Or] Supported nested values in messages
484bd21 [Andrew Or] Specify an ordering for fields in SubmitDriverRequestMessage
6ff088d [Andrew Or] Rename classes to generalize REST protocol
af9d9cb [Andrew Or] Integrate REST protocol in standalone mode
53e7c0e [Andrew Or] Initial client, server, and all the messages
2015-02-06 15:57:06 -08:00
Kostas Sakellis dcd1e42d6b [SPARK-4874] [CORE] Collect record count metrics
Collects record counts for both Input/Output and Shuffle Metrics. For the input/output metrics, it just appends the counter every time the iterators get accessed.

For shuffle on the write side, we count the metrics post aggregation (after a map side combine) and on the read side we count the metrics pre aggregation. This allows both the bytes read/written metrics and the records read/written to line up.

For backwards compatibility, if we deserialize an older event that doesn't have record metrics, we set the metric to -1.

Author: Kostas Sakellis <kostas@cloudera.com>

Closes #4067 from ksakellis/kostas-spark-4874 and squashes the following commits:

bd919be [Kostas Sakellis] Changed 'Records Read' in shuffleReadMetrics json output to 'Total Records Read'
dad4d57 [Kostas Sakellis] Add a comment and check to BlockObjectWriter so that it cannot be reopend.
6f236a1 [Kostas Sakellis] Renamed _recordsWritten in ShuffleWriteMetrics to be more consistent
70620a0 [Kostas Sakellis] CR Feedback
17faa3a [Kostas Sakellis] Removed AtomicLong in favour of using Long
b6f9923 [Kostas Sakellis] Merge AfterNextInterceptingIterator with InterruptableIterator to save a function call
46c8186 [Kostas Sakellis] Combined Bytes and # records into one column
57551c1 [Kostas Sakellis] Conforms to SPARK-3288
6cdb44e [Kostas Sakellis] Removed the generic InterceptingIterator and repalced it with specific implementation
1aa273c [Kostas Sakellis] CR Feedback
1bb78b1 [Kostas Sakellis] [SPARK-4874] [CORE] Collect record count metrics
2015-02-06 14:31:20 -08:00
Marcelo Vanzin 5687bab8fd [SPARK-5600] [core] Clean up FsHistoryProvider test, fix app sort order.
Clean up some test setup code to remove duplicate instantiation of the
provider. Also make sure unfinished apps are sorted correctly.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #4370 from vanzin/SPARK-5600 and squashes the following commits:

0d048d5 [Marcelo Vanzin] Cleanup test code a bit.
2585119 [Marcelo Vanzin] Review feedback.
8b97544 [Marcelo Vanzin] Merge branch 'master' into SPARK-5600
be979e9 [Marcelo Vanzin] Merge branch 'master' into SPARK-5600
298371c [Marcelo Vanzin] [SPARK-5600] [core] Clean up FsHistoryProvider test, fix app sort order.
2015-02-06 14:23:09 -08:00
Josh Rosen fb6c0cbac4 [HOTFIX] Fix test build break in ExecutorAllocationManagerSuite.
This was caused because #3486 added a new field to ExecutorInfo and #4369
added new tests that created ExecutorInfos.  These patches were merged in
quick succession and were never tested together, hence the compilation error.
2015-02-06 11:48:52 -08:00
Josh Rosen 0d74bd7fd7 [SPARK-5555] Enable UISeleniumSuite tests
This patch enables UISeleniumSuite, a set of tests for the Spark application web UI.  These tests were previously disabled because they were slow, but I think we now have sufficient test time budget that the benefit of enabling them outweighs the time costs.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #4334 from JoshRosen/enable-uiseleniumsuite and squashes the following commits:

4ab9477 [Josh Rosen] Use BeforeAndAfterAll to cleanup WebDriver
71efc72 [Josh Rosen] Update broken UISeleniumSuite tests; use random port #.
a5ab595 [Josh Rosen] Enable UISeleniumSuite tests.
2015-02-06 11:14:58 -08:00
Kostas Sakellis 32e964c410 SPARK-2450 Adds executor log links to Web UI
Adds links to stderr/stdout in the executor tab of the webUI for:
1) Standalone
2) Yarn client
3) Yarn cluster

This tries to add the log url support in a general way so as to make it easy to add support for all the
cluster managers. This is done by using environment variables to pass to the executor the log urls. The
SPARK_LOG_URL_ prefix is used and so additional logs besides stderr/stdout can also be added.

To propagate this information to the UI we use the onExecutorAdded spark listener event.

Although this commit doesn't add log urls when running on a mesos cluster, it should be possible to add using the same mechanism.

Author: Kostas Sakellis <kostas@cloudera.com>
Author: Josh Rosen <joshrosen@databricks.com>

Closes #3486 from ksakellis/kostas-spark-2450 and squashes the following commits:

d190936 [Josh Rosen] Fix a few minor style / formatting nits. Reset listener after each test Don't null listener out at end of main().
8673fe1 [Kostas Sakellis] CR feedback. Hide the log column if there are no logs available
5bf6952 [Kostas Sakellis] [SPARK-2450] [CORE] Adds exeuctor log links to Web UI
2015-02-06 11:13:00 -08:00
lianhuiwang 6072fcc14e [SPARK-5593][Core]Replace BlockManagerListener with ExecutorListener in ExecutorAllocationListener
More strictly, in ExecutorAllocationListener, we need to replace onBlockManagerAdded, onBlockManagerRemoved with onExecutorAdded,onExecutorRemoved. because at some time, onExecutorAdded and onExecutorRemoved are more accurate to express these meanings. example at SPARK-5529, BlockManager has been removed,but executor is existed.
 andrewor14 sryza

Author: lianhuiwang <lianhuiwang09@gmail.com>

Closes #4369 from lianhuiwang/SPARK-5593 and squashes the following commits:

333367c [lianhuiwang] Replace BlockManagerListener with ExecutorListener in ExecutorAllocationListener
2015-02-06 11:09:37 -08:00
Stephen Haberman 9792bec596 [SPARK-4877] Allow user first classes to extend classes in the parent.
Previously, the classloader isolation was almost too good, such
that if a child class needed to load/reference a class that was
only available in the parent, it could not do so.

This adds tests for that case, the user-first Fake2 class extends
the only-in-parent Fake3 class.

It also sneaks in a fix where only the first stage seemed to work,
and on subsequent stages, a LinkageError happened because classes
from the user-first classpath were getting defined twice.

Author: Stephen Haberman <stephen@exigencecorp.com>

Closes #3725 from stephenh/4877_user_first_parent_inheritance and squashes the following commits:

dabcd35 [Stephen Haberman] [SPARK-4877] Respect userClassPathFirst for the driver code too.
3d0fa7c [Stephen Haberman] [SPARK-4877] Allow user first classes to extend classes in the parent.
2015-02-06 11:03:56 -08:00
Marcelo Vanzin 856928979f [SPARK-5582] [history] Ignore empty log directories.
Empty log directories are not useful at the moment, but if one ends
up showing in the log root, it breaks the code that checks for log
directories.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #4352 from vanzin/SPARK-5582 and squashes the following commits:

1a6a3d4 [Marcelo Vanzin] [SPARK-5582] Fix exception when looking at empty directories.
2015-02-06 10:07:20 +00:00
Jacek Lewandowski 081ac69f3d SPARK-5548: Fixed a race condition in AkkaUtilsSuite
`Await.result` and `selection.resolveOne` runs the same timeout simultaneously. When `Await.result` timeout is reached first, then `TimeoutException` is thrown. On the other hand, when `selection.resolveOne` timeout is reached first, `ActorNotFoundException` is thrown. This is an obvious race condition and the easiest way to fix it is to increase the timeout of one method to make sure the code fails on the other method first.

Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>

Closes #4343 from jacek-lewandowski/SPARK-5548-1.3 and squashes the following commits:

b9ba47e [Jacek Lewandowski] SPARK-5548: Fixed a race condition in AkkaUtilsSuite
2015-02-05 12:21:40 -08:00
Sandy Ryza c4b1108c3f SPARK-4687. Add a recursive option to the addFile API
This adds a recursive option to the addFile API to satisfy Hive's needs.  It only allows specifying HDFS dirs that will be copied down on every executor.

There are a couple outstanding questions.
* Should we allow specifying local dirs as well?  The best way to do this would probably be to archive them.  The drawback is that it would require a fair bit of code that I don't know of any current use cases for.
* The addFiles implementation has a caching component that I don't entirely understand.  What events are we caching between?  AFAICT it's users calling addFile on the same file in the same app at different times?  Do we want/need to add something similar for addDirectory.
*  The addFiles implementation will check to see if an added file already exists and has the same contents.  I imagine we want the same behavior, so planning to add this unless people think otherwise.

I plan to add some tests if people are OK with the approach.

Author: Sandy Ryza <sandy@cloudera.com>

Closes #3670 from sryza/sandy-spark-4687 and squashes the following commits:

f9fc77f [Sandy Ryza] Josh's comments
70cd24d [Sandy Ryza] Add another test
13da824 [Sandy Ryza] Revert executor changes
38bf94d [Sandy Ryza] Marcelo's comments
ca83849 [Sandy Ryza] Add addFile test
1941be3 [Sandy Ryza] Fix test and avoid HTTP server in local mode
31f15a9 [Sandy Ryza] Use cache recursively and fix some compile errors
0239c3d [Sandy Ryza] Change addDirectory to addFile with recursive
46fe70a [Sandy Ryza] SPARK-4687. Add a addDirectory API
2015-02-05 10:15:55 -08:00
Josh Rosen 9a7ce70eab [SPARK-5411] Allow SparkListeners to be specified in SparkConf and loaded when creating SparkContext
This patch introduces a new configuration option, `spark.extraListeners`, that allows SparkListeners to be specified in SparkConf and registered before the SparkContext is initialized.  From the configuration documentation:

> A comma-separated list of classes that implement SparkListener; when initializing SparkContext, instances of these classes will be created and registered with Spark's listener bus. If a class has a single-argument constructor that accepts a SparkConf, that constructor will be called; otherwise, a zero-argument constructor will be called. If no valid constructor can be found, the SparkContext creation will fail with an exception.

This motivation for this patch is to allow monitoring code to be easily injected into existing Spark programs without having to modify those programs' code.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #4111 from JoshRosen/SPARK-5190-register-sparklistener-in-sc-constructor and squashes the following commits:

8370839 [Josh Rosen] Two minor fixes after merging with master
6e0122c [Josh Rosen] Merge remote-tracking branch 'origin/master' into SPARK-5190-register-sparklistener-in-sc-constructor
1a5b9a0 [Josh Rosen] Remove SPARK_EXTRA_LISTENERS environment variable.
2daff9b [Josh Rosen] Add a couple of explanatory comments for SPARK_EXTRA_LISTENERS.
b9973da [Josh Rosen] Add test to ensure that conf and env var settings are merged, not overriden.
d6f3113 [Josh Rosen] Use getConstructors() instead of try-catch to find right constructor.
d0d276d [Josh Rosen] Move code into setupAndStartListenerBus() method
b22b379 [Josh Rosen] Instantiate SparkListeners from classes listed in configurations.
9c0d8f1 [Josh Rosen] Revert "[SPARK-5190] Allow SparkListeners to be registered before SparkContext starts."
217ecc0 [Josh Rosen] Revert "Add addSparkListener to JavaSparkContext"
25988f3 [Josh Rosen] Add addSparkListener to JavaSparkContext
163ba19 [Josh Rosen] [SPARK-5190] Allow SparkListeners to be registered before SparkContext starts.
2015-02-04 17:18:03 -08:00
Davies Liu 0a89b15685 [SPARK-4939] move to next locality when no pending tasks
Currently, if there are different locality in a task set, the tasks with NODE_LOCAL only get scheduled after all the PROCESS_LOCAL tasks are scheduled and timeout with spark.locality.wait.process (3 seconds by default). In local mode, the LocalScheduler will never call resourceOffer() again once it failed to get a task with same locality, then all the NODE_LOCAL tasks will be never scheduled.

This bug could be reproduced by run example python/streaming/stateful_network_wordcount.py, it will hang after finished a batch with some data.

This patch will check whether there is task for current locality level, if not, it will change to next locality level without waiting for `spark.locality.wait.process` seconds. It works for all locality levels.

Because the list of pending tasks are updated lazily, the check can be false-positive, it means it will not move to next locality level even there is no valid pending tasks, it will wait for timeout.

Author: Davies Liu <davies@databricks.com>

Closes #3779 from davies/local_streaming and squashes the following commits:

2d25fb3 [Davies Liu] Update TaskSetManager.scala
1550668 [Davies Liu] add comment
1c37aac [Davies Liu] address comments
6b13824 [Davies Liu] address comments
906f456 [Davies Liu] Merge branch 'master' of github.com:apache/spark into local_streaming
414e79e [Davies Liu] fix bug, add logging
ff8eabb [Davies Liu] Merge branch 'master' into local_streaming
28d1b3c [Davies Liu] check tasks
9d0ceab [Davies Liu] Merge branch 'master' of github.com:apache/spark into local_streaming
37a2804 [Davies Liu] fix tests
49bda82 [Davies Liu] address comment
d8fb95a [Davies Liu] move to next locality level if no more tasks
2d6ae73 [Davies Liu] add comments
32d363f [Davies Liu] add regression test
7d8c5a5 [Davies Liu] jump to next locality if no pending tasks for executors
2015-02-04 14:22:07 -08:00
Burak Yavuz 6aed719e50 [SPARK-5341] Use maven coordinates as dependencies in spark-shell and spark-submit
This PR adds support for using maven coordinates as dependencies to spark-shell.
Coordinates can be provided as a comma-delimited string after the flag `--packages`.
Additional remote repositories (like SonaType) can be supplied as a comma-delimited string after the flag
`--repositories`.

Uses the Ivy library to resolve dependencies. Unfortunately the library has no decent documentation, therefore solving more complex dependency issues can be a problem.

pwendell, mateiz, mengxr

**Note: This is still a WIP. The following need to be handled:**
- [x] add docs for the methods
- [x] take local ivy cache path as an argument
- [x] add tests
- [x] add Windows compatibility
- [x] exclude unused Ivy dependencies

Author: Burak Yavuz <brkyvz@gmail.com>

Closes #4215 from brkyvz/SPARK-5341ivy and squashes the following commits:

9215851 [Burak Yavuz] ready to merge
db2a5cc [Burak Yavuz] changed logging to printStream
9dae87f [Burak Yavuz] file separators changed
71c374d [Burak Yavuz] merge conflicts fixed
c08dc9f [Burak Yavuz] fixed merge conflicts
3ada19a [Burak Yavuz] fixed Jenkins error (hopefully) and added comment on oro
43c2290 [Burak Yavuz] fixed that ONE line
231f72f [Burak Yavuz] addressed code review
2cd6562 [Burak Yavuz] Merge branch 'master' of github.com:apache/spark into SPARK-5341ivy
85ec5a3 [Burak Yavuz] added oro as a dependency explicitly
ea44ca4 [Burak Yavuz] add oro back to dependencies
cef0e24 [Burak Yavuz] IntelliJ is just messing things up
97c4a92 [Burak Yavuz] fix more weird IntelliJ formatting
9cf077d [Burak Yavuz] fix weird IntelliJ formatting
dcf5e13 [Burak Yavuz] fix windows command line flags
3a23f21 [Burak Yavuz] excluded ivy dependencies
53423e0 [Burak Yavuz] tests added
3705907 [Burak Yavuz] remove ivy-repo as a command line argument. Use global ivy cache as default
c04d885 [Burak Yavuz] take path to ivy cache as a conf
2edc9b5 [Burak Yavuz] managed to exclude Spark and it's dependencies
a0870af [Burak Yavuz] add docs. remove unnecesary new lines
6645af4 [Burak Yavuz] [SPARK-5341] added base implementation
882c4c8 [Burak Yavuz] added maven dependency download
2015-02-03 22:39:17 -08:00
zsxwing d37978d8aa [SPARK-4795][Core] Redesign the "primitive type => Writable" implicit APIs to make them be activated automatically
Try to redesign the "primitive type => Writable" implicit APIs to make them be activated automatically and without breaking binary compatibility.

However, this PR will breaking the source compatibility if people use `xxxToXxxWritable` occasionally. See the unit test in `graphx`.

Author: zsxwing <zsxwing@gmail.com>

Closes #3642 from zsxwing/SPARK-4795 and squashes the following commits:

914b2d6 [zsxwing] Add implicit back to the Writables methods
0b9017f [zsxwing] Add some docs
a0e8509 [zsxwing] Merge branch 'master' into SPARK-4795
39343de [zsxwing] Fix the unit test
64853af [zsxwing] Reorganize the rest 'implicit' methods in SparkContext
2015-02-03 20:17:12 -08:00
Reynold Xin bebf4c42be [SPARK-5549] Define TaskContext interface in Scala.
So the interface documentation shows up in ScalaDoc.

Author: Reynold Xin <rxin@databricks.com>

Closes #4324 from rxin/TaskContext-scala and squashes the following commits:

2480a17 [Reynold Xin] comment
573756f [Reynold Xin] style fixes and javadoc fixes.
87dd537 [Reynold Xin] [SPARK-5549] Define TaskContext interface in Scala.
2015-02-03 00:46:04 -08:00
Davies Liu 0561c45449 [SPARK-5154] [PySpark] [Streaming] Kafka streaming support in Python
This PR brings the Python API for Spark Streaming Kafka data source.

```
    class KafkaUtils(__builtin__.object)
     |  Static methods defined here:
     |
     |  createStream(ssc, zkQuorum, groupId, topics, storageLevel=StorageLevel(True, True, False, False,
2), keyDecoder=<function utf8_decoder>, valueDecoder=<function utf8_decoder>)
     |      Create an input stream that pulls messages from a Kafka Broker.
     |
     |      :param ssc:  StreamingContext object
     |      :param zkQuorum:  Zookeeper quorum (hostname:port,hostname:port,..).
     |      :param groupId:  The group id for this consumer.
     |      :param topics:  Dict of (topic_name -> numPartitions) to consume.
     |                      Each partition is consumed in its own thread.
     |      :param storageLevel:  RDD storage level.
     |      :param keyDecoder:  A function used to decode key
     |      :param valueDecoder:  A function used to decode value
     |      :return: A DStream object
```
run the example:

```
bin/spark-submit --driver-class-path external/kafka-assembly/target/scala-*/spark-streaming-kafka-assembly-*.jar examples/src/main/python/streaming/kafka_wordcount.py localhost:2181 test
```

Author: Davies Liu <davies@databricks.com>
Author: Tathagata Das <tdas@databricks.com>

Closes #3715 from davies/kafka and squashes the following commits:

d93bfe0 [Davies Liu] Update make-distribution.sh
4280d04 [Davies Liu] address comments
e6d0427 [Davies Liu] Merge branch 'master' of github.com:apache/spark into kafka
f257071 [Davies Liu] add tests for null in RDD
23b039a [Davies Liu] address comments
9af51c4 [Davies Liu] Merge branch 'kafka' of github.com:davies/spark into kafka
a74da87 [Davies Liu] address comments
dc1eed0 [Davies Liu] Update kafka_wordcount.py
31e2317 [Davies Liu] Update kafka_wordcount.py
370ba61 [Davies Liu] Update kafka.py
97386b3 [Davies Liu] address comment
2c567a5 [Davies Liu] update logging and comment
33730d1 [Davies Liu] Merge branch 'master' of github.com:apache/spark into kafka
adeeb38 [Davies Liu] Merge pull request #3 from tdas/kafka-python-api
aea8953 [Tathagata Das] Kafka-assembly for Python API
eea16a7 [Davies Liu] refactor
f6ce899 [Davies Liu] add example and fix bugs
98c8d17 [Davies Liu] fix python style
5697a01 [Davies Liu] bypass decoder in scala
048dbe6 [Davies Liu] fix python style
75d485e [Davies Liu] add mqtt
07923c4 [Davies Liu] support kafka in Python
2015-02-02 19:16:27 -08:00
Jacek Lewandowski cfea30037f Spark 3883: SSL support for HttpServer and Akka
SPARK-3883: SSL support for Akka connections and Jetty based file servers.

This story introduced the following changes:
- Introduced SSLOptions object which holds the SSL configuration and can build the appropriate configuration for Akka or Jetty. SSLOptions can be created by parsing SparkConf entries at a specified namespace.
- SSLOptions is created and kept by SecurityManager
- All Akka actor address creation snippets based on interpolated strings were replaced by a dedicated methods from AkkaUtils. Those methods select the proper Akka protocol - whether akka.tcp or akka.ssl.tcp
- Added tests cases for AkkaUtils, FileServer, SSLOptions and SecurityManager
- Added a way to use node local SSL configuration by executors and driver in standalone mode. It can be done by specifying spark.ssl.useNodeLocalConf in SparkConf.
- Made CoarseGrainedExecutorBackend not overwrite the settings which are executor startup configuration - they are passed anyway from Worker

Refer to https://github.com/apache/spark/pull/3571 for discussion and details

Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>
Author: Jacek Lewandowski <jacek.lewandowski@datastax.com>

Closes #3571 from jacek-lewandowski/SPARK-3883-master and squashes the following commits:

9ef4ed1 [Jacek Lewandowski] Merge pull request #2 from jacek-lewandowski/SPARK-3883-docs2
fb31b49 [Jacek Lewandowski] SPARK-3883: Added SSL setup documentation
2532668 [Jacek Lewandowski] SPARK-3883: Refactored AkkaUtils.protocol method to not use Try
90a8762 [Jacek Lewandowski] SPARK-3883: Refactored methods to resolve Akka address and made it possible to easily configure multiple communication layers for SSL
72b2541 [Jacek Lewandowski] SPARK-3883: A reference to the fallback SSLOptions can be provided when constructing SSLOptions
93050f4 [Jacek Lewandowski] SPARK-3883: SSL support for HttpServer and Akka
2015-02-02 17:27:26 -08:00
Jacek Lewandowski 5a5526164b SPARK-5425: Use synchronised methods in system properties to create SparkConf
SPARK-5425: Fixed usages of system properties

This patch fixes few problems caused by the fact that the Scala wrapper over system properties is not thread-safe and is basically invalid because it doesn't take into account the default values which could have been set in the properties object. The problem is fixed by modifying `Utils.getSystemProperties` method so that it uses `stringPropertyNames` method of the `Properties` class, which is thread-safe (internally it creates a defensive copy in a synchronized method) and returns keys of the properties which were set explicitly and which are defined as defaults.
The other related problem, which is fixed here. was in `ResetSystemProperties` mix-in. It created a copy of the system properties in the wrong way.

This patch also introduces a test case for thread-safeness of SparkConf creation.

Refer to the discussion in https://github.com/apache/spark/pull/4220 for more details.

Author: Jacek Lewandowski <lewandowski.jacek@gmail.com>

Closes #4222 from jacek-lewandowski/SPARK-5425-1.3 and squashes the following commits:

03da61b [Jacek Lewandowski] SPARK-5425: Modified Utils.getSystemProperties to return a map of all system properties - explicit + defaults
8faf2ea [Jacek Lewandowski] SPARK-5425: Use SerializationUtils to save properties in ResetSystemProperties trait
71aa572 [Jacek Lewandowski] SPARK-5425: Use synchronised methods in system properties to create SparkConf
2015-02-02 14:07:19 -08:00
Sandy Ryza b2047b55c5 SPARK-4585. Spark dynamic executor allocation should use minExecutors as...
... initial number

Author: Sandy Ryza <sandy@cloudera.com>

Closes #4051 from sryza/sandy-spark-4585 and squashes the following commits:

d1dd039 [Sandy Ryza] Add spark.dynamicAllocation.initialNumExecutors and make min and max not required
b7c59dc [Sandy Ryza] SPARK-4585. Spark dynamic executor allocation should use minExecutors as initial number
2015-02-02 12:27:08 -08:00
Reynold Xin 740a56862b [SPARK-5307] SerializationDebugger
This patch adds a SerializationDebugger that is used to add serialization path to a NotSerializableException. When a NotSerializableException is encountered, the debugger visits the object graph to find the path towards the object that cannot be serialized, and constructs information to help user to find the object.

The patch uses the internals of JVM serialization (in particular, heavy usage of ObjectStreamClass). Compared with an earlier attempt, this one provides extra information including field names, array offsets, writeExternal calls, etc.

An example serialization stack:
```
Serialization stack:
  - object not serializable (class: org.apache.spark.serializer.NotSerializable, value: org.apache.spark.serializer.NotSerializable2c43caa4)
  - element of array (index: 0)
  - array (class [Ljava.lang.Object;, size 1)
  - field (class: org.apache.spark.serializer.SerializableArray, name: arrayField, type: class [Ljava.lang.Object;)
  - object (class org.apache.spark.serializer.SerializableArray, org.apache.spark.serializer.SerializableArray193c5908)
  - writeExternal data
  - externalizable object (class org.apache.spark.serializer.ExternalizableClass, org.apache.spark.serializer.ExternalizableClass320bdadc)
```

Author: Reynold Xin <rxin@databricks.com>

Closes #4098 from rxin/SerializationDebugger and squashes the following commits:

553b3ff [Reynold Xin] Update SerializationDebuggerSuite.scala
572d0cb [Reynold Xin] Disable automatically when reflection fails.
b349b77 [Reynold Xin] [SPARK-5307] SerializationDebugger to help debug NotSerializableException - take 2
2015-01-30 22:34:10 -08:00
Sandy Ryza 254eaa4d35 SPARK-5393. Flood of util.RackResolver log messages after SPARK-1714
Previously I had tried to solve this with by adding a line in Spark's log4j-defaults.properties.

The issue with the message in log4j-defaults.properties was that the log4j.properties packaged inside Hadoop was getting picked up instead. While it would be ideal to fix that as well, we still want to quiet this in situations where a user supplies their own custom log4j properties.

Author: Sandy Ryza <sandy@cloudera.com>

Closes #4192 from sryza/sandy-spark-5393 and squashes the following commits:

4d5dedc [Sandy Ryza] Only set log level if unset
46e07c5 [Sandy Ryza] SPARK-5393. Flood of util.RackResolver log messages after SPARK-1714
2015-01-30 11:31:54 -06:00
Xiangrui Meng 4ee79c71af [SPARK-5430] move treeReduce and treeAggregate from mllib to core
We have seen many use cases of `treeAggregate`/`treeReduce` outside the ML domain. Maybe it is time to move them to Core. pwendell

Author: Xiangrui Meng <meng@databricks.com>

Closes #4228 from mengxr/SPARK-5430 and squashes the following commits:

20ad40d [Xiangrui Meng] exclude tree* from mima
e89a43e [Xiangrui Meng] fix compile and update java doc
3ae1a4b [Xiangrui Meng] add treeReduce/treeAggregate to Python
6f948c5 [Xiangrui Meng] add treeReduce/treeAggregate to JavaRDDLike
d600b6c [Xiangrui Meng] move treeReduce and treeAggregate to core
2015-01-28 17:26:03 -08:00
Michael Nazario e023112d33 [SPARK-5441][pyspark] Make SerDeUtil PairRDD to Python conversions more robust
SerDeUtil.pairRDDToPython and SerDeUtil.pythonToPairRDD now both support empty RDDs by checking the result of take(1) instead of calling first which throws an exception.

Author: Michael Nazario <mnazario@palantir.com>

Closes #4236 from mnazario/feature/empty-first and squashes the following commits:

a531c0c [Michael Nazario] Added regression tests for SPARK-5441
e3b2fb6 [Michael Nazario] Added acceptance of the empty case
2015-01-28 13:58:46 -08:00
Andrew Or 84b6ecdef6 [SPARK-5437] Fix DriverSuite and SparkSubmitSuite timeout issues
In DriverSuite, we currently set a timeout of 60 seconds. If after this time the process has not terminated, we leak the process because we never destroy it.

In SparkSubmitSuite, we currently do not have a timeout so the test can hang indefinitely.

Author: Andrew Or <andrew@databricks.com>

Closes #4230 from andrewor14/fix-driver-suite and squashes the following commits:

f5c80fd [Andrew Or] Fix timeout behaviors in both suites
8092c36 [Andrew Or] Stop SparkContext after every individual test
2015-01-28 12:53:22 -08:00
Kousuke Saruta 0b35fcd7f0 [SPARK-5291][CORE] Add timestamp and reason why an executor is removed to SparkListenerExecutorAdded and SparkListenerExecutorRemoved
Recently `SparkListenerExecutorAdded` and `SparkListenerExecutorRemoved` are added.
I think it's useful if they have timestamp and the reason why an executor is removed.

Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>

Closes #4082 from sarutak/SPARK-5291 and squashes the following commits:

a026ff2 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-5291
979dfe1 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-5291
cf9f9080 [Kousuke Saruta] Fixed test case
1f2a89b [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-5291
243f2a60 [Kousuke Saruta] Modified MesosSchedulerBackendSuite
a527c35 [Kousuke Saruta] Added timestamp to SparkListenerExecutorAdded
2015-01-28 11:02:51 -08:00
Sandy Ryza b1b35ca2e4 SPARK-5199. FS read metrics should support CombineFileSplits and track bytes from all FSs
...mbineFileSplits

Author: Sandy Ryza <sandy@cloudera.com>

Closes #4050 from sryza/sandy-spark-5199 and squashes the following commits:

864514b [Sandy Ryza] Add tests and fix bug
0d504f1 [Sandy Ryza] Prettify
915c7e6 [Sandy Ryza] Get metrics from all filesystems
cdbc3e8 [Sandy Ryza] SPARK-5199. Input metrics should show up for InputFormats that return CombineFileSplits
2015-01-27 15:42:55 -08:00
Sean Owen 0497ea51ac SPARK-960 [CORE] [TEST] JobCancellationSuite "two jobs sharing the same stage" is broken
This reenables and fixes this test, after addressing two issues:

- The Semaphore that was intended to be shared locally was being serialized and copied; it's now a static member in the companion object as in other tests
- Later changes to Spark means that cancelling the first task will not cancel the shared stage and therefore the second task should succeed

Author: Sean Owen <sowen@cloudera.com>

Closes #4180 from srowen/SPARK-960 and squashes the following commits:

43da66f [Sean Owen] Fix 'two jobs sharing the same stage' test and reenable it: truly share a Semaphore locally as intended, and update expectation of failure in non-cancelled task
2015-01-26 14:32:27 -08:00
Davies Liu 142093179a [SPARK-5355] use j.u.c.ConcurrentHashMap instead of TrieMap
j.u.c.ConcurrentHashMap is more battle tested.

cc rxin JoshRosen pwendell

Author: Davies Liu <davies@databricks.com>

Closes #4208 from davies/safe-conf and squashes the following commits:

c2182dc [Davies Liu] address comments, fix tests
3a1d821 [Davies Liu] fix test
da14ced [Davies Liu] Merge branch 'master' of github.com:apache/spark into safe-conf
ae4d305 [Davies Liu] change to j.u.c.ConcurrentMap
f8fa1cf [Davies Liu] change to TrieMap
a1d769a [Davies Liu] make SparkConf thread-safe
2015-01-26 12:51:32 -08:00
Kousuke Saruta 8f5c827b01 [SPARK-5344][WebUI] HistoryServer cannot recognize that inprogress file was renamed to completed file
`FsHistoryProvider` tries to update application status but if `checkForLogs` is called before `.inprogress` file is renamed to completed file, the file is not recognized as completed.

Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>

Closes #4132 from sarutak/SPARK-5344 and squashes the following commits:

9658008 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-5344
d2c72b6 [Kousuke Saruta] Fixed update issue of FsHistoryProvider
2015-01-25 15:34:20 -08:00
zsxwing 0d1e67ee9b [SPARK-5214][Test] Add a test to demonstrate EventLoop can be stopped in the event thread
Author: zsxwing <zsxwing@gmail.com>

Closes #4174 from zsxwing/SPARK-5214-unittest and squashes the following commits:

443e564 [zsxwing] Change the check interval to 5ms
7aaa2d7 [zsxwing] Add a test to demonstrate EventLoop can be stopped in the event thread
2015-01-24 11:00:35 -08:00
Josh Rosen cef1f092a6 [SPARK-5063] More helpful error messages for several invalid operations
This patch adds more helpful error messages for invalid programs that define nested RDDs, broadcast RDDs, perform actions inside of transformations (e.g. calling `count()` from inside of `map()`), and call certain methods on stopped SparkContexts.  Currently, these invalid programs lead to confusing NullPointerExceptions at runtime and have been a major source of questions on the mailing list and StackOverflow.

In a few cases, I chose to log warnings instead of throwing exceptions in order to avoid any chance that this patch breaks programs that worked "by accident" in earlier Spark releases (e.g. programs that define nested RDDs but never run any jobs with them).

In SparkContext, the new `assertNotStopped()` method is used to check whether methods are being invoked on a stopped SparkContext.  In some cases, user programs will not crash in spite of calling methods on stopped SparkContexts, so I've only added `assertNotStopped()` calls to methods that always throw exceptions when called on stopped contexts (e.g. by dereferencing a null `dagScheduler` pointer).

Author: Josh Rosen <joshrosen@databricks.com>

Closes #3884 from JoshRosen/SPARK-5063 and squashes the following commits:

a38774b [Josh Rosen] Fix spelling typo
a943e00 [Josh Rosen] Convert two exceptions into warnings in order to avoid breaking user programs in some edge-cases.
2d0d7f7 [Josh Rosen] Fix test to reflect 1.2.1 compatibility
3f0ea0c [Josh Rosen] Revert two unintentional formatting changes
8e5da69 [Josh Rosen] Remove assertNotStopped() calls for methods that were sometimes safe to call on stopped SC's in Spark 1.2
8cff41a [Josh Rosen] IllegalStateException fix
6ef68d0 [Josh Rosen] Fix Python line length issues.
9f6a0b8 [Josh Rosen] Add improved error messages to PySpark.
13afd0f [Josh Rosen] SparkException -> IllegalStateException
8d404f3 [Josh Rosen] Merge remote-tracking branch 'origin/master' into SPARK-5063
b39e041 [Josh Rosen] Fix BroadcastSuite test which broadcasted an RDD
99cc09f [Josh Rosen] Guard against calling methods on stopped SparkContexts.
34833e8 [Josh Rosen] Add more descriptive error message.
57cc8a1 [Josh Rosen] Add error message when directly broadcasting RDD.
15b2e6b [Josh Rosen] [SPARK-5063] Useful error messages for nested RDDs and actions inside of transformations
2015-01-23 17:53:15 -08:00
Jongyoul Lee 9d9294aebf [SPARK-5333][Mesos] MesosTaskLaunchData occurs BufferUnderflowException
- Rewind ByteBuffer before making ByteString

(This fixes a bug introduced in #3849 / SPARK-4014)

Author: Jongyoul Lee <jongyoul@gmail.com>

Closes #4119 from jongyoul/SPARK-5333 and squashes the following commits:

c6693a8 [Jongyoul Lee] [SPARK-5333][Mesos] MesosTaskLaunchData occurs BufferUnderflowException - changed logDebug location
4141f58 [Jongyoul Lee] [SPARK-5333][Mesos] MesosTaskLaunchData occurs BufferUnderflowException - Added license information
2190606 [Jongyoul Lee] [SPARK-5333][Mesos] MesosTaskLaunchData occurs BufferUnderflowException - Adjusted imported libraries
b7f5517 [Jongyoul Lee] [SPARK-5333][Mesos] MesosTaskLaunchData occurs BufferUnderflowException - Rewind ByteBuffer before making ByteString
2015-01-20 10:18:10 -08:00
Sean Owen 306ff187af SPARK-5270 [CORE] Provide isEmpty() function in RDD API
Pretty minor, but submitted for consideration -- this would at least help people make this check in the most efficient way I know.

Author: Sean Owen <sowen@cloudera.com>

Closes #4074 from srowen/SPARK-5270 and squashes the following commits:

66885b8 [Sean Owen] Add note that JavaRDDLike should not be implemented by user code
2e9b490 [Sean Owen] More tests, and Mima-exclude the new isEmpty method in JavaRDDLike
28395ff [Sean Owen] Add isEmpty to Java, Python
7dd04b7 [Sean Owen] Add efficient RDD.isEmpty()
2015-01-19 22:50:45 -08:00
zsxwing e69fb8c75a [SPARK-5214][Core] Add EventLoop and change DAGScheduler to an EventLoop
This PR adds a simple `EventLoop` and use it to replace Actor in DAGScheduler. `EventLoop` is a general class to support that posting events in multiple threads and handling events in a single event thread.

Author: zsxwing <zsxwing@gmail.com>

Closes #4016 from zsxwing/event-loop and squashes the following commits:

aefa1ce [zsxwing] Add protected to on*** methods
5cfac83 [zsxwing] Remove null check of eventProcessLoop
dba35b2 [zsxwing] Add a test that onReceive swallows InterruptException
460f7b3 [zsxwing] Use volatile instead of Atomic things in unit tests
227bf33 [zsxwing] Add a stop flag and some tests
37f79c6 [zsxwing] Fix docs
55fb6f6 [zsxwing] Add private[spark] to EventLoop
1f73eac [zsxwing] Fix the import order
3b2e59c [zsxwing] Add EventLoop and change DAGScheduler to an EventLoop
2015-01-19 18:15:51 -08:00
Jongyoul Lee 4a4f9ccba2 [SPARK-5088] Use spark-class for running executors directly
Author: Jongyoul Lee <jongyoul@gmail.com>

Closes #3897 from jongyoul/SPARK-5088 and squashes the following commits:

8232aa8 [Jongyoul Lee] [SPARK-5088] Use spark-class for running executors directly - Added a listenerBus for fixing test cases
932289f [Jongyoul Lee] [SPARK-5088] Use spark-class for running executors directly - Rebased from master
613cb47 [Jongyoul Lee] [SPARK-5088] Use spark-class for running executors directly - Fixed code if spark.executor.uri doesn't have any value - Added test cases
ff57bda [Jongyoul Lee] [SPARK-5088] Use spark-class for running executors directly - Adjusted orders of import
97e4bd4 [Jongyoul Lee] [SPARK-5088] Use spark-class for running executors directly - Changed command for using spark-class directly - Delete sbin/spark-executor and moved some codes into spark-class' case statement
2015-01-19 02:01:56 -08:00
Ilya Ganelin 3453d578ad [SPARK-3288] All fields in TaskMetrics should be private and use getters/setters
I've updated the fields and all usages of these fields in the Spark code. I've verified that this did not break anything on my local repo.

Author: Ilya Ganelin <ilya.ganelin@capitalone.com>

Closes #4020 from ilganeli/SPARK-3288 and squashes the following commits:

39f3810 [Ilya Ganelin] resolved merge issues
e446287 [Ilya Ganelin] Merge remote-tracking branch 'upstream/master' into SPARK-3288
b8c05cb [Ilya Ganelin] Missed making a variable private
6444391 [Ilya Ganelin] Made inc/dec functions private[spark]
1149e78 [Ilya Ganelin] Merge remote-tracking branch 'upstream/master' into SPARK-3288
26b312b [Ilya Ganelin] Debugging tests
17146c2 [Ilya Ganelin] Merge remote-tracking branch 'upstream/master' into SPARK-3288
5525c20 [Ilya Ganelin] Completed refactoring to make vars in TaskMetrics class private
c64da4f [Ilya Ganelin] Partially updated task metrics to make some vars private
2015-01-19 01:32:36 -08:00
Kousuke Saruta e8422c521b [SPARK-5231][WebUI] History Server shows wrong job submission time.
History Server doesn't show collect job submission time.
It's because `JobProgressListener` updates job submission time every time `onJobStart` method is invoked from `ReplayListenerBus`.

Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>

Closes #4029 from sarutak/SPARK-5231 and squashes the following commits:

0af9e22 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-5231
da8bd14 [Kousuke Saruta] Made submissionTime in SparkListenerJobStartas and completionTime in SparkListenerJobEnd as regular Long
0412a6a [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-5231
26b9b99 [Kousuke Saruta] Fixed the test cases
2d47bd3 [Kousuke Saruta] Fixed to record job submission time and completion time collectly
2015-01-16 10:05:11 -08:00
Ye Xianjin e200ac8e53 [SPARK-5201][CORE] deal with int overflow in the ParallelCollectionRDD.slice method
There is an int overflow in the ParallelCollectionRDD.slice method. That's originally reported by SaintBacchus.
```
sc.makeRDD(1 to (Int.MaxValue)).count       // result = 0
sc.makeRDD(1 to (Int.MaxValue - 1)).count   // result = 2147483646 = Int.MaxValue - 1
sc.makeRDD(1 until (Int.MaxValue)).count    // result = 2147483646 = Int.MaxValue - 1
```
see https://github.com/apache/spark/pull/2874 for more details.
This pr try to fix the overflow. However, There's another issue I don't address.
```
val largeRange = Int.MinValue to Int.MaxValue
largeRange.length // throws java.lang.IllegalArgumentException: -2147483648 to 2147483647 by 1: seqs cannot contain more than Int.MaxValue elements.
```

So, the range we feed to sc.makeRDD cannot contain more than Int.MaxValue elements. This is the limitation of Scala. However I think  we may want to support that kind of range. But the fix is beyond this pr.

srowen andrewor14 would you mind take a look at this pr?

Author: Ye Xianjin <advancedxy@gmail.com>

Closes #4002 from advancedxy/SPARk-5201 and squashes the following commits:

96265a1 [Ye Xianjin] Update slice method comment and some responding docs.
e143d7a [Ye Xianjin] Update inclusive range check for splitting inclusive range.
b3f5577 [Ye Xianjin] We can include the last element in the last slice in general for inclusive range, hence eliminate the need to check Int.MaxValue or Int.MinValue.
7d39b9e [Ye Xianjin] Convert the two cases pattern matching to one case.
651c959 [Ye Xianjin] rename sign to needsInclusiveRange. add some comments
196f8a8 [Ye Xianjin] Add test cases for ranges end with Int.MaxValue or Int.MinValue
e66e60a [Ye Xianjin] Deal with inclusive and exclusive ranges in one case. If the range is inclusive and the end of the range is (Int.MaxValue or Int.MinValue), we should use inclusive range instead of exclusive
2015-01-16 09:21:33 -08:00
Kostas Sakellis a79a9f923c [SPARK-4092] [CORE] Fix InputMetrics for coalesce'd Rdds
When calculating the input metrics there was an assumption that one task only reads from one block - this is not true for some operations including coalesce. This patch simply increments the task's input metrics if previous ones existed of the same read method.

A limitation to this patch is that if a task reads from two different blocks of different read methods, one will override the other.

Author: Kostas Sakellis <kostas@cloudera.com>

Closes #3120 from ksakellis/kostas-spark-4092 and squashes the following commits:

54e6658 [Kostas Sakellis] Drops metrics if conflicting read methods exist
f0e0cc5 [Kostas Sakellis] Add bytesReadCallback to InputMetrics
a2a36d4 [Kostas Sakellis] CR feedback
5a0c770 [Kostas Sakellis] [SPARK-4092] [CORE] Fix InputMetrics for coalesce'd Rdds
2015-01-15 18:48:39 -08:00
Kostas Sakellis 96c2c714f4 [SPARK-4857] [CORE] Adds Executor membership events to SparkListener
Adds onExecutorAdded and onExecutorRemoved events to the SparkListener. This will allow a client to get notified when an executor has been added/removed and provide additional information such as how many vcores it is consuming.

In addition, this commit adds a SparkListenerAdapter to the Java API that provides default implementations to the SparkListener. This is to get around the fact that default implementations for traits don't work in Java. Having Java clients extend SparkListenerAdapter moving forward will prevent breakage in java when we add new events to SparkListener.

Author: Kostas Sakellis <kostas@cloudera.com>

Closes #3711 from ksakellis/kostas-spark-4857 and squashes the following commits:

946d2c5 [Kostas Sakellis] Added executorAdded/Removed events to MesosSchedulerBackend
b1d054a [Kostas Sakellis] Remove executorInfo from ExecutorRemoved event
1727b38 [Kostas Sakellis] Renamed ExecutorDetails back to ExecutorInfo and other CR feedback
14fe78d [Kostas Sakellis] Added executor added/removed events to json protocol
93d087b [Kostas Sakellis] [SPARK-4857] [CORE] Adds Executor membership events to SparkListener
2015-01-15 17:53:42 -08:00
Josh Rosen 259936be71 [SPARK-4014] Add TaskContext.attemptNumber and deprecate TaskContext.attemptId
`TaskContext.attemptId` is misleadingly-named, since it currently returns a taskId, which uniquely identifies a particular task attempt within a particular SparkContext, instead of an attempt number, which conveys how many times a task has been attempted.

This patch deprecates `TaskContext.attemptId` and add `TaskContext.taskId` and `TaskContext.attemptNumber` fields.  Prior to this change, it was impossible to determine whether a task was being re-attempted (or was a speculative copy), which made it difficult to write unit tests for tasks that fail on early attempts or speculative tasks that complete faster than original tasks.

Earlier versions of the TaskContext docs suggest that `attemptId` behaves like `attemptNumber`, so there's an argument to be made in favor of changing this method's implementation.  Since we've decided against making that change in maintenance branches, I think it's simpler to add better-named methods and retain the old behavior for `attemptId`; if `attemptId` behaved differently in different branches, then this would cause confusing build-breaks when backporting regression tests that rely on the new `attemptId` behavior.

Most of this patch is fairly straightforward, but there is a bit of trickiness related to Mesos tasks: since there's no field in MesosTaskInfo to encode the attemptId, I packed it into the `data` field alongside the task binary.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #3849 from JoshRosen/SPARK-4014 and squashes the following commits:

89d03e0 [Josh Rosen] Merge remote-tracking branch 'origin/master' into SPARK-4014
5cfff05 [Josh Rosen] Introduce wrapper for serializing Mesos task launch data.
38574d4 [Josh Rosen] attemptId -> taskAttemptId in PairRDDFunctions
a180b88 [Josh Rosen] Merge remote-tracking branch 'origin/master' into SPARK-4014
1d43aa6 [Josh Rosen] Merge remote-tracking branch 'origin/master' into SPARK-4014
eee6a45 [Josh Rosen] Merge remote-tracking branch 'origin/master' into SPARK-4014
0b10526 [Josh Rosen] Use putInt instead of putLong (silly mistake)
8c387ce [Josh Rosen] Use local with maxRetries instead of local-cluster.
cbe4d76 [Josh Rosen] Preserve attemptId behavior and deprecate it:
b2dffa3 [Josh Rosen] Address some of Reynold's minor comments
9d8d4d1 [Josh Rosen] Doc typo
1e7a933 [Josh Rosen] [SPARK-4014] Change TaskContext.attemptId to return attempt number instead of task ID.
fd515a5 [Josh Rosen] Add failing test for SPARK-4014
2015-01-14 11:45:40 -08:00
zsxwing 6942b974ad [SPARK-4951][Core] Fix the issue that a busy executor may be killed
A few changes to fix this issue:

1. Handle the case that receiving `SparkListenerTaskStart` before `SparkListenerBlockManagerAdded`.
2. Don't add `executorId` to `removeTimes` when the executor is busy.
3. Use `HashMap.retain` to safely traverse the HashMap and remove items.
4. Use the same lock in ExecutorAllocationManager and ExecutorAllocationListener to fix the race condition in `totalPendingTasks`.
5. Move the blocking codes out of the message processing code in YarnSchedulerActor.

Author: zsxwing <zsxwing@gmail.com>

Closes #3783 from zsxwing/SPARK-4951 and squashes the following commits:

d51fa0d [zsxwing] Add comments
2e365ce [zsxwing] Remove expired executors from 'removeTimes' and add idle executors back when a new executor joins
49f61a9 [zsxwing] Eliminate duplicate executor registered warnings
d4c4e9a [zsxwing] Minor fixes for the code style
05f6238 [zsxwing] Move the blocking codes out of the message processing code
105ba3a [zsxwing] Fix the race condition in totalPendingTasks
d5c615d [zsxwing] Fix the issue that a busy executor may be killed
2015-01-11 16:23:28 -08:00
mcheah e0f28e010c [SPARK-4737] Task set manager properly handles serialization errors
Dealing with [SPARK-4737], the handling of serialization errors should not be the DAGScheduler's responsibility. The task set manager now catches the error and aborts the stage.

If the TaskSetManager throws a TaskNotSerializableException, the TaskSchedulerImpl will return an empty list of task descriptions, because no tasks were started. The scheduler should abort the stage gracefully.

Note that I'm not too familiar with this part of the codebase and its place in the overall architecture of the Spark stack. If implementing it this way will have any averse side effects please voice that loudly.

Author: mcheah <mcheah@palantir.com>

Closes #3638 from mccheah/task-set-manager-properly-handle-ser-err and squashes the following commits:

1545984 [mcheah] Some more style fixes from Andrew Or.
5267929 [mcheah] Fixing style suggestions from Andrew Or.
dfa145b [mcheah] Fixing style from Josh Rosen's feedback
b2a430d [mcheah] Not returning empty seq when a task set cannot be serialized.
94844d7 [mcheah] Fixing compilation error, one brace too many
5f486f4 [mcheah] Adding license header for fake task class
bf5e706 [mcheah] Fixing indentation.
097e7a2 [mcheah] [SPARK-4737] Catching task serialization exception in TaskSetManager
2015-01-09 14:16:20 -08:00
Kay Ousterhout b6aa557300 [SPARK-1143] Separate pool tests into their own suite.
The current TaskSchedulerImplSuite includes some tests that are
actually for the TaskSchedulerImpl, but the remainder of the tests avoid using
the TaskSchedulerImpl entirely, and actually test the pool and scheduling
algorithm mechanisms. This commit separates the pool/scheduling algorithm
tests into their own suite, and also simplifies those tests.

The pull request replaces #339.

Author: Kay Ousterhout <kayousterhout@gmail.com>

Closes #3967 from kayousterhout/SPARK-1143 and squashes the following commits:

8a898c4 [Kay Ousterhout] [SPARK-1143] Separate pool tests into their own suite.
2015-01-09 09:47:06 -08:00
Marcelo Vanzin 48cecf673c [SPARK-4048] Enhance and extend hadoop-provided profile.
This change does a few things to make the hadoop-provided profile more useful:

- Create new profiles for other libraries / services that might be provided by the infrastructure
- Simplify and fix the poms so that the profiles are only activated while building assemblies.
- Fix tests so that they're able to run when the profiles are activated
- Add a new env variable to be used by distributions that use these profiles to provide the runtime
  classpath for Spark jobs and daemons.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #2982 from vanzin/SPARK-4048 and squashes the following commits:

82eb688 [Marcelo Vanzin] Add a comment.
eb228c0 [Marcelo Vanzin] Fix borked merge.
4e38f4e [Marcelo Vanzin] Merge branch 'master' into SPARK-4048
9ef79a3 [Marcelo Vanzin] Alternative way to propagate test classpath to child processes.
371ebee [Marcelo Vanzin] Review feedback.
52f366d [Marcelo Vanzin] Merge branch 'master' into SPARK-4048
83099fc [Marcelo Vanzin] Merge branch 'master' into SPARK-4048
7377e7b [Marcelo Vanzin] Merge branch 'master' into SPARK-4048
322f882 [Marcelo Vanzin] Fix merge fail.
f24e9e7 [Marcelo Vanzin] Merge branch 'master' into SPARK-4048
8b00b6a [Marcelo Vanzin] Merge branch 'master' into SPARK-4048
9640503 [Marcelo Vanzin] Cleanup child process log message.
115fde5 [Marcelo Vanzin] Simplify a comment (and make it consistent with another pom).
e3ab2da [Marcelo Vanzin] Fix hive-thriftserver profile.
7820d58 [Marcelo Vanzin] Fix CliSuite with provided profiles.
1be73d4 [Marcelo Vanzin] Restore flume-provided profile.
d1399ed [Marcelo Vanzin] Restore jetty dependency.
82a54b9 [Marcelo Vanzin] Remove unused profile.
5c54a25 [Marcelo Vanzin] Fix HiveThriftServer2Suite with *-provided profiles.
1fc4d0b [Marcelo Vanzin] Update dependencies for hive-thriftserver.
f7b3bbe [Marcelo Vanzin] Add snappy to hadoop-provided list.
9e4e001 [Marcelo Vanzin] Remove duplicate hive profile.
d928d62 [Marcelo Vanzin] Redirect child stderr to parent's log.
4d67469 [Marcelo Vanzin] Propagate SPARK_DIST_CLASSPATH on Yarn.
417d90e [Marcelo Vanzin] Introduce "SPARK_DIST_CLASSPATH".
2f95f0d [Marcelo Vanzin] Propagate classpath to child processes during testing.
1adf91c [Marcelo Vanzin] Re-enable maven-install-plugin for a few projects.
284dda6 [Marcelo Vanzin] Rework the "hadoop-provided" profile, add new ones.
2015-01-08 17:15:13 -08:00
zsxwing 2b729d2250 [SPARK-5126][Core] Verify Spark urls before creating Actors so that invalid urls can crash the process.
Because `actorSelection` will return `deadLetters` for an invalid path,  Worker keeps quiet for an invalid master url. It's better to log an error so that people can find such problem quickly.

This PR will check the url before sending to `actorSelection`, throw and log a SparkException for an invalid url.

Author: zsxwing <zsxwing@gmail.com>

Closes #3927 from zsxwing/SPARK-5126 and squashes the following commits:

9d429ee [zsxwing] Create a utility method in Utils to parse Spark url; verify urls before creating Actors so that invalid urls can crash the process.
8286e51 [zsxwing] Check the url before sending to Akka and log the error if the url is invalid
2015-01-07 23:01:30 -08:00
Masayoshi TSUZUKI 6e74edeca3 [SPARK-2458] Make failed application log visible on History Server
Enabled HistoryServer to show incomplete applications.
We can see the log for incomplete applications by clicking the bottom link.

Author: Masayoshi TSUZUKI <tsudukim@oss.nttdata.co.jp>

Closes #3467 from tsudukim/feature/SPARK-2458-2 and squashes the following commits:

76205d2 [Masayoshi TSUZUKI] Fixed and added test code.
29a04a9 [Masayoshi TSUZUKI] Merge branch 'master' of github.com:tsudukim/spark into feature/SPARK-2458-2
f9ef854 [Masayoshi TSUZUKI] Added space between "if" and "(". Fixed "Incomplete" as capitalized in the web UI. Modified double negative variable name.
9b465b0 [Masayoshi TSUZUKI] Modified typo and better implementation.
3ed8a41 [Masayoshi TSUZUKI] Modified too long lines.
08ea14d [Masayoshi TSUZUKI] [SPARK-2458] Make failed application log visible on History Server
2015-01-07 07:32:53 -08:00
Sean Owen 4cba6eb420 SPARK-4159 [CORE] Maven build doesn't run JUnit test suites
This PR:

- Reenables `surefire`, and copies config from `scalatest` (which is itself an old fork of `surefire`, so similar)
- Tells `surefire` to test only Java tests
- Enables `surefire` and `scalatest` for all children, and in turn eliminates some duplication.

For me this causes the Scala and Java tests to be run once each, it seems, as desired. It doesn't affect the SBT build but works for Maven. I still need to verify that all of the Scala tests and Java tests are being run.

Author: Sean Owen <sowen@cloudera.com>

Closes #3651 from srowen/SPARK-4159 and squashes the following commits:

2e8a0af [Sean Owen] Remove specialized SPARK_HOME setting for REPL, YARN tests as it appears to be obsolete
12e4558 [Sean Owen] Append to unit-test.log instead of overwriting, so that both surefire and scalatest output is preserved. Also standardize/correct comments a bit.
e6f8601 [Sean Owen] Reenable Java tests by reenabling surefire with config cloned from scalatest; centralize test config in the parent
2015-01-06 12:02:08 -08:00
zsxwing 5c506cecb9 [SPARK-5074][Core] Fix a non-deterministic test failure
Add `assert(sc.listenerBus.waitUntilEmpty(WAIT_TIMEOUT_MILLIS))` to make sure `sparkListener` receive the message.

Author: zsxwing <zsxwing@gmail.com>

Closes #3889 from zsxwing/SPARK-5074 and squashes the following commits:

e61c198 [zsxwing] Fix a non-deterministic test failure
2015-01-04 21:18:33 -08:00
zsxwing 27e7f5a723 [SPARK-5083][Core] Fix a flaky test in TaskResultGetterSuite
Because `sparkEnv.blockManager.master.removeBlock` is asynchronous, we need to make sure the block has already been removed before calling `super.enqueueSuccessfulTask`.

Author: zsxwing <zsxwing@gmail.com>

Closes #3894 from zsxwing/SPARK-5083 and squashes the following commits:

d97c03d [zsxwing] Fix a flaky test in TaskResultGetterSuite
2015-01-04 21:09:21 -08:00
zsxwing 6c726a3fbd [SPARK-5069][Core] Fix the race condition of TaskSchedulerImpl.dagScheduler
It's not necessary to set `TaskSchedulerImpl.dagScheduler` in preStart. It's safe to set it after `initializeEventProcessActor()`.

Author: zsxwing <zsxwing@gmail.com>

Closes #3887 from zsxwing/SPARK-5069 and squashes the following commits:

d95894f [zsxwing] Fix the race condition of TaskSchedulerImpl.dagScheduler
2015-01-04 21:06:04 -08:00
zsxwing 72396522bc [SPARK-5067][Core] Use '===' to compare well-defined case class
A simple fix would be adding `assert(e1.appId == e2.appId)` for `SparkListenerApplicationStart`. But actually we can use `===` for well-defined case class directly. Therefore, instead of fixing this issue, I use `===` to compare those well-defined case classes (all fields have implemented a correct `equals` method, such as primitive types)

Author: zsxwing <zsxwing@gmail.com>

Closes #3886 from zsxwing/SPARK-5067 and squashes the following commits:

0a51711 [zsxwing] Use '===' to compare well-defined case class
2015-01-04 21:03:17 -08:00
Josh Rosen 012839807c [HOTFIX] Bind web UI to ephemeral port in DriverSuite
The job launched by DriverSuite should bind the web UI to an ephemeral port, since it looks like port contention in this test has caused a large number of Jenkins failures when many builds are started simultaneously.  Our tests already disable the web UI, but this doesn't affect subprocesses launched by our tests.  In this case, I've opted to bind to an ephemeral port instead of disabling the UI because disabling features in this test may mask its ability to catch certain bugs.

See also: e24d3a9

Author: Josh Rosen <joshrosen@databricks.com>

Closes #3873 from JoshRosen/driversuite-webui-port and squashes the following commits:

48cd05c [Josh Rosen] [HOTFIX] Bind web UI to ephemeral port in DriverSuite.
2015-01-01 15:03:54 -08:00
Josh Rosen e24d3a9a29 [HOTFIX] Disable Spark UI in SparkSubmitSuite tests
This should fix a major cause of build breaks when running many parallel tests.
2014-12-31 14:13:09 -08:00
Josh Rosen 352ed6bbe3 [SPARK-1010] Clean up uses of System.setProperty in unit tests
Several of our tests call System.setProperty (or test code which implicitly sets system properties) and don't always reset/clear the modified properties, which can create ordering dependencies between tests and cause hard-to-diagnose failures.

This patch removes most uses of System.setProperty from our tests, since in most cases we can use SparkConf to set these configurations (there are a few exceptions, including the tests of SparkConf itself).

For the cases where we continue to use System.setProperty, this patch introduces a `ResetSystemProperties` ScalaTest mixin class which snapshots the system properties before individual tests and to automatically restores them on test completion / failure.  See the block comment at the top of the ResetSystemProperties class for more details.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #3739 from JoshRosen/cleanup-system-properties-in-tests and squashes the following commits:

0236d66 [Josh Rosen] Replace setProperty uses in two example programs / tools
3888fe3 [Josh Rosen] Remove setProperty use in LocalJavaStreamingContext
4f4031d [Josh Rosen] Add note on why SparkSubmitSuite needs ResetSystemProperties
4742a5b [Josh Rosen] Clarify ResetSystemProperties trait inheritance ordering.
0eaf0b6 [Josh Rosen] Remove setProperty call in TaskResultGetterSuite.
7a3d224 [Josh Rosen] Fix trait ordering
3fdb554 [Josh Rosen] Remove setProperty call in TaskSchedulerImplSuite
bee20df [Josh Rosen] Remove setProperty calls in SparkContextSchedulerCreationSuite
655587c [Josh Rosen] Remove setProperty calls in JobCancellationSuite
3f2f955 [Josh Rosen] Remove System.setProperty calls in DistributedSuite
cfe9cce [Josh Rosen] Remove use of system properties in SparkContextSuite
8783ab0 [Josh Rosen] Remove TestUtils.setSystemProperty, since it is subsumed by the ResetSystemProperties trait.
633a84a [Josh Rosen] Remove use of system properties in FileServerSuite
25bfce2 [Josh Rosen] Use ResetSystemProperties in UtilsSuite
1d1aa5a [Josh Rosen] Use ResetSystemProperties in SizeEstimatorSuite
dd9492b [Josh Rosen] Use ResetSystemProperties in AkkaUtilsSuite
b0daff2 [Josh Rosen] Use ResetSystemProperties in BlockManagerSuite
e9ded62 [Josh Rosen] Use ResetSystemProperties in TaskSchedulerImplSuite
5b3cb54 [Josh Rosen] Use ResetSystemProperties in SparkListenerSuite
0995c4b [Josh Rosen] Use ResetSystemProperties in SparkContextSchedulerCreationSuite
c83ded8 [Josh Rosen] Use ResetSystemProperties in SparkConfSuite
51aa870 [Josh Rosen] Use withSystemProperty in ShuffleSuite
60a63a1 [Josh Rosen] Use ResetSystemProperties in JobCancellationSuite
14a92e4 [Josh Rosen] Use withSystemProperty in FileServerSuite
628f46c [Josh Rosen] Use ResetSystemProperties in DistributedSuite
9e3e0dd [Josh Rosen] Add ResetSystemProperties test fixture mixin; use it in SparkSubmitSuite.
4dcea38 [Josh Rosen] Move withSystemProperty to TestUtils class.
2014-12-30 18:12:20 -08:00
Josh Rosen efa80a531e [SPARK-4882] Register PythonBroadcast with Kryo so that PySpark works with KryoSerializer
This PR fixes an issue where PySpark broadcast variables caused NullPointerExceptions if KryoSerializer was used.  The fix is to register PythonBroadcast with Kryo so that it's deserialized with a KryoJavaSerializer.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #3831 from JoshRosen/SPARK-4882 and squashes the following commits:

0466c7a [Josh Rosen] Register PythonBroadcast with Kryo.
d5b409f [Josh Rosen] Enable registrationRequired, which would have caught this bug.
069d8a7 [Josh Rosen] Add failing test for SPARK-4882
2014-12-30 09:29:52 -08:00
Sean Owen 29fabb1b52 SPARK-4297 [BUILD] Build warning fixes omnibus
There are a number of warnings generated in a normal, successful build right now. They're mostly Java unchecked cast warnings, which can be suppressed. But there's a grab bag of other Scala language warnings and so on that can all be easily fixed. The forthcoming PR fixes about 90% of the build warnings I see now.

Author: Sean Owen <sowen@cloudera.com>

Closes #3157 from srowen/SPARK-4297 and squashes the following commits:

8c9e469 [Sean Owen] Suppress unchecked cast warnings, and several other build warning fixes
2014-12-24 13:32:51 -08:00
Marcelo Vanzin dd155369a0 [SPARK-4834] [standalone] Clean up application files after app finishes.
Commit 7aacb7bfa added support for sharing downloaded files among multiple
executors of the same app. That works great in Yarn, since the app's directory
is cleaned up after the app is done.

But Spark standalone mode didn't do that, so the lock/cache files created
by that change were left around and could eventually fill up the disk hosting
/tmp.

To solve that, create app-specific directories under the local dirs when
launching executors. Multiple executors launched by the same Worker will
use the same app directories, so they should be able to share the downloaded
files. When the application finishes, a new message is sent to all workers
telling them the application has finished; once that message has been received,
and all executors registered for the application shut down, then those
directories will be cleaned up by the Worker.

Note: Unit testing this is hard (if even possible), since local-cluster mode
doesn't seem to leave the Master/Worker daemons running long enough after
`sc.stop()` is called for the clean up protocol to take effect.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #3705 from vanzin/SPARK-4834 and squashes the following commits:

b430534 [Marcelo Vanzin] Remove seemingly unnecessary synchronization.
50eb4b9 [Marcelo Vanzin] Review feedback.
c0e5ea5 [Marcelo Vanzin] [SPARK-4834] [standalone] Clean up application files after app finishes.
2014-12-23 12:02:08 -08:00
Kostas Sakellis 7c0ed13d29 [SPARK-4079] [CORE] Consolidates Errors if a CompressionCodec is not available
This commit consolidates some of the exceptions thrown if compression codecs are not available. If a bad configuration string was passed in, a ClassNotFoundException was through. Also, if Snappy was not available, it would throw an InvocationTargetException when the codec was being used (not when it was being initialized). Now, an IllegalArgumentException is thrown when a codec is not available at creation time - either because the class does not exist or the codec itself is not available in the system. This will allow us to have a better message and fail faster.

Author: Kostas Sakellis <kostas@cloudera.com>

Closes #3119 from ksakellis/kostas-spark-4079 and squashes the following commits:

9709c7c [Kostas Sakellis] Removed unnecessary Logging class
63bfdd0 [Kostas Sakellis] Removed isAvailable to preserve binary compatibility
1d0ef2f [Kostas Sakellis] [SPARK-4079] [CORE] Added more information to exception
64f3d27 [Kostas Sakellis] [SPARK-4079] [CORE] Code review feedback
52dfa8f [Kostas Sakellis] [SPARK-4079] [CORE] Default to LZF if Snappy not available
2014-12-22 13:07:01 -08:00
Sean Owen c6a3c0d505 SPARK-4910 [CORE] build failed (use of FileStatus.isFile in Hadoop 1.x)
Fix small Hadoop 1 compile error from SPARK-2261. In Hadoop 1.x, all we have is FileStatus.isDir, so these "is file" assertions are changed to "is not a dir". This is how similar checks are done so far in the code base.

Author: Sean Owen <sowen@cloudera.com>

Closes #3754 from srowen/SPARK-4910 and squashes the following commits:

52c5e4e [Sean Owen] Fix small Hadoop 1 compile error from SPARK-2261
2014-12-21 13:16:57 -08:00