Commit graph

1861 commits

Author SHA1 Message Date
Tejas Patil ac38bdc756 [SPARK-15601][CORE] CircularBuffer's toString() to print only the contents written if buffer isn't full
## What changes were proposed in this pull request?

1. The class allocated 4x space than needed as it was using `Int` to store the `Byte` values

2. If CircularBuffer isn't full, currently toString() will print some garbage chars along with the content written as is tries to print the entire array allocated for the buffer. The fix is to keep track of buffer getting full and don't print the tail of the buffer if it isn't full (suggestion by sameeragarwal over https://github.com/apache/spark/pull/12194#discussion_r64495331)

3. Simplified `toString()`

## How was this patch tested?

Added new test case

Author: Tejas Patil <tejasp@fb.com>

Closes #13351 from tejasapatil/circular_buffer.
2016-05-31 19:52:22 -05:00
Devaraj K 5b21139dbf [SPARK-10530][CORE] Kill other task attempts when one taskattempt belonging the same task is succeeded in speculation
## What changes were proposed in this pull request?

With this patch, TaskSetManager kills other running attempts when any one of the attempt succeeds for the same task. Also killed tasks will not be considered as failed tasks and they get listed separately in the UI and also shows the task state as KILLED instead of FAILED.

## How was this patch tested?

core\src\test\scala\org\apache\spark\ui\jobs\JobProgressListenerSuite.scala
core\src\test\scala\org\apache\spark\util\JsonProtocolSuite.scala

I have verified this patch manually by enabling spark.speculation as true, when any attempt gets succeeded then other running attempts are getting killed for the same task and other pending tasks are getting assigned in those. And also when any attempt gets killed then they are considered as KILLED tasks and not considered as FAILED tasks. Please find the attached screen shots for the reference.

![stage-tasks-table](https://cloud.githubusercontent.com/assets/3174804/14075132/394c6a12-f4f4-11e5-8638-20ff7b8cc9bc.png)
![stages-table](https://cloud.githubusercontent.com/assets/3174804/14075134/3b60f412-f4f4-11e5-9ea6-dd0dcc86eb03.png)

Ref : https://github.com/apache/spark/pull/11916

Author: Devaraj K <devaraj@apache.org>

Closes #11996 from devaraj-kavali/SPARK-10530.
2016-05-30 14:29:27 -07:00
Reynold Xin 73178c7556 [SPARK-15633][MINOR] Make package name for Java tests consistent
## What changes were proposed in this pull request?
This is a simple patch that makes package names for Java 8 test suites consistent. I moved everything to test.org.apache.spark to we can test package private APIs properly. Also added "java8" as the package name so we can easily run all the tests related to Java 8.

## How was this patch tested?
This is a test only change.

Author: Reynold Xin <rxin@databricks.com>

Closes #13364 from rxin/SPARK-15633.
2016-05-27 21:20:02 -07:00
dding3 88c9c467a3 [SPARK-15562][ML] Delete temp directory after program exit in DataFrameExample
## What changes were proposed in this pull request?
Temp directory used to save records is not deleted after program exit in DataFrameExample. Although it called deleteOnExit, it doesn't work as the directory is not empty. Similar things happend in ContextCleanerSuite. Update the code to make sure temp directory is deleted after program exit.

## How was this patch tested?

unit tests and local build.

Author: dding3 <ding.ding@intel.com>

Closes #13328 from dding3/master.
2016-05-27 21:01:50 -05:00
Sital Kedia ce756daa4f [SPARK-15569] Reduce frequency of updateBytesWritten function in Disk…
## What changes were proposed in this pull request?

Profiling a Spark job spilling large amount of intermediate data we found that significant portion of time is being spent in DiskObjectWriter.updateBytesWritten function. Looking at the code, we see that the function is being called too frequently to update the number of bytes written to disk. We should reduce the frequency to avoid this.

## How was this patch tested?

Tested by running the job on cluster and saw 20% CPU gain  by this change.

Author: Sital Kedia <skedia@fb.com>

Closes #13332 from sitalkedia/DiskObjectWriter.
2016-05-27 11:22:39 -07:00
Sameer Agarwal fe6de16f78 [SPARK-8428][SPARK-13850] Fix integer overflows in TimSort
## What changes were proposed in this pull request?

This patch fixes a few integer overflows in `UnsafeSortDataFormat.copyRange()` and `ShuffleSortDataFormat copyRange()` that seems to be the most likely cause behind a number of `TimSort` contract violation errors seen in Spark 2.0 and Spark 1.6 while sorting large datasets.

## How was this patch tested?

Added a test in `ExternalSorterSuite` that instantiates a large array of the form of [150000000, 150000001, 150000002, ...., 300000000, 0, 1, 2, ..., 149999999] that triggers a `copyRange` in `TimSort.mergeLo` or `TimSort.mergeHi`. Note that the input dataset should contain at least 268.43 million rows with a certain data distribution for an overflow to occur.

Author: Sameer Agarwal <sameer@databricks.com>

Closes #13336 from sameeragarwal/timsort-bug.
2016-05-26 15:49:16 -07:00
Imran Rashid dfc9fc02cc [SPARK-10372] [CORE] basic test framework for entire spark scheduler
This is a basic framework for testing the entire scheduler.  The tests this adds aren't very interesting -- the point of this PR is just to setup the framework, to keep the initial change small, but it can be built upon to test more features (eg., speculation, killing tasks, blacklisting, etc.).

Author: Imran Rashid <irashid@cloudera.com>

Closes #8559 from squito/SPARK-10372-scheduler-integs.
2016-05-26 00:29:09 -05:00
Takuya UESHIN 698ef762f8 [SPARK-14269][SCHEDULER] Eliminate unnecessary submitStage() call.
## What changes were proposed in this pull request?

Currently a method `submitStage()` for waiting stages is called on every iteration of the event loop in `DAGScheduler` to submit all waiting stages, but most of them are not necessary because they are not related to Stage status.
The case we should try to submit waiting stages is only when their parent stages are successfully completed.

This elimination can improve `DAGScheduler` performance.

## How was this patch tested?

Added some checks and other existing tests, and our projects.

We have a project bottle-necked by `DAGScheduler`, having about 2000 stages.

Before this patch the almost all execution time in `Driver` process was spent to process `submitStage()` of `dag-scheduler-event-loop` thread but after this patch the performance was improved as follows:

|        | total execution time | `dag-scheduler-event-loop` thread time | `submitStage()` |
|--------|---------------------:|---------------------------------------:|----------------:|
| Before |              760 sec |                                710 sec |         667 sec |
| After  |              440 sec |                                 14 sec |          10 sec |

Author: Takuya UESHIN <ueshin@happy-camper.st>

Closes #12060 from ueshin/issues/SPARK-14269.
2016-05-25 13:57:25 -07:00
Lukasz b120fba6ae [SPARK-9044] Fix "Storage" tab in UI so that it reflects RDD name change.
## What changes were proposed in this pull request?

1. Making 'name' field of RDDInfo mutable.
2. In StorageListener: catching the fact that RDD's name was changed and updating it in RDDInfo.

## How was this patch tested?

1. Manual verification - the 'Storage' tab now behaves as expected.
2. The commit also contains a new unit test which verifies this.

Author: Lukasz <lgieron@gmail.com>

Closes #13264 from lgieron/SPARK-9044.
2016-05-25 10:24:21 -07:00
Reynold Xin 14494da87b [SPARK-15518] Rename various scheduler backend for consistency
## What changes were proposed in this pull request?
This patch renames various scheduler backends to make them consistent:

- LocalScheduler -> LocalSchedulerBackend
- AppClient -> StandaloneAppClient
- AppClientListener -> StandaloneAppClientListener
- SparkDeploySchedulerBackend -> StandaloneSchedulerBackend
- CoarseMesosSchedulerBackend -> MesosCoarseGrainedSchedulerBackend
- MesosSchedulerBackend -> MesosFineGrainedSchedulerBackend

## How was this patch tested?
Updated test cases to reflect the name change.

Author: Reynold Xin <rxin@databricks.com>

Closes #13288 from rxin/SPARK-15518.
2016-05-24 20:55:47 -07:00
Dongjoon Hyun f08bf587b1 [SPARK-15512][CORE] repartition(0) should raise IllegalArgumentException
## What changes were proposed in this pull request?

Previously, SPARK-8893 added the constraints on positive number of partitions for repartition/coalesce operations in general. This PR adds one missing part for that and adds explicit two testcases.

**Before**
```scala
scala> sc.parallelize(1 to 5).coalesce(0)
java.lang.IllegalArgumentException: requirement failed: Number of partitions (0) must be positive.
...
scala> sc.parallelize(1 to 5).repartition(0).collect()
res1: Array[Int] = Array()   // empty
scala> spark.sql("select 1").coalesce(0)
res2: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = [1: int]
scala> spark.sql("select 1").coalesce(0).collect()
java.lang.IllegalArgumentException: requirement failed: Number of partitions (0) must be positive.
scala> spark.sql("select 1").repartition(0)
res3: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = [1: int]
scala> spark.sql("select 1").repartition(0).collect()
res4: Array[org.apache.spark.sql.Row] = Array()  // empty
```

**After**
```scala
scala> sc.parallelize(1 to 5).coalesce(0)
java.lang.IllegalArgumentException: requirement failed: Number of partitions (0) must be positive.
...
scala> sc.parallelize(1 to 5).repartition(0)
java.lang.IllegalArgumentException: requirement failed: Number of partitions (0) must be positive.
...
scala> spark.sql("select 1").coalesce(0)
java.lang.IllegalArgumentException: requirement failed: Number of partitions (0) must be positive.
...
scala> spark.sql("select 1").repartition(0)
java.lang.IllegalArgumentException: requirement failed: Number of partitions (0) must be positive.
...
```

## How was this patch tested?

Pass the Jenkins tests with new testcases.

Author: Dongjoon Hyun <dongjoon@apache.org>

Closes #13282 from dongjoon-hyun/SPARK-15512.
2016-05-24 18:55:23 -07:00
Dongjoon Hyun be99a99fe7 [MINOR][CORE][TEST] Update obsolete takeSample test case.
## What changes were proposed in this pull request?

This PR fixes some obsolete comments and assertion in `takeSample` testcase of `RDDSuite.scala`.

## How was this patch tested?

This fixes the testcase only.

Author: Dongjoon Hyun <dongjoon@apache.org>

Closes #13260 from dongjoon-hyun/SPARK-15481.
2016-05-24 11:09:54 -07:00
Xin Wu 01659bc50c [SPARK-15431][SQL] Support LIST FILE(s)|JAR(s) command natively
## What changes were proposed in this pull request?
Currently command `ADD FILE|JAR <filepath | jarpath>` is supported natively in SparkSQL. However, when this command is run, the file/jar is added to the resources that can not be looked up by `LIST FILE(s)|JAR(s)` command because the `LIST` command is passed to Hive command processor in Spark-SQL or simply not supported in Spark-shell. There is no way users can find out what files/jars are added to the spark context.
Refer to [Hive commands](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Cli)

This PR is to support following commands:
`LIST (FILE[s] [filepath ...] | JAR[s] [jarfile ...])`

### For example:
##### LIST FILE(s)
```
scala> spark.sql("add file hdfs://bdavm009.svl.ibm.com:8020/tmp/test.txt")
res1: org.apache.spark.sql.DataFrame = []
scala> spark.sql("add file hdfs://bdavm009.svl.ibm.com:8020/tmp/test1.txt")
res2: org.apache.spark.sql.DataFrame = []

scala> spark.sql("list file hdfs://bdavm009.svl.ibm.com:8020/tmp/test1.txt").show(false)
+----------------------------------------------+
|result                                        |
+----------------------------------------------+
|hdfs://bdavm009.svl.ibm.com:8020/tmp/test1.txt|
+----------------------------------------------+

scala> spark.sql("list files").show(false)
+----------------------------------------------+
|result                                        |
+----------------------------------------------+
|hdfs://bdavm009.svl.ibm.com:8020/tmp/test1.txt|
|hdfs://bdavm009.svl.ibm.com:8020/tmp/test.txt |
+----------------------------------------------+
```

##### LIST JAR(s)
```
scala> spark.sql("add jar /Users/xinwu/spark/core/src/test/resources/TestUDTF.jar")
res9: org.apache.spark.sql.DataFrame = [result: int]

scala> spark.sql("list jar TestUDTF.jar").show(false)
+---------------------------------------------+
|result                                       |
+---------------------------------------------+
|spark://192.168.1.234:50131/jars/TestUDTF.jar|
+---------------------------------------------+

scala> spark.sql("list jars").show(false)
+---------------------------------------------+
|result                                       |
+---------------------------------------------+
|spark://192.168.1.234:50131/jars/TestUDTF.jar|
+---------------------------------------------+
```
## How was this patch tested?
New test cases are added for Spark-SQL, Spark-Shell and SparkContext API code path.

Author: Xin Wu <xinwu@us.ibm.com>
Author: xin Wu <xinwu@us.ibm.com>

Closes #13212 from xwu0226/list_command.
2016-05-23 17:32:01 -07:00
Shixiong Zhu 305263954a Fix the compiler error introduced by #13153 for Scala 2.10 2016-05-19 12:36:44 -07:00
Shixiong Zhu 4e3cb7a5d9 [SPARK-15317][CORE] Don't store accumulators for every task in listeners
## What changes were proposed in this pull request?

In general, the Web UI doesn't need to store the Accumulator/AccumulableInfo for every task. It only needs the Accumulator values.

In this PR, it creates new UIData classes to store the necessary fields and make `JobProgressListener` store only these new classes, so that `JobProgressListener` won't store Accumulator/AccumulableInfo and the size of `JobProgressListener` becomes pretty small. I also eliminates `AccumulableInfo` from `SQLListener` so that we don't keep any references for those unused `AccumulableInfo`s.

## How was this patch tested?

I ran two tests reported in JIRA locally:

The first one is:
```
val data = spark.range(0, 10000, 1, 10000)
data.cache().count()
```
The retained size of JobProgressListener decreases from 60.7M to 6.9M.

The second one is:
```
import org.apache.spark.ml.CC
import org.apache.spark.sql.SQLContext
val sqlContext = SQLContext.getOrCreate(sc)
CC.runTest(sqlContext)
```

This test won't cause OOM after applying this patch.

Author: Shixiong Zhu <shixiong@databricks.com>

Closes #13153 from zsxwing/memory.
2016-05-19 12:05:17 -07:00
Davies Liu ad182086cc [SPARK-15300] Fix writer lock conflict when remove a block
## What changes were proposed in this pull request?

A writer lock could be acquired when 1) create a new block 2) remove a block 3) evict a block to disk. 1) and 3) could happen in the same time within the same task, all of them could happen in the same time outside a task. It's OK that when someone try to grab the write block for a block, but the block is acquired by another one that has the same task attempt id.

This PR remove the check.

## How was this patch tested?

Updated existing tests.

Author: Davies Liu <davies@databricks.com>

Closes #13082 from davies/write_lock_conflict.
2016-05-19 11:47:17 -07:00
Sandeep Singh 3facca5152 [CORE][MINOR] Remove redundant set master in OutputCommitCoordinatorIntegrationSuite
## What changes were proposed in this pull request?
Remove redundant set master in OutputCommitCoordinatorIntegrationSuite, as we are already setting it in SparkContext below on line 43.

## How was this patch tested?
existing tests

Author: Sandeep Singh <sandeep@techaddict.me>

Closes #13168 from techaddict/minor-1.
2016-05-19 10:44:26 +01:00
Davies Liu 8fb1d1c7f3 [SPARK-15357] Cooperative spilling should check consumer memory mode
## What changes were proposed in this pull request?

Since we support forced spilling for Spillable, which only works in OnHeap mode, different from other SQL operators (could be OnHeap or OffHeap), we should considering the mode of consumer before calling trigger forced spilling.

## How was this patch tested?

Add new test.

Author: Davies Liu <davies@databricks.com>

Closes #13151 from davies/fix_mode.
2016-05-18 09:44:21 -07:00
WeichenXu 2f9047b5eb [SPARK-15322][MLLIB][CORE][SQL] update deprecate accumulator usage into accumulatorV2 in spark project
## What changes were proposed in this pull request?

I use Intellj-IDEA to search usage of deprecate SparkContext.accumulator in the whole spark project, and update the code.(except those test code for accumulator method itself)

## How was this patch tested?

Exisiting unit tests

Author: WeichenXu <WeichenXu123@outlook.com>

Closes #13112 from WeichenXu123/update_accuV2_in_mllib.
2016-05-18 11:48:46 +01:00
Takuya UESHIN a57aadae84 [SPARK-13902][SCHEDULER] Make DAGScheduler not to create duplicate stage.
## What changes were proposed in this pull request?

`DAGScheduler`sometimes generate incorrect stage graph.

Suppose you have the following DAG:

```
[A] <--(s_A)-- [B] <--(s_B)-- [C] <--(s_C)-- [D]
            \                /
              <-------------
```

Note: [] means an RDD, () means a shuffle dependency.

Here, RDD `B` has a shuffle dependency on RDD `A`, and RDD `C` has shuffle dependency on both `B` and `A`. The shuffle dependency IDs are numbers in the `DAGScheduler`, but to make the example easier to understand, let's call the shuffled data from `A` shuffle dependency ID `s_A` and the shuffled data from `B` shuffle dependency ID `s_B`.
The `getAncestorShuffleDependencies` method in `DAGScheduler` (incorrectly) does not check for duplicates when it's adding ShuffleDependencies to the parents data structure, so for this DAG, when `getAncestorShuffleDependencies` gets called on `C` (previous of the final RDD), `getAncestorShuffleDependencies` will return `s_A`, `s_B`, `s_A` (`s_A` gets added twice: once when the method "visit"s RDD `C`, and once when the method "visit"s RDD `B`). This is problematic because this line of code: https://github.com/apache/spark/blob/8ef3399/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L289 then generates a new shuffle stage for each dependency returned by `getAncestorShuffleDependencies`, resulting in duplicate map stages that compute the map output from RDD `A`.

As a result, `DAGScheduler` generates the following stages and their parents for each shuffle:

| | stage | parents |
|----|----|----|
| s_A | ShuffleMapStage 2 | List() |
| s_B | ShuffleMapStage 1 | List(ShuffleMapStage 0) |
| s_C | ShuffleMapStage 3 | List(ShuffleMapStage 1, ShuffleMapStage 2) |
| - | ResultStage 4 | List(ShuffleMapStage 3) |

The stage for s_A should be `ShuffleMapStage 0`, but the stage for `s_A` is generated twice as `ShuffleMapStage 2` and `ShuffleMapStage 0` is overwritten by `ShuffleMapStage 2`, and the stage `ShuffleMap Stage1` keeps referring the old stage `ShuffleMapStage 0`.

This patch is fixing it.

## How was this patch tested?

I added the sample RDD graph to show the illegal stage graph to `DAGSchedulerSuite`.

Author: Takuya UESHIN <ueshin@happy-camper.st>

Closes #12655 from ueshin/issues/SPARK-13902.
2016-05-12 12:36:18 -07:00
Sandeep Singh ff92eb2e80 [SPARK-15080][CORE] Break copyAndReset into copy and reset
## What changes were proposed in this pull request?
Break copyAndReset into two methods copy and reset instead of just one.

## How was this patch tested?
Existing Tests

Author: Sandeep Singh <sandeep@techaddict.me>

Closes #12936 from techaddict/SPARK-15080.
2016-05-12 11:12:09 +08:00
Andrew Or bb88ad4e0e [SPARK-15260] Atomically resize memory pools
## What changes were proposed in this pull request?

When we acquire execution memory, we do a lot of things between shrinking the storage memory pool and enlarging the execution memory pool. In particular, we call `memoryStore.evictBlocksToFreeSpace`, which may do a lot of I/O and can throw exceptions. If an exception is thrown, the pool sizes on that executor will be in a bad state.

This patch minimizes the things we do between the two calls to make the resizing more atomic.

## How was this patch tested?

Jenkins.

Author: Andrew Or <andrew@databricks.com>

Closes #13039 from andrewor14/safer-pool.
2016-05-11 12:58:57 -07:00
Eric Liang 6d0368ab8d [SPARK-15259] Sort time metric should not include spill and record insertion time
## What changes were proposed in this pull request?

After SPARK-14669 it seems the sort time metric includes both spill and record insertion time. This makes it not very useful since the metric becomes close to the total execution time of the node.

We should track just the time spent for in-memory sort, as before.

## How was this patch tested?

Verified metric in the UI, also unit test on UnsafeExternalRowSorter.

cc davies

Author: Eric Liang <ekl@databricks.com>
Author: Eric Liang <ekhliang@gmail.com>

Closes #13035 from ericl/fix-metrics.
2016-05-11 11:25:46 -07:00
Wenchen Fan bcfee153b1 [SPARK-12837][CORE] reduce network IO for accumulators
Sending un-updated accumulators back to driver makes no sense, as merging a zero value accumulator is a no-op. We should only send back updated accumulators, to save network IO.

new test in `TaskContextSuite`

Author: Wenchen Fan <wenchen@databricks.com>

Closes #12899 from cloud-fan/acc.
2016-05-10 11:16:56 -07:00
Marcelo Vanzin 0b9cae4242 [SPARK-11249][LAUNCHER] Throw error if app resource is not provided.
Without this, the code would build an invalid spark-submit command line,
and a more cryptic error would be presented to the user. Also, expose
a constant that allows users to set a dummy resource in cases where
they don't need an actual resource file; for backwards compatibility,
that uses the same "spark-internal" resource that Spark itself uses.

Tested via unit tests, run-example, spark-shell, and running the
thrift server with mixed spark and hive command line arguments.

Author: Marcelo Vanzin <vanzin@cloudera.com>

Closes #12909 from vanzin/SPARK-11249.
2016-05-10 10:35:54 -07:00
Sital Kedia a019e6efb7 [SPARK-14542][CORE] PipeRDD should allow configurable buffer size for…
## What changes were proposed in this pull request?

Currently PipedRDD internally uses PrintWriter to write data to the stdin of the piped process, which by default uses a BufferedWriter of buffer size 8k. In our experiment, we have seen that 8k buffer size is too small and the job spends significant amount of CPU time in system calls to copy the data. We should have a way to configure the buffer size for the writer.

## How was this patch tested?
Ran PipedRDDSuite tests.

Author: Sital Kedia <skedia@fb.com>

Closes #12309 from sitalkedia/bufferedPipedRDD.
2016-05-10 15:28:35 +01:00
Alex Bozarth c3e23bc0c3 [SPARK-10653][CORE] Remove unnecessary things from SparkEnv
## What changes were proposed in this pull request?

Removed blockTransferService and sparkFilesDir from SparkEnv since they're rarely used and don't need to be in stored in the env. Edited their few usages to accommodate the change.

## How was this patch tested?

ran dev/run-tests locally

Author: Alex Bozarth <ajbozart@us.ibm.com>

Closes #12970 from ajbozarth/spark10653.
2016-05-09 11:51:37 -07:00
Thomas Graves cc95f1ed5f [SPARK-1239] Improve fetching of map output statuses
The main issue we are trying to solve is the memory bloat of the Driver when tasks request the map output statuses.  This means with a large number of tasks you either need a huge amount of memory on Driver or you have to repartition to smaller number.  This makes it really difficult to run over say 50000 tasks.

The main issues that cause the memory bloat are:
1) no flow control on sending the map output status responses.  We serialize the map status output  and then hand off to netty to send.  netty is sending asynchronously and it can't send them fast enough to keep up with incoming requests so we end up with lots of copies of the serialized map output statuses sitting there and this causes huge bloat when you have 10's of thousands of tasks and map output status is in the 10's of MB.
2) When initial reduce tasks are started up, they all request the map output statuses from the Driver. These requests are handled by multiple threads in parallel so even though we check to see if we have a cached version, initially when we don't have a cached version yet, many of initial requests can all end up serializing the exact same map output statuses.

This patch does a couple of things:
- When the map output status size is over a threshold (default 512K) then it uses broadcast to send the map statuses.  This means we no longer serialize a large map output status and thus we don't have issues with memory bloat.  the messages sizes are now in the 300-400 byte range and the map status output are broadcast. If its under the threadshold it sends it as before, the message contains the DIRECT indicator now.
- synchronize the incoming requests to allow one thread to cache the serialized output and broadcast the map output status  that can then be used by everyone else.  This ensures we don't create multiple broadcast variables when we don't need to.  To ensure this happens I added a second thread pool which the Dispatcher hands the requests to so that those threads can block without blocking the main dispatcher threads (which would cause things like heartbeats and such not to come through)

Note that some of design and code was contributed by mridulm

## How was this patch tested?

Unit tests and a lot of manually testing.
Ran with akka and netty rpc. Ran with both dynamic allocation on and off.

one of the large jobs I used to test this was a join of 15TB of data.  it had 200,000 map tasks, and  20,000 reduce tasks. Executors ranged from 200 to 2000.  This job ran successfully with 5GB of memory on the driver with these changes. Without these changes I was using 20GB and only had 500 reduce tasks.  The job has 50mb of serialized map output statuses and took roughly the same amount of time for the executors to get the map output statuses as before.

Ran a variety of other jobs, from large wordcounts to small ones not using broadcasts.

Author: Thomas Graves <tgraves@staydecay.corp.gq1.yahoo.com>

Closes #12113 from tgravescs/SPARK-1239.
2016-05-06 19:31:26 -07:00
Ryan Blue 08db491265 [SPARK-9926] Parallelize partition logic in UnionRDD.
This patch has the new logic from #8512 that uses a parallel collection to compute partitions in UnionRDD. The rest of #8512 added an alternative code path for calculating splits in S3, but that isn't necessary to get the same speedup. The underlying problem wasn't that bulk listing wasn't used, it was that an extra FileStatus was retrieved for each file. The fix was just committed as [HADOOP-12810](https://issues.apache.org/jira/browse/HADOOP-12810). (I think the original commit also used a single prefix to enumerate all paths, but that isn't always helpful and it was removed in later versions so there is no need for SparkS3Utils.)

I tested this using the same table that piapiaozhexiu was using. Calculating splits for a 10-day period took 25 seconds with this change and HADOOP-12810, which is on par with the results from #8512.

Author: Ryan Blue <blue@apache.org>
Author: Cheolsoo Park <cheolsoop@netflix.com>

Closes #11242 from rdblue/SPARK-9926-parallelize-union-rdd.
2016-05-05 14:40:37 -07:00
Sebastien Rainville eb019af9a9 [SPARK-13001][CORE][MESOS] Prevent getting offers when reached max cores
Similar to https://github.com/apache/spark/pull/8639

This change rejects offers for 120s when reached `spark.cores.max` in coarse-grained mode to mitigate offer starvation. This prevents Mesos to send us offers again and again, starving other frameworks. This is especially problematic when running many small frameworks on the same Mesos cluster, e.g. many small Sparks streaming jobs, and cause the bigger spark jobs to stop receiving offers. By rejecting the offers for a long period of time, they become available to those other frameworks.

Author: Sebastien Rainville <sebastien@hopper.com>

Closes #10924 from sebastienrainville/master.
2016-05-04 14:32:36 -07:00
Bryan Cutler cf2e9da612 [SPARK-12299][CORE] Remove history serving functionality from Master
Remove history server functionality from standalone Master.  Previously, the Master process rebuilt a SparkUI once the application was completed which sometimes caused problems, such as OOM, when the application event log is large (see SPARK-6270).  Keeping this functionality out of the Master will help to simplify the process and increase stability.

Testing for this change included running core unit tests and manually running an application on a standalone cluster to verify that it completed successfully and that the Master UI functioned correctly.  Also added 2 unit tests to verify killing an application and driver from MasterWebUI makes the correct request to the Master.

Author: Bryan Cutler <cutlerb@gmail.com>

Closes #10991 from BryanCutler/remove-history-master-SPARK-12299.
2016-05-04 14:29:54 -07:00
Reynold Xin 6274a520fa [SPARK-15115][SQL] Reorganize whole stage codegen benchmark suites
## What changes were proposed in this pull request?
We currently have a single suite that is very large, making it difficult to maintain and play with specific primitives. This patch reorganizes the file by creating multiple benchmark suites in a single package.

Most of the changes are straightforward move of code. On top of the code moving, I did:
1. Use SparkSession instead of SQLContext.
2. Turned most benchmark scenarios into a their own test cases, rather than having multiple scenarios in a single test case, which takes forever to run.

## How was this patch tested?
This is a test only change.

Author: Reynold Xin <rxin@databricks.com>

Closes #12891 from rxin/SPARK-15115.
2016-05-04 11:00:01 -07:00
Dhruve Ashar a45647746d [SPARK-4224][CORE][YARN] Support group acls
## What changes were proposed in this pull request?
Currently only a list of users can be specified for view and modify acls. This change enables a group of admins/devs/users to be provisioned for viewing and modifying Spark jobs.

**Changes Proposed in the fix**
Three new corresponding config entries have been added where the user can specify the groups to be given access.

```
spark.admin.acls.groups
spark.modify.acls.groups
spark.ui.view.acls.groups
```

New config entries were added because specifying the users and groups explicitly is a better and cleaner way compared to specifying them in the existing config entry using a delimiter.

A generic trait has been introduced to provide the user to group mapping which makes it pluggable to support a variety of mapping protocols - similar to the one used in hadoop. A default unix shell based implementation has been provided.
Custom user to group mapping protocol can be specified and configured by the entry ```spark.user.groups.mapping```

**How the patch was Tested**
We ran different spark jobs setting the config entries in combinations of admin, modify and ui acls. For modify acls we tried killing the job stages from the ui and using yarn commands. For view acls we tried accessing the UI tabs and the logs. Headless accounts were used to launch these jobs and different users tried to modify and view the jobs to ensure that the groups mapping applied correctly.

Additional Unit tests have been added without modifying the existing ones. These test for different ways of setting the acls through configuration and/or API and validate the expected behavior.

Author: Dhruve Ashar <dhruveashar@gmail.com>

Closes #12760 from dhruve/impr/SPARK-4224.
2016-05-04 08:45:43 -05:00
Reynold Xin 695f0e9195 [SPARK-15107][SQL] Allow varying # iterations by test case in Benchmark
## What changes were proposed in this pull request?
This patch changes our micro-benchmark util to allow setting different iteration numbers for different test cases. For some of our benchmarks, turning off whole-stage codegen can make the runtime 20X slower, making it very difficult to run a large number of times without substantially shortening the input cardinality.

With this change, I set the default num iterations to 2 for whole stage codegen off, and 5 for whole stage codegen on. I also updated some results.

## How was this patch tested?
N/A - this is a test util.

Author: Reynold Xin <rxin@databricks.com>

Closes #12884 from rxin/SPARK-15107.
2016-05-03 22:56:40 -07:00
Thomas Graves 83ee92f603 [SPARK-11316] coalesce doesn't handle UnionRDD with partial locality properly
## What changes were proposed in this pull request?

coalesce doesn't handle UnionRDD with partial locality properly.  I had a user who had a UnionRDD that was made up of mapPartitionRDD without preferred locations and a checkpointedRDD with preferred locations (getting from hdfs).  It took the driver over 20 minutes to setup the groups and put the partitions into those groups before it even started any tasks.  Even perhaps worse is it didn't end up with the number of partitions he was asking for because it didn't put a partition in each of the groups properly.

The changes in this patch get rid of a n^2 while loop that was causing the 20 minutes, it properly distributes the partitions to have at least one per group, and it changes from using the rotation iterator which got the preferred locations many times to get all the preferred locations once up front.

Note that the n^2 while loop that I removed in setupGroups took so long because all of the partitions with preferred locations were already assigned to group, so it basically looped through every single one and wasn't ever able to assign it.  At the time I had 960 partitions with preferred locations and 1020 without and did the outer while loop 319 times because that is the # of groups left to create.  Note that each of those times through the inner while loop is going off to hdfs to get the block locations, so this is extremely inefficient.

## How was the this patch tested?

Added unit tests for this case and ran existing ones that applied to make sure no regressions.
Also manually tested on the users production job to make sure it fixed their issue.  It created the proper number of partitions and now it takes about 6 seconds rather then 20 minutes.
 I did also run some basic manual tests with spark-shell doing coalesced to smaller number, same number, and then greater with shuffle.

Author: Thomas Graves <tgraves@prevailsail.corp.gq1.yahoo.com>

Closes #11327 from tgravescs/SPARK-11316.
2016-05-03 13:43:20 -07:00
Sandeep Singh 84b3a4a873 [SPARK-15082][CORE] Improve unit test coverage for AccumulatorV2
## What changes were proposed in this pull request?
Added tests for ListAccumulator and LegacyAccumulatorWrapper, test for ListAccumulator is one similar to old Collection Accumulators

## How was this patch tested?
Ran tests locally.

cc rxin

Author: Sandeep Singh <sandeep@techaddict.me>

Closes #12862 from techaddict/SPARK-15082.
2016-05-03 11:45:51 -07:00
Sandeep Singh ca813330c7 [SPARK-15087][CORE][SQL] Remove AccumulatorV2.localValue and keep only value
## What changes were proposed in this pull request?
Remove AccumulatorV2.localValue and keep only value

## How was this patch tested?
existing tests

Author: Sandeep Singh <sandeep@techaddict.me>

Closes #12865 from techaddict/SPARK-15087.
2016-05-03 11:38:43 -07:00
Reynold Xin d557a5e01e [SPARK-15081] Move AccumulatorV2 and subclasses into util package
## What changes were proposed in this pull request?
This patch moves AccumulatorV2 and subclasses into util package.

## How was this patch tested?
Updated relevant tests.

Author: Reynold Xin <rxin@databricks.com>

Closes #12863 from rxin/SPARK-15081.
2016-05-03 19:45:12 +08:00
Reynold Xin bb9ab56b96 [SPARK-15079] Support average/count/sum in Long/DoubleAccumulator
## What changes were proposed in this pull request?
This patch removes AverageAccumulator and adds the ability to compute average to LongAccumulator and DoubleAccumulator. The patch also improves documentation for the two accumulators.

## How was this patch tested?
Added unit tests for this.

Author: Reynold Xin <rxin@databricks.com>

Closes #12858 from rxin/SPARK-15079.
2016-05-02 21:12:48 -07:00
Marcin Tustin 8028f3a0b4 [SPARK-14685][CORE] Document heritability of localProperties
## What changes were proposed in this pull request?

This updates the java-/scala- doc for setLocalProperty to document heritability of localProperties. This also adds tests for that behaviour.

## How was this patch tested?

Tests pass. New tests were added.

Author: Marcin Tustin <marcin.tustin@gmail.com>

Closes #12455 from marcintustin/SPARK-14685.
2016-05-02 19:37:57 -07:00
Reynold Xin 44da8d8eab [SPARK-15049] Rename NewAccumulator to AccumulatorV2
## What changes were proposed in this pull request?
NewAccumulator isn't the best name if we ever come up with v3 of the API.

## How was this patch tested?
Updated tests to reflect the change.

Author: Reynold Xin <rxin@databricks.com>

Closes #12827 from rxin/SPARK-15049.
2016-05-01 20:21:02 -07:00
Allen cdf9e9753d [SPARK-14505][CORE] Fix bug : creating two SparkContext objects in the same jvm, the first one will can not run any task!
After creating two SparkContext objects in the same jvm(the second one can not be created successfully!),
use the first one to run job will throw exception like below:

![image](https://cloud.githubusercontent.com/assets/7162889/14402832/0c8da2a6-fe73-11e5-8aba-68ee3ddaf605.png)

Author: Allen <yufan_1990@163.com>

Closes #12273 from the-sea/context-create-bug.
2016-05-01 15:39:14 +01:00
tedyu dcfaeadea7 [SPARK-15003] Use ConcurrentHashMap in place of HashMap for NewAccumulator.originals
## What changes were proposed in this pull request?

This PR proposes to use ConcurrentHashMap in place of HashMap for NewAccumulator.originals

This should result in better performance.

## How was this patch tested?

Existing unit test suite

cloud-fan

Author: tedyu <yuzhihong@gmail.com>

Closes #12776 from tedyu/master.
2016-04-30 07:54:53 +08:00
Wenchen Fan 6f9a18fe31 [HOTFIX][CORE] fix a concurrence issue in NewAccumulator
## What changes were proposed in this pull request?

`AccumulatorContext` is not thread-safe, that's why all of its methods are synchronized. However, there is one exception: the `AccumulatorContext.originals`. `NewAccumulator` use it to check if it's registered, which is wrong as it's not synchronized.

This PR mark `AccumulatorContext.originals` as `private` and now all access to `AccumulatorContext` is synchronized.

## How was this patch tested?

I verified it locally. To be safe, we can let jenkins test it many times to make sure this problem is gone.

Author: Wenchen Fan <wenchen@databricks.com>

Closes #12773 from cloud-fan/debug.
2016-04-28 21:57:58 -07:00
Xin Ren 5743352a28 [SPARK-14935][CORE] DistributedSuite "local-cluster format" shouldn't actually launch clusters
https://issues.apache.org/jira/browse/SPARK-14935

In DistributedSuite, the "local-cluster format" test actually launches a bunch of clusters, but this doesn't seem necessary for what should just be a unit test of a regex. We should clean up the code so that this is testable without actually launching a cluster, which should buy us about 20 seconds per build.

Passed unit test on my local machine

Author: Xin Ren <iamshrek@126.com>

Closes #12744 from keypointt/SPARK-14935.
2016-04-28 10:50:06 -07:00
Wenchen Fan bf5496dbda [SPARK-14654][CORE] New accumulator API
## What changes were proposed in this pull request?

This PR introduces a new accumulator API  which is much simpler than before:

1. the type hierarchy is simplified, now we only have an `Accumulator` class
2. Combine `initialValue` and `zeroValue` concepts into just one concept: `zeroValue`
3. there in only one `register` method, the accumulator registration and cleanup registration are combined.
4. the `id`,`name` and `countFailedValues` are combined into an `AccumulatorMetadata`, and is provided during registration.

`SQLMetric` is a good example to show the simplicity of this new API.

What we break:

1. no `setValue` anymore. In the new API, the intermedia type can be different from the result type, it's very hard to implement a general `setValue`
2. accumulator can't be serialized before registered.

Problems need to be addressed in follow-ups:

1. with this new API, `AccumulatorInfo` doesn't make a lot of sense, the partial output is not partial updates, we need to expose the intermediate value.
2. `ExceptionFailure` should not carry the accumulator updates. Why do users care about accumulator updates for failed cases? It looks like we only use this feature to update the internal metrics, how about we sending a heartbeat to update internal metrics after the failure event?
3. the public event `SparkListenerTaskEnd` carries a `TaskMetrics`. Ideally this `TaskMetrics` don't need to carry external accumulators, as the only method of `TaskMetrics` that can access external accumulators is `private[spark]`. However, `SQLListener` use it to retrieve sql metrics.

## How was this patch tested?

existing tests

Author: Wenchen Fan <wenchen@databricks.com>

Closes #12612 from cloud-fan/acc.
2016-04-28 00:26:39 -07:00
Hemant Bhanawat e4d439c831 [SPARK-14729][SCHEDULER] Refactored YARN scheduler creation code to use newly added ExternalClusterManager
## What changes were proposed in this pull request?
With the addition of ExternalClusterManager(ECM) interface in PR #11723, any cluster manager can now be integrated with Spark. It was suggested in  ExternalClusterManager PR that one of the existing cluster managers should start using the new interface to ensure that the API is correct. Ideally, all the existing cluster managers should eventually use the ECM interface but as a first step yarn will now use the ECM interface. This PR refactors YARN code from SparkContext.createTaskScheduler function  into YarnClusterManager that implements ECM interface.

## How was this patch tested?
Since this is refactoring, no new tests has been added. Existing tests have been run. Basic manual testing with YARN was done too.

Author: Hemant Bhanawat <hemant@snappydata.io>

Closes #12641 from hbhanawat/yarnClusterMgr.
2016-04-27 10:59:23 -07:00
Sean Owen be0d5d3bbe [SPARK-14873][CORE] Java sampleByKey methods take ju.Map but with Scala Double values; results in type Object
## What changes were proposed in this pull request?

Java `sampleByKey` methods should accept `Map` with `java.lang.Double` values

## How was this patch tested?

Existing (updated) Jenkins tests

Author: Sean Owen <sowen@cloudera.com>

Closes #12637 from srowen/SPARK-14873.
2016-04-23 10:47:50 -07:00
Joan bf95b8da27 [SPARK-6429] Implement hashCode and equals together
## What changes were proposed in this pull request?

Implement some `hashCode` and `equals` together in order to enable the scalastyle.
This is a first batch, I will continue to implement them but I wanted to know your thoughts.

Author: Joan <joan@goyeau.com>

Closes #12157 from joan38/SPARK-6429-HashCode-Equals.
2016-04-22 12:24:12 +01:00
Reynold Xin 0bf8df250e [HOTFIX] Fix Java 7 compilation break 2016-04-21 17:52:10 -07:00