This commit removes an unnecessary duplicate check in addPendingTask that meant
that scheduling a task set took time proportional to (# tasks)^2.
Author: Sital Kedia <skedia@fb.com>
Closes#11167 from sitalkedia/fix_stuck_driver and squashes the following commits:
3fe1af8 [Sital Kedia] [SPARK-13279] Remove unnecessary duplicate check in addPendingTask function
Made sure the old tables continue to use the old css and the new DataTables use the new css. Also fixed it so the Safari Web Inspector doesn't throw errors when on the new DataTables pages.
Author: Alex Bozarth <ajbozart@us.ibm.com>
Closes#11038 from ajbozarth/spark13124.
The "getPersistentRDDs()" is a useful API of SparkContext to get cached RDDs. However, the JavaSparkContext does not have this API.
Add a simple getPersistentRDDs() to get java.util.Map<Integer, JavaRDD> for Java users.
Author: Junyang <fly.shenjy@gmail.com>
Closes#10978 from flyjy/master.
Remove spark.closure.serializer option and use JavaSerializer always
CC andrewor14 rxin I see there's a discussion in the JIRA but just thought I'd offer this for a look at what the change would be.
Author: Sean Owen <sowen@cloudera.com>
Closes#11150 from srowen/SPARK-12414.
The right margin of the history page is little bit off. A simple fix for that issue.
Author: zhuol <zhuol@yahoo-inc.com>
Closes#11029 from zhuoliu/13126.
The column width for the new DataTables now adjusts for the current page rather than being hard-coded for the entire table's data.
Author: Alex Bozarth <ajbozart@us.ibm.com>
Closes#11057 from ajbozarth/spark13163.
This is the next iteration of tnachen's previous PR: https://github.com/apache/spark/pull/4027
In that PR, we resolved with andrewor14 and pwendell to implement the Mesos scheduler's support of `spark.executor.cores` to be consistent with YARN and Standalone. This PR implements that resolution.
This PR implements two high-level features. These two features are co-dependent, so they're implemented both here:
- Mesos support for spark.executor.cores
- Multiple executors per slave
We at Mesosphere have been working with Typesafe on a Spark/Mesos integration test suite: https://github.com/typesafehub/mesos-spark-integration-tests, which passes for this PR.
The contribution is my original work and I license the work to the project under the project's open source license.
Author: Michael Gummelt <mgummelt@mesosphere.io>
Closes#10993 from mgummelt/executor_sizing.
This PR improve the lookup of BytesToBytesMap by:
1. Generate code for calculate the hash code of grouping keys.
2. Do not use MemoryLocation, fetch the baseObject and offset for key and value directly (remove the indirection).
Author: Davies Liu <davies@databricks.com>
Closes#11010 from davies/gen_map.
Call shuffleMetrics's incRemoteBytesRead and incRemoteBlocksFetched when polling FetchResult from `results` so as to always use shuffleMetrics in one thread.
Also fix a race condition that could cause memory leak.
Author: Shixiong Zhu <shixiong@databricks.com>
Closes#11138 from zsxwing/SPARK-13245.
Adds the benchmark results as comments.
The codegen version is slower than the interpreted version for `simple` case becasue of 3 reasons:
1. codegen version use a more complex hash algorithm than interpreted version, i.e. `Murmur3_x86_32.hashInt` vs [simple multiplication and addition](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/rows.scala#L153).
2. codegen version will write the hash value to a row first and then read it out. I tried to create a `GenerateHasher` that can generate code to return hash value directly and got about 60% speed up for the `simple` case, does it worth?
3. the row in `simple` case only has one int field, so the runtime reflection may be removed because of branch prediction, which makes the interpreted version faster.
The `array` case is also slow for similar reasons, e.g. array elements are of same type, so interpreted version can probably get rid of runtime reflection by branch prediction.
Author: Wenchen Fan <wenchen@databricks.com>
Closes#10917 from cloud-fan/hash-benchmark.
Since Spark requires at least JRE 1.7, it is safe to use built-in java.nio.Files.
Author: Jakob Odersky <jakob@odersky.com>
Closes#11098 from jodersky/SPARK-13176.
Additional changes to #10835, mainly related to style and visibility. This patch also adds back a few deprecated methods for backward compatibility.
Author: Andrew Or <andrew@databricks.com>
Closes#10958 from andrewor14/task-metrics-to-accums-followups.
There is a bug when we try to grow the buffer, OOM is ignore wrongly (the assert also skipped by JVM), then we try grow the array again, this one will trigger spilling free the current page, the current record we inserted will be invalid.
The root cause is that JVM has less free memory than MemoryManager thought, it will OOM when allocate a page without trigger spilling. We should catch the OOM, and acquire memory again to trigger spilling.
And also, we could not grow the array in `insertRecord` of `InMemorySorter` (it was there just for easy testing).
Author: Davies Liu <davies@databricks.com>
Closes#11095 from davies/fix_expand.
rxin srowen
I work out note message for rdd.take function, please help to review.
If it's fine, I can apply to all other function later.
Author: Tommy YU <tummyyu@163.com>
Closes#10874 from Wenpei/spark-5865-add-warning-for-localdatastructure.
Trivial search-and-replace to eliminate deprecation warnings in Scala 2.11.
Also works with 2.10
Author: Jakob Odersky <jakob@odersky.com>
Closes#11085 from jodersky/SPARK-13171.
Fix for [SPARK-13002](https://issues.apache.org/jira/browse/SPARK-13002) about the initial number of executors when running with dynamic allocation on Mesos.
Instead of fixing it just for the Mesos case, made the change in `ExecutorAllocationManager`. It is already driving the number of executors running on Mesos, only no the initial value.
The `None` and `Some(0)` are internal details on the computation of resources to reserved, in the Mesos backend scheduler. `executorLimitOption` has to be initialized correctly, otherwise the Mesos backend scheduler will, either, create to many executors at launch, or not create any executors and not be able to recover from this state.
Removed the 'special case' description in the doc. It was not totally accurate, and is not needed anymore.
This doesn't fix the same problem visible with Spark standalone. There is no straightforward way to send the initial value in standalone mode.
Somebody knowing this part of the yarn support should review this change.
Author: Luc Bourlier <luc.bourlier@typesafe.com>
Closes#11047 from skyluc/issue/initial-dyn-alloc-2.
The config already describes time and accepts a general format
that is not restricted to ms. This commit renames the internal
config to use a format that's consistent in Spark.
Currently the Master would always set an application's initial executor limit to infinity. If the user specified `spark.dynamicAllocation.initialExecutors`, the config would not take effect. This is similar to #11047 but for standalone mode.
Author: Andrew Or <andrew@databricks.com>
Closes#11054 from andrewor14/standalone-da-initial.
Building with scala 2.11 results in the warning trait SynchronizedBuffer in package mutable is deprecated: Synchronization via traits is deprecated as it is inherently unreliable. Consider java.util.concurrent.ConcurrentLinkedQueue as an alternative. Investigation shows we are already using ConcurrentLinkedQueue in other locations so switch our uses of SynchronizedBuffer to ConcurrentLinkedQueue.
Author: Holden Karau <holden@us.ibm.com>
Closes#11059 from holdenk/SPARK-13164-replace-deprecated-synchronized-buffer-in-core.
In the current implementation the mesos coarse scheduler does not wait for the mesos tasks to complete before ending the driver. This causes a race where the task has to finish cleaning up before the mesos driver terminates it with a SIGINT (and SIGKILL after 3 seconds if the SIGINT doesn't work).
This PR causes the mesos coarse scheduler to wait for the mesos tasks to finish (with a timeout defined by `spark.mesos.coarse.shutdown.ms`)
This PR also fixes a regression caused by [SPARK-10987] whereby submitting a shutdown causes a race between the local shutdown procedure and the notification of the scheduler driver disconnection. If the scheduler driver disconnection wins the race, the coarse executor incorrectly exits with status 1 (instead of the proper status 0)
With this patch the mesos coarse scheduler terminates properly, the executors clean up, and the tasks are reported as `FINISHED` in the Mesos console (as opposed to `KILLED` in < 1.6 or `FAILED` in 1.6 and later)
Author: Charles Allen <charles@allen-net.com>
Closes#10319 from drcrallen/SPARK-12330.
JIRA: https://issues.apache.org/jira/browse/SPARK-13113
As we shift bits right, looks like the bitwise AND operation is unnecessary.
Author: Liang-Chi Hsieh <viirya@gmail.com>
Closes#11002 from viirya/improve-decodepagenumber.
Make an internal non-deprecated version of incBytesRead and incRecordsRead so we don't have unecessary deprecation warnings in our build.
Right now incBytesRead and incRecordsRead are marked as deprecated and for internal use only. We should make private[spark] versions which are not deprecated and switch to those internally so as to not clutter up the warning messages when building.
cc andrewor14 who did the initial deprecation
Author: Holden Karau <holden@us.ibm.com>
Closes#11056 from holdenk/SPARK-13152-fix-task-metrics-deprecation-warnings.
Best time is stabler than average time, also added a column for nano seconds per row (which could be used to estimate contributions of each components in a query).
Having best time and average time together for more information (we can see kind of variance).
rate, time per row and relative are all calculated using best time.
The result looks like this:
```
Intel(R) Core(TM) i7-4558U CPU 2.80GHz
rang/filter/sum: Best/Avg Time(ms) Rate(M/s) Per Row(ns) Relative
-------------------------------------------------------------------------------------------
rang/filter/sum codegen=false 14332 / 16646 36.0 27.8 1.0X
rang/filter/sum codegen=true 845 / 940 620.0 1.6 17.0X
```
Author: Davies Liu <davies@databricks.com>
Closes#11018 from davies/gen_bench.
`rpcEnv.awaitTermination()` was not added in #10854 because some Streaming Python tests hung forever.
This patch fixed the hung issue and added rpcEnv.awaitTermination() back to SparkEnv.
Previously, Streaming Kafka Python tests shutdowns the zookeeper server before stopping StreamingContext. Then when stopping StreamingContext, KafkaReceiver may be hung due to https://issues.apache.org/jira/browse/KAFKA-601, hence, some thread of RpcEnv's Dispatcher cannot exit and rpcEnv.awaitTermination is hung.The patch just changed the shutdown order to fix it.
Author: Shixiong Zhu <shixiong@databricks.com>
Closes#11031 from zsxwing/awaitTermination.
https://issues.apache.org/jira/browse/SPARK-13122
A race condition can occur in MemoryStore's unrollSafely() method if two threads that
return the same value for currentTaskAttemptId() execute this method concurrently. This
change makes the operation of reading the initial amount of unroll memory used, performing
the unroll, and updating the associated memory maps atomic in order to avoid this race
condition.
Initial proposed fix wraps all of unrollSafely() in a memoryManager.synchronized { } block. A cleaner approach might be introduce a mechanism that synchronizes based on task attempt ID. An alternative option might be to track unroll/pending unroll memory based on block ID rather than task attempt ID.
Author: Adam Budde <budde@amazon.com>
Closes#11012 from budde/master.
The driver filesystem is likely different from where the executors will run, so resolving paths (and symlinks, etc.) will lead to invalid paths on executors.
Author: Iulian Dragos <jaguarul@gmail.com>
Closes#10923 from dragos/issue/canonical-paths.
This takes over #10729 and makes sure that `spark-shell` fails with a proper error message. There is a slight behavioral change: before this change `spark-shell` would exit, while now the REPL is still there, but `sc` and `sqlContext` are not defined and the error is visible to the user.
Author: Nilanjan Raychaudhuri <nraychaudhuri@gmail.com>
Author: Iulian Dragos <jaguarul@gmail.com>
Closes#10921 from dragos/pr/10729.
Fix zookeeper dir configuration used in cluster mode, and also add documentation around these settings.
Author: Timothy Chen <tnachen@gmail.com>
Closes#10057 from tnachen/fix_mesos_dir.
Add a local property to indicate if checkpointing all RDDs that are marked with the checkpoint flag, and enable it in Streaming
Author: Shixiong Zhu <shixiong@databricks.com>
Closes#10934 from zsxwing/recursive-checkpoint.
This issue is causing tests to fail consistently in master with Hadoop 2.6 / 2.7. This is because for Hadoop 2.5+ we overwrite existing values of `InputMetrics#bytesRead` in each call to `HadoopRDD#compute`. In the case of coalesce, e.g.
```
sc.textFile(..., 4).coalesce(2).count()
```
we will call `compute` multiple times in the same task, overwriting `bytesRead` values from previous calls to `compute`.
For a regression test, see `InputOutputMetricsSuite.input metrics for old hadoop with coalesce`. I did not add a new regression test because it's impossible without significant refactoring; there's a lot of existing duplicate code in this corner of Spark.
This was caused by #10835.
Author: Andrew Or <andrew@databricks.com>
Closes#10973 from andrewor14/fix-input-metrics-coalesce.
Apparently chrome removed `SVGElement.prototype.getTransformToElement`, which is used by our JS library dagre-d3 when creating edges. The real diff can be found here: 7d6c0002e4, which is taken from the fix in the main repo: 1ef067f1c6
Upstream issue: https://github.com/cpettitt/dagre-d3/issues/202
Author: Andrew Or <andrew@databricks.com>
Closes#10986 from andrewor14/fix-dag-viz.
This is an existing issue uncovered recently by #10835. The reason for the exception was because the `SQLHistoryListener` gets all sorts of accumulators, not just the ones that represent SQL metrics. For example, the listener gets the `internal.metrics.shuffleRead.remoteBlocksFetched`, which is an Int, then it proceeds to cast the Int to a Long, which fails.
The fix is to mark accumulators representing SQL metrics using some internal metadata. Then we can identify which ones are SQL metrics and only process those in the `SQLHistoryListener`.
Author: Andrew Or <andrew@databricks.com>
Closes#10971 from andrewor14/fix-sql-history.
[SPARK-10873] Support column sort and search for History Server using jQuery DataTable and REST API. Before this commit, the history server was generated hard-coded html and can not support search, also, the sorting was disabled if there is any application that has more than one attempt. Supporting search and sort (over all applications rather than the 20 entries in the current page) in any case will greatly improve user experience.
1. Create the historypage-template.html for displaying application information in datables.
2. historypage.js uses jQuery to access the data from /api/v1/applications REST API, and use DataTable to display each application's information. For application that has more than one attempt, the RowsGroup is used to merge such entries while at the same time supporting sort and search.
3. "duration" and "lastUpdated" rest API are added to application's "attempts".
4. External javascirpt and css files for datatables, RowsGroup and jquery plugins are added with licenses clarified.
Snapshots for how it looks like now:
History page view:
![historypage](https://cloud.githubusercontent.com/assets/11683054/12184383/89bad774-b55a-11e5-84e4-b0276172976f.png)
Search:
![search](https://cloud.githubusercontent.com/assets/11683054/12184385/8d3b94b0-b55a-11e5-869a-cc0ef0a4242a.png)
Sort by started time:
![sort-by-started-time](https://cloud.githubusercontent.com/assets/11683054/12184387/8f757c3c-b55a-11e5-98c8-577936366566.png)
Author: zhuol <zhuol@yahoo-inc.com>
Closes#10648 from zhuoliu/10873.
by explicitly marking annotated parameters as vals (SI-8813).
Caused by #10835.
Author: Andrew Or <andrew@databricks.com>
Closes#10955 from andrewor14/fix-scala211.
Spark's `Partition` and `RDD.partitions` APIs have a contract which requires custom implementations of `RDD.partitions` to ensure that for all `x`, `rdd.partitions(x).index == x`; in other words, the `index` reported by a repartition needs to match its position in the partitions array.
If a custom RDD implementation violates this contract, then Spark has the potential to become stuck in an infinite recomputation loop when recomputing a subset of an RDD's partitions, since the tasks that are actually run will not correspond to the missing output partitions that triggered the recomputation. Here's a link to a notebook which demonstrates this problem: 5e8a5aa8d2/Violating%2520RDD.partitions%2520contract.html
In order to guard against this infinite loop behavior, this patch modifies Spark so that it fails fast and refuses to compute RDDs' whose `partitions` violate the API contract.
Author: Josh Rosen <joshrosen@databricks.com>
Closes#10932 from JoshRosen/SPARK-13021.
The high level idea is that instead of having the executors send both accumulator updates and TaskMetrics, we should have them send only accumulator updates. This eliminates the need to maintain both code paths since one can be implemented in terms of the other. This effort is split into two parts:
**SPARK-12895: Implement TaskMetrics using accumulators.** TaskMetrics is basically just a bunch of accumulable fields. This patch makes TaskMetrics a syntactic wrapper around a collection of accumulators so we don't need to send TaskMetrics from the executors to the driver.
**SPARK-12896: Send only accumulator updates to the driver.** Now that TaskMetrics are expressed in terms of accumulators, we can capture all TaskMetrics values if we just send accumulator updates from the executors to the driver. This completes the parent issue SPARK-10620.
While an effort has been made to preserve as much of the public API as possible, there were a few known breaking DeveloperApi changes that would be very awkward to maintain. I will gather the full list shortly and post it here.
Note: This was once part of #10717. This patch is split out into its own patch from there to make it easier for others to review. Other smaller pieces of already been merged into master.
Author: Andrew Or <andrew@databricks.com>
Closes#10835 from andrewor14/task-metrics-use-accums.
If there's an RPC issue while sparkContext is alive but stopped (which would happen only when executing SparkContext.stop), log a warning instead. This is a common occurrence.
vanzin
Author: Nishkam Ravi <nishkamravi@gmail.com>
Author: nishkamravi2 <nishkamravi@gmail.com>
Closes#10881 from nishkamravi2/master_netty.
Right now RpcEndpointRef.ask may throw exception in some corner cases, such as calling ask after stopping RpcEnv. It's better to avoid throwing exception from RpcEndpointRef.ask. We can send the exception to the future for `ask`.
Author: Shixiong Zhu <shixiong@databricks.com>
Closes#10568 from zsxwing/send-ask-fail.
Call system.exit explicitly to make sure non-daemon user threads terminate. Without this, user applications might live forever if the cluster manager does not appropriately kill them. E.g., YARN had this bug: HADOOP-12441.
Author: zhuol <zhuol@yahoo-inc.com>
Closes#9946 from zhuoliu/10911.
Fix Java function API methods for flatMap and mapPartitions to require producing only an Iterator, not Iterable. Also fix DStream.flatMap to require a function producing TraversableOnce only, not Traversable.
CC rxin pwendell for API change; tdas since it also touches streaming.
Author: Sean Owen <sowen@cloudera.com>
Closes#10413 from srowen/SPARK-3369.
JIRA: https://issues.apache.org/jira/browse/SPARK-12961
To prevent memory leak in snappy-java, just call the method once and cache the result. After the library releases new version, we can remove this object.
JoshRosen
Author: Liang-Chi Hsieh <viirya@gmail.com>
Closes#10875 from viirya/prevent-snappy-memory-leak.