ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Andrew Or	7b6c856367	[SPARK-10564] ThreadingSuite: assertion failures in threads don't fail the test (round 2) This is a follow-up patch to #8723. I missed one case there. Author: Andrew Or <andrew@databricks.com> Closes #8727 from andrewor14/fix-threading-suite.	2015-09-14 15:09:43 -07:00
Forest Fang	fd1e8cddf2	[SPARK-10543] [CORE] Peak Execution Memory Quantile should be Per-task Basis Read `PEAK_EXECUTION_MEMORY` using `update` to get per task partial value instead of cumulative value. I tested with this workload: ```scala val size = 1000 val repetitions = 10 val data = sc.parallelize(1 to size, 5).map(x => (util.Random.nextInt(size / repetitions),util.Random.nextDouble)).toDF("key", "value") val res = data.toDF.groupBy("key").agg(sum("value")).count ``` Before: ![image](https://cloud.githubusercontent.com/assets/4317392/9828197/07dd6874-58b8-11e5-9bd9-6ba927c38b26.png) After: ![image](https://cloud.githubusercontent.com/assets/4317392/9828151/a5ddff30-58b7-11e5-8d31-eda5dc4eae79.png) Tasks view: ![image](https://cloud.githubusercontent.com/assets/4317392/9828199/17dc2b84-58b8-11e5-92a8-be89ce4d29d1.png) cc andrewor14 I appreciate if you can give feedback on this since I think you introduced display of this metric. Author: Forest Fang <forest.fang@outlook.com> Closes #8726 from saurfang/stagepage.	2015-09-14 15:07:13 -07:00
Sean Owen	4e2242bb41	[SPARK-10576] [BUILD] Move .java files out of src/main/scala Move .java files in `src/main/scala` to `src/main/java` root, except for `package-info.java` (to stay next to package.scala) Author: Sean Owen <sowen@cloudera.com> Closes #8736 from srowen/SPARK-10576.	2015-09-14 15:03:51 -07:00
Wenchen Fan	32407bfd2b	[SPARK-9899] [SQL] log warning for direct output committer with speculation enabled This is a follow-up of https://github.com/apache/spark/pull/8317. When speculation is enabled, there may be multiply tasks writing to the same path. Generally it's OK as we will write to a temporary directory first and only one task can commit the temporary directory to target path. However, when we use direct output committer, tasks will write data to target path directly without temporary directory. This causes problems like corrupted data. Please see [PR comment](https://github.com/apache/spark/pull/8191#issuecomment-131598385) for more details. Unfortunately, we don't have a simple flag to tell if a output committer will write to temporary directory or not, so for safety, we have to disable any customized output committer when `speculation` is true. Author: Wenchen Fan <cloud0fan@outlook.com> Closes #8687 from cloud-fan/direct-committer.	2015-09-14 11:51:39 -07:00
Josh Rosen	b3a7480ab0	[SPARK-10330] Add Scalastyle rule to require use of SparkHadoopUtil JobContext methods This is a followup to #8499 which adds a Scalastyle rule to mandate the use of SparkHadoopUtil's JobContext accessor methods and fixes the existing violations. Author: Josh Rosen <joshrosen@databricks.com> Closes #8521 from JoshRosen/SPARK-10330-part2.	2015-09-12 16:23:55 -07:00
Sean Owen	22730ad54d	[SPARK-10547] [TEST] Streamline / improve style of Java API tests Fix a few Java API test style issues: unused generic types, exceptions, wrong assert argument order Author: Sean Owen <sowen@cloudera.com> Closes #8706 from srowen/SPARK-10547.	2015-09-12 10:40:10 +01:00
Nithin Asokan	8285e3b0d3	[SPARK-10554] [CORE] Fix NPE with ShutdownHook https://issues.apache.org/jira/browse/SPARK-10554 Fixes NPE when ShutdownHook tries to cleanup temporary folders Author: Nithin Asokan <Nithin.Asokan@Cerner.com> Closes #8720 from nasokan/SPARK-10554.	2015-09-12 09:50:49 +01:00
Daniel Imfeld	6d8367807c	[SPARK-10566] [CORE] SnappyCompressionCodec init exception handling masks important error information When throwing an IllegalArgumentException in SnappyCompressionCodec.init, chain the existing exception. This allows potentially important debugging info to be passed to the user. Manual testing shows the exception chained properly, and the test suite still looks fine as well. This contribution is my original work and I license the work to the project under the project's open source license. Author: Daniel Imfeld <daniel@danielimfeld.com> Closes #8725 from dimfeld/dimfeld-patch-1.	2015-09-12 09:19:59 +01:00
Andrew Or	d74c6a143c	[SPARK-10564] ThreadingSuite: assertion failures in threads don't fail the test This commit ensures if an assertion fails within a thread, it will ultimately fail the test. Otherwise we end up potentially masking real bugs by not propagating assertion failures properly. Author: Andrew Or <andrew@databricks.com> Closes #8723 from andrewor14/fix-threading-suite.	2015-09-11 15:02:59 -07:00
tedyu	b231ab8938	[SPARK-10546] Check partitionId's range in ExternalSorter#spill() See this thread for background: http://search-hadoop.com/m/q3RTt0rWvIkHAE81 We should check the range of partition Id and provide meaningful message through exception. Alternatively, we can use abs() and modulo to force the partition Id into legitimate range. However, expectation is that user should correct the logic error in his / her code. Author: tedyu <yuzhihong@gmail.com> Closes #8703 from tedyu/master.	2015-09-11 21:45:45 +01:00
Matt Massie	0eabea8a05	[SPARK-9043] Serialize key, value and combiner classes in ShuffleDependency ShuffleManager implementations are currently not given type information for the key, value and combiner classes. Serialization of shuffle objects relies on objects being JavaSerializable, with methods defined for reading/writing the object or, alternatively, serialization via Kryo which uses reflection. Serialization systems like Avro, Thrift and Protobuf generate classes with zero argument constructors and explicit schema information (e.g. IndexedRecords in Avro have get, put and getSchema methods). By serializing the key, value and combiner class names in ShuffleDependency, shuffle implementations will have access to schema information when registerShuffle() is called. Author: Matt Massie <massie@cs.berkeley.edu> Closes #7403 from massie/shuffle-classtags.	2015-09-10 17:24:33 -07:00
Sun Rui	45e3be5c13	[SPARK-10049] [SPARKR] Support collecting data of ArraryType in DataFrame. this PR : 1. Enhance reflection in RBackend. Automatically matching a Java array to Scala Seq when finding methods. Util functions like seq(), listToSeq() in R side can be removed, as they will conflict with the Serde logic that transferrs a Scala seq to R side. 2. Enhance the SerDe to support transferring a Scala seq to R side. Data of ArrayType in DataFrame after collection is observed to be of Scala Seq type. 3. Support ArrayType in createDataFrame(). Author: Sun Rui <rui.sun@intel.com> Closes #8458 from sun-rui/SPARK-10049.	2015-09-10 12:21:13 -07:00
Akash Mishra	a5ef2d0600	[SPARK-10514] [MESOS] waiting for min no of total cores acquired by Spark by implementing the sufficientResourcesRegistered method spark.scheduler.minRegisteredResourcesRatio configuration parameter works for YARN mode but not for Mesos Coarse grained mode. If the parameter specified default value of 0 will be set for spark.scheduler.minRegisteredResourcesRatio in base class and this method will always return true. There are no existing test for YARN mode too. Hence not added test for the same. Author: Akash Mishra <akash.mishra20@gmail.com> Closes #8672 from SleepyThread/master.	2015-09-10 12:04:02 -07:00
Iulian Dragos	f0562e8cdb	[SPARK-6350] [MESOS] Fine-grained mode scheduler respects mesosExecutor.cores This is a regression introduced in #4960, this commit fixes it and adds a test. tnachen andrewor14 please review, this should be an easy one. Author: Iulian Dragos <jaguarul@gmail.com> Closes #8653 from dragos/issue/mesos/fine-grained-maxExecutorCores.	2015-09-10 12:00:21 -07:00
mcheah	af3bc59d1f	[SPARK-8167] Make tasks that fail from YARN preemption not fail job The architecture is that, in YARN mode, if the driver detects that an executor has disconnected, it asks the ApplicationMaster why the executor died. If the ApplicationMaster is aware that the executor died because of preemption, all tasks associated with that executor are not marked as failed. The executor is still removed from the driver's list of available executors, however. There's a few open questions: 1. Should standalone mode have a similar "get executor loss reason" as well? I localized this change as much as possible to affect only YARN, but there could be a valid case to differentiate executor losses in standalone mode as well. 2. I make a pretty strong assumption in YarnAllocator that getExecutorLossReason(executorId) will only be called once per executor id; I do this so that I can remove the metadata from the in-memory map to avoid object accumulation. It's not clear if I'm being overly zealous to save space, however. cc vanzin specifically for review because it collided with some earlier YARN scheduling work. cc JoshRosen because it's similar to output commit coordination we did in the past cc andrewor14 for our discussion on how to get executor exit codes and loss reasons Author: mcheah <mcheah@palantir.com> Closes #8007 from mccheah/feature/preemption-handling.	2015-09-10 11:58:54 -07:00
Cheng Hao	e048111376	[SPARK-10466] [SQL] UnsafeRow SerDe exception with data spill Data Spill with UnsafeRow causes assert failure. ``` java.lang.AssertionError: assertion failed at scala.Predef$.assert(Predef.scala:165) at org.apache.spark.sql.execution.UnsafeRowSerializerInstance$$anon$2.writeKey(UnsafeRowSerializer.scala:75) at org.apache.spark.storage.DiskBlockObjectWriter.write(DiskBlockObjectWriter.scala:180) at org.apache.spark.util.collection.ExternalSorter$$anonfun$writePartitionedFile$2$$anonfun$apply$1.apply(ExternalSorter.scala:688) at org.apache.spark.util.collection.ExternalSorter$$anonfun$writePartitionedFile$2$$anonfun$apply$1.apply(ExternalSorter.scala:687) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at org.apache.spark.util.collection.ExternalSorter$$anonfun$writePartitionedFile$2.apply(ExternalSorter.scala:687) at org.apache.spark.util.collection.ExternalSorter$$anonfun$writePartitionedFile$2.apply(ExternalSorter.scala:683) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at org.apache.spark.util.collection.ExternalSorter.writePartitionedFile(ExternalSorter.scala:683) at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:80) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73) at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41) at org.apache.spark.scheduler.Task.run(Task.scala:88) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) ``` To reproduce that with code (thanks andrewor14): ```scala bin/spark-shell --master local --conf spark.shuffle.memoryFraction=0.005 --conf spark.shuffle.sort.bypassMergeThreshold=0 sc.parallelize(1 to 2 * 1000 * 1000, 10) .map { i => (i, i) }.toDF("a", "b").groupBy("b").avg().count() ``` Author: Cheng Hao <hao.cheng@intel.com> Closes #8635 from chenghao-intel/unsafe_spill.	2015-09-10 11:48:43 -07:00
Liang-Chi Hsieh	45de518742	[SPARK-9730] [SQL] Add Full Outer Join support for SortMergeJoin This PR is based on #8383 , thanks to viirya JIRA: https://issues.apache.org/jira/browse/SPARK-9730 This patch adds the Full Outer Join support for SortMergeJoin. A new class SortMergeFullJoinScanner is added to scan rows from left and right iterators. FullOuterIterator is simply a wrapper of type RowIterator to consume joined rows from SortMergeFullJoinScanner. Closes #8383 Author: Liang-Chi Hsieh <viirya@appier.com> Author: Davies Liu <davies@databricks.com> Closes #8579 from davies/smj_fullouter.	2015-09-09 16:02:27 -07:00
Luc Bourlier	c1bc4f439f	[SPARK-10227] fatal warnings with sbt on Scala 2.11 The bulk of the changes are on `transient` annotation on class parameter. Often the compiler doesn't generate a field for this parameters, so the the transient annotation would be unnecessary. But if the class parameter are used in methods, then fields are created. So it is safer to keep the annotations. The remainder are some potential bugs, and deprecated syntax. Author: Luc Bourlier <luc.bourlier@typesafe.com> Closes #8433 from skyluc/issue/sbt-2.11.	2015-09-09 09:57:58 +01:00
Reynold Xin	5ffe752b59	[SPARK-9767] Remove ConnectionManager. We introduced the Netty network module for shuffle in Spark 1.2, and has turned it on by default for 3 releases. The old ConnectionManager is difficult to maintain. If we merge the patch now, by the time it is released, it would be 1 yr for which ConnectionManager is off by default. It's time to remove it. Author: Reynold Xin <rxin@databricks.com> Closes #8161 from rxin/SPARK-9767.	2015-09-07 10:42:30 -10:00
robbins	2e1c17553d	[SPARK-10454] [SPARK CORE] wait for empty event queue Author: robbins <robbins@uk.ibm.com> Closes #8605 from robbinspg/DAGSchedulerSuite-fix.	2015-09-04 15:23:29 -07:00
Timothy Chen	b087d23e28	[SPARK-9669] [MESOS] Support PySpark on Mesos cluster mode. Support running pyspark with cluster mode on Mesos! This doesn't upload any scripts, so if running in a remote Mesos requires the user to specify the script from a available URI. Author: Timothy Chen <tnachen@gmail.com> Closes #8349 from tnachen/mesos_python.	2015-09-04 15:21:31 -07:00
Andrew Or	cf42138643	[SPARK-10003] Improve readability of DAGScheduler Note: this is not intended to be in Spark 1.5! This patch rewrites some code in the `DAGScheduler` to make it more readable. In particular - there were blocks of code that are unnecessary and removed for simplicity - there were abstractions that are unnecessary and made the code hard to navigate - other minor changes Author: Andrew Or <andrew@databricks.com> Closes #8217 from andrewor14/dag-scheduler-readability and squashes the following commits: 57abca3 [Andrew Or] Move comment back into if case 574fb1e [Andrew Or] Merge branch 'master' of github.com:apache/spark into dag-scheduler-readability 64a9ed2 [Andrew Or] Remove unnecessary code + minor code rewrites	2015-09-03 17:55:10 -07:00
Andrew Or	08b0750974	[SPARK-10435] Spark submit should fail fast for Mesos cluster mode with R It's not supported yet so we should error with a clear message. Author: Andrew Or <andrew@databricks.com> Closes #8590 from andrewor14/mesos-cluster-r-guard.	2015-09-03 13:57:20 -07:00
jeanlyn	db4c130f9e	[SPARK-9591] [CORE] Job may fail for exception during getting remote block [SPARK-9591](https://issues.apache.org/jira/browse/SPARK-9591) When we getting the broadcast variable, we can fetch the block form several location,but now when connecting the lost blockmanager(idle for enough time removed by driver when using dynamic resource allocate and so on) will cause task fail,and the worse case will cause the job fail. Author: jeanlyn <jeanlyn92@gmail.com> Closes #7927 from jeanlyn/catch_exception.	2015-09-03 13:56:11 -07:00
Vinod K C	11ef32c5a1	[SPARK-10430] [CORE] Added hashCode methods in AccumulableInfo and RDDOperationScope Author: Vinod K C <vinod.kc@huawei.com> Closes #8581 from vinodkc/fix_RDDOperationScope_Hashcode.	2015-09-03 13:55:02 -07:00
Pat Shields	e62f4a46f4	[SPARK-9672] [MESOS] Don’t include SPARK_ENV_LOADED when passing env vars This contribution is my original work and I license the work to the project under the project's open source license. Author: Pat Shields <yeoldefortran@gmail.com> Closes #7979 from pashields/env-loading-on-driver.	2015-09-03 13:53:18 -07:00
robbins	d911c682f0	[SPARK-10431] [CORE] Fix intermittent test failure. Wait for event queue to be clear Author: robbins <robbins@uk.ibm.com> Closes #8582 from robbinspg/InputOutputMetricsSuite.	2015-09-03 13:47:25 -07:00
CHOIJAEHONG	af0e3125cb	[SPARK-8951] [SPARKR] support Unicode characters in collect() Spark gives an error message and does not show the output when a field of the result DataFrame contains characters in CJK. I changed SerDe.scala in order that Spark support Unicode characters when writes a string to R. Author: CHOIJAEHONG <redrock07@naver.com> Closes #7494 from CHOIJAEHONG1/SPARK-8951.	2015-09-03 13:38:26 -07:00
Davies Liu	62b4690d6b	[SPARK-10379] preserve first page in UnsafeShuffleExternalSorter Author: Davies Liu <davies@databricks.com> Closes #8543 from davies/preserve_page.	2015-09-02 22:15:54 -07:00
Imran Rashid	3ddb9b3233	[SPARK-10247] [CORE] improve readability of a test case in DAGSchedulerSuite This is pretty minor, just trying to improve the readability of `DAGSchedulerSuite`, I figure every bit helps. Before whenever I read this test, I never knew what "should work" and "should be ignored" really meant -- this adds some asserts & updates comments to make it more clear. Also some reformatting per a suggestion from markhamstra on https://github.com/apache/spark/pull/7699 Author: Imran Rashid <irashid@cloudera.com> Closes #8434 from squito/SPARK-10247.	2015-09-02 22:14:50 -07:00
Evan Racah	f6c447f875	Removed code duplication in ShuffleBlockFetcherIterator Added fetchUpToMaxBytes() to prevent having to update both code blocks when a change is made. Author: Evan Racah <ejracah@gmail.com> Closes #8514 from eracah/master.	2015-09-02 22:13:37 -07:00
navis.ryu	0985d2c30e	[SPARK-8707] RDD#toDebugString fails if any cached RDD has invalid partitions Added numPartitions(evaluate: Boolean) to RDD. With "evaluate=true" the method is same with "partitions.length". With "evaluate=false", it checks checked-out or already evaluated partitions in the RDD to get number of partition. If it's not those cases, returns -1. RDDInfo.partitionNum calls numPartition only when it's accessed. Author: navis.ryu <navis@apache.org> Closes #7127 from navis/SPARK-8707.	2015-09-02 22:11:51 -07:00
Ilya Ganelin	4bd85d06e0	[SPARK-5945] Spark should not retry a stage infinitely on a FetchFailedException The ```Stage``` class now tracks whether there were a sufficient number of consecutive failures of that stage to trigger an abort. To avoid an infinite loop of stage retries, we abort the job completely after 4 consecutive stage failures for one stage. We still allow more than 4 consecutive stage failures if there is an intervening successful attempt for the stage, so that in very long-lived applications, where a stage may get reused many times, we don't abort the job after failures that have been recovered from successfully. I've added test cases to exercise the most obvious scenarios. Author: Ilya Ganelin <ilya.ganelin@capitalone.com> Closes #5636 from ilganeli/SPARK-5945.	2015-09-02 22:08:24 -07:00
Marcelo Vanzin	2da3a9e98e	[SPARK-10004] [SHUFFLE] Perform auth checks when clients read shuffle data. To correctly isolate applications, when requests to read shuffle data arrive at the shuffle service, proper authorization checks need to be performed. This change makes sure that only the application that created the shuffle data can read from it. Such checks are only enabled when "spark.authenticate" is enabled, otherwise there's no secure way to make sure that the client is really who it says it is. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #8218 from vanzin/SPARK-10004.	2015-09-02 12:53:24 -07:00
Chuan Shao	c3b881a7d7	[SPARK-7336] [HISTORYSERVER] Fix bug that applications status incorrect on JobHistory UI. Author: ArcherShao <shaochuan@huawei.com> Closes #5886 from ArcherShao/SPARK-7336.	2015-09-02 11:02:57 -07:00
zhuol	ec01280533	[SPARK-4223] [CORE] Support * in acls. SPARK-4223. Currently we support setting view and modify acls but you have to specify a list of users. It would be nice to support * meaning all users have access. Manual tests to verify that: "*" works for any user in: a. Spark ui: view and kill stage. Done. b. Spark history server. Done. c. Yarn application killing. Done. Author: zhuol <zhuol@yahoo-inc.com> Closes #8398 from zhuoliu/4223.	2015-09-01 11:14:59 -10:00
Davies Liu	540bdee931	[SPARK-10341] [SQL] fix memory starving in unsafe SMJ In SMJ, the first ExternalSorter could consume all the memory before spilling, then the second can not even acquire the first page. Before we have a better memory allocator, SMJ should call prepare() before call any compute() of it's children. cc rxin JoshRosen Author: Davies Liu <davies@databricks.com> Closes #8511 from davies/smj_memory.	2015-08-31 15:55:22 -07:00
EugenCepoi	72f6dbf7b0	[SPARK-8730] Fixes - Deser objects containing a primitive class attribute Author: EugenCepoi <cepoi.eugen@gmail.com> Closes #7122 from EugenCepoi/master.	2015-08-31 13:24:35 -05:00
ihainan	1bfd934782	[SPARK-10184] [CORE] Optimization for bounds determination in RangePartitioner JIRA Issue: https://issues.apache.org/jira/browse/SPARK-10184 Change `cumWeight > target` to `cumWeight >= target` in `RangePartitioner.determineBounds` method to make the output partitions more balanced. Author: ihainan <ihainan72@gmail.com> Closes #8397 from ihainan/opt_for_rangepartitioner.	2015-08-30 08:26:14 +01:00
Marcelo Vanzin	c53c902fa9	[SPARK-9284] [TESTS] Allow all tests to run without an assembly. This change aims at speeding up the dev cycle a little bit, by making sure that all tests behave the same w.r.t. where the code to be tested is loaded from. Namely, that means that tests don't rely on the assembly anymore, rather loading all needed classes from the build directories. The main change is to make sure all build directories (classes and test-classes) are added to the classpath of child processes when running tests. YarnClusterSuite required some custom code since the executors are run differently (i.e. not through the launcher library, like standalone and Mesos do). I also found a couple of tests that could leak a SparkContext on failure, and added code to handle those. With this patch, it's possible to run the following command from a clean source directory and have all tests pass: mvn -Pyarn -Phadoop-2.4 -Phive-thriftserver install Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #7629 from vanzin/SPARK-9284.	2015-08-28 12:33:40 -07:00
Sean Owen	cc39803062	[SPARK-10295] [CORE] Dynamic allocation in Mesos does not release when RDDs are cached Remove obsolete warning about dynamic allocation not working with cached RDDs See discussion in https://issues.apache.org/jira/browse/SPARK-10295 Author: Sean Owen <sowen@cloudera.com> Closes #8489 from srowen/SPARK-10295.	2015-08-28 09:32:23 +01:00
Ram Sriharsha	de0278286c	[SPARK-10251] [CORE] some common types are not registered for Kryo Serializat… …ion by default Author: Ram Sriharsha <rsriharsha@hw11853.local> Closes #8465 from harsha2010/SPARK-10251.	2015-08-26 23:12:55 -07:00
Sun Rui	71a138cd0e	[SPARK-10048] [SPARKR] Support arbitrary nested Java array in serde. This PR: 1. supports transferring arbitrary nested array from JVM to R side in SerDe; 2. based on 1, collect() implemenation is improved. Now it can support collecting data of complex types from a DataFrame. Author: Sun Rui <rui.sun@intel.com> Closes #8276 from sun-rui/SPARK-10048.	2015-08-25 13:14:10 -07:00
Sean Owen	69c9c17716	[SPARK-9613] [CORE] Ban use of JavaConversions and migrate all existing uses to JavaConverters Replace `JavaConversions` implicits with `JavaConverters` Most occurrences I've seen so far are necessary conversions; a few have been avoidable. None are in critical code as far as I see, yet. Author: Sean Owen <sowen@cloudera.com> Closes #8033 from srowen/SPARK-9613.	2015-08-25 12:33:13 +01:00
ehnalis	7f1e507bf7	Fixed a typo in DAGScheduler. Author: ehnalis <zoltan.zvara@gmail.com> Closes #8308 from ehnalis/master.	2015-08-25 12:30:06 +01:00
Zhang, Liye	5c14890159	[DOC] add missing parameters in SparkContext.scala for scala doc Author: Zhang, Liye <liye.zhang@intel.com> Closes #8412 from liyezhang556520/minorDoc.	2015-08-25 11:48:55 +01:00
Andrew Or	662bb96676	[SPARK-10144] [UI] Actually show peak execution memory by default The peak execution memory metric was introduced in SPARK-8735. That was before Tungsten was enabled by default, so it assumed that `spark.sql.unsafe.enabled` must be explicitly set to true. The result is that the memory is not displayed by default. Author: Andrew Or <andrew@databricks.com> Closes #8345 from andrewor14/show-memory-default.	2015-08-24 14:10:50 -07:00
Imran Rashid	708036c1de	[SPARK-9439] [YARN] External shuffle service robust to NM restarts using leveldb https://issues.apache.org/jira/browse/SPARK-9439 In general, Yarn apps should be robust to NodeManager restarts. However, if you run spark with the external shuffle service on, after a NM restart all shuffles fail, b/c the shuffle service has lost some state with info on each executor. (Note the shuffle data is perfectly fine on disk across a NM restart, the problem is we've lost the small bit of state that lets us find those files.) The solution proposed here is that the external shuffle service can write out its state to leveldb (backed by a local file) every time an executor is added. When running with yarn, that file is in the NM's local dir. Whenever the service is started, it looks for that file, and if it exists, it reads the file and re-registers all executors there. Nothing is changed in non-yarn modes with this patch. The service is not given a place to save the state to, so it operates the same as before. This should make it easy to update other cluster managers as well, by just supplying the right file & the equivalent of yarn's `initializeApplication` -- I'm not familiar enough with those modes to know how to do that. Author: Imran Rashid <irashid@cloudera.com> Closes #7943 from squito/leveldb_external_shuffle_service_NM_restart and squashes the following commits: 0d285d3 [Imran Rashid] review feedback 70951d6 [Imran Rashid] Merge branch 'master' into leveldb_external_shuffle_service_NM_restart 5c71c8c [Imran Rashid] save executor to db before registering; style 2499c8c [Imran Rashid] explicit dependency on jackson-annotations 795d28f [Imran Rashid] review feedback 81f80e2 [Imran Rashid] Merge branch 'master' into leveldb_external_shuffle_service_NM_restart 594d520 [Imran Rashid] use json to serialize application executor info 1a7980b [Imran Rashid] version 8267d2a [Imran Rashid] style e9f99e8 [Imran Rashid] cleanup the handling of bad dbs a little 9378ba3 [Imran Rashid] fail gracefully on corrupt leveldb files acedb62 [Imran Rashid] switch to writing out one record per executor 79922b7 [Imran Rashid] rely on yarn to call stopApplication; assorted cleanup 12b6a35 [Imran Rashid] save registered executors when apps are removed; add tests c878fbe [Imran Rashid] better explanation of shuffle service port handling 694934c [Imran Rashid] only open leveldb connection once per service d596410 [Imran Rashid] store executor data in leveldb 59800b7 [Imran Rashid] Files.move in case renaming is unsupported 32fe5ae [Imran Rashid] Merge branch 'master' into external_shuffle_service_NM_restart d7450f0 [Imran Rashid] style f729e2b [Imran Rashid] debugging 4492835 [Imran Rashid] lol, dont use a PrintWriter b/c of scalastyle checks 0a39b98 [Imran Rashid] Merge branch 'master' into external_shuffle_service_NM_restart 55f49fc [Imran Rashid] make sure the service doesnt die if the registered executor file is corrupt; add tests 245db19 [Imran Rashid] style 62586a6 [Imran Rashid] just serialize the whole executors map bdbbf0d [Imran Rashid] comments, remove some unnecessary changes 857331a [Imran Rashid] better tests & comments bb9d1e6 [Imran Rashid] formatting bdc4b32 [Imran Rashid] rename 86e0cb9 [Imran Rashid] for tests, shuffle service finds an open port 23994ff [Imran Rashid] style 7504de8 [Imran Rashid] style a36729c [Imran Rashid] cleanup efb6195 [Imran Rashid] proper unit test, and no longer leak if apps stop during NM restart dd93dc0 [Imran Rashid] test for shuffle service w/ NM restarts d596969 [Imran Rashid] cleanup imports 0e9d69b [Imran Rashid] better names 9eae119 [Imran Rashid] cleanup lots of duplication 1136f44 [Imran Rashid] test needs to have an actual shuffle 0b588bd [Imran Rashid] more fixes ... ad122ef [Imran Rashid] more fixes 5e5a7c3 [Imran Rashid] fix build c69f46b [Imran Rashid] maybe working version, needs tests & cleanup ... bb3ba49 [Imran Rashid] minor cleanup 36127d3 [Imran Rashid] wip b9d2ced [Imran Rashid] incomplete setup for external shuffle service tests	2015-08-21 08:41:36 -05:00
Xiangrui Meng	cdd9a2bb10	[SPARK-10140] [DOC] add target fields to @Since so constructors parameters and public fields can be annotated. rxin MechCoder Author: Xiangrui Meng <meng@databricks.com> Closes #8344 from mengxr/SPARK-10140.2.	2015-08-20 20:01:13 -07:00
Alex Shkurenko	39e91fe2fd	[SPARK-9982] [SPARKR] SparkR DataFrame fail to return data of Decimal type Author: Alex Shkurenko <ashkurenko@enova.com> Closes #8239 from ashkurenko/master.	2015-08-20 10:16:38 -07:00
Timothy Chen	73431d8afb	[SPARK-10124] [MESOS] Fix removing queued driver in mesos cluster mode. Currently the spark applications can be queued to the Mesos cluster dispatcher, but when multiple jobs are in queue we don't handle removing jobs from the buffer correctly while iterating and causes null pointer exception. This patch copies the buffer before iterating them, so exceptions aren't thrown when the jobs are removed. Author: Timothy Chen <tnachen@gmail.com> Closes #8322 from tnachen/fix_cluster_mode.	2015-08-19 19:43:26 -07:00
Marcelo Vanzin	e0dd1309ac	[SPARK-10119] [CORE] Fix isDynamicAllocationEnabled when config is expliticly disabled. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #8316 from vanzin/SPARK-10119.	2015-08-19 14:33:32 -07:00
Joshi	f3391ff2b8	[SPARK-8889] [CORE] Fix for OOM for graph creation Fix for OOM for graph creation Author: Joshi <rekhajoshm@gmail.com> Author: Rekha Joshi <rekhajoshm@gmail.com> Closes #7602 from rekhajoshm/SPARK-8889.	2015-08-19 21:23:02 +01:00
Yu ISHIKAWA	2fcb9cb955	[SPARK-9856] [SPARKR] Add expression functions into SparkR whose params are complicated I added lots of Column functinos into SparkR. And I also added `rand(seed: Int)` and `randn(seed: Int)` in Scala. Since we need such APIs for R integer type. ### JIRA [[SPARK-9856] Add expression functions into SparkR whose params are complicated - ASF JIRA](https://issues.apache.org/jira/browse/SPARK-9856) Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com> Closes #8264 from yu-iskw/SPARK-9856-3.	2015-08-19 10:41:14 -07:00
Han JU	3d16a54500	[SPARK-8949] Print warnings when using preferred locations feature Add warnings according to SPARK-8949 in `SparkContext` - warnings in scaladoc - log warnings when preferred locations feature is used through `SparkContext`'s constructor However I didn't found any documentation reference of this feature. Please direct me if you know any reference to this feature. Author: Han JU <ju.han.felix@gmail.com> Closes #7874 from darkjh/SPARK-8949.	2015-08-19 13:04:16 +01:00
Tathagata Das	bc9a0e0323	[SPARK-9967] [SPARK-10099] [STREAMING] Renamed conf spark.streaming.backpressure.{enable-->enabled} and fixed deprecated annotations Small changes - Renamed conf spark.streaming.backpressure.{enable --> enabled} - Change Java Deprecated annotations to Scala deprecated annotation with more information. Author: Tathagata Das <tathagata.das1565@gmail.com> Closes #8299 from tdas/SPARK-9967.	2015-08-18 23:37:57 -07:00
Josh Rosen	010b03ed52	[SPARK-9952] Fix N^2 loop when DAGScheduler.getPreferredLocsInternal accesses cacheLocs In Scala, `Seq.fill` always seems to return a List. Accessing a list by index is an O(N) operation. Thus, the following code will be really slow (~10 seconds on my machine): ```scala val numItems = 100000 val s = Seq.fill(numItems)(1) for (i <- 0 until numItems) s(i) ``` It turns out that we had a loop like this in DAGScheduler code, although it's a little tricky to spot. In `getPreferredLocsInternal`, there's a call to `getCacheLocs(rdd)(partition)`. The `getCacheLocs` call returns a Seq. If this Seq is a List and the RDD contains many partitions, then indexing into this list will cost O(partitions). Thus, when we loop over our tasks to compute their individual preferred locations we implicitly perform an N^2 loop, reducing scheduling throughput. This patch fixes this by replacing `Seq` with `Array`. Author: Josh Rosen <joshrosen@databricks.com> Closes #8178 from JoshRosen/dagscheduler-perf.	2015-08-18 22:30:13 -07:00
Marcelo Vanzin	c1840a862e	[SPARK-7736] [CORE] Fix a race introduced in PythonRunner. The fix for SPARK-7736 introduced a race where a port value of "-1" could be passed down to the pyspark process, causing it to fail to connect back to the JVM. This change adds code to fix that race. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #8258 from vanzin/SPARK-7736.	2015-08-18 11:36:36 -07:00
CodingCat	c34e9ff0ea	[MINOR] fix the comments in IndexShuffleBlockResolver it might be a typo introduced at the first moment or some leftover after some renaming...... the name of the method accessing the index file is called `getBlockData` now (not `getBlockLocation` as indicated in the comments) Author: CodingCat <zhunansjtu@gmail.com> Closes #8238 from CodingCat/minor_1.	2015-08-18 10:31:11 +01:00
Marcelo Vanzin	f68d024096	[SPARK-7736] [CORE] [YARN] Make pyspark fail YARN app on failure. The YARN backend doesn't like when user code calls `System.exit`, since it cannot know the exit status and thus cannot set an appropriate final status for the application. So, for pyspark, avoid that call and instead throw an exception with the exit code. SparkSubmit handles that exception and exits with the given exit code, while YARN uses the exit code as the failure code for the Spark app. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #7751 from vanzin/SPARK-9416.	2015-08-17 10:34:22 -07:00
Rohit Agarwal	ed092a06c2	[SPARK-9924] [WEB UI] Don't schedule checkForLogs while some of them are already running. Author: Rohit Agarwal <rohita@qubole.com> Closes #8153 from mindprince/SPARK-9924.	2015-08-17 10:31:57 -07:00
Matei Zaharia	cf016075a0	[SPARK-10008] Ensure shuffle locality doesn't take precedence over narrow deps The shuffle locality patch made the DAGScheduler aware of shuffle data, but for RDDs that have both narrow and shuffle dependencies, it can cause them to place tasks based on the shuffle dependency instead of the narrow one. This case is common in iterative join-based algorithms like PageRank and ALS, where one RDD is hash-partitioned and one isn't. Author: Matei Zaharia <matei@databricks.com> Closes #8220 from mateiz/shuffle-loc-fix.	2015-08-16 00:34:58 -07:00
Herman van Hovell	a85fb6c07f	[SPARK-9980] [BUILD] Fix SBT publishLocal error due to invalid characters in doc Tiny modification to a few comments ```sbt publishLocal``` work again. Author: Herman van Hovell <hvanhovell@questtec.nl> Closes #8209 from hvanhovell/SPARK-9980.	2015-08-15 10:46:04 +01:00
Davies Liu	37586e5449	[HOTFIX] fix duplicated braces Author: Davies Liu <davies@databricks.com> Closes #8219 from davies/fix_typo.	2015-08-14 20:56:55 -07:00
Reynold Xin	e5fd60415f	[SPARK-9934] Deprecate NIO ConnectionManager. Deprecate NIO ConnectionManager in Spark 1.5.0, before removing it in Spark 1.6.0. Author: Reynold Xin <rxin@databricks.com> Closes #8162 from rxin/SPARK-9934.	2015-08-14 20:55:32 -07:00
jerryshao	9407baa2a7	[SPARK-9877] [CORE] Fix StandaloneRestServer NPE when submitting application Detailed exception log can be seen in [SPARK-9877](https://issues.apache.org/jira/browse/SPARK-9877), the problem is when creating `StandaloneRestServer`, `self` (`masterEndpoint`) is null. So this fix is creating `StandaloneRestServer` when `self` is available. Author: jerryshao <sshao@hortonworks.com> Closes #8127 from jerryshao/SPARK-9877.	2015-08-14 13:44:38 -07:00
Andrew Or	6518ef6303	[SPARK-9948] Fix flaky AccumulatorSuite - internal accumulators In these tests, we use a custom listener and we assert on fields in the stage / task completion events. However, these events are posted in a separate thread so they're not guaranteed to be posted in time. This commit fixes this flakiness through a job end registration callback. Author: Andrew Or <andrew@databricks.com> Closes #8176 from andrewor14/fix-accumulator-suite.	2015-08-14 13:42:53 -07:00
Carson Wang	33bae585d4	[SPARK-9809] Task crashes because the internal accumulators are not properly initialized When a stage failed and another stage was resubmitted with only part of partitions to compute, all the tasks failed with error message: java.util.NoSuchElementException: key not found: peakExecutionMemory. This is because the internal accumulators are not properly initialized for this stage while other codes assume the internal accumulators always exist. Author: Carson Wang <carson.wang@intel.com> Closes #8090 from carsonwang/SPARK-9809.	2015-08-14 13:38:25 -07:00
Neelesh Srinivas Salian	57c2d08800	[SPARK-9923] [CORE] ShuffleMapStage.numAvailableOutputs should be an Int instead of Long Modified type of ShuffleMapStage.numAvailableOutputs from Long to Int Author: Neelesh Srinivas Salian <nsalian@cloudera.com> Closes #8183 from nssalian/SPARK-9923.	2015-08-14 20:03:50 +01:00
Davies Liu	bd35385d53	[SPARK-9945] [SQL] pageSize should be calculated from executor.memory Currently, pageSize of TungstenSort is calculated from driver.memory, it should use executor.memory instead. Also, in the worst case, the safeFactor could be 4 (because of rounding), increase it to 16. cc rxin Author: Davies Liu <davies@databricks.com> Closes #8175 from davies/page_size.	2015-08-13 21:12:59 -07:00
Andrew Or	8815ba2f67	[SPARK-9649] Fix MasterSuite, third time's a charm This particular test did not load the default configurations so it continued to start the REST server, which causes port bind exceptions.	2015-08-13 11:31:10 -07:00
Davies Liu	a8ab2634c1	[SPARK-9832] [SQL] add a thread-safe lookup for BytesToBytseMap This patch add a thread-safe lookup for BytesToBytseMap, and use that in broadcasted HashedRelation. Author: Davies Liu <davies@databricks.com> Closes #8151 from davies/safeLookup.	2015-08-12 21:26:00 -07:00
Josh Rosen	7b13ed27c1	[SPARK-9870] Disable driver UI and Master REST server in SparkSubmitSuite I think that we should pass additional configuration flags to disable the driver UI and Master REST server in SparkSubmitSuite and HiveSparkSubmitSuite. This might cut down on port-contention-related flakiness in Jenkins. Author: Josh Rosen <joshrosen@databricks.com> Closes #8124 from JoshRosen/disable-ui-in-sparksubmitsuite.	2015-08-12 18:52:11 -07:00
Rohit Agarwal	0d1d146c22	[SPARK-9724] [WEB UI] Avoid unnecessary redirects in the Spark Web UI. Author: Rohit Agarwal <rohita@qubole.com> Closes #8014 from mindprince/SPARK-9724 and squashes the following commits: a7af5ff [Rohit Agarwal] [SPARK-9724] [WEB UI] Inline attachPrefix and attachPrefixForRedirect. Fix logic of attachPrefix 8a977cd [Rohit Agarwal] [SPARK-9724] [WEB UI] Address review comments: Remove unneeded code, update scaladoc. b257844 [Rohit Agarwal] [SPARK-9724] [WEB UI] Avoid unnecessary redirects in the Spark Web UI.	2015-08-12 17:48:43 -07:00
Michel Lemay	ab7e721cfe	[SPARK-9826] [CORE] Fix cannot use custom classes in log4j.properties Refactor Utils class and create ShutdownHookManager. NOTE: Wasn't able to run /dev/run-tests on windows machine. Manual tests were conducted locally using custom log4j.properties file with Redis appender and logstash formatter (bundled in the fat-jar submitted to spark) ex: log4j.rootCategory=WARN,console,redis log4j.appender.console=org.apache.log4j.ConsoleAppender log4j.appender.console.target=System.err log4j.appender.console.layout=org.apache.log4j.PatternLayout log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n log4j.logger.org.eclipse.jetty=WARN log4j.logger.org.eclipse.jetty.util.component.AbstractLifeCycle=ERROR log4j.logger.org.apache.spark.repl.SparkIMain$exprTyper=INFO log4j.logger.org.apache.spark.repl.SparkILoop$SparkILoopInterpreter=INFO log4j.logger.org.apache.spark.graphx.Pregel=INFO log4j.appender.redis=com.ryantenney.log4j.FailoverRedisAppender log4j.appender.redis.endpoints=hostname:port log4j.appender.redis.key=mykey log4j.appender.redis.alwaysBatch=false log4j.appender.redis.layout=net.logstash.log4j.JSONEventLayoutV1 Author: michellemay <mlemay@gmail.com> Closes #8109 from michellemay/SPARK-9826.	2015-08-12 16:41:35 -07:00
Niranjan Padmanabhan	738f353988	[SPARK-9092] Fixed incompatibility when both num-executors and dynamic... … allocation are set. Now, dynamic allocation is set to false when num-executors is explicitly specified as an argument. Consequently, executorAllocationManager in not initialized in the SparkContext. Author: Niranjan Padmanabhan <niranjan.padmanabhan@cloudera.com> Closes #7657 from neurons/SPARK-9092.	2015-08-12 16:10:21 -07:00
Xiangrui Meng	6f60298b1d	[SPARK-8967] [DOC] add Since annotation Add `Since` as a Scala annotation. The benefit is that we can use it without having explicit JavaDoc. This is useful for inherited methods. The limitation is that is doesn't show up in the generated Java API documentation. This might be fixed by modifying genjavadoc. I think we could leave it as a TODO. This is how the generated Scala doc looks: `since` JavaDoc tag: ![screen shot 2015-08-11 at 10 00 37 pm](https://cloud.githubusercontent.com/assets/829644/9230761/fa72865c-40d8-11e5-807e-0f3c815c5acd.png) `Since` annotation: ![screen shot 2015-08-11 at 10 00 28 pm](https://cloud.githubusercontent.com/assets/829644/9230764/0041d7f4-40d9-11e5-8124-c3f3e5d5b31f.png) rxin Author: Xiangrui Meng <meng@databricks.com> Closes #8131 from mengxr/SPARK-8967.	2015-08-12 14:28:23 -07:00
Andrew Or	e0110792ef	[SPARK-9747] [SQL] Avoid starving an unsafe operator in aggregation This is the sister patch to #8011, but for aggregation. In a nutshell: create the `TungstenAggregationIterator` before computing the parent partition. Internally this creates a `BytesToBytesMap` which acquires a page in the constructor as of this patch. This ensures that the aggregation operator is not starved since we reserve at least 1 page in advance. rxin yhuai Author: Andrew Or <andrew@databricks.com> Closes #8038 from andrewor14/unsafe-starve-memory-agg.	2015-08-12 10:08:35 -07:00
Andrew Or	be5d191207	[SPARK-9795] Dynamic allocation: avoid double counting when killing same executor twice This is based on KaiXinXiaoLei's changes in #7716. The issue is that when someone calls `sc.killExecutor("1")` on the same executor twice quickly, then the executor target will be adjusted downwards by 2 instead of 1 even though we're only actually killing one executor. In certain cases where we don't adjust the target back upwards quickly, we'll end up with jobs hanging. This is a common danger because there are many places where this is called: - `HeartbeatReceiver` kills an executor that has not been sending heartbeats - `ExecutorAllocationManager` kills an executor that has been idle - The user code might call this, which may interfere with the previous callers While it's not clear whether this fixes SPARK-9745, fixing this potential race condition seems like a strict improvement. I've added a regression test to illustrate the issue. Author: Andrew Or <andrew@databricks.com> Closes #8078 from andrewor14/da-double-kill.	2015-08-12 09:24:50 -07:00
Tom White	2e680668f7	[SPARK-8625] [CORE] Propagate user exceptions in tasks back to driver This allows clients to retrieve the original exception from the cause field of the SparkException that is thrown by the driver. If the original exception is not in fact Serializable then it will not be returned, but the message and stacktrace will be. (All Java Throwables implement the Serializable interface, but this is no guarantee that a particular implementation can actually be serialized.) Author: Tom White <tom@cloudera.com> Closes #7014 from tomwhite/propagate-user-exceptions.	2015-08-12 10:07:11 -05:00
Timothy Chen	5c99d8bf98	[SPARK-8798] [MESOS] Allow additional uris to be fetched with mesos Some users like to download additional files in their sandbox that they can refer to from their spark program, or even later mount these files to another directory. Author: Timothy Chen <tnachen@gmail.com> Closes #7195 from tnachen/mesos_files.	2015-08-11 23:26:33 -07:00
Carson Wang	bab8923285	[SPARK-9426] [WEBUI] Job page DAG visualization is not shown To reproduce the issue, go to the stage page and click DAG Visualization once, then go to the job page to show the job DAG visualization. You will only see the first stage of the job. Root cause: the java script use local storage to remember your selection. Once you click the stage DAG visualization, the local storage set `expand-dag-viz-arrow-stage` to true. When you go to the job page, the js checks `expand-dag-viz-arrow-stage` in the local storage first and will try to show stage DAG visualization on the job page. To fix this, I set an id to the DAG span to differ job page and stage page. In the js code, we check the id and local storage together to make sure we show the correct DAG visualization. Author: Carson Wang <carson.wang@intel.com> Closes #8104 from carsonwang/SPARK-9426.	2015-08-11 23:25:02 -07:00
zsxwing	4e3f4b934f	[SPARK-9829] [WEBUI] Display the update value for peak execution memory The peak execution memory is not correct because it shows the sum of finished tasks' values when a task finishes. This PR fixes it by using the update value rather than the accumulator value. Author: zsxwing <zsxwing@gmail.com> Closes #8121 from zsxwing/SPARK-9829.	2015-08-11 23:23:17 -07:00
Rohit Agarwal	a807fcbe50	[SPARK-9806] [WEB UI] Don't share ReplayListenerBus between multiple applications Author: Rohit Agarwal <rohita@qubole.com> Closes #8088 from mindprince/SPARK-9806.	2015-08-11 23:20:39 -07:00
xutingjun	b85f9a242a	[SPARK-8366] maxNumExecutorsNeeded should properly handle failed tasks Author: xutingjun <xutingjun@huawei.com> Author: meiyoula <1039320815@qq.com> Closes #6817 from XuTingjun/SPARK-8366.	2015-08-11 23:19:35 -07:00
zsxwing	f16bc68dfb	[SPARK-9824] [CORE] Fix the issue that InternalAccumulator leaks WeakReference `InternalAccumulator.create` doesn't call `registerAccumulatorForCleanup` to register itself with ContextCleaner, so `WeakReference`s for these accumulators in `Accumulators.originals` won't be removed. This PR added `registerAccumulatorForCleanup` for internal accumulators to avoid the memory leak. Author: zsxwing <zsxwing@gmail.com> Closes #8108 from zsxwing/internal-accumulators-leak.	2015-08-11 14:06:23 -07:00
Jeff Zhang	bce72797f3	Fix comment error API is updated but its doc comment is not updated. Author: Jeff Zhang <zjffdu@apache.org> Closes #8097 from zjffdu/dev.	2015-08-11 10:42:17 -07:00
Reynold Xin	d378396f86	[SPARK-9815] Rename PlatformDependent.UNSAFE -> Platform. PlatformDependent.UNSAFE is way too verbose. Author: Reynold Xin <rxin@databricks.com> Closes #8094 from rxin/SPARK-9815 and squashes the following commits: 229b603 [Reynold Xin] [SPARK-9815] Rename PlatformDependent.UNSAFE -> Platform.	2015-08-11 08:41:06 -07:00
Marcelo Vanzin	0f3366a4c7	[SPARK-9710] [TEST] Fix RPackageUtilsSuite when R is not available. RUtils.isRInstalled throws an exception if R is not installed, instead of returning false. Fix that. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #8008 from vanzin/SPARK-9710 and squashes the following commits: df72d8c [Marcelo Vanzin] [SPARK-9710] [test] Fix RPackageUtilsSuite when R is not available.	2015-08-10 10:10:40 -07:00
Shivaram Venkataraman	46025616b4	[CORE] [SPARK-9760] Use Option instead of Some for Ivy repos This was introduced in #7599 cc rxin brkyvz Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu> Closes #8055 from shivaram/spark-packages-repo-fix and squashes the following commits: 890f306 [Shivaram Venkataraman] Remove test case 51d69ee [Shivaram Venkataraman] Add test case for --packages without --repository c02e0b4 [Shivaram Venkataraman] Use Option instead of Some for Ivy repos	2015-08-09 14:30:30 -07:00
Reynold Xin	e9c36938ba	[SPARK-9752][SQL] Support UnsafeRow in Sample operator. In order for this to work, I had to disable gap sampling. Author: Reynold Xin <rxin@databricks.com> Closes #8040 from rxin/SPARK-9752 and squashes the following commits: f9e248c [Reynold Xin] Fix the test case for real this time. adbccb3 [Reynold Xin] Fixed test case. 589fb23 [Reynold Xin] Merge branch 'SPARK-9752' of github.com:rxin/spark into SPARK-9752 55ccddc [Reynold Xin] Fixed core test. 78fa895 [Reynold Xin] [SPARK-9752][SQL] Support UnsafeRow in Sample operator. c9e7112 [Reynold Xin] [SPARK-9752][SQL] Support UnsafeRow in Sample operator.	2015-08-09 10:58:36 -07:00
Carson Wang	ef062c1599	[SPARK-9731] Standalone scheduling incorrect cores if spark.executor.cores is not set The issue only happens if `spark.executor.cores` is not set and executor memory is set to a high value. For example, if we have a worker with 4G and 10 cores and we set `spark.executor.memory` to 3G, then only 1 core is assigned to the executor. The correct number should be 10 cores. I've added a unit test to illustrate the issue. Author: Carson Wang <carson.wang@intel.com> Closes #8017 from carsonwang/SPARK-9731 and squashes the following commits: d09ec48 [Carson Wang] Fix code style 86b651f [Carson Wang] Simplify the code 943cc4c [Carson Wang] fix scheduling correct cores to executors	2015-08-07 23:36:26 -07:00
Andrew Or	881548ab20	[SPARK-9674] Re-enable ignored test in SQLQuerySuite The original code that this test tests is removed in `9270bd06fd`. It was ignored shortly before that so we never caught it. This patch re-enables the test and adds the code necessary to make it pass. JoshRosen yhuai Author: Andrew Or <andrew@databricks.com> Closes #8015 from andrewor14/SPARK-9674 and squashes the following commits: 225eac2 [Andrew Or] Merge branch 'master' of github.com:apache/spark into SPARK-9674 8c24209 [Andrew Or] Fix NPE e541d64 [Andrew Or] Track aggregation memory for both sort and hash 0be3a42 [Andrew Or] Fix test	2015-08-07 14:20:13 -07:00
zsxwing	ebfd91c542	[SPARK-9467][SQL]Add SQLMetric to specialize accumulators to avoid boxing This PR adds SQLMetric/SQLMetricParam/SQLMetricValue to specialize accumulators to avoid boxing. All SQL metrics should use these classes rather than `Accumulator`. Author: zsxwing <zsxwing@gmail.com> Closes #7996 from zsxwing/sql-accu and squashes the following commits: 14a5f0a [zsxwing] Address comments 367ca23 [zsxwing] Use localValue directly to avoid changing Accumulable 42f50c3 [zsxwing] Add SQLMetric to specialize accumulators to avoid boxing	2015-08-07 00:09:58 -07:00
Davies Liu	15bd6f338d	[SPARK-9453] [SQL] support records larger than page size in UnsafeShuffleExternalSorter This patch follows exactly #7891 (except testing) Author: Davies Liu <davies@databricks.com> Closes #8005 from davies/larger_record and squashes the following commits: f9c4aff [Davies Liu] address comments 9de5c72 [Davies Liu] support records larger than page size in UnsafeShuffleExternalSorter	2015-08-06 23:40:38 -07:00
Reynold Xin	4309262ec9	[SPARK-9700] Pick default page size more intelligently. Previously, we use 64MB as the default page size, which was way too big for a lot of Spark applications (especially for single node). This patch changes it so that the default page size, if unset by the user, is determined by the number of cores available and the total execution memory available. Author: Reynold Xin <rxin@databricks.com> Closes #8012 from rxin/pagesize and squashes the following commits: 16f4756 [Reynold Xin] Fixed failing test. 5afd570 [Reynold Xin] private... 0d5fb98 [Reynold Xin] Update default value. 674a6cd [Reynold Xin] Address review feedback. dc00e05 [Reynold Xin] Merge with master. 73ebdb6 [Reynold Xin] [SPARK-9700] Pick default page size more intelligently.	2015-08-06 23:18:29 -07:00
zsxwing	672f467668	[SPARK-8057][Core]Call TaskAttemptContext.getTaskAttemptID using Reflection Someone may use the Spark core jar in the maven repo with hadoop 1. SPARK-2075 has already resolved the compatibility issue to support it. But `SparkHadoopMapRedUtil.commitTask` broke it recently. This PR uses Reflection to call `TaskAttemptContext.getTaskAttemptID` to fix the compatibility issue. Author: zsxwing <zsxwing@gmail.com> Closes #6599 from zsxwing/SPARK-8057 and squashes the following commits: f7a343c [zsxwing] Remove the redundant import 6b7f1af [zsxwing] Call TaskAttemptContext.getTaskAttemptID using Reflection	2015-08-06 21:42:42 -07:00
Andrew Or	014a9f9d8c	[SPARK-9709] [SQL] Avoid starving unsafe operators that use sort The issue is that a task may run multiple sorts, and the sorts run by the child operator (i.e. parent RDD) may acquire all available memory such that other sorts in the same task do not have enough to proceed. This manifests itself in an `IOException("Unable to acquire X bytes of memory")` thrown by `UnsafeExternalSorter`. The solution is to reserve a page in each sorter in the chain before computing the child operator's (parent RDD's) partitions. This requires us to use a new special RDD that does some preparation before computing the parent's partitions. Author: Andrew Or <andrew@databricks.com> Closes #8011 from andrewor14/unsafe-starve-memory and squashes the following commits: 35b69a4 [Andrew Or] Simplify test 0b07782 [Andrew Or] Minor: update comments 5d5afdf [Andrew Or] Merge branch 'master' of github.com:apache/spark into unsafe-starve-memory 254032e [Andrew Or] Add tests 234acbd [Andrew Or] Reserve a page in sorter when preparing each partition b889e08 [Andrew Or] MapPartitionsWithPreparationRDD	2015-08-06 19:04:57 -07:00
Reynold Xin	b87825310a	[SPARK-9692] Remove SqlNewHadoopRDD's generated Tuple2 and InterruptibleIterator. A small performance optimization – we don't need to generate a Tuple2 and then immediately discard the key. We also don't need an extra wrapper from InterruptibleIterator. Author: Reynold Xin <rxin@databricks.com> Closes #8000 from rxin/SPARK-9692 and squashes the following commits: 1d4d0b3 [Reynold Xin] [SPARK-9692] Remove SqlNewHadoopRDD's generated Tuple2 and InterruptibleIterator.	2015-08-06 18:25:38 -07:00
Marcelo Vanzin	e234ea1b49	[SPARK-9645] [YARN] [CORE] Allow shuffle service to read shuffle files. Spark should not mess with the permissions of directories created by the cluster manager. Here, by setting the block manager dir permissions to 700, the shuffle service (running as the YARN user) wouldn't be able to serve shuffle files created by applications. Also, the code to protect the local app dir was missing in standalone's Worker; that has been now added. Since all processes run as the same user in standalone, `chmod 700` should not cause problems. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #7966 from vanzin/SPARK-9645 and squashes the following commits: 6e07b31 [Marcelo Vanzin] Protect the app dir in standalone mode. 384ba6a [Marcelo Vanzin] [SPARK-9645] [yarn] [core] Allow shuffle service to read shuffle files.	2015-08-06 15:30:27 -07:00

1 2 3 4 5 ...

4821 commits