ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Patrick Wendell	1cfa7302ee	Preparing development version 1.4.2-SNAPSHOT	2015-06-22 22:21:31 -07:00
Patrick Wendell	d0a5560ce4	Preparing Spark release v1.4.1-rc1	2015-06-22 22:21:26 -07:00
Patrick Wendell	48d6830144	[BUILD] Preparing Spark release 1.4.1	2015-06-22 22:18:52 -07:00
Tathagata Das	4b2c793a27	[SPARK-7180] [SPARK-8090] [SPARK-8091] Fix a number of SerializationDebugger bugs and limitations This PR solves three SerializationDebugger issues. * SPARK-7180 - SerializationDebugger fails with ArrayOutOfBoundsException * SPARK-8090 - SerializationDebugger does not handle classes with writeReplace correctly * SPARK-8091 - SerializationDebugger does not handle classes with writeObject method The solutions for each are explained as follows * SPARK-7180 - The wrong slot desc was used for getting the value of the fields in the object being tested. * SPARK-8090 - Test the type of the replaced object. * SPARK-8091 - Use a dummy ObjectOutputStream to collect all the objects written by the writeObject() method, and then test those objects as usual. I also added more tests in the testsuite to increase code coverage. For example, added tests for cases where there are not serializability issues. Author: Tathagata Das <tathagata.das1565@gmail.com> Closes #6625 from tdas/SPARK-7180 and squashes the following commits: c7cb046 [Tathagata Das] Addressed comments on docs ae212c8 [Tathagata Das] Improved docs 304c97b [Tathagata Das] Fixed build error 26b5179 [Tathagata Das] more tests.....92% line coverage 7e2fdcf [Tathagata Das] Added more tests d1967fb [Tathagata Das] Added comments. da75d34 [Tathagata Das] Removed unnecessary lines. 50a608d [Tathagata Das] Fixed bugs and added support for writeObject	2015-06-19 11:06:32 -07:00
Dibyendu Bhattacharya	b55e4b9a52	[SPARK-8080] [STREAMING] Receiver.store with Iterator does not give correct count at Spark UI tdas zsxwing this is the new PR for Spark-8080 I have merged https://github.com/apache/spark/pull/6659 Also to mention , for MEMORY_ONLY settings , when Block is not able to unrollSafely to memory if enough space is not there, BlockManager won't try to put the block and ReceivedBlockHandler will throw SparkException as it could not find the block id in PutResult. Thus number of records in block won't be counted if Block failed to unroll in memory. Which is fine. For MEMORY_DISK settings , if BlockManager not able to unroll block to memory, block will still get deseralized to Disk. Same for WAL based store. So for those cases ( storage level = memory + disk ) number of records will be counted even though the block not able to unroll to memory. thus I added the isFullyConsumed in the CountingIterator but have not used it as such case will never happen that block not fully consumed and ReceivedBlockHandler still get the block ID. I have added few test cases to cover those block unrolling scenarios also. Author: Dibyendu Bhattacharya <dibyendu.bhattacharya1@pearson.com> Author: U-PEROOT\UBHATD1 <UBHATD1@PIN-L-PI046.PEROOT.com> Closes #6707 from dibbhatt/master and squashes the following commits: f6cb6b5 [Dibyendu Bhattacharya] [SPARK-8080][STREAMING] Receiver.store with Iterator does not give correct count at Spark UI f37cfd8 [Dibyendu Bhattacharya] [SPARK-8080][STREAMING] Receiver.store with Iterator does not give correct count at Spark UI 5a8344a [Dibyendu Bhattacharya] [SPARK-8080][STREAMING] Receiver.store with Iterator does not give correct count at Spark UI Count ByteBufferBlock as 1 count fceac72 [Dibyendu Bhattacharya] [SPARK-8080][STREAMING] Receiver.store with Iterator does not give correct count at Spark UI 0153e7e [Dibyendu Bhattacharya] [SPARK-8080][STREAMING] Receiver.store with Iterator does not give correct count at Spark UI Fixed comments given by @zsxwing 4c5931d [Dibyendu Bhattacharya] [SPARK-8080][STREAMING] Receiver.store with Iterator does not give correct count at Spark UI 01e6dc8 [U-PEROOT\UBHATD1] A	2015-06-18 20:04:29 -07:00
huangzhaowei	f287f7ea14	[SPARK-8367] [STREAMING] Add a limit for 'spark.streaming.blockInterval` since a data loss bug. Bug had reported in the jira [SPARK-8367](https://issues.apache.org/jira/browse/SPARK-8367) The relution is limitting the configuration `spark.streaming.blockInterval` to a positive number. Author: huangzhaowei <carlmartinmax@gmail.com> Author: huangzhaowei <SaintBacchus@users.noreply.github.com> Closes #6818 from SaintBacchus/SPARK-8367 and squashes the following commits: c9d1927 [huangzhaowei] Update BlockGenerator.scala bd3f71a [huangzhaowei] Use requre instead of if 3d17796 [huangzhaowei] [SPARK_8367][Streaming]Add a limit for 'spark.streaming.blockInterval' since a data loss bug. (cherry picked from commit `ccf010f27b`) Signed-off-by: Sean Owen <sowen@cloudera.com>	2015-06-16 08:16:18 +02:00
zsxwing	200c980a13	[SPARK-8112] [STREAMING] Fix the negative event count issue Author: zsxwing <zsxwing@gmail.com> Closes #6659 from zsxwing/SPARK-8112 and squashes the following commits: a5d7da6 [zsxwing] Address comments d255b6e [zsxwing] Fix the negative event count issue (cherry picked from commit `4f16d3fe2e`) Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>	2015-06-05 12:46:15 -07:00
Andrew Or	bfe74b34a6	[SPARK-7558] Demarcate tests in unit-tests.log (1.4) This includes the following commits: original: `9eb222c` hotfix1: `8c99793` hotfix2: `a4f2412` scalastyle check: `609c492` --- Original patch #6441 Branch-1.3 patch #6602 Author: Andrew Or <andrew@databricks.com> Closes #6598 from andrewor14/demarcate-tests-1.4 and squashes the following commits: 4c3c566 [Andrew Or] Merge branch 'branch-1.4' of github.com:apache/spark into demarcate-tests-1.4 e217b78 [Andrew Or] [SPARK-7558] Guard against direct uses of FunSuite / FunSuiteLike 46d4361 [Andrew Or] Various whitespace changes (minor) 3d9bf04 [Andrew Or] Make all test suites extend SparkFunSuite instead of FunSuite eaa520e [Andrew Or] Fix tests? b4d93de [Andrew Or] Fix tests 634a777 [Andrew Or] Fix log message a932e8d [Andrew Or] Fix manual things that cannot be covered through automation 8bc355d [Andrew Or] Add core tests as dependencies in all modules 75d361f [Andrew Or] Introduce base abstract class for all test suites	2015-06-03 20:46:44 -07:00
Patrick Wendell	ab713af564	Preparing development version 1.4.0-SNAPSHOT	2015-06-02 18:06:41 -07:00
Patrick Wendell	22596c534a	Preparing Spark release v1.4.0-rc4	2015-06-02 18:06:35 -07:00
Patrick Wendell	e3c35b217c	Preparing development version 1.4.0-SNAPSHOT	2015-06-02 17:01:15 -07:00
Patrick Wendell	a14fad11ef	Preparing Spark release v1.4.0-rc4	2015-06-02 17:01:10 -07:00
Patrick Wendell	92ccc5ba39	Preparing development version 1.4.0-SNAPSHOT	2015-06-02 14:02:19 -07:00
Patrick Wendell	d630f4d697	Preparing Spark release v1.4.0-rc4	2015-06-02 14:02:14 -07:00
Patrick Wendell	92a677891c	Preparing development version 1.4.0-SNAPSHOT	2015-06-02 08:41:15 -07:00
Patrick Wendell	48c506724a	Preparing Spark release v1.4.0-rc4	2015-06-02 08:41:10 -07:00
zsxwing	0d21a41be9	[SPARK-8025][Streaming]Add JavaDoc style deprecation for deprecated Streaming methods Scala `deprecated` annotation actually doesn't show up in JavaDoc. Author: zsxwing <zsxwing@gmail.com> Closes #6564 from zsxwing/SPARK-8025 and squashes the following commits: 2faa2bb [zsxwing] Add JavaDoc style deprecation for deprecated Streaming methods (cherry picked from commit `7f74bb3bc6`) Signed-off-by: Reynold Xin <rxin@databricks.com>	2015-06-01 21:36:55 -07:00
Tathagata Das	a7c8b00b72	[SPARK-7958] [STREAMING] Handled exception in StreamingContext.start() to prevent leaking of actors StreamingContext.start() can throw exception because DStream.validateAtStart() fails (say, checkpoint directory not set for StateDStream). But by then JobScheduler, JobGenerator, and ReceiverTracker has already started, along with their actors. But those cannot be shutdown because the only way to do that is call StreamingContext.stop() which cannot be called as the context has not been marked as ACTIVE. The solution in this PR is to stop the internal scheduler if start throw exception, and mark the context as STOPPED. Author: Tathagata Das <tathagata.das1565@gmail.com> Closes #6559 from tdas/SPARK-7958 and squashes the following commits: 20b2ec1 [Tathagata Das] Added synchronized 790b617 [Tathagata Das] Handled exception in StreamingContext.start() (cherry picked from commit `2f9c7519d6`) Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>	2015-06-01 20:05:11 -07:00
Reynold Xin	f63eab950b	[SPARK-3850] Trim trailing spaces for examples/streaming/yarn. Author: Reynold Xin <rxin@databricks.com> Closes #6530 from rxin/trim-whitespace-1 and squashes the following commits: 7b7b3a0 [Reynold Xin] Reset again. dc14597 [Reynold Xin] Reset scalastyle. cd556c4 [Reynold Xin] YARN, Kinesis, Flume. 4223fe1 [Reynold Xin] [SPARK-3850] Trim trailing spaces for examples/streaming. (cherry picked from commit `564bc11e98`) Signed-off-by: Reynold Xin <rxin@databricks.com>	2015-05-31 00:48:29 -07:00
Patrick Wendell	e549874c33	Preparing development version 1.4.0-SNAPSHOT	2015-05-29 13:07:07 -07:00
Patrick Wendell	dd109a8746	Preparing Spark release v1.4.0-rc3	2015-05-29 13:06:59 -07:00
Patrick Wendell	c68abaa34e	Preparing development version 1.4.0-SNAPSHOT	2015-05-29 12:15:18 -07:00
Patrick Wendell	fb60503ff2	Preparing Spark release v1.4.0-rc3	2015-05-29 12:15:13 -07:00
Patrick Wendell	6bf5a42084	Preparing development version 1.4.0-SNAPSHOT	2015-05-28 23:40:27 -07:00
Patrick Wendell	f2796816be	Preparing Spark release v1.4.0-rc3	2015-05-28 23:40:22 -07:00
Patrick Wendell	119c93af9c	Preparing development version 1.4.0-SNAPSHOT	2015-05-28 22:57:31 -07:00
Patrick Wendell	2d97d7a0aa	Preparing Spark release v1.4.0-rc3	2015-05-28 22:57:26 -07:00
Patrick Wendell	e419821c3b	[HOTFIX] Minor style fix from last commit	2015-05-28 22:48:25 -07:00
Tathagata Das	7a52fdf25f	[SPARK-7931] [STREAMING] Do not restart receiver when stopped Attempts to restart the socket receiver when it is supposed to be stopped causes undesirable error messages. Author: Tathagata Das <tathagata.das1565@gmail.com> Closes #6483 from tdas/SPARK-7931 and squashes the following commits: 09aeee1 [Tathagata Das] Do not restart receiver when stopped	2015-05-28 22:48:23 -07:00
Reynold Xin	f4b135337c	[SPARK-7927] whitespace fixes for streaming. So we can enable a whitespace enforcement rule in the style checker to save code review time. Author: Reynold Xin <rxin@databricks.com> Closes #6475 from rxin/whitespace-streaming and squashes the following commits: 810dae4 [Reynold Xin] Fixed tests. 89068ad [Reynold Xin] [SPARK-7927] whitespace fixes for streaming. (cherry picked from commit `3af0b3136e`) Signed-off-by: Reynold Xin <rxin@databricks.com>	2015-05-28 17:55:29 -07:00
Patrick Wendell	7c342bdd93	Preparing development version 1.4.0-SNAPSHOT	2015-05-27 22:36:30 -07:00
Patrick Wendell	4983dfc878	Preparing Spark release v1.4.0-rc3	2015-05-27 22:36:23 -07:00
Patrick Wendell	947d700ec8	Preparing development version 1.4.0-SNAPSHOT	2015-05-23 20:13:05 -07:00
Patrick Wendell	03fb26a3e5	Preparing Spark release v1.4.0-rc2	2015-05-23 20:13:00 -07:00
Patrick Wendell	f2f74b9b1a	Preparing development version 1.4.1-SNAPSHOT	2015-05-23 14:59:37 -07:00
Patrick Wendell	0da7396990	Preparing Spark release v1.4.0-rc2-test	2015-05-23 14:59:31 -07:00
Patrick Wendell	8da8caab17	Preparing development version 1.4.1-SNAPSHOT	2015-05-23 14:46:27 -07:00
Patrick Wendell	8f50218f38	Preparing Spark release 1.4.0-rc2-test	2015-05-23 14:46:23 -07:00
zsxwing	ea9db50bc3	[SPARK-7777][Streaming] Handle the case when there is no block in a batch In the old implementation, if a batch has no block, `areWALRecordHandlesPresent` will be `true` and it will return `WriteAheadLogBackedBlockRDD`. This PR handles this case by returning `WriteAheadLogBackedBlockRDD` or `BlockRDD` according to the configuration. Author: zsxwing <zsxwing@gmail.com> Closes #6372 from zsxwing/SPARK-7777 and squashes the following commits: 788f895 [zsxwing] Handle the case when there is no block in a batch (cherry picked from commit `ad0badba14`) Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>	2015-05-23 02:11:28 -07:00
Tathagata Das	b928db4fe3	[SPARK-7838] [STREAMING] Set scope for kinesis stream Author: Tathagata Das <tathagata.das1565@gmail.com> Closes #6369 from tdas/SPARK-7838 and squashes the following commits: 87d1c7f [Tathagata Das] Addressed comment 37775d8 [Tathagata Das] set scope for kinesis stream (cherry picked from commit `baa89838cc`) Signed-off-by: Andrew Or <andrew@databricks.com>	2015-05-22 23:06:01 -07:00
Tathagata Das	a17a5cb302	[SPARK-7776] [STREAMING] Added shutdown hook to StreamingContext Shutdown hook to stop SparkContext was added recently. This results in ugly errors when a streaming application is terminated by ctrl-C. ``` Exception in thread "Thread-27" org.apache.spark.SparkException: Job cancelled because SparkContext was shut down at org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:736) at org.apache.spark.scheduler.DAGScheduler$$anonfun$cleanUpAfterSchedulerStop$1.apply(DAGScheduler.scala:735) at scala.collection.mutable.HashSet.foreach(HashSet.scala:79) at org.apache.spark.scheduler.DAGScheduler.cleanUpAfterSchedulerStop(DAGScheduler.scala:735) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onStop(DAGScheduler.scala:1468) at org.apache.spark.util.EventLoop.stop(EventLoop.scala:84) at org.apache.spark.scheduler.DAGScheduler.stop(DAGScheduler.scala:1403) at org.apache.spark.SparkContext.stop(SparkContext.scala:1642) at org.apache.spark.SparkContext$$anonfun$3.apply$mcV$sp(SparkContext.scala:559) at org.apache.spark.util.SparkShutdownHook.run(Utils.scala:2266) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(Utils.scala:2236) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(Utils.scala:2236) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(Utils.scala:2236) at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1764) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(Utils.scala:2236) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(Utils.scala:2236) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(Utils.scala:2236) at scala.util.Try$.apply(Try.scala:161) at org.apache.spark.util.SparkShutdownHookManager.runAll(Utils.scala:2236) at org.apache.spark.util.SparkShutdownHookManager$$anon$6.run(Utils.scala:2218) at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54) ``` This is because the Spark's shutdown hook stops the context, and the streaming jobs fail in the middle. The correct solution is to stop the streaming context before the spark context. This PR adds the shutdown hook to do so with a priority higher than the SparkContext's shutdown hooks priority. Author: Tathagata Das <tathagata.das1565@gmail.com> Closes #6307 from tdas/SPARK-7776 and squashes the following commits: e3d5475 [Tathagata Das] Added conf to specify graceful shutdown 4c18652 [Tathagata Das] Added shutdown hook to StreamingContxt. (cherry picked from commit `d68ea24d60`) Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>	2015-05-21 17:45:17 -07:00
Burak Yavuz	f08c6f3193	[SPARK-7745] Change asserts to requires for user input checks in Spark Streaming Assertions can be turned off. `require` throws an `IllegalArgumentException` which makes more sense when it's a user set variable. Author: Burak Yavuz <brkyvz@gmail.com> Closes #6271 from brkyvz/streaming-require and squashes the following commits: d249484 [Burak Yavuz] fix merge conflict 264adb8 [Burak Yavuz] addressed comments v1.0 6161350 [Burak Yavuz] fix tests 16aa766 [Burak Yavuz] changed more assertions to more meaningful errors afd923d [Burak Yavuz] changed some assertions to require (cherry picked from commit `1ee8eb431e`) Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>	2015-05-21 00:31:16 -07:00
zsxwing	b6182ce891	[SPARK-7777] [STREAMING] Fix the flaky test in org.apache.spark.streaming.BasicOperationsSuite Just added a guard to make sure a batch has completed before moving to the next batch. Author: zsxwing <zsxwing@gmail.com> Closes #6306 from zsxwing/SPARK-7777 and squashes the following commits: ecee529 [zsxwing] Fix the failure message 58634fe [zsxwing] Fix the flaky test in org.apache.spark.streaming.BasicOperationsSuite (cherry picked from commit `895baf8f77`) Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>	2015-05-20 19:56:10 -07:00
Patrick Wendell	9b37e32c55	Preparing development version 1.4.0-SNAPSHOT	2015-05-20 17:29:00 -07:00
Patrick Wendell	1e458e3553	Preparing Spark release rc-test	2015-05-20 17:28:55 -07:00
pwendell	8d66849862	Preparing development version 1.4.0-SNAPSHOT	2015-05-20 17:26:15 -07:00
pwendell	ae29aeaf8e	Preparing Spark release rc-test	2015-05-20 17:26:10 -07:00
jenkins	534c787b9f	Preparing development version 1.4.0-SNAPSHOT	2015-05-20 16:49:59 -07:00
jenkins	5f4d87f608	Preparing Spark release rc-test	2015-05-20 16:49:54 -07:00
Patrick Wendell	205ed15f29	Preparing development version 1.4.0-SNAPSHOT	2015-05-20 16:30:01 -07:00

1 2 3 4 5 ...

820 commits