ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Prashant Sharma	4e5b09664c	fixes corresponding to review feedback at pull request #479	2013-02-20 19:14:52 +05:30
Tathagata Das	991d3342fe	Merge pull request #486 from ScrapCodes/akka-example-bug-fix A bug fix to AkkaWordCount example.	2013-02-20 02:06:36 -08:00
Prashant Sharma	05dc385649	A bug fix post merge, following changes to AkkaUtils	2013-02-20 15:28:12 +05:30
Matei Zaharia	05bc02e80b	Merge pull request #482 from woggling/shutdown-exceptions Don't call System.exit over uncaught exceptions from shutdown hooks	2013-02-19 20:56:15 -08:00
haitao.yao	6a3d44c673	Merge branch 'mesos'	2013-02-20 10:23:58 +08:00
Charles Reiss	092c631fa8	Pull detection of being in a shutdown hook into utility function.	2013-02-19 17:49:55 -08:00
Matei Zaharia	8a992226bd	Merge pull request #484 from andyk/master Fixes a broken link in documentation to issue tracker	2013-02-19 17:07:24 -08:00
Matei Zaharia	a3e86b2b1f	Merge pull request #483 from rxin/splitpruningrdd2 Added a method to create PartitionPruningRDD.	2013-02-19 17:07:00 -08:00
Andy Konwinski	ecd137a72d	Fixes link to issue tracker in documentation page "Contributing to Spark".	2013-02-19 16:58:02 -08:00
Reynold Xin	130f704baf	Added a method to create PartitionPruningRDD.	2013-02-19 16:03:52 -08:00
Charles Reiss	d0588bd6d7	Catch/log errors deleting temp dirs	2013-02-19 13:04:06 -08:00
Charles Reiss	687581c3ec	Paranoid uncaught exception handling for exceptions during shutdown	2013-02-19 13:03:02 -08:00
Tathagata Das	b0565ae396	Merge pull request #481 from pwendell/stream-rdd-type-streaming STREAMING-51: Add RDD type as a type parameter in JavaDStreamLike Edit (streaming/ version)	2013-02-19 10:25:36 -08:00
Patrick Wendell	041c19e5f0	Small changes that were missing in merge	2013-02-19 08:44:20 -08:00
Patrick Wendell	fed1122d74	Use RDD type for `slice` operator in Java. This commit uses the RDD type in `slice`, making it available to both normal and pair RDD's in java. It also updates the signature for `slice` to match changes in the Scala API.	2013-02-19 08:32:38 -08:00
Patrick Wendell	35880de42e	Use RDD type for `transform` operator in Java. This is an improved implementation of the `transform` operator in Java. The main difference is that this allows all four possible types of transform functions 1. JavaRDD -> JavaRDD 2. JavaRDD -> JavaPairRDD 3. JavaPairRDD -> JavaPairRDD 4. JavaPairRDD -> JavaRDD whereas previously only (1) and (3) were possible. Conflicts: streaming/src/test/java/spark/streaming/JavaAPISuite.java	2013-02-19 08:31:58 -08:00
Patrick Wendell	9d49a6b03f	Use RDD type for `foreach` operator in Java.	2013-02-19 08:30:32 -08:00
Nick Pentreath	8a281399f9	Streaming example using Twitter Algebird's Count Min Sketch monoid	2013-02-19 17:56:02 +02:00
Nick Pentreath	d8ee184d95	Dependencies and refactoring for streaming HLL example, and using context.twitterStream method	2013-02-19 17:42:57 +02:00
Prashant Sharma	8d44480d84	example for demonstrating ZeroMQ stream	2013-02-19 19:42:14 +05:30
Prashant Sharma	f7d3e309cb	ZeroMQ stream as receiver	2013-02-19 19:32:52 +05:30
Nick Pentreath	315ea069e8	Merge remote-tracking branch 'upstream/streaming' into streaming-eg-algebird Conflicts: project/SparkBuild.scala	2013-02-19 13:58:05 +02:00
Nick Pentreath	015893f0e8	Adding streaming HyperLogLog example using Algebird	2013-02-19 13:21:33 +02:00
Tathagata Das	8b9c673fce	Merge pull request #476 from tdas/streaming Major modifications to fix driver fault-tolerance with file input stream	2013-02-19 03:07:10 -08:00
Tathagata Das	7e30c46aaf	Added comment to the KafkaWordCount, given by Sean McNamara.	2013-02-19 03:05:44 -08:00
Tathagata Das	7851b34e97	Merge branch 'mesos-streaming' into streaming	2013-02-19 03:01:15 -08:00
Tathagata Das	9e82be1503	Merge branch 'streaming' into ScrapCodes-streaming-actor Conflicts: docs/plugin-custom-receiver.md streaming/src/main/scala/spark/streaming/StreamingContext.scala streaming/src/main/scala/spark/streaming/dstream/KafkaInputDStream.scala streaming/src/main/scala/spark/streaming/dstream/PluggableInputDStream.scala streaming/src/main/scala/spark/streaming/receivers/ActorReceiver.scala streaming/src/test/scala/spark/streaming/InputStreamsSuite.scala	2013-02-19 02:48:50 -08:00
Matei Zaharia	03d847999e	Merge pull request #477 from shivaram/ganglia-port-change Ganglia port change	2013-02-18 20:25:48 -08:00
haitao.yao	7c129388fb	Merge branch 'mesos'	2013-02-19 11:22:24 +08:00
Shivaram Venkataraman	6cba5a48b0	Print cluster url after setup completes	2013-02-18 18:30:36 -08:00
Shivaram Venkataraman	e7cdf7a6a4	Print ganglia url after setup	2013-02-18 17:15:22 -08:00
Shivaram Venkataraman	03f45a18d5	Use port 5080 for httpd/ganglia	2013-02-18 16:56:01 -08:00
Tathagata Das	12ea14c211	Changed networkStream to socketStream and pluggableNetworkStream to become networkStream as a way to create streams from arbitrary network receiver.	2013-02-18 15:18:34 -08:00
Tathagata Das	6a6e6bda57	Merge branch 'streaming' into ScrapCode-streaming Conflicts: streaming/src/main/scala/spark/streaming/dstream/KafkaInputDStream.scala streaming/src/main/scala/spark/streaming/dstream/NetworkInputDStream.scala	2013-02-18 13:26:12 -08:00
Tathagata Das	8ad561dc7d	Added checkpointing and fault-tolerance semantics to the programming guide. Fixed default checkpoint interval to being a multiple of slide duration. Fixed visibility of some classes and objects to clean up docs.	2013-02-18 02:12:41 -08:00
Matei Zaharia	7151e1e4c8	Rename "jobs" to "applications" in the standalone cluster	2013-02-17 23:23:08 -08:00
Matei Zaharia	06e5e6627f	Renamed "splits" to "partitions"	2013-02-17 22:13:26 -08:00
Matei Zaharia	455d015076	Clean up EC2 script options a bit	2013-02-17 16:53:12 -08:00
Tathagata Das	f98c7da23e	Many changes to ensure better 2nd recovery if 2nd failure happens while recovering from 1st failure - Made the scheduler to checkpoint after clearing old metadata which ensures that a new checkpoint is written as soon as at least one batch gets computed while recovering from a failure. This ensures that if there is a 2nd failure while recovering from 1st failure, the system start 2nd recovery from a newer checkpoint. - Modified Checkpoint writer to write checkpoint in a different thread. - Added a check to make sure that compute for InputDStreams gets called only for strictly increasing times. - Changed implementation of slice to call getOrCompute on parent DStream in time-increasing order. - Added testcase to test slice. - Fixed testGroupByKeyAndWindow testcase in JavaAPISuite to verify results with expected output in an order-independent manner.	2013-02-17 15:06:41 -08:00
Matei Zaharia	08e444df0e	Change EC2 script to use 0.6 AMIs by default, for now	2013-02-17 14:01:48 -08:00
Matei Zaharia	2a907dceb3	Merge pull request #421 from shivaram/spark-ec2-change Switch spark_ec2.py to use the new spark-ec2 scripts.	2013-02-17 13:48:43 -08:00
Matei Zaharia	340cc54e47	Merge pull request #471 from stephenh/parallelrdd Move ParallelCollection into spark.rdd package.	2013-02-16 16:39:15 -08:00
Matei Zaharia	3260b6120e	Merge pull request #470 from stephenh/morek Make CoGroupedRDDs explicitly have the same key type.	2013-02-16 16:38:38 -08:00
Stephen Haberman	924f47dd11	Add RDD.subtract. Instead of reusing the cogroup primitive, this adds a SubtractedRDD that knows it only needs to keep rdd1's values (per split) in memory.	2013-02-16 13:38:42 -06:00
Stephen Haberman	e7713adb99	Move ParallelCollection into spark.rdd package.	2013-02-16 13:20:48 -06:00
Stephen Haberman	ae2234687d	Make CoGroupedRDDs explicitly have the same key type.	2013-02-16 13:10:31 -06:00
Matei Zaharia	9d979fb630	Merge pull request #469 from stephenh/samepartitionercombine If combineByKey is using the same partitioner, skip the shuffle.	2013-02-16 10:07:42 -08:00
Stephen Haberman	4328873294	Add assertion about dependencies.	2013-02-16 01:16:40 -06:00
Stephen Haberman	c34b8ad2c5	Avoid a shuffle if combineByKey is passed the same partitioner.	2013-02-16 00:54:03 -06:00
Stephen Haberman	4281e579c2	Update more javadocs.	2013-02-16 00:45:03 -06:00

... 4 5 6 7 8 ...

2564 commits