ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Tathagata Das	3bcc6e5c03	Merge pull request #466 from pwendell/java-stream-transform STREAMING-50: Support transform workaround in JavaPairDStream	2013-02-14 21:30:55 -08:00
Tathagata Das	4b8402e900	Moved Java streaming examples to examples/src/main/java/spark/streaming/... and fixed logging in NetworkInputTracker to highlight errors when receiver deregisters/shuts down.	2013-02-14 18:10:37 -08:00
Tathagata Das	def8126d77	Added TwitterInputDStream from example to StreamingContext. Renamed example TwitterBasic to TwitterPopularTags.	2013-02-14 17:49:43 -08:00
Tathagata Das	2eacf22401	Removed countByKeyAndWindow on paired DStreams, and added countByValueAndWindow for all DStreams. Updated both scala and java API and testsuites.	2013-02-14 12:21:47 -08:00
Tathagata Das	03e8dc6861	Changes functions comments to make them more consistent.	2013-02-13 20:59:29 -08:00
Tathagata Das	12b020b668	Added filter functionality to reduceByKeyAndWindow with inverse. Consolidated reduceByKeyAndWindow's many functions into smaller number of functions with optional parameters.	2013-02-13 20:53:50 -08:00
Imran Rashid	893bad9089	use appid instead of frameworkid; simplify stupid condition	2013-02-13 20:30:21 -08:00
Matei Zaharia	e8663e0fe5	Merge pull request #461 from JoshRosen/fix/issue-tracker-link Update issue tracker link in contributing guide	2013-02-13 18:42:17 -08:00
Imran Rashid	8f18e7e863	include jobid in Executor commandline args	2013-02-13 13:05:13 -08:00
Tathagata Das	39addd3803	Changed scheduler and file input stream to fix bugs in the driver fault tolerance. Added MasterFailureTest to rigorously test master fault tolerance with file input stream.	2013-02-13 12:17:45 -08:00
Patrick Wendell	3f3e77f28b	STREAMING-50: Support transform workaround in JavaPairDStream This ports a useful workaround (the `transform` function) to JavaPairDStream. It is necessary to do things like sorting which are not supported yet in the core streaming API.	2013-02-12 14:02:32 -08:00
Matei Zaharia	fd7e414bd0	Merge pull request #464 from pwendell/java-type-fix SPARK-694: All references to [K, V] in JavaDStreamLike should be changed to [K2, V2]	2013-02-11 19:19:05 -08:00
Matei Zaharia	bfeed4725d	Merge pull request #465 from pwendell/java-sort-fix SPARK-696: sortByKey should use 'ascending' parameter	2013-02-11 18:23:12 -08:00
Patrick Wendell	21df6ffc13	SPARK-696: sortByKey should use 'ascending' parameter	2013-02-11 17:43:26 -08:00
Matei Zaharia	582d31dff9	Formatting fixes	2013-02-11 13:24:54 -08:00
Matei Zaharia	ea08537143	Fixed an exponential recursion that could happen with doCheckpoint due to lack of memoization	2013-02-11 13:23:50 -08:00
Josh Rosen	e9fb25426e	Remove hack workaround for SPARK-668. Renaming the type paramters solves this problem (see SPARK-694). I tried this fix earlier, but it didn't work because I didn't run `sbt/sbt clean` first.	2013-02-11 11:19:20 -08:00
Patrick Wendell	d09c36065c	Using tuple swap()	2013-02-11 10:45:45 -08:00
Patrick Wendell	04786d0739	small fix	2013-02-11 10:05:49 -08:00
Patrick Wendell	c65988bdc1	Fix for MapPartitions	2013-02-11 10:03:37 -08:00
Patrick Wendell	20cf770545	Fix for flatmap	2013-02-11 10:03:37 -08:00
Patrick Wendell	314d87a038	Indentation fix	2013-02-11 10:03:37 -08:00
Patrick Wendell	f0b68c623c	Initial cut at replacing K, V in Java files	2013-02-11 10:03:37 -08:00
Imran Rashid	e9f53ec0ea	undo chnage to onCompleteCallbacks	2013-02-11 09:36:49 -08:00
Matei Zaharia	da8afbc77e	Some bug and formatting fixes to FT Conflicts: core/src/main/scala/spark/scheduler/cluster/SparkDeploySchedulerBackend.scala core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala	2013-02-10 22:43:38 -08:00
root	1b47fa2752	Detect hard crashes of workers using a heartbeat mechanism. Also fixes some issues in the rest of the code with detecting workers this way. Conflicts: core/src/main/scala/spark/deploy/master/Master.scala core/src/main/scala/spark/deploy/worker/Worker.scala core/src/main/scala/spark/scheduler/cluster/SparkDeploySchedulerBackend.scala core/src/main/scala/spark/scheduler/cluster/StandaloneClusterMessage.scala core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala	2013-02-10 22:28:28 -08:00
Matei Zaharia	05d2e94838	Use a separate memory setting for standalone cluster daemons Conflicts: docs/_config.yml	2013-02-10 21:59:41 -08:00
Matei Zaharia	8c66c49962	Tweak web UI so that people don't get confused about master URL format Conflicts: core/src/main/twirl/spark/deploy/master/index.scala.html core/src/main/twirl/spark/deploy/worker/index.scala.html	2013-02-10 21:58:34 -08:00
Matei Zaharia	0b788b760b	Update Windows scripts to launch daemons with less RAM and fix a few other issues Conflicts: run2.cmd	2013-02-10 21:51:49 -08:00
Imran Rashid	d9461b15d3	cleanup a bunch of imports	2013-02-10 21:41:40 -08:00
Tathagata Das	fd90daf850	Fixed bugs in FileInputDStream and Scheduler that occasionally failed to reprocess old files after recovering from master failure. Completely modified spark.streaming.FailureTest to test multiple master failures using file input stream.	2013-02-10 19:48:42 -08:00
Tathagata Das	16baea62bc	Fixed bug in CheckpointRDD to prevent exception when the original RDD had zero splits.	2013-02-10 19:14:49 -08:00
Imran Rashid	383af599bb	SparkContext.addSparkListener; "std" listener in StatsReportListener	2013-02-10 14:19:37 -08:00
Imran Rashid	b7d9e24394	use TaskMetrics to gather all stats; lots of plumbing to get it all the way back to driver	2013-02-10 14:18:52 -08:00
Josh Rosen	131b56afd0	Update issue tracker link in contributing guide.	2013-02-10 13:28:31 -08:00
Matei Zaharia	b1d809913b	Merge pull request #460 from markhamstra/404 Fixed a 404 in 'Tuning Spark' -- missing '.html'	2013-02-10 13:01:09 -08:00
Mark Hamstra	4975dcdafc	Fixed a 404 -- missing '.html'	2013-02-10 12:55:47 -08:00
Stephen Haberman	680f42e6cd	Change defaultPartitioner to use upstream split size. Previously it used the SparkContext.defaultParallelism, which occassionally ended up being a very bad guess. Looking at upstream RDDs seems to make better use of the context. Also sorted the upstream RDDs by partition size first, as if we have a hugely-partitioned RDD and tiny-partitioned RDD, it is unlikely we want the resulting RDD to be tiny-partitioned.	2013-02-10 02:27:03 -06:00
Patrick Wendell	2ed791fd7f	Minor fixes	2013-02-09 22:00:38 -08:00
Patrick Wendell	1859c9f93c	Changing to use Timer based on code review	2013-02-09 21:55:17 -08:00
Matei Zaharia	ccb1ca4a23	Merge pull request #448 from squito/fetch_maxBytesInFlight add as many fetch requests as we can, subject to maxBytesInFlight	2013-02-09 18:15:18 -08:00
Matei Zaharia	76ac0ce6c0	Merge pull request #446 from pwendell/olap-example SPARK-678: Adding an example with an OLAP roll-up	2013-02-09 18:14:44 -08:00
Matei Zaharia	f750daa510	Merge pull request #452 from stephenh/misc Add RDD.coalesce, clean up some RDDs, other misc.	2013-02-09 18:12:56 -08:00
Stephen Haberman	4619ee0787	Move JavaRDDLike.coalesce into the right places.	2013-02-09 20:05:42 -06:00
Josh Rosen	fc5b2e8b83	Merge pull request #457 from markhamstra/commutative Add commutative requirement for 'reduce' to Python docstring.	2013-02-09 15:54:48 -08:00
Tathagata Das	99a5fc498a	Added an initial spark job to ensure worker nodes are initialized.	2013-02-09 15:18:05 -08:00
Stephen Haberman	921be76533	Use stubs instead of mocks for DAGSchedulerSuite.	2013-02-09 16:42:18 -06:00
Stephen Haberman	fb7599870f	Fix JavaRDDLike.coalesce return type.	2013-02-09 16:10:52 -06:00
Mark Hamstra	b7a1fb5c5d	Add commutative requirement for 'reduce' to Python docstring.	2013-02-09 12:14:11 -08:00
Matei Zaharia	51db4c1f30	Merge pull request #453 from markhamstra/commutative Change docs on 'reduce' since the merging of local reduces no longer pre...	2013-02-09 10:36:30 -08:00

... 5 6 7 8 9 ...

2556 commits