ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Tathagata Das	2c87c853ba	Renamed examples	2012-10-22 15:31:19 -07:00
Thomas Dudziak	f595bb53d1	Tweaked run file to live more happily with typesafe's debian package	2012-10-22 13:11:05 -07:00
Matei Zaharia	0967e71a00	Bump up version to 0.7.0-SNAPSHOT for master branch	2012-10-22 11:49:42 -07:00
Matei Zaharia	902a608187	Update version to 0.6.1-SNAPSHOT to show this is in development	2012-10-22 11:43:57 -07:00
Josh Rosen	d4f2e5b0ef	Remove PYTHONPATH from SparkContext's executorEnvs. It makes more sense to pass it in the dictionary of environment variables that is used to construct PythonRDD.	2012-10-22 10:28:59 -07:00
Tathagata Das	d85c66636b	Added MapValueDStream, FlatMappedValuesDStream and CoGroupedDStream, and therefore DStream operations mapValue, flatMapValues, cogroup, and join. Also, added tests for DStream operations filter, glom, mapPartitions, groupByKey, mapValues, flatMapValues, cogroup, and join.	2012-10-21 17:40:08 -07:00
Tathagata Das	c4a2b6f636	Fixed some bugs in tests for forgetting RDDs, and made sure that use of manual clock leads to a zeroTime of 0 in the DStreams (more intuitive).	2012-10-21 10:41:25 -07:00
Matei Zaharia	1be335e8fa	Merge branch 'master' into dev	2012-10-21 00:05:02 -07:00
Matei Zaharia	15e95be2fd	Merge pull request #285 from tomdz/cdh4-dev Support for Hadoop 2 distributions such as cdh4	2012-10-20 23:35:01 -07:00
Matei Zaharia	6999724ce8	Fix a path in the web UI	2012-10-20 23:33:37 -07:00
Patrick Wendell	45430f2cb9	Merge pull request #290 from pwendell/dev Two trivial commits to test JIRA integration	2012-10-19 23:18:44 -07:00
Patrick Wendell	cd0936529b	SPARK-581 #resolve Removing whitespace to test JIRA	2012-10-19 23:17:44 -07:00
Patrick Wendell	d50028b345	Adding whitespace to test JIRA integration	2012-10-19 23:17:44 -07:00
Josh Rosen	c23bf1aff4	Add PySpark README and run scripts.	2012-10-20 00:22:27 +00:00
Tathagata Das	6d5eb4b40c	Added functionality to forget RDDs from DStreams.	2012-10-19 12:11:44 -07:00
Josh Rosen	52989c8a2c	Update Python API for v0.6.0 compatibility.	2012-10-19 10:24:49 -07:00
Josh Rosen	e21eb6e00d	Merge tag 'v0.6.0' into python-api	2012-10-19 09:44:32 -07:00
Matei Zaharia	bff5ceff53	Merge pull request #287 from rxin/startslave Use SPARK_MASTER_IP if it is set in start-slaves.sh.	2012-10-19 01:12:05 -07:00
Reynold Xin	f67bcbed07	Use SPARK_MASTER_IP if it is set in start-slaves.sh.	2012-10-19 01:08:23 -07:00
Thomas Dudziak	d9c2a89c57	Support for Hadoop 2 distributions such as cdh4	2012-10-18 16:08:54 -07:00
Josh Rosen	365a4c1e68	Allow EC2 script to stop/destroy cluster after master/slave failures.	2012-10-18 10:36:50 -07:00
Reynold Xin	4a3fb06ac2	Updated Kryo to 2.20.	2012-10-16 01:10:01 -07:00
Reynold Xin	3b97124604	Changed Spark version back to 0.6.0	2012-10-15 21:39:51 -07:00
Reynold Xin	63fae9bc23	Serialize accumulator updates in TaskResult for local mode.	2012-10-15 21:38:28 -07:00
Reynold Xin	9087a1abef	Changed version to 0.6.0-rxin.	2012-10-15 13:54:04 -07:00
Tathagata Das	b760d6426a	Minor modifications.	2012-10-15 12:26:44 -07:00
Matei Zaharia	388a111153	Fix sbt assembly's merge rules	2012-10-15 10:21:16 -07:00
Reynold Xin	42d20fa8da	Added a method to report slave memory status.	2012-10-14 22:30:53 -07:00
Tathagata Das	3f1aae5c71	Refactored DStreamSuiteBase to create CheckpointSuite- testsuite for testing checkpointing under different operations.	2012-10-14 21:39:30 -07:00
Matei Zaharia	63fe4e9d33	Merge pull request #279 from pwendell/dev Removing credentials line in build.	2012-10-14 19:36:41 -07:00
Patrick Wendell	629dd2691e	Removing credentials line in build.	2012-10-14 19:33:39 -07:00
Matei Zaharia	f8768da418	Comment out PGP stuff for publish-local to work	2012-10-14 17:37:21 -07:00
Matei Zaharia	1f06445b03	tweak	2012-10-14 12:04:58 -07:00
Matei Zaharia	4947bd0958	tweak	2012-10-14 12:02:58 -07:00
Matei Zaharia	6c766a9187	tweak	2012-10-14 12:02:32 -07:00
Matei Zaharia	8192fe0325	Merge branch 'dev' of github.com:mesos/spark into dev	2012-10-14 12:01:38 -07:00
Matei Zaharia	1c73d8974d	Update README	2012-10-14 12:00:25 -07:00
Matei Zaharia	7855bacd26	Merge pull request #278 from pwendell/quickstart-fix Adding dependency repos in quickstart example	2012-10-14 11:52:24 -07:00
Patrick Wendell	7a03a0e35d	Adding dependency repos in quickstart example	2012-10-14 11:48:24 -07:00
Matei Zaharia	64dbf8d372	Made ShuffleDependency automatically find a shuffle ID for itself	2012-10-14 10:00:22 -07:00
Matei Zaharia	64b52166ee	Changed default Hadoop version back to 0.20.205	2012-10-14 09:51:34 -07:00
Tathagata Das	b08708e6fc	Fixed bugs in the streaming testsuites.	2012-10-13 21:02:24 -07:00
Tathagata Das	e95ff45b53	Implemented checkpointing of StreamingContext and DStream graph.	2012-10-13 20:10:49 -07:00
Matei Zaharia	4be12d97ec	Some doc fixes, including showing version number in nav bar again	2012-10-13 19:05:11 -07:00
Matei Zaharia	19910c00c3	tweaks	2012-10-13 16:22:39 -07:00
Matei Zaharia	4a3e9cf69c	Document how to configure SPARK_MEM & co on a per-job basis	2012-10-13 16:20:25 -07:00
Matei Zaharia	ce6b5a3ee5	Uncomment Maven publishing stuff and set version to 0.6.0	2012-10-13 15:55:39 -07:00
Matei Zaharia	8815aeba0c	Take executor environment vars as an arguemnt to SparkContext	2012-10-13 15:31:11 -07:00
Josh Rosen	33cd3a0c12	Remove map-side combining from ShuffleMapTask. This separation of concerns simplifies the ShuffleDependency and ShuffledRDD interfaces. Map-side combining can be performed in a mapPartitions() call prior to shuffling the RDD. I don't anticipate this having much of a performance impact: in both approaches, each tuple is hashed twice: once in the bucket partitioning and once in the combiner's hashtable. The same steps are being performed, but in a different order and through one extra Iterator.	2012-10-13 14:59:20 -07:00
Josh Rosen	10bcd217d2	Remove mapSideCombine field from Aggregator. Instead, the presence or absense of a ShuffleDependency's aggregator will control whether map-side combining is performed.	2012-10-13 14:59:20 -07:00

... 357 358 359 360 361 ...

19359 commits