ODIn/spark-instrumented-optimizer

Author	SHA1	Message	Date
Tathagata Das	72d2e1dd77	Fixed bug in Java transformWith, added more Java testcases for transform and transformWith, added missing variations of Java join and cogroup, updated various Scala and Java API docs.	2013-10-22 23:35:51 -07:00
Tathagata Das	0666498799	Updated TransformDStream to allow n-ary DStream transform. Added transformWith, leftOuterJoin and rightOuterJoin operations to DStream for Scala and Java APIs. Also added n-ary union and n-ary transform operations to StreamingContext for Scala and Java APIs.	2013-10-21 05:34:09 -07:00
Matei Zaharia	b5346064d6	Merge pull request #8 from vchekan/checkpoint-ttl-restore Serialize and restore spark.cleaner.ttl to savepoint In accordance to conversation in spark-dev maillist, preserve spark.cleaner.ttl parameter when serializing checkpoint.	2013-10-15 21:25:03 -07:00
Aaron Davidson	a395911138	Refactor BlockId into an actual type This is an unfortunately invasive change which converts all of our BlockId strings into actual BlockId types. Here are some advantages of doing this now: + Type safety + Code clarity - it's now obvious what the key of a shuffle or rdd block is, for instance. Additionally, appearing in tuple/map type signatures is a big readability bonus. A Seq[(String, BlockStatus)] is not very clear. Further, we can now use more Scala features, like matching on BlockId types. + Explicit usage - we can now formally tell where various BlockIds are being used (without doing string searches); this makes updating current BlockIds a much clearer process, and compiler-supported. (I'm looking at you, shuffle file consolidation.) + It will only get harder to make this change as time goes on. Since this touches a lot of files, it'd be best to either get this patch in quickly or throw it on the ground to avoid too many secondary merge conflicts.	2013-10-12 22:44:57 -07:00
Vadim Chekan	fbe40c5806	Serialize and restore spark.cleaner.ttl to savepoint	2013-09-20 12:13:48 -07:00
Matei Zaharia	0a8cc30921	Move some classes to more appropriate packages: * RDD, RDDFunctions -> org.apache.spark.rdd Utils, ClosureCleaner, SizeEstimator -> org.apache.spark.util * JavaSerializer, KryoSerializer -> org.apache.spark.serializer	2013-09-01 14:13:16 -07:00
Matei Zaharia	46eecd110a	Initial work to rename package to org.apache.spark	2013-09-01 14:13:13 -07:00
Matei Zaharia	5a6ac12840	Merge pull request #701 from ScrapCodes/documentation-suggestions Documentation suggestions for spark streaming.	2013-08-22 22:08:03 -07:00
Prashant Sharma	2bc348e92c	Linking custom receiver guide	2013-08-23 09:44:02 +05:30
Prashant Sharma	3049415e24	Corrections in documentation comment	2013-08-23 09:40:28 +05:30
Josh Rosen	d7f78b443b	Change scala.Option to Guava Optional in Java APIs.	2013-08-11 12:05:09 -07:00
Reynold Xin	c61843a69f	Changed other LZF uses to use the compression codec interface.	2013-07-31 10:32:13 -07:00
Matei Zaharia	af3c9d5042	Add Apache license headers and LICENSE and NOTICE files	2013-07-16 17:21:33 -07:00
Shivaram Venkataraman	3350ad0d7f	Catch RejectedExecution exception in Checkpoint handler.	2013-07-07 04:09:37 -07:00
Matei Zaharia	1ffadb2d9e	Merge remote-tracking branch 'pwendell/ui-updates' Conflicts: core/src/main/scala/spark/scheduler/DAGScheduler.scala core/src/main/scala/spark/util/AkkaUtils.scala pom.xml	2013-07-06 15:51:41 -07:00
Matei Zaharia	94871e4703	Merge pull request #655 from tgravescs/master Add support for running Spark on Yarn on a secure Hadoop Cluster	2013-07-06 15:26:19 -07:00
Tathagata Das	280418ac45	Reduced the number of Iterator to ArrayBuffer copies in NetworkReceiver.	2013-07-05 21:38:21 -07:00
$Y.CORP.YAHOO.COM\tgraves$ Y.CORP.YAHOO.COM\tgraves	923cf92900	Rework from pull request. Removed --user option from Spark on Yarn Client, made the user of JAVA_HOME environment variable conditional on if its set, and created addCredentials in each of the SparkHadoopUtil classes to only add the credentials when the profile is hadoop2-yarn.	2013-07-02 21:18:59 -05:00
Matei Zaharia	4358acfe07	Initialize Twitter4J OAuth from system properties instead of prompting	2013-06-29 15:25:06 -07:00
Matei Zaharia	1667158544	Merge remote-tracking branch 'mrpotes/master'	2013-06-29 14:36:09 -07:00
Patrick Wendell	362d996c81	Handful of changes based on matei's review - Avoid exception when no tasks have finished for a stage - Adding DOCTYPE so css renders properly - Adding progress slider	2013-06-27 19:14:28 -07:00
James Phillpotts	366572edca	Include a default OAuth implementation, and update examples and JavaStreamingContext	2013-06-25 22:59:34 +01:00
Tathagata Das	c89af0a7f9	Merge branch 'master' into streaming Conflicts: .gitignore	2013-06-24 23:57:47 -07:00
Tathagata Das	48c7e373c6	Minor formatting fixes	2013-06-24 23:11:04 -07:00
Tathagata Das	1249e9153b	Merge pull request #572 from Reinvigorate/sm-block-interval Adding spark.streaming.blockInterval property	2013-06-24 21:46:33 -07:00
Tathagata Das	cfcda95f86	Merge pull request #571 from Reinvigorate/sm-kafka-serializers Surfacing decoders on KafkaInputDStream	2013-06-24 21:44:50 -07:00
James Phillpotts	8955787a59	Twitter API v1 is retired - username/password auth no longer possible	2013-06-24 09:15:17 +01:00
James Phillpotts	93a1643405	Allow other twitter authorizations than username/password	2013-06-21 14:21:52 +01:00
Thomas Graves	75d78c7ac9	Add support for Spark on Yarn on a secure Hadoop cluster	2013-06-19 11:18:42 -05:00
seanm	f25282def5	fixing kafkaStream Java API and adding test	2013-05-10 17:34:28 -06:00
seanm	3632980b1b	fixing indentation	2013-05-10 15:54:26 -06:00
seanm	b95c1bdbba	count() now uses a transform instead of ConstantInputDStream	2013-05-10 12:47:24 -06:00
Reynold Xin	90577ada69	Merge branch 'shuffle-performance-fix-0.7' of github.com:shane-huang/spark into shufflemerge Conflicts: core/src/main/scala/spark/storage/BlockManager.scala core/src/main/scala/spark/storage/DiskStore.scala project/SparkBuild.scala	2013-05-07 15:56:19 -07:00
Mridul Muralidharan	430c531464	Remove debug statements	2013-04-29 00:24:30 +05:30
Mridul Muralidharan	3a89a76b87	Make log message more descriptive to aid in debugging	2013-04-29 00:04:12 +05:30
Mridul Muralidharan	7fa6978a1e	Allow CheckpointWriter pending tasks to finish	2013-04-28 23:08:10 +05:30
Mridul Muralidharan	afee902443	Attempt to fix streaming test failures after yarn branch merge	2013-04-28 22:26:45 +05:30
seanm	7e56e99573	Surfacing decoders on KafkaInputDStream	2013-04-16 17:17:16 -06:00
seanm	ab0f834dbb	adding spark.streaming.blockInterval property	2013-04-16 11:57:05 -06:00
seanm	b42d68c8ce	fixing Spark Streaming count() so that 0 will be emitted when there is nothing to count	2013-04-15 12:54:55 -06:00
shane-huang	df47b40b76	Shuffle Performance fix: Use netty embeded OIO file server instead of ConnectionManager Shuffle Performance Optimization: do not send 0-byte block requests to reduce network messages change reference from io.Source to scala.io.Source to avoid looking into io.netty package Signed-off-by: shane-huang <shengsheng.huang@intel.com>	2013-04-07 14:37:12 +08:00
seanm	329ef34c2e	fixing autooffset.reset behavior when set to 'largest'	2013-03-26 23:56:15 -06:00
Holden Karau	1f5381119f	method first in trait IterableLike is deprecated: use `head' instead	2013-03-24 19:19:40 -07:00
seanm	d61978d0ab	keeping JavaStreamingContext in sync with StreamingContext + adding comments for better clarity	2013-03-15 23:36:52 -06:00
seanm	33fa1e7e4a	removing dependency on ZookeeperConsumerConnector + purging last relic of kafka reliability that never solidified (ie- setOffsets)	2013-03-15 00:10:13 -06:00
seanm	d069283211	fixing memory leak in kafka MessageHandler	2013-03-14 23:45:33 -06:00
seanm	cfa8e769a8	KafkaInputDStream improvements. Allows more Kafka configurability	2013-03-14 23:45:19 -06:00
Stephen Haberman	0cf320485d	Forgot equals.	2013-03-12 00:05:35 -05:00
Stephen Haberman	9e68f48625	More quickly call close in HadoopRDD. This also refactors out the common "gotNext" iterator pattern into a shared utility class.	2013-03-11 23:59:17 -05:00
Matei Zaharia	4223be3aa4	Merge pull request #503 from pwendell/bug-fix createNewSparkContext should use sparkHome/jars/environment.	2013-02-25 19:43:05 -08:00

1 2 3 4 5 ...

286 commits