Commit graph

337 commits

Author SHA1 Message Date
Reynold Xin 4e44d65b5e Exclusion rules for Maven build files. 2013-10-19 12:35:55 -07:00
Matei Zaharia b5346064d6 Merge pull request #8 from vchekan/checkpoint-ttl-restore
Serialize and restore spark.cleaner.ttl to savepoint

In accordance to conversation in spark-dev maillist, preserve spark.cleaner.ttl parameter when serializing checkpoint.
2013-10-15 21:25:03 -07:00
Aaron Davidson a395911138 Refactor BlockId into an actual type
This is an unfortunately invasive change which converts all of our BlockId
strings into actual BlockId types. Here are some advantages of doing this now:

+ Type safety

+ Code clarity - it's now obvious what the key of a shuffle or rdd block is,
  for instance. Additionally, appearing in tuple/map type signatures is a big
  readability bonus. A Seq[(String, BlockStatus)] is not very clear.
  Further, we can now use more Scala features, like matching on BlockId types.

+ Explicit usage - we can now formally tell where various BlockIds are being used
  (without doing string searches); this makes updating current BlockIds a much
  clearer process, and compiler-supported.
  (I'm looking at you, shuffle file consolidation.)

+ It will only get harder to make this change as time goes on.

Since this touches a lot of files, it'd be best to either get this patch
in quickly or throw it on the ground to avoid too many secondary merge conflicts.
2013-10-12 22:44:57 -07:00
Patrick Wendell aa9fb84994 Merging build changes in from 0.8 2013-10-05 22:07:00 -07:00
Patrick Wendell 6079721fa1 Update build version in master 2013-09-24 11:41:51 -07:00
Vadim Chekan fbe40c5806 Serialize and restore spark.cleaner.ttl to savepoint 2013-09-20 12:13:48 -07:00
Matei Zaharia 0a8cc30921 Move some classes to more appropriate packages:
* RDD, *RDDFunctions -> org.apache.spark.rdd
* Utils, ClosureCleaner, SizeEstimator -> org.apache.spark.util
* JavaSerializer, KryoSerializer -> org.apache.spark.serializer
2013-09-01 14:13:16 -07:00
Matei Zaharia 5701eb92c7 Fix some URLs 2013-09-01 14:13:16 -07:00
Matei Zaharia 46eecd110a Initial work to rename package to org.apache.spark 2013-09-01 14:13:13 -07:00
Matei Zaharia 5a6ac12840 Merge pull request #701 from ScrapCodes/documentation-suggestions
Documentation suggestions for spark streaming.
2013-08-22 22:08:03 -07:00
Prashant Sharma 2bc348e92c Linking custom receiver guide 2013-08-23 09:44:02 +05:30
Prashant Sharma 3049415e24 Corrections in documentation comment 2013-08-23 09:40:28 +05:30
Jey Kottalam 23f4622aff Remove redundant dependencies from POMs 2013-08-18 18:53:57 -07:00
Jey Kottalam ad580b94d5 Maven build now also works with YARN 2013-08-16 13:50:12 -07:00
Jey Kottalam 11b42a84db Maven build now works with CDH hadoop-2.0.0-mr1 2013-08-16 13:50:12 -07:00
Jey Kottalam 353fab2440 Initial changes to make Maven build agnostic of hadoop version 2013-08-16 13:50:12 -07:00
Josh Rosen d7f78b443b Change scala.Option to Guava Optional in Java APIs. 2013-08-11 12:05:09 -07:00
Reynold Xin c61843a69f Changed other LZF uses to use the compression codec interface. 2013-07-31 10:32:13 -07:00
Matei Zaharia af3c9d5042 Add Apache license headers and LICENSE and NOTICE files 2013-07-16 17:21:33 -07:00
Matei Zaharia 7dcda9ae74 Merge pull request #688 from markhamstra/scalaDependencies
Fixed SPARK-795 with explicit dependencies
2013-07-08 23:24:23 -07:00
Mark Hamstra 0b39d66f3f pom cleanup 2013-07-08 16:07:09 -07:00
Mark Hamstra afdaf430bd Explicit dependencies for scala-library and scalap to prevent 2.9.2 vs. 2.9.3 problems 2013-07-08 15:40:50 -07:00
Shivaram Venkataraman 3350ad0d7f Catch RejectedExecution exception in Checkpoint handler. 2013-07-07 04:09:37 -07:00
Matei Zaharia 1ffadb2d9e Merge remote-tracking branch 'pwendell/ui-updates'
Conflicts:
	core/src/main/scala/spark/scheduler/DAGScheduler.scala
	core/src/main/scala/spark/util/AkkaUtils.scala
	pom.xml
2013-07-06 15:51:41 -07:00
Matei Zaharia 94871e4703 Merge pull request #655 from tgravescs/master
Add support for running Spark on Yarn on a secure Hadoop Cluster
2013-07-06 15:26:19 -07:00
Tathagata Das 280418ac45 Reduced the number of Iterator to ArrayBuffer copies in NetworkReceiver. 2013-07-05 21:38:21 -07:00
Y.CORP.YAHOO.COM\tgraves 923cf92900 Rework from pull request. Removed --user option from Spark on Yarn Client, made the user of JAVA_HOME environment
variable conditional on if its set, and created addCredentials in each of the SparkHadoopUtil classes
to only add the credentials when the profile is hadoop2-yarn.
2013-07-02 21:18:59 -05:00
Matei Zaharia 4358acfe07 Initialize Twitter4J OAuth from system properties instead of prompting 2013-06-29 15:25:06 -07:00
Matei Zaharia 1667158544 Merge remote-tracking branch 'mrpotes/master' 2013-06-29 14:36:09 -07:00
Patrick Wendell 362d996c81 Handful of changes based on matei's review
- Avoid exception when no tasks have finished for a stage
- Adding DOCTYPE so css renders properly
- Adding progress slider
2013-06-27 19:14:28 -07:00
James Phillpotts 366572edca Include a default OAuth implementation, and update examples and JavaStreamingContext 2013-06-25 22:59:34 +01:00
Tathagata Das c89af0a7f9 Merge branch 'master' into streaming
Conflicts:
	.gitignore
2013-06-24 23:57:47 -07:00
Tathagata Das 48c7e373c6 Minor formatting fixes 2013-06-24 23:11:04 -07:00
Tathagata Das 1249e9153b Merge pull request #572 from Reinvigorate/sm-block-interval
Adding spark.streaming.blockInterval property
2013-06-24 21:46:33 -07:00
Tathagata Das cfcda95f86 Merge pull request #571 from Reinvigorate/sm-kafka-serializers
Surfacing decoders on KafkaInputDStream
2013-06-24 21:44:50 -07:00
James Phillpotts 8955787a59 Twitter API v1 is retired - username/password auth no longer possible 2013-06-24 09:15:17 +01:00
James Phillpotts 93a1643405 Allow other twitter authorizations than username/password 2013-06-21 14:21:52 +01:00
Thomas Graves 75d78c7ac9 Add support for Spark on Yarn on a secure Hadoop cluster 2013-06-19 11:18:42 -05:00
Jey Kottalam e7982c798e Exclude old versions of Netty from Maven-based build 2013-05-18 21:24:58 -07:00
seanm f25282def5 fixing kafkaStream Java API and adding test 2013-05-10 17:34:28 -06:00
seanm 3632980b1b fixing indentation 2013-05-10 15:54:26 -06:00
seanm b95c1bdbba count() now uses a transform instead of ConstantInputDStream 2013-05-10 12:47:24 -06:00
seanm d761e7359d adding kafkaStream API tests 2013-05-10 12:05:10 -06:00
Reynold Xin 90577ada69 Merge branch 'shuffle-performance-fix-0.7' of github.com:shane-huang/spark into shufflemerge
Conflicts:
	core/src/main/scala/spark/storage/BlockManager.scala
	core/src/main/scala/spark/storage/DiskStore.scala
	project/SparkBuild.scala
2013-05-07 15:56:19 -07:00
Mridul Muralidharan 430c531464 Remove debug statements 2013-04-29 00:24:30 +05:30
Mridul Muralidharan 3a89a76b87 Make log message more descriptive to aid in debugging 2013-04-29 00:04:12 +05:30
Mridul Muralidharan 7fa6978a1e Allow CheckpointWriter pending tasks to finish 2013-04-28 23:08:10 +05:30
Mridul Muralidharan afee902443 Attempt to fix streaming test failures after yarn branch merge 2013-04-28 22:26:45 +05:30
Mridul Muralidharan dd515ca3ee Attempt at fixing merge conflict 2013-04-24 09:24:17 +05:30
seanm 7e56e99573 Surfacing decoders on KafkaInputDStream 2013-04-16 17:17:16 -06:00