Tathagata Das
72d2e1dd77
Fixed bug in Java transformWith, added more Java testcases for transform and transformWith, added missing variations of Java join and cogroup, updated various Scala and Java API docs.
2013-10-22 23:35:51 -07:00
Tathagata Das
0666498799
Updated TransformDStream to allow n-ary DStream transform. Added transformWith, leftOuterJoin and rightOuterJoin operations to DStream for Scala and Java APIs. Also added n-ary union and n-ary transform operations to StreamingContext for Scala and Java APIs.
2013-10-21 05:34:09 -07:00
Matei Zaharia
b5346064d6
Merge pull request #8 from vchekan/checkpoint-ttl-restore
...
Serialize and restore spark.cleaner.ttl to savepoint
In accordance to conversation in spark-dev maillist, preserve spark.cleaner.ttl parameter when serializing checkpoint.
2013-10-15 21:25:03 -07:00
Aaron Davidson
a395911138
Refactor BlockId into an actual type
...
This is an unfortunately invasive change which converts all of our BlockId
strings into actual BlockId types. Here are some advantages of doing this now:
+ Type safety
+ Code clarity - it's now obvious what the key of a shuffle or rdd block is,
for instance. Additionally, appearing in tuple/map type signatures is a big
readability bonus. A Seq[(String, BlockStatus)] is not very clear.
Further, we can now use more Scala features, like matching on BlockId types.
+ Explicit usage - we can now formally tell where various BlockIds are being used
(without doing string searches); this makes updating current BlockIds a much
clearer process, and compiler-supported.
(I'm looking at you, shuffle file consolidation.)
+ It will only get harder to make this change as time goes on.
Since this touches a lot of files, it'd be best to either get this patch
in quickly or throw it on the ground to avoid too many secondary merge conflicts.
2013-10-12 22:44:57 -07:00
Vadim Chekan
fbe40c5806
Serialize and restore spark.cleaner.ttl to savepoint
2013-09-20 12:13:48 -07:00
Matei Zaharia
0a8cc30921
Move some classes to more appropriate packages:
...
* RDD, *RDDFunctions -> org.apache.spark.rdd
* Utils, ClosureCleaner, SizeEstimator -> org.apache.spark.util
* JavaSerializer, KryoSerializer -> org.apache.spark.serializer
2013-09-01 14:13:16 -07:00
Matei Zaharia
46eecd110a
Initial work to rename package to org.apache.spark
2013-09-01 14:13:13 -07:00
Matei Zaharia
5a6ac12840
Merge pull request #701 from ScrapCodes/documentation-suggestions
...
Documentation suggestions for spark streaming.
2013-08-22 22:08:03 -07:00
Prashant Sharma
2bc348e92c
Linking custom receiver guide
2013-08-23 09:44:02 +05:30
Prashant Sharma
3049415e24
Corrections in documentation comment
2013-08-23 09:40:28 +05:30
Josh Rosen
d7f78b443b
Change scala.Option to Guava Optional in Java APIs.
2013-08-11 12:05:09 -07:00
Reynold Xin
c61843a69f
Changed other LZF uses to use the compression codec interface.
2013-07-31 10:32:13 -07:00
Matei Zaharia
af3c9d5042
Add Apache license headers and LICENSE and NOTICE files
2013-07-16 17:21:33 -07:00
Shivaram Venkataraman
3350ad0d7f
Catch RejectedExecution exception in Checkpoint handler.
2013-07-07 04:09:37 -07:00
Matei Zaharia
1ffadb2d9e
Merge remote-tracking branch 'pwendell/ui-updates'
...
Conflicts:
core/src/main/scala/spark/scheduler/DAGScheduler.scala
core/src/main/scala/spark/util/AkkaUtils.scala
pom.xml
2013-07-06 15:51:41 -07:00
Matei Zaharia
94871e4703
Merge pull request #655 from tgravescs/master
...
Add support for running Spark on Yarn on a secure Hadoop Cluster
2013-07-06 15:26:19 -07:00
Tathagata Das
280418ac45
Reduced the number of Iterator to ArrayBuffer copies in NetworkReceiver.
2013-07-05 21:38:21 -07:00
Y.CORP.YAHOO.COM\tgraves
923cf92900
Rework from pull request. Removed --user option from Spark on Yarn Client, made the user of JAVA_HOME environment
...
variable conditional on if its set, and created addCredentials in each of the SparkHadoopUtil classes
to only add the credentials when the profile is hadoop2-yarn.
2013-07-02 21:18:59 -05:00
Matei Zaharia
4358acfe07
Initialize Twitter4J OAuth from system properties instead of prompting
2013-06-29 15:25:06 -07:00
Matei Zaharia
1667158544
Merge remote-tracking branch 'mrpotes/master'
2013-06-29 14:36:09 -07:00
Patrick Wendell
362d996c81
Handful of changes based on matei's review
...
- Avoid exception when no tasks have finished for a stage
- Adding DOCTYPE so css renders properly
- Adding progress slider
2013-06-27 19:14:28 -07:00
James Phillpotts
366572edca
Include a default OAuth implementation, and update examples and JavaStreamingContext
2013-06-25 22:59:34 +01:00
Tathagata Das
c89af0a7f9
Merge branch 'master' into streaming
...
Conflicts:
.gitignore
2013-06-24 23:57:47 -07:00
Tathagata Das
48c7e373c6
Minor formatting fixes
2013-06-24 23:11:04 -07:00
Tathagata Das
1249e9153b
Merge pull request #572 from Reinvigorate/sm-block-interval
...
Adding spark.streaming.blockInterval property
2013-06-24 21:46:33 -07:00
Tathagata Das
cfcda95f86
Merge pull request #571 from Reinvigorate/sm-kafka-serializers
...
Surfacing decoders on KafkaInputDStream
2013-06-24 21:44:50 -07:00
James Phillpotts
8955787a59
Twitter API v1 is retired - username/password auth no longer possible
2013-06-24 09:15:17 +01:00
James Phillpotts
93a1643405
Allow other twitter authorizations than username/password
2013-06-21 14:21:52 +01:00
Thomas Graves
75d78c7ac9
Add support for Spark on Yarn on a secure Hadoop cluster
2013-06-19 11:18:42 -05:00
seanm
f25282def5
fixing kafkaStream Java API and adding test
2013-05-10 17:34:28 -06:00
seanm
3632980b1b
fixing indentation
2013-05-10 15:54:26 -06:00
seanm
b95c1bdbba
count() now uses a transform instead of ConstantInputDStream
2013-05-10 12:47:24 -06:00
Reynold Xin
90577ada69
Merge branch 'shuffle-performance-fix-0.7' of github.com:shane-huang/spark into shufflemerge
...
Conflicts:
core/src/main/scala/spark/storage/BlockManager.scala
core/src/main/scala/spark/storage/DiskStore.scala
project/SparkBuild.scala
2013-05-07 15:56:19 -07:00
Mridul Muralidharan
430c531464
Remove debug statements
2013-04-29 00:24:30 +05:30
Mridul Muralidharan
3a89a76b87
Make log message more descriptive to aid in debugging
2013-04-29 00:04:12 +05:30
Mridul Muralidharan
7fa6978a1e
Allow CheckpointWriter pending tasks to finish
2013-04-28 23:08:10 +05:30
Mridul Muralidharan
afee902443
Attempt to fix streaming test failures after yarn branch merge
2013-04-28 22:26:45 +05:30
seanm
7e56e99573
Surfacing decoders on KafkaInputDStream
2013-04-16 17:17:16 -06:00
seanm
ab0f834dbb
adding spark.streaming.blockInterval property
2013-04-16 11:57:05 -06:00
seanm
b42d68c8ce
fixing Spark Streaming count() so that 0 will be emitted when there is nothing to count
2013-04-15 12:54:55 -06:00
shane-huang
df47b40b76
Shuffle Performance fix: Use netty embeded OIO file server instead of ConnectionManager
...
Shuffle Performance Optimization: do not send 0-byte block requests to reduce network messages
change reference from io.Source to scala.io.Source to avoid looking into io.netty package
Signed-off-by: shane-huang <shengsheng.huang@intel.com>
2013-04-07 14:37:12 +08:00
seanm
329ef34c2e
fixing autooffset.reset behavior when set to 'largest'
2013-03-26 23:56:15 -06:00
Holden Karau
1f5381119f
method first in trait IterableLike is deprecated: use `head' instead
2013-03-24 19:19:40 -07:00
seanm
d61978d0ab
keeping JavaStreamingContext in sync with StreamingContext + adding comments for better clarity
2013-03-15 23:36:52 -06:00
seanm
33fa1e7e4a
removing dependency on ZookeeperConsumerConnector + purging last relic of kafka reliability that never solidified (ie- setOffsets)
2013-03-15 00:10:13 -06:00
seanm
d069283211
fixing memory leak in kafka MessageHandler
2013-03-14 23:45:33 -06:00
seanm
cfa8e769a8
KafkaInputDStream improvements. Allows more Kafka configurability
2013-03-14 23:45:19 -06:00
Stephen Haberman
0cf320485d
Forgot equals.
2013-03-12 00:05:35 -05:00
Stephen Haberman
9e68f48625
More quickly call close in HadoopRDD.
...
This also refactors out the common "gotNext" iterator pattern into
a shared utility class.
2013-03-11 23:59:17 -05:00
Matei Zaharia
4223be3aa4
Merge pull request #503 from pwendell/bug-fix
...
createNewSparkContext should use sparkHome/jars/environment.
2013-02-25 19:43:05 -08:00