Commit graph

284 commits

Author SHA1 Message Date
Mridul Muralidharan 430c531464 Remove debug statements 2013-04-29 00:24:30 +05:30
Mridul Muralidharan 3a89a76b87 Make log message more descriptive to aid in debugging 2013-04-29 00:04:12 +05:30
Mridul Muralidharan 7fa6978a1e Allow CheckpointWriter pending tasks to finish 2013-04-28 23:08:10 +05:30
Mridul Muralidharan afee902443 Attempt to fix streaming test failures after yarn branch merge 2013-04-28 22:26:45 +05:30
Mridul Muralidharan dd515ca3ee Attempt at fixing merge conflict 2013-04-24 09:24:17 +05:30
Matei Zaharia 65caa8f711 Merge remote-tracking branch 'jey/bump-development-version-to-0.8.0'
Conflicts:
	docs/_config.yml
	project/SparkBuild.scala
2013-04-08 12:43:17 -04:00
Mridul Muralidharan 6798a09df8 Add support for building against hadoop2-yarn : adding new maven profile for it 2013-04-07 17:47:38 +05:30
Jey Kottalam bc8ba222ff Bump development version to 0.8.0 2013-03-28 15:42:01 -07:00
Jey Kottalam b569b3f200 Move streaming test initialization into 'before' blocks 2013-03-28 15:08:41 -07:00
Holden Karau 1f5381119f method first in trait IterableLike is deprecated: use `head' instead 2013-03-24 19:19:40 -07:00
Mikhail Bautin 7fd2708eda Add a log4j compile dependency to fix build in IntelliJ
Also rename parent project to spark-parent (otherwise it shows up as
"parent" in IntelliJ, which is very confusing).
2013-03-15 11:41:51 -07:00
Stephen Haberman 0cf320485d Forgot equals. 2013-03-12 00:05:35 -05:00
Stephen Haberman 9e68f48625 More quickly call close in HadoopRDD.
This also refactors out the common "gotNext" iterator pattern into
a shared utility class.
2013-03-11 23:59:17 -05:00
Mark Hamstra b409073102 Instead of failing to bind to a fixed, already-in-use port, let the OS choose an available port for TestServer. 2013-03-01 15:05:07 -08:00
Mark Hamstra 8b06b359da bump version to 0.7.1-SNAPSHOT in the subproject poms to keep the maven build building. 2013-02-28 23:34:34 -08:00
Matei Zaharia 4223be3aa4 Merge pull request #503 from pwendell/bug-fix
createNewSparkContext should use sparkHome/jars/environment.
2013-02-25 19:43:05 -08:00
Patrick Wendell 284ba90958 createNewSparkContext should use sparkHome/jars/environment.
This fixes a bug introduced by Matei's recent change.
2013-02-25 19:40:52 -08:00
Matei Zaharia 5d7b591cfe Pass a code JAR to SparkContext in our examples. Fixes SPARK-594. 2013-02-25 19:34:32 -08:00
Tathagata Das bc4a6eb850 Changed Flume test to use the same port as other tests, so that can be controlled centrally. 2013-02-25 18:04:21 -08:00
Matei Zaharia 4d480ec59e Fixed something that was reported as a compile error in ScalaDoc.
For some reason, ScalaDoc complained about no such constructor for
StreamingContext; it doesn't seem like an actual Scala error but it
prevented sbt publish and from working because docs weren't built.
2013-02-25 15:53:43 -08:00
Matei Zaharia 490f056cdd Allow passing sparkHome and JARs to StreamingContext constructor
Also warns if spark.cleaner.ttl is not set in the version where you pass
your own SparkContext.
2013-02-25 15:13:30 -08:00
Tathagata Das 5ab37be983 Fixed class paths and dependencies based on Matei's comments. 2013-02-24 16:24:52 -08:00
Tathagata Das 28f8b721f6 Added back the initial spark job before starting streaming receivers 2013-02-24 13:01:54 -08:00
Tathagata Das 68c7934b1a Fixed missing dependencies in streaming/pom.xml 2013-02-24 11:51:45 -08:00
Tathagata Das b4eb24de96 Updated streaming programming guide with Java API info, and comments from Patrick. 2013-02-23 23:59:45 -08:00
Tathagata Das 41285eaae3 Fixed differences in APIs of StreamingContext and JavaStreamingContext. Change rawNetworkStream to rawSocketStream, and added twitter, actor, zeroMQ streams to JavaStreamingContext. Also added them to JavaAPISuite. 2013-02-23 16:25:07 -08:00
Tathagata Das d8cee52d52 Merge branch 'mesos-streaming' into streaming 2013-02-22 18:25:34 -08:00
Tathagata Das cfa65ebff1 Merge pull request #480 from MLnick/streaming-eg-algebird
[Streaming] Examples using Twitter's Algebird library
2013-02-22 12:29:04 -08:00
Tathagata Das 688e62718f Merge pull request #479 from ScrapCodes/zeromq-streaming
Zeromq streaming
2013-02-22 12:17:17 -08:00
Tathagata Das 208edaac1b Fixed condition in InputDStream isTimeValid. 2013-02-21 15:22:26 -08:00
Nick Pentreath 16d456742e Merge remote-tracking branch 'upstream/streaming' into streaming-eg-algebird 2013-02-21 09:33:08 +02:00
Tathagata Das 972fe7714f Merge branch 'mesos-streaming' into streaming
Conflicts:
	streaming/src/test/java/spark/streaming/JavaAPISuite.java
2013-02-20 11:06:01 -08:00
Tathagata Das fb9956256d Merge branch 'mesos-master' into streaming
Conflicts:
	core/src/main/scala/spark/rdd/CheckpointRDD.scala
	streaming/src/main/scala/spark/streaming/dstream/ReducedWindowedDStream.scala
2013-02-20 09:01:29 -08:00
Prashant Sharma 4e5b09664c fixes corresponding to review feedback at pull request #479 2013-02-20 19:14:52 +05:30
Patrick Wendell 041c19e5f0 Small changes that were missing in merge 2013-02-19 08:44:20 -08:00
Patrick Wendell fed1122d74 Use RDD type for slice operator in Java.
This commit uses the RDD type in `slice`, making it available to both normal
and pair RDD's in java. It also updates the signature for `slice` to match
changes in the Scala API.
2013-02-19 08:32:38 -08:00
Patrick Wendell 35880de42e Use RDD type for transform operator in Java.
This is an improved implementation of the `transform` operator in Java.
The main difference is that this allows all four possible types of
transform functions

1. JavaRDD -> JavaRDD
2. JavaRDD -> JavaPairRDD
3. JavaPairRDD -> JavaPairRDD
4. JavaPairRDD -> JavaRDD

whereas previously only (1) and (3) were possible.

Conflicts:

	streaming/src/test/java/spark/streaming/JavaAPISuite.java
2013-02-19 08:31:58 -08:00
Patrick Wendell 9d49a6b03f Use RDD type for foreach operator in Java. 2013-02-19 08:30:32 -08:00
Nick Pentreath d8ee184d95 Dependencies and refactoring for streaming HLL example, and using context.twitterStream method 2013-02-19 17:42:57 +02:00
Prashant Sharma 8d44480d84 example for demonstrating ZeroMQ stream 2013-02-19 19:42:14 +05:30
Prashant Sharma f7d3e309cb ZeroMQ stream as receiver 2013-02-19 19:32:52 +05:30
Tathagata Das 9e82be1503 Merge branch 'streaming' into ScrapCodes-streaming-actor
Conflicts:
	docs/plugin-custom-receiver.md
	streaming/src/main/scala/spark/streaming/StreamingContext.scala
	streaming/src/main/scala/spark/streaming/dstream/KafkaInputDStream.scala
	streaming/src/main/scala/spark/streaming/dstream/PluggableInputDStream.scala
	streaming/src/main/scala/spark/streaming/receivers/ActorReceiver.scala
	streaming/src/test/scala/spark/streaming/InputStreamsSuite.scala
2013-02-19 02:48:50 -08:00
Tathagata Das 12ea14c211 Changed networkStream to socketStream and pluggableNetworkStream to become networkStream as a way to create streams from arbitrary network receiver. 2013-02-18 15:18:34 -08:00
Tathagata Das 6a6e6bda57 Merge branch 'streaming' into ScrapCode-streaming
Conflicts:
	streaming/src/main/scala/spark/streaming/dstream/KafkaInputDStream.scala
	streaming/src/main/scala/spark/streaming/dstream/NetworkInputDStream.scala
2013-02-18 13:26:12 -08:00
Tathagata Das 8ad561dc7d Added checkpointing and fault-tolerance semantics to the programming guide. Fixed default checkpoint interval to being a multiple of slide duration. Fixed visibility of some classes and objects to clean up docs. 2013-02-18 02:12:41 -08:00
Matei Zaharia 7151e1e4c8 Rename "jobs" to "applications" in the standalone cluster 2013-02-17 23:23:08 -08:00
Tathagata Das f98c7da23e Many changes to ensure better 2nd recovery if 2nd failure happens while
recovering from 1st failure
- Made the scheduler to checkpoint after clearing old metadata which
  ensures that a new checkpoint is written as soon as at least one batch
  gets computed  while recovering from a failure. This ensures that if
  there is a 2nd failure while recovering from 1st failure, the system
  start 2nd recovery from a newer checkpoint.
- Modified Checkpoint writer to write checkpoint in a different thread.
- Added a check to make sure that compute for InputDStreams gets called
  only for strictly increasing times.
- Changed implementation of slice to call getOrCompute on parent DStream
  in time-increasing order.
- Added testcase to test slice.
- Fixed testGroupByKeyAndWindow testcase in JavaAPISuite to verify
  results with expected output in an order-independent manner.
2013-02-17 15:06:41 -08:00
Stephen Haberman ae2234687d Make CoGroupedRDDs explicitly have the same key type. 2013-02-16 13:10:31 -06:00
Tathagata Das ddcb976b0d Made MasterFailureTest more robust. 2013-02-15 06:54:47 +00:00
Tathagata Das 4b8402e900 Moved Java streaming examples to examples/src/main/java/spark/streaming/... and fixed logging in NetworkInputTracker to highlight errors when receiver deregisters/shuts down. 2013-02-14 18:10:37 -08:00