Commit graph

396 commits

Author SHA1 Message Date
Prashant Sharma 6fcfefcb27 Few more fixes to tests broken during merge 2013-09-10 10:57:47 +05:30
Prashant Sharma 4106ae9fbf Merged with master 2013-09-06 17:53:01 +05:30
Matei Zaharia 0a8cc30921 Move some classes to more appropriate packages:
* RDD, *RDDFunctions -> org.apache.spark.rdd
* Utils, ClosureCleaner, SizeEstimator -> org.apache.spark.util
* JavaSerializer, KryoSerializer -> org.apache.spark.serializer
2013-09-01 14:13:16 -07:00
Matei Zaharia 5701eb92c7 Fix some URLs 2013-09-01 14:13:16 -07:00
Matei Zaharia 46eecd110a Initial work to rename package to org.apache.spark 2013-09-01 14:13:13 -07:00
Matei Zaharia 5a6ac12840 Merge pull request #701 from ScrapCodes/documentation-suggestions
Documentation suggestions for spark streaming.
2013-08-22 22:08:03 -07:00
Prashant Sharma 2bc348e92c Linking custom receiver guide 2013-08-23 09:44:02 +05:30
Prashant Sharma 3049415e24 Corrections in documentation comment 2013-08-23 09:40:28 +05:30
Jey Kottalam 23f4622aff Remove redundant dependencies from POMs 2013-08-18 18:53:57 -07:00
Jey Kottalam ad580b94d5 Maven build now also works with YARN 2013-08-16 13:50:12 -07:00
Jey Kottalam 11b42a84db Maven build now works with CDH hadoop-2.0.0-mr1 2013-08-16 13:50:12 -07:00
Jey Kottalam 353fab2440 Initial changes to make Maven build agnostic of hadoop version 2013-08-16 13:50:12 -07:00
Josh Rosen d7f78b443b Change scala.Option to Guava Optional in Java APIs. 2013-08-11 12:05:09 -07:00
Reynold Xin c61843a69f Changed other LZF uses to use the compression codec interface. 2013-07-31 10:32:13 -07:00
Matei Zaharia af3c9d5042 Add Apache license headers and LICENSE and NOTICE files 2013-07-16 17:21:33 -07:00
Prashant Sharma 119c98c1be code formatting, The warning related to scope exit and enter is not worth fixing as it only affects debugging scopes and nothing else. 2013-07-16 15:01:33 +05:30
Prashant Sharma 55da6e9504 Fixed warning erasure -> runtimeClass 2013-07-16 14:37:08 +05:30
Prashant Sharma ff14f38f3d Fixed warning Throwables 2013-07-16 14:34:56 +05:30
Prashant Sharma 63addd93a8 Fixed warning ClassManifest -> ClassTag 2013-07-16 14:09:52 +05:30
Prashant Sharma e86d5dbaad Merge branch 'master' into master-merge
Conflicts:
	README.md
	core/pom.xml
	core/src/main/scala/spark/deploy/JsonProtocol.scala
	core/src/main/scala/spark/deploy/LocalSparkCluster.scala
	core/src/main/scala/spark/deploy/master/Master.scala
	core/src/main/scala/spark/deploy/master/MasterWebUI.scala
	core/src/main/scala/spark/deploy/worker/Worker.scala
	core/src/main/scala/spark/deploy/worker/WorkerWebUI.scala
	core/src/main/scala/spark/storage/BlockManagerUI.scala
	core/src/main/scala/spark/util/AkkaUtils.scala
	pom.xml
	project/SparkBuild.scala
	streaming/src/main/scala/spark/streaming/receivers/ActorReceiver.scala
2013-07-12 14:49:16 +05:30
Matei Zaharia 7dcda9ae74 Merge pull request #688 from markhamstra/scalaDependencies
Fixed SPARK-795 with explicit dependencies
2013-07-08 23:24:23 -07:00
Mark Hamstra 0b39d66f3f pom cleanup 2013-07-08 16:07:09 -07:00
Mark Hamstra afdaf430bd Explicit dependencies for scala-library and scalap to prevent 2.9.2 vs. 2.9.3 problems 2013-07-08 15:40:50 -07:00
Shivaram Venkataraman 3350ad0d7f Catch RejectedExecution exception in Checkpoint handler. 2013-07-07 04:09:37 -07:00
Matei Zaharia 1ffadb2d9e Merge remote-tracking branch 'pwendell/ui-updates'
Conflicts:
	core/src/main/scala/spark/scheduler/DAGScheduler.scala
	core/src/main/scala/spark/util/AkkaUtils.scala
	pom.xml
2013-07-06 15:51:41 -07:00
Matei Zaharia 94871e4703 Merge pull request #655 from tgravescs/master
Add support for running Spark on Yarn on a secure Hadoop Cluster
2013-07-06 15:26:19 -07:00
Tathagata Das 280418ac45 Reduced the number of Iterator to ArrayBuffer copies in NetworkReceiver. 2013-07-05 21:38:21 -07:00
Prashant Sharma a5f1f6a907 Merge branch 'master' into master-merge
Conflicts:
	core/pom.xml
	core/src/main/scala/spark/MapOutputTracker.scala
	core/src/main/scala/spark/RDD.scala
	core/src/main/scala/spark/RDDCheckpointData.scala
	core/src/main/scala/spark/SparkContext.scala
	core/src/main/scala/spark/Utils.scala
	core/src/main/scala/spark/api/python/PythonRDD.scala
	core/src/main/scala/spark/deploy/client/Client.scala
	core/src/main/scala/spark/deploy/master/MasterWebUI.scala
	core/src/main/scala/spark/deploy/worker/Worker.scala
	core/src/main/scala/spark/deploy/worker/WorkerWebUI.scala
	core/src/main/scala/spark/rdd/BlockRDD.scala
	core/src/main/scala/spark/rdd/ZippedRDD.scala
	core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
	core/src/main/scala/spark/storage/BlockManager.scala
	core/src/main/scala/spark/storage/BlockManagerMaster.scala
	core/src/main/scala/spark/storage/BlockManagerMasterActor.scala
	core/src/main/scala/spark/storage/BlockManagerUI.scala
	core/src/main/scala/spark/util/AkkaUtils.scala
	core/src/test/scala/spark/SizeEstimatorSuite.scala
	pom.xml
	project/SparkBuild.scala
	repl/src/main/scala/spark/repl/SparkILoop.scala
	repl/src/test/scala/spark/repl/ReplSuite.scala
	streaming/src/main/scala/spark/streaming/StreamingContext.scala
	streaming/src/main/scala/spark/streaming/api/java/JavaStreamingContext.scala
	streaming/src/main/scala/spark/streaming/dstream/KafkaInputDStream.scala
	streaming/src/main/scala/spark/streaming/util/MasterFailureTest.scala
2013-07-03 11:43:26 +05:30
Y.CORP.YAHOO.COM\tgraves 923cf92900 Rework from pull request. Removed --user option from Spark on Yarn Client, made the user of JAVA_HOME environment
variable conditional on if its set, and created addCredentials in each of the SparkHadoopUtil classes
to only add the credentials when the profile is hadoop2-yarn.
2013-07-02 21:18:59 -05:00
Matei Zaharia 4358acfe07 Initialize Twitter4J OAuth from system properties instead of prompting 2013-06-29 15:25:06 -07:00
Matei Zaharia 1667158544 Merge remote-tracking branch 'mrpotes/master' 2013-06-29 14:36:09 -07:00
Patrick Wendell 362d996c81 Handful of changes based on matei's review
- Avoid exception when no tasks have finished for a stage
- Adding DOCTYPE so css renders properly
- Adding progress slider
2013-06-27 19:14:28 -07:00
James Phillpotts 366572edca Include a default OAuth implementation, and update examples and JavaStreamingContext 2013-06-25 22:59:34 +01:00
Tathagata Das c89af0a7f9 Merge branch 'master' into streaming
Conflicts:
	.gitignore
2013-06-24 23:57:47 -07:00
Tathagata Das 48c7e373c6 Minor formatting fixes 2013-06-24 23:11:04 -07:00
Tathagata Das 1249e9153b Merge pull request #572 from Reinvigorate/sm-block-interval
Adding spark.streaming.blockInterval property
2013-06-24 21:46:33 -07:00
Tathagata Das cfcda95f86 Merge pull request #571 from Reinvigorate/sm-kafka-serializers
Surfacing decoders on KafkaInputDStream
2013-06-24 21:44:50 -07:00
James Phillpotts 8955787a59 Twitter API v1 is retired - username/password auth no longer possible 2013-06-24 09:15:17 +01:00
James Phillpotts 93a1643405 Allow other twitter authorizations than username/password 2013-06-21 14:21:52 +01:00
Thomas Graves 75d78c7ac9 Add support for Spark on Yarn on a secure Hadoop cluster 2013-06-19 11:18:42 -05:00
Jey Kottalam e7982c798e Exclude old versions of Netty from Maven-based build 2013-05-18 21:24:58 -07:00
seanm f25282def5 fixing kafkaStream Java API and adding test 2013-05-10 17:34:28 -06:00
seanm 3632980b1b fixing indentation 2013-05-10 15:54:26 -06:00
seanm b95c1bdbba count() now uses a transform instead of ConstantInputDStream 2013-05-10 12:47:24 -06:00
seanm d761e7359d adding kafkaStream API tests 2013-05-10 12:05:10 -06:00
Reynold Xin 90577ada69 Merge branch 'shuffle-performance-fix-0.7' of github.com:shane-huang/spark into shufflemerge
Conflicts:
	core/src/main/scala/spark/storage/BlockManager.scala
	core/src/main/scala/spark/storage/DiskStore.scala
	project/SparkBuild.scala
2013-05-07 15:56:19 -07:00
Prashant Sharma 4041a2689e Updated to latest stable scala 2.10.1 and akka 2.1.2 2013-05-01 11:35:35 +05:30
Prashant Sharma 24bbf318b3 Fixied other warnings 2013-04-29 19:56:28 +05:30
Prashant Sharma d3518f57cd Fixed warning: erasure -> runtimeClass 2013-04-29 18:14:25 +05:30
Prashant Sharma 8f3ac240cb Fixed Warning: ClassManifest -> ClassTag 2013-04-29 16:39:13 +05:30
Prashant Sharma 4b4a36ea7d Fixed pom.xml with updated dependencies. 2013-04-29 12:55:43 +05:30
Mridul Muralidharan 430c531464 Remove debug statements 2013-04-29 00:24:30 +05:30
Mridul Muralidharan 3a89a76b87 Make log message more descriptive to aid in debugging 2013-04-29 00:04:12 +05:30
Mridul Muralidharan 7fa6978a1e Allow CheckpointWriter pending tasks to finish 2013-04-28 23:08:10 +05:30
Mridul Muralidharan afee902443 Attempt to fix streaming test failures after yarn branch merge 2013-04-28 22:26:45 +05:30
Prashant Sharma bb4102b0ee Fixed breaking tests in streaming checkpoint suite. Changed RichInt to Int as it is final and not serializable 2013-04-25 14:38:01 +05:30
Prashant Sharma ad88f083a6 scala 2.10 and master merge 2013-04-24 18:08:26 +05:30
Mridul Muralidharan dd515ca3ee Attempt at fixing merge conflict 2013-04-24 09:24:17 +05:30
seanm 7e56e99573 Surfacing decoders on KafkaInputDStream 2013-04-16 17:17:16 -06:00
seanm ab0f834dbb adding spark.streaming.blockInterval property 2013-04-16 11:57:05 -06:00
seanm b42d68c8ce fixing Spark Streaming count() so that 0 will be emitted when there is nothing to count 2013-04-15 12:54:55 -06:00
Matei Zaharia 65caa8f711 Merge remote-tracking branch 'jey/bump-development-version-to-0.8.0'
Conflicts:
	docs/_config.yml
	project/SparkBuild.scala
2013-04-08 12:43:17 -04:00
Mridul Muralidharan 6798a09df8 Add support for building against hadoop2-yarn : adding new maven profile for it 2013-04-07 17:47:38 +05:30
shane-huang df47b40b76 Shuffle Performance fix: Use netty embeded OIO file server instead of ConnectionManager
Shuffle Performance Optimization: do not send 0-byte block requests to reduce network messages
change reference from io.Source to scala.io.Source to avoid looking into io.netty package

Signed-off-by: shane-huang <shengsheng.huang@intel.com>
2013-04-07 14:37:12 +08:00
Jey Kottalam bc8ba222ff Bump development version to 0.8.0 2013-03-28 15:42:01 -07:00
Jey Kottalam b569b3f200 Move streaming test initialization into 'before' blocks 2013-03-28 15:08:41 -07:00
seanm 329ef34c2e fixing autooffset.reset behavior when set to 'largest' 2013-03-26 23:56:15 -06:00
Holden Karau 1f5381119f method first in trait IterableLike is deprecated: use `head' instead 2013-03-24 19:19:40 -07:00
seanm d61978d0ab keeping JavaStreamingContext in sync with StreamingContext + adding comments for better clarity 2013-03-15 23:36:52 -06:00
Mikhail Bautin 7fd2708eda Add a log4j compile dependency to fix build in IntelliJ
Also rename parent project to spark-parent (otherwise it shows up as
"parent" in IntelliJ, which is very confusing).
2013-03-15 11:41:51 -07:00
seanm 33fa1e7e4a removing dependency on ZookeeperConsumerConnector + purging last relic of kafka reliability that never solidified (ie- setOffsets) 2013-03-15 00:10:13 -06:00
seanm d069283211 fixing memory leak in kafka MessageHandler 2013-03-14 23:45:33 -06:00
seanm cfa8e769a8 KafkaInputDStream improvements. Allows more Kafka configurability 2013-03-14 23:45:19 -06:00
Stephen Haberman 0cf320485d Forgot equals. 2013-03-12 00:05:35 -05:00
Stephen Haberman 9e68f48625 More quickly call close in HadoopRDD.
This also refactors out the common "gotNext" iterator pattern into
a shared utility class.
2013-03-11 23:59:17 -05:00
Mark Hamstra b409073102 Instead of failing to bind to a fixed, already-in-use port, let the OS choose an available port for TestServer. 2013-03-01 15:05:07 -08:00
Mark Hamstra 8b06b359da bump version to 0.7.1-SNAPSHOT in the subproject poms to keep the maven build building. 2013-02-28 23:34:34 -08:00
Matei Zaharia 4223be3aa4 Merge pull request #503 from pwendell/bug-fix
createNewSparkContext should use sparkHome/jars/environment.
2013-02-25 19:43:05 -08:00
Patrick Wendell 284ba90958 createNewSparkContext should use sparkHome/jars/environment.
This fixes a bug introduced by Matei's recent change.
2013-02-25 19:40:52 -08:00
Matei Zaharia 5d7b591cfe Pass a code JAR to SparkContext in our examples. Fixes SPARK-594. 2013-02-25 19:34:32 -08:00
Tathagata Das bc4a6eb850 Changed Flume test to use the same port as other tests, so that can be controlled centrally. 2013-02-25 18:04:21 -08:00
Matei Zaharia 4d480ec59e Fixed something that was reported as a compile error in ScalaDoc.
For some reason, ScalaDoc complained about no such constructor for
StreamingContext; it doesn't seem like an actual Scala error but it
prevented sbt publish and from working because docs weren't built.
2013-02-25 15:53:43 -08:00
Matei Zaharia 490f056cdd Allow passing sparkHome and JARs to StreamingContext constructor
Also warns if spark.cleaner.ttl is not set in the version where you pass
your own SparkContext.
2013-02-25 15:13:30 -08:00
Tathagata Das 5ab37be983 Fixed class paths and dependencies based on Matei's comments. 2013-02-24 16:24:52 -08:00
Tathagata Das 28f8b721f6 Added back the initial spark job before starting streaming receivers 2013-02-24 13:01:54 -08:00
Tathagata Das 68c7934b1a Fixed missing dependencies in streaming/pom.xml 2013-02-24 11:51:45 -08:00
Tathagata Das b4eb24de96 Updated streaming programming guide with Java API info, and comments from Patrick. 2013-02-23 23:59:45 -08:00
Tathagata Das 41285eaae3 Fixed differences in APIs of StreamingContext and JavaStreamingContext. Change rawNetworkStream to rawSocketStream, and added twitter, actor, zeroMQ streams to JavaStreamingContext. Also added them to JavaAPISuite. 2013-02-23 16:25:07 -08:00
Tathagata Das d8cee52d52 Merge branch 'mesos-streaming' into streaming 2013-02-22 18:25:34 -08:00
Tathagata Das cfa65ebff1 Merge pull request #480 from MLnick/streaming-eg-algebird
[Streaming] Examples using Twitter's Algebird library
2013-02-22 12:29:04 -08:00
Tathagata Das 688e62718f Merge pull request #479 from ScrapCodes/zeromq-streaming
Zeromq streaming
2013-02-22 12:17:17 -08:00
Tathagata Das 208edaac1b Fixed condition in InputDStream isTimeValid. 2013-02-21 15:22:26 -08:00
Nick Pentreath 16d456742e Merge remote-tracking branch 'upstream/streaming' into streaming-eg-algebird 2013-02-21 09:33:08 +02:00
Tathagata Das 972fe7714f Merge branch 'mesos-streaming' into streaming
Conflicts:
	streaming/src/test/java/spark/streaming/JavaAPISuite.java
2013-02-20 11:06:01 -08:00
Tathagata Das fb9956256d Merge branch 'mesos-master' into streaming
Conflicts:
	core/src/main/scala/spark/rdd/CheckpointRDD.scala
	streaming/src/main/scala/spark/streaming/dstream/ReducedWindowedDStream.scala
2013-02-20 09:01:29 -08:00
Prashant Sharma 4e5b09664c fixes corresponding to review feedback at pull request #479 2013-02-20 19:14:52 +05:30
Patrick Wendell 041c19e5f0 Small changes that were missing in merge 2013-02-19 08:44:20 -08:00
Patrick Wendell fed1122d74 Use RDD type for slice operator in Java.
This commit uses the RDD type in `slice`, making it available to both normal
and pair RDD's in java. It also updates the signature for `slice` to match
changes in the Scala API.
2013-02-19 08:32:38 -08:00
Patrick Wendell 35880de42e Use RDD type for transform operator in Java.
This is an improved implementation of the `transform` operator in Java.
The main difference is that this allows all four possible types of
transform functions

1. JavaRDD -> JavaRDD
2. JavaRDD -> JavaPairRDD
3. JavaPairRDD -> JavaPairRDD
4. JavaPairRDD -> JavaRDD

whereas previously only (1) and (3) were possible.

Conflicts:

	streaming/src/test/java/spark/streaming/JavaAPISuite.java
2013-02-19 08:31:58 -08:00
Patrick Wendell 9d49a6b03f Use RDD type for foreach operator in Java. 2013-02-19 08:30:32 -08:00