Jey Kottalam
23f4622aff
Remove redundant dependencies from POMs
2013-08-18 18:53:57 -07:00
Jey Kottalam
c1e547bb7f
Updates to repl and example POMs to match SBT build
2013-08-16 13:50:12 -07:00
Jey Kottalam
ad580b94d5
Maven build now also works with YARN
2013-08-16 13:50:12 -07:00
Jey Kottalam
9dd15fe700
Don't mark hadoop-client as 'provided'
2013-08-16 13:50:12 -07:00
Jey Kottalam
11b42a84db
Maven build now works with CDH hadoop-2.0.0-mr1
2013-08-16 13:50:12 -07:00
Jey Kottalam
353fab2440
Initial changes to make Maven build agnostic of hadoop version
2013-08-16 13:50:12 -07:00
Jey Kottalam
4f43fd791a
make SparkHadoopUtil a member of SparkEnv
2013-08-15 16:50:37 -07:00
Evan Sparks
ff9ebfabb4
Merge pull request #762 from shivaram/sgd-cleanup
...
Refactor SGD options into a new class.
2013-08-11 10:52:55 -07:00
Alexander Pivovarov
2d97cc46af
Fixed path to JavaALS.java and JavaKMeans.java, fixed hadoop2-yarn profile
2013-08-10 23:04:50 -07:00
Matei Zaharia
4c4f769187
Optimize Scala PageRank to use reduceByKey
2013-08-10 18:09:54 -07:00
Matei Zaharia
06e4f2a8f2
Merge pull request #789 from MLnick/master
...
Adding Scala version of PageRank example
2013-08-10 18:06:23 -07:00
Matei Zaharia
cd247ba5bb
Merge pull request #786 from shivaram/mllib-java
...
Java fixes, tests and examples for ALS, KMeans
2013-08-09 20:41:13 -07:00
Matei Zaharia
06303a62e5
Optimize JavaPageRank to use reduceByKey instead of groupByKey
2013-08-08 18:50:00 -07:00
Shivaram Venkataraman
2812e72200
Add setters for optimizer, gradient in SGD.
...
Also remove java-specific constructor for LabeledPoint.
2013-08-08 16:24:31 -07:00
Shivaram Venkataraman
e1a209f791
Remove Java-specific constructor for Rating.
...
The scala constructor works for native type java types. Modify examples
to match this.
2013-08-08 14:36:02 -07:00
Nick Pentreath
c4eea875ac
Style changes as per Matei's comments
2013-08-08 12:40:37 +02:00
Nick Pentreath
cce758b893
Adding Scala version of PageRank example
2013-08-07 16:38:52 +02:00
Shivaram Venkataraman
338b7a7455
Merge branch 'master' of git://github.com/mesos/spark into sgd-cleanup
...
Conflicts:
mllib/src/main/scala/spark/mllib/util/MLUtils.scala
2013-08-06 21:21:55 -07:00
Shivaram Venkataraman
7db69d56f2
Refactor GLM algorithms and add Java tests
...
This change adds Java examples and unit tests for all GLM algorithms
to make sure the MLLib interface works from Java. Changes include
- Introduce LabeledPoint and avoid using Doubles in train arguments
- Rename train to run in class methods
- Make the optimizer a member variable of GLM to make sure the builder
pattern works
2013-08-06 17:23:22 -07:00
Shivaram Venkataraman
471fbadd0c
Java examples, tests for KMeans and ALS
...
- Changes ALS to accept RDD[Rating] instead of (Int, Int, Double) making it
easier to call from Java
- Renames class methods from `train` to `run` to enable static methods to be
called from Java.
- Add unit tests which check if both static / class methods can be called.
- Also add examples which port the main() function in ALS, KMeans to the
examples project.
Couple of minor changes to existing code:
- Add a toJavaRDD method in RDD to convert scala RDD to java RDD easily
- Workaround a bug where using double[] from Java leads to class cast exception in
KMeans init
2013-08-06 15:43:46 -07:00
stayhf
882baee489
Got rid of unnecessary map function
2013-08-06 21:34:39 +00:00
stayhf
326a7a82e0
changes as reviewer requested
2013-08-06 21:03:24 +00:00
stayhf
98fd62605d
Updated code with reviewer's suggestions
2013-08-05 00:30:28 +00:00
stayhf
a682637301
Simple PageRank algorithm implementation in Java for SPARK-760
2013-08-03 06:01:16 +00:00
Matei Zaharia
af3c9d5042
Add Apache license headers and LICENSE and NOTICE files
2013-07-16 17:21:33 -07:00
Prashant Sharma
e86d5dbaad
Merge branch 'master' into master-merge
...
Conflicts:
README.md
core/pom.xml
core/src/main/scala/spark/deploy/JsonProtocol.scala
core/src/main/scala/spark/deploy/LocalSparkCluster.scala
core/src/main/scala/spark/deploy/master/Master.scala
core/src/main/scala/spark/deploy/master/MasterWebUI.scala
core/src/main/scala/spark/deploy/worker/Worker.scala
core/src/main/scala/spark/deploy/worker/WorkerWebUI.scala
core/src/main/scala/spark/storage/BlockManagerUI.scala
core/src/main/scala/spark/util/AkkaUtils.scala
pom.xml
project/SparkBuild.scala
streaming/src/main/scala/spark/streaming/receivers/ActorReceiver.scala
2013-07-12 14:49:16 +05:30
Mark Hamstra
0b39d66f3f
pom cleanup
2013-07-08 16:07:09 -07:00
Mark Hamstra
afdaf430bd
Explicit dependencies for scala-library and scalap to prevent 2.9.2 vs. 2.9.3 problems
2013-07-08 15:40:50 -07:00
Prashant Sharma
a5f1f6a907
Merge branch 'master' into master-merge
...
Conflicts:
core/pom.xml
core/src/main/scala/spark/MapOutputTracker.scala
core/src/main/scala/spark/RDD.scala
core/src/main/scala/spark/RDDCheckpointData.scala
core/src/main/scala/spark/SparkContext.scala
core/src/main/scala/spark/Utils.scala
core/src/main/scala/spark/api/python/PythonRDD.scala
core/src/main/scala/spark/deploy/client/Client.scala
core/src/main/scala/spark/deploy/master/MasterWebUI.scala
core/src/main/scala/spark/deploy/worker/Worker.scala
core/src/main/scala/spark/deploy/worker/WorkerWebUI.scala
core/src/main/scala/spark/rdd/BlockRDD.scala
core/src/main/scala/spark/rdd/ZippedRDD.scala
core/src/main/scala/spark/scheduler/cluster/StandaloneSchedulerBackend.scala
core/src/main/scala/spark/storage/BlockManager.scala
core/src/main/scala/spark/storage/BlockManagerMaster.scala
core/src/main/scala/spark/storage/BlockManagerMasterActor.scala
core/src/main/scala/spark/storage/BlockManagerUI.scala
core/src/main/scala/spark/util/AkkaUtils.scala
core/src/test/scala/spark/SizeEstimatorSuite.scala
pom.xml
project/SparkBuild.scala
repl/src/main/scala/spark/repl/SparkILoop.scala
repl/src/test/scala/spark/repl/ReplSuite.scala
streaming/src/main/scala/spark/streaming/StreamingContext.scala
streaming/src/main/scala/spark/streaming/api/java/JavaStreamingContext.scala
streaming/src/main/scala/spark/streaming/dstream/KafkaInputDStream.scala
streaming/src/main/scala/spark/streaming/util/MasterFailureTest.scala
2013-07-03 11:43:26 +05:30
Konstantin Boudnik
6fdbc68f2c
Fixing missed hbase dependency in examples hadoop2-yarn profile
2013-07-01 17:45:07 -07:00
Matei Zaharia
ccfe953a4d
Merge pull request #577 from skumargithub/master
...
Example of cumulative counting using updateStateByKey
2013-06-29 17:57:53 -07:00
Matei Zaharia
1667158544
Merge remote-tracking branch 'mrpotes/master'
2013-06-29 14:36:09 -07:00
James Phillpotts
176193b1e8
Fix usage and parameter extraction
2013-06-25 23:06:15 +01:00
James Phillpotts
366572edca
Include a default OAuth implementation, and update examples and JavaStreamingContext
2013-06-25 22:59:34 +01:00
Tathagata Das
c89af0a7f9
Merge branch 'master' into streaming
...
Conflicts:
.gitignore
2013-06-24 23:57:47 -07:00
Matei Zaharia
dbfab49d2a
Merge remote-tracking branch 'milliondreams/casdemo'
...
Conflicts:
project/SparkBuild.scala
2013-06-18 14:55:31 +02:00
Matei Zaharia
b7794813b1
Fix run script on Windows for Scala 2.10
2013-06-15 09:37:13 -07:00
Rohit Rai
b5b12823fa
Fixing the style as per feedback
2013-06-13 14:05:46 +05:30
Rohit Rai
b104c7f5c7
Example to write the output to cassandra
2013-06-03 15:15:52 +05:30
Rohit Rai
56c64c4033
A better way to read column value if you are sure the column exists in every row.
2013-06-03 12:48:35 +05:30
Rohit Rai
6d8423fd1b
Adding deps to examples/pom.xml
...
Fixing exclusion in examples deps in SparkBuild.scala
2013-06-02 13:03:45 +05:30
Rohit Rai
81c2adc15c
Removing infix call
2013-06-02 12:51:15 +05:30
Rohit Rai
3be7bdcefd
Adding example to make Spark RDD from Cassandra
2013-06-01 19:32:17 +05:30
Ethan Jewett
3217d486f7
Add hBase dependency to examples POM
2013-05-20 19:41:38 -05:00
Ethan Jewett
ee6f6aa6cd
Add hBase example
2013-05-09 18:33:38 -05:00
Reynold Xin
012c9e5ab0
Revert "Merge pull request #596 from esjewett/master" because the
...
dependency on hbase introduces netty-3.2.2 which conflicts with
netty-3.5.3 already in Spark. This caused multiple test failures.
This reverts commit 0f1b7a06e1
, reversing
changes made to aacca1b8a8
.
2013-05-09 14:20:01 -07:00
Ethan Jewett
a3d5f92210
Switch to using SparkContext method to create RDD
2013-05-07 11:43:06 -05:00
unknown
cbf6a5ee1e
Removed unused code, clarified intent of the program, batch size to 1 second
2013-05-06 08:05:45 -06:00
Ethan Jewett
7cff7e7897
Fix indents and mention other configuration options
2013-05-04 14:56:55 -05:00
Ethan Jewett
9290f16430
Remove unnecessary column family config
2013-05-04 12:39:14 -05:00
Ethan Jewett
02e8cfa617
HBase example
2013-05-04 12:31:30 -05:00
unknown
1d54401d7e
Modified as per TD's suggestions
2013-04-30 23:01:32 -06:00
Prashant Sharma
8f3ac240cb
Fixed Warning: ClassManifest -> ClassTag
2013-04-29 16:39:13 +05:30
Prashant Sharma
4b4a36ea7d
Fixed pom.xml with updated dependencies.
2013-04-29 12:55:43 +05:30
Mridul Muralidharan
dd515ca3ee
Attempt at fixing merge conflict
2013-04-24 09:24:17 +05:30
unknown
0dc1e2d60f
Examaple of cumulative counting using updateStateByKey
2013-04-22 09:22:45 -06:00
Mridul Muralidharan
7acab3ab45
Fix review comments, add a new api to SparkHadoopUtil to create appropriate Configuration. Modify an example to show how to use SplitInfo
2013-04-22 08:01:13 +05:30
seanm
7e56e99573
Surfacing decoders on KafkaInputDStream
2013-04-16 17:17:16 -06:00
Andrew Ash
f1d8871ca1
Uniform whitespace across scala examples
2013-04-09 23:35:13 -04:00
Matei Zaharia
65caa8f711
Merge remote-tracking branch 'jey/bump-development-version-to-0.8.0'
...
Conflicts:
docs/_config.yml
project/SparkBuild.scala
2013-04-08 12:43:17 -04:00
Matei Zaharia
b362df39ea
Merge pull request #552 from MLnick/master
...
Bumping version for Twitter Algebird to latest
2013-04-07 17:17:52 -07:00
Mridul Muralidharan
6798a09df8
Add support for building against hadoop2-yarn : adding new maven profile for it
2013-04-07 17:47:38 +05:30
Nick Pentreath
0f54344fd8
Bumping Algebird version in examples now that it supports JDK 1.6
2013-04-03 13:15:34 +02:00
Erik van oosten
b5e60c3253
Corrected order of CountMinSketchMonoid arguments
2013-04-02 15:25:22 +03:00
Jey Kottalam
bc8ba222ff
Bump development version to 0.8.0
2013-03-28 15:42:01 -07:00
Matei Zaharia
ca4d083ec8
Merge pull request #528 from MLnick/java-examples
...
[SPARK-707] Adding Java versions of Pi, LogQuery and K-Means examples
2013-03-20 11:22:36 -07:00
Nick Pentreath
52398cc1a3
Java indentation 4 --> 2 spaces
2013-03-20 09:55:42 +02:00
Nick Pentreath
9fa47a2039
A few cosmetic changes for JavaKMeans
2013-03-19 15:31:03 +02:00
Nick Pentreath
568ddf7330
Adding Java K-Means example
2013-03-19 15:29:22 +02:00
Nick Pentreath
b990caeb80
Changes to more closely match line length limit style
2013-03-17 20:03:27 +02:00
Mikhail Bautin
7fd2708eda
Add a log4j compile dependency to fix build in IntelliJ
...
Also rename parent project to spark-parent (otherwise it shows up as
"parent" in IntelliJ, which is very confusing).
2013-03-15 11:41:51 -07:00
Nick Pentreath
13757b1198
Adding Java versions of Pi and LogQuery
2013-03-15 10:52:01 +02:00
Mark Hamstra
8b06b359da
bump version to 0.7.1-SNAPSHOT in the subproject poms to keep the maven build building.
2013-02-28 23:34:34 -08:00
Matei Zaharia
5d7b591cfe
Pass a code JAR to SparkContext in our examples. Fixes SPARK-594.
2013-02-25 19:34:32 -08:00
Matei Zaharia
6b87ef7c86
Fix compile error
2013-02-25 14:01:16 -08:00
Matei Zaharia
01bd136ba5
Use public method sparkContext instead of protected sc in streaming examples
2013-02-25 13:27:11 -08:00
Tathagata Das
f282bc4960
Changed Algebird from 0.1.9 to 0.1.8
2013-02-24 12:44:12 -08:00
Tathagata Das
c1a040db3a
Fixed bugs in examples.
2013-02-24 11:00:30 -08:00
Tathagata Das
41285eaae3
Fixed differences in APIs of StreamingContext and JavaStreamingContext. Change rawNetworkStream to rawSocketStream, and added twitter, actor, zeroMQ streams to JavaStreamingContext. Also added them to JavaAPISuite.
2013-02-23 16:25:07 -08:00
Tathagata Das
cfa65ebff1
Merge pull request #480 from MLnick/streaming-eg-algebird
...
[Streaming] Examples using Twitter's Algebird library
2013-02-22 12:29:04 -08:00
Tathagata Das
688e62718f
Merge pull request #479 from ScrapCodes/zeromq-streaming
...
Zeromq streaming
2013-02-22 12:17:17 -08:00
Nick Pentreath
d9bdae8cc2
Adding documentation for HLL and CMS examples. More efficient and clear use of the monoids.
2013-02-21 12:31:31 +02:00
Nick Pentreath
718474b9c6
Bumping Algebird to 0.1.9
2013-02-21 12:11:31 +02:00
Nick Pentreath
16d456742e
Merge remote-tracking branch 'upstream/streaming' into streaming-eg-algebird
2013-02-21 09:33:08 +02:00
Tathagata Das
972fe7714f
Merge branch 'mesos-streaming' into streaming
...
Conflicts:
streaming/src/test/java/spark/streaming/JavaAPISuite.java
2013-02-20 11:06:01 -08:00
Tathagata Das
fb9956256d
Merge branch 'mesos-master' into streaming
...
Conflicts:
core/src/main/scala/spark/rdd/CheckpointRDD.scala
streaming/src/main/scala/spark/streaming/dstream/ReducedWindowedDStream.scala
2013-02-20 09:01:29 -08:00
Prashant Sharma
4e5b09664c
fixes corresponding to review feedback at pull request #479
2013-02-20 19:14:52 +05:30
Prashant Sharma
05dc385649
A bug fix post merge, following changes to AkkaUtils
2013-02-20 15:28:12 +05:30
Nick Pentreath
8a281399f9
Streaming example using Twitter Algebird's Count Min Sketch monoid
2013-02-19 17:56:02 +02:00
Nick Pentreath
d8ee184d95
Dependencies and refactoring for streaming HLL example, and using context.twitterStream method
2013-02-19 17:42:57 +02:00
Prashant Sharma
8d44480d84
example for demonstrating ZeroMQ stream
2013-02-19 19:42:14 +05:30
Nick Pentreath
315ea069e8
Merge remote-tracking branch 'upstream/streaming' into streaming-eg-algebird
...
Conflicts:
project/SparkBuild.scala
2013-02-19 13:58:05 +02:00
Nick Pentreath
015893f0e8
Adding streaming HyperLogLog example using Algebird
2013-02-19 13:21:33 +02:00
Tathagata Das
7e30c46aaf
Added comment to the KafkaWordCount, given by Sean McNamara.
2013-02-19 03:05:44 -08:00
Tathagata Das
9e82be1503
Merge branch 'streaming' into ScrapCodes-streaming-actor
...
Conflicts:
docs/plugin-custom-receiver.md
streaming/src/main/scala/spark/streaming/StreamingContext.scala
streaming/src/main/scala/spark/streaming/dstream/KafkaInputDStream.scala
streaming/src/main/scala/spark/streaming/dstream/PluggableInputDStream.scala
streaming/src/main/scala/spark/streaming/receivers/ActorReceiver.scala
streaming/src/test/scala/spark/streaming/InputStreamsSuite.scala
2013-02-19 02:48:50 -08:00
Tathagata Das
12ea14c211
Changed networkStream to socketStream and pluggableNetworkStream to become networkStream as a way to create streams from arbitrary network receiver.
2013-02-18 15:18:34 -08:00
Tathagata Das
6a6e6bda57
Merge branch 'streaming' into ScrapCode-streaming
...
Conflicts:
streaming/src/main/scala/spark/streaming/dstream/KafkaInputDStream.scala
streaming/src/main/scala/spark/streaming/dstream/NetworkInputDStream.scala
2013-02-18 13:26:12 -08:00
Tathagata Das
4b8402e900
Moved Java streaming examples to examples/src/main/java/spark/streaming/... and fixed logging in NetworkInputTracker to highlight errors when receiver deregisters/shuts down.
2013-02-14 18:10:37 -08:00
Tathagata Das
def8126d77
Added TwitterInputDStream from example to StreamingContext. Renamed example TwitterBasic to TwitterPopularTags.
2013-02-14 17:49:43 -08:00
Tathagata Das
2eacf22401
Removed countByKeyAndWindow on paired DStreams, and added countByValueAndWindow for all DStreams. Updated both scala and java API and testsuites.
2013-02-14 12:21:47 -08:00