Tathagata Das
c89af0a7f9
Merge branch 'master' into streaming
...
Conflicts:
.gitignore
2013-06-24 23:57:47 -07:00
Matei Zaharia
dbfab49d2a
Merge remote-tracking branch 'milliondreams/casdemo'
...
Conflicts:
project/SparkBuild.scala
2013-06-18 14:55:31 +02:00
Rohit Rai
b5b12823fa
Fixing the style as per feedback
2013-06-13 14:05:46 +05:30
Rohit Rai
b104c7f5c7
Example to write the output to cassandra
2013-06-03 15:15:52 +05:30
Rohit Rai
56c64c4033
A better way to read column value if you are sure the column exists in every row.
2013-06-03 12:48:35 +05:30
Rohit Rai
81c2adc15c
Removing infix call
2013-06-02 12:51:15 +05:30
Rohit Rai
3be7bdcefd
Adding example to make Spark RDD from Cassandra
2013-06-01 19:32:17 +05:30
Ethan Jewett
ee6f6aa6cd
Add hBase example
2013-05-09 18:33:38 -05:00
Reynold Xin
012c9e5ab0
Revert "Merge pull request #596 from esjewett/master" because the
...
dependency on hbase introduces netty-3.2.2 which conflicts with
netty-3.5.3 already in Spark. This caused multiple test failures.
This reverts commit 0f1b7a06e1
, reversing
changes made to aacca1b8a8
.
2013-05-09 14:20:01 -07:00
Ethan Jewett
a3d5f92210
Switch to using SparkContext method to create RDD
2013-05-07 11:43:06 -05:00
unknown
cbf6a5ee1e
Removed unused code, clarified intent of the program, batch size to 1 second
2013-05-06 08:05:45 -06:00
Ethan Jewett
7cff7e7897
Fix indents and mention other configuration options
2013-05-04 14:56:55 -05:00
Ethan Jewett
9290f16430
Remove unnecessary column family config
2013-05-04 12:39:14 -05:00
Ethan Jewett
02e8cfa617
HBase example
2013-05-04 12:31:30 -05:00
unknown
1d54401d7e
Modified as per TD's suggestions
2013-04-30 23:01:32 -06:00
Mridul Muralidharan
dd515ca3ee
Attempt at fixing merge conflict
2013-04-24 09:24:17 +05:30
unknown
0dc1e2d60f
Examaple of cumulative counting using updateStateByKey
2013-04-22 09:22:45 -06:00
Mridul Muralidharan
7acab3ab45
Fix review comments, add a new api to SparkHadoopUtil to create appropriate Configuration. Modify an example to show how to use SplitInfo
2013-04-22 08:01:13 +05:30
seanm
7e56e99573
Surfacing decoders on KafkaInputDStream
2013-04-16 17:17:16 -06:00
Andrew Ash
f1d8871ca1
Uniform whitespace across scala examples
2013-04-09 23:35:13 -04:00
Erik van oosten
b5e60c3253
Corrected order of CountMinSketchMonoid arguments
2013-04-02 15:25:22 +03:00
Nick Pentreath
52398cc1a3
Java indentation 4 --> 2 spaces
2013-03-20 09:55:42 +02:00
Nick Pentreath
9fa47a2039
A few cosmetic changes for JavaKMeans
2013-03-19 15:31:03 +02:00
Nick Pentreath
568ddf7330
Adding Java K-Means example
2013-03-19 15:29:22 +02:00
Nick Pentreath
b990caeb80
Changes to more closely match line length limit style
2013-03-17 20:03:27 +02:00
Nick Pentreath
13757b1198
Adding Java versions of Pi and LogQuery
2013-03-15 10:52:01 +02:00
Matei Zaharia
5d7b591cfe
Pass a code JAR to SparkContext in our examples. Fixes SPARK-594.
2013-02-25 19:34:32 -08:00
Matei Zaharia
6b87ef7c86
Fix compile error
2013-02-25 14:01:16 -08:00
Matei Zaharia
01bd136ba5
Use public method sparkContext instead of protected sc in streaming examples
2013-02-25 13:27:11 -08:00
Tathagata Das
c1a040db3a
Fixed bugs in examples.
2013-02-24 11:00:30 -08:00
Tathagata Das
41285eaae3
Fixed differences in APIs of StreamingContext and JavaStreamingContext. Change rawNetworkStream to rawSocketStream, and added twitter, actor, zeroMQ streams to JavaStreamingContext. Also added them to JavaAPISuite.
2013-02-23 16:25:07 -08:00
Tathagata Das
cfa65ebff1
Merge pull request #480 from MLnick/streaming-eg-algebird
...
[Streaming] Examples using Twitter's Algebird library
2013-02-22 12:29:04 -08:00
Tathagata Das
688e62718f
Merge pull request #479 from ScrapCodes/zeromq-streaming
...
Zeromq streaming
2013-02-22 12:17:17 -08:00
Nick Pentreath
d9bdae8cc2
Adding documentation for HLL and CMS examples. More efficient and clear use of the monoids.
2013-02-21 12:31:31 +02:00
Nick Pentreath
16d456742e
Merge remote-tracking branch 'upstream/streaming' into streaming-eg-algebird
2013-02-21 09:33:08 +02:00
Tathagata Das
972fe7714f
Merge branch 'mesos-streaming' into streaming
...
Conflicts:
streaming/src/test/java/spark/streaming/JavaAPISuite.java
2013-02-20 11:06:01 -08:00
Tathagata Das
fb9956256d
Merge branch 'mesos-master' into streaming
...
Conflicts:
core/src/main/scala/spark/rdd/CheckpointRDD.scala
streaming/src/main/scala/spark/streaming/dstream/ReducedWindowedDStream.scala
2013-02-20 09:01:29 -08:00
Prashant Sharma
4e5b09664c
fixes corresponding to review feedback at pull request #479
2013-02-20 19:14:52 +05:30
Prashant Sharma
05dc385649
A bug fix post merge, following changes to AkkaUtils
2013-02-20 15:28:12 +05:30
Nick Pentreath
8a281399f9
Streaming example using Twitter Algebird's Count Min Sketch monoid
2013-02-19 17:56:02 +02:00
Nick Pentreath
d8ee184d95
Dependencies and refactoring for streaming HLL example, and using context.twitterStream method
2013-02-19 17:42:57 +02:00
Prashant Sharma
8d44480d84
example for demonstrating ZeroMQ stream
2013-02-19 19:42:14 +05:30
Nick Pentreath
315ea069e8
Merge remote-tracking branch 'upstream/streaming' into streaming-eg-algebird
...
Conflicts:
project/SparkBuild.scala
2013-02-19 13:58:05 +02:00
Nick Pentreath
015893f0e8
Adding streaming HyperLogLog example using Algebird
2013-02-19 13:21:33 +02:00
Tathagata Das
7e30c46aaf
Added comment to the KafkaWordCount, given by Sean McNamara.
2013-02-19 03:05:44 -08:00
Tathagata Das
9e82be1503
Merge branch 'streaming' into ScrapCodes-streaming-actor
...
Conflicts:
docs/plugin-custom-receiver.md
streaming/src/main/scala/spark/streaming/StreamingContext.scala
streaming/src/main/scala/spark/streaming/dstream/KafkaInputDStream.scala
streaming/src/main/scala/spark/streaming/dstream/PluggableInputDStream.scala
streaming/src/main/scala/spark/streaming/receivers/ActorReceiver.scala
streaming/src/test/scala/spark/streaming/InputStreamsSuite.scala
2013-02-19 02:48:50 -08:00
Tathagata Das
12ea14c211
Changed networkStream to socketStream and pluggableNetworkStream to become networkStream as a way to create streams from arbitrary network receiver.
2013-02-18 15:18:34 -08:00
Tathagata Das
6a6e6bda57
Merge branch 'streaming' into ScrapCode-streaming
...
Conflicts:
streaming/src/main/scala/spark/streaming/dstream/KafkaInputDStream.scala
streaming/src/main/scala/spark/streaming/dstream/NetworkInputDStream.scala
2013-02-18 13:26:12 -08:00
Tathagata Das
4b8402e900
Moved Java streaming examples to examples/src/main/java/spark/streaming/... and fixed logging in NetworkInputTracker to highlight errors when receiver deregisters/shuts down.
2013-02-14 18:10:37 -08:00
Tathagata Das
def8126d77
Added TwitterInputDStream from example to StreamingContext. Renamed example TwitterBasic to TwitterPopularTags.
2013-02-14 17:49:43 -08:00
Tathagata Das
2eacf22401
Removed countByKeyAndWindow on paired DStreams, and added countByValueAndWindow for all DStreams. Updated both scala and java API and testsuites.
2013-02-14 12:21:47 -08:00
Prashant Sharma
291dd47c7f
Taking FeederActor out as seperate program
2013-02-08 14:34:07 +05:30
Tathagata Das
12300758cc
Merge pull request #372 from Reinvigorate/sm-kafka
...
Removing offset management code that is non-existent in kafka 0.7.0+
2013-02-07 12:41:07 -08:00
Patrick Wendell
dab81a8511
Fixing to match Spark styleguide
2013-02-05 20:57:04 -08:00
Patrick Wendell
cc37601ecb
Adding an example with an OLAP roll-up
2013-02-04 14:18:11 -08:00
Prashant Sharma
4496bf197b
Improved document comment in example
2013-01-25 14:34:38 +05:30
Prashant Sharma
d17065c4b5
actor as receiver
2013-01-22 13:28:29 +05:30
Prashant Sharma
43bfd7bb21
Changed method name of createReceiver to getReceiver as it is not intended to be a factory.
2013-01-21 11:39:30 +05:30
Matei Zaharia
86057ec7c8
Merge branch 'master' into streaming
...
Conflicts:
core/src/main/scala/spark/api/python/PythonRDD.scala
2013-01-20 12:47:55 -08:00
Matei Zaharia
2a8c2a6790
Minor formatting fixes
2013-01-20 10:24:53 -08:00
Tathagata Das
4f8fe58b25
Merge branch 'mesos-streaming' into streaming
...
Conflicts:
core/src/main/scala/spark/api/java/JavaRDDLike.scala
core/src/main/scala/spark/api/java/JavaSparkContext.scala
core/src/test/scala/spark/JavaAPISuite.java
2013-01-20 01:13:56 -08:00
Prashant Sharma
56b9bd197c
Plug in actor as stream receiver API
2013-01-19 22:04:07 +05:30
Prashant Sharma
bb6ab92e31
Changed method name of createReceiver to getReceiver as it is not intended to be a factory.
2013-01-19 22:04:07 +05:30
seanm
d3064fe707
kafkaStream API cleanup. A quorum of zookeepers can now be specified
2013-01-18 21:34:29 -07:00
Patrick Wendell
12b72b3e73
NetworkWordCount example
2013-01-17 22:37:56 -08:00
Patrick Wendell
e0165bf714
Adding queueStream and some slight refactoring
2013-01-17 21:25:49 -08:00
Patrick Wendell
6fba7683c2
Small doc fix
2013-01-17 18:46:24 -08:00
Nick Pentreath
a5ba7a9f32
Use only one update function and pass in transpose of ratings matrix where appropriate
2013-01-17 16:21:00 +02:00
Nick Pentreath
a512df551f
Fixed index error missing first argument
2013-01-17 16:05:27 +02:00
Nick Pentreath
42fbef3c2a
Adding default command line args to SparkALS
2013-01-17 15:54:59 +02:00
Tathagata Das
cd1521cfdb
Merge branch 'master' into streaming
...
Conflicts:
core/src/main/scala/spark/rdd/CoGroupedRDD.scala
core/src/main/scala/spark/rdd/FilteredRDD.scala
docs/_layouts/global.html
docs/index.md
run
2013-01-15 12:08:51 -08:00
Patrick Wendell
d182a57cae
Two changes:
...
- Updating countByX() types based on bug fix
- Porting new documentation to Java
2013-01-14 10:03:55 -08:00
Patrick Wendell
3461cd99b7
Flume example and bug fix
2013-01-14 09:42:36 -08:00
Tathagata Das
0a2e333341
Removed stream id from the constructor of NetworkReceiver to make it easier for PluggableNetworkInputDStream.
2013-01-13 16:18:39 -08:00
Eric Zhang
ba06e9c97c
Update examples/src/main/scala/spark/examples/LocalLR.scala
...
fix spelling mistake
2013-01-13 15:33:11 +08:00
Patrick Wendell
6c502e3793
Making the Twitter example distributed.
...
This adds a distributed (receiver-based) implementation of the
Twitter dstream. It also changes the example to perform a
distributed sort rather than collecting the dataset at one node.
2013-01-07 22:01:11 -08:00
Tathagata Das
8c1b872512
Moved Twitter example to the where the other examples are.
2013-01-07 17:48:10 -08:00
Tathagata Das
237bac36e9
Renamed examples and added documentation.
2013-01-07 14:37:21 -08:00
Tathagata Das
af8738dfb5
Moved Spark Streaming examples to examples sub-project.
2013-01-06 19:31:54 -08:00
root
acf8272324
Fix K-means example a little
2012-11-10 23:07:21 -08:00
Matei Zaharia
8d7b77bcb5
Some doc and usability improvements:
...
- Added a StorageLevels class for easy access to StorageLevel constants
in Java
- Added doc comments on Function classes in Java
- Updated Accumulator and HadoopWriter docs slightly
2012-10-12 17:53:20 -07:00
Mosharaf Chowdhury
119e50c7b9
Conflict fixed
2012-10-02 22:25:39 -07:00
Matei Zaharia
56c90485fd
More updates to documentation
2012-09-25 19:31:07 -07:00
Mosharaf Chowdhury
3883532545
Bug fix. Fixed log messages. Updated BroadcastTest example to have iterations.
2012-08-30 21:43:00 -07:00
Josh Rosen
566feafe1d
Cache points in SparkLR example.
2012-08-26 15:24:43 -07:00
Matei Zaharia
6ae3c375a9
Renamed apply() to call() in Java API and allowed it to throw Exceptions
2012-08-12 23:10:19 +02:00
Imran Rashid
edc6972f8e
move Vector class into core and spark.util package
2012-07-28 20:15:42 -07:00
Josh Rosen
2a60c998cc
Remove StringOps.split() from Java WordCount.
2012-07-25 10:13:06 -07:00
Josh Rosen
6a78e88237
Minor cleanup and optimizations in Java API.
...
- Add override keywords.
- Cache RDDs and counts in TC example.
- Clean up JavaRDDLike's abstract methods.
2012-07-24 09:47:00 -07:00
Josh Rosen
460da878fc
Improve Java API examples
...
- Replace JavaLR example with JavaHdfsLR example.
- Use anonymous classes in JavaWordCount; add options.
- Remove @Override annotations.
2012-07-22 14:40:39 -07:00
Josh Rosen
01dce3f569
Add Java API
...
Add distinct() method to RDD.
Fix bug in DoubleRDDFunctions.
2012-07-18 17:34:29 -07:00
Matei Zaharia
28fed4ce3b
Add System.exit(0) at the end of all the example programs.
2012-06-05 23:31:28 -07:00
haoyuan
651932e703
Format the code as coding style agreed by Matei/TD/Haoyuan
2012-02-09 13:26:23 -08:00
Matei Zaharia
100e800782
Some fixes to the examples (mostly to use functional API)
2012-01-31 00:33:18 -08:00
Matei Zaharia
fabcc82528
Merge pull request #103 from edisontung/master
...
Made improvements to takeSample. Also changed SparkLocalKMeans to SparkKMeans
2012-01-13 19:20:03 -08:00
Matei Zaharia
3034fc0d91
Merge commit 'ad4ebff42c1b738746b2b9ecfbb041b6d06e3e16'
2011-12-14 18:19:43 +01:00
Matei Zaharia
72c4839c5f
Fixed LocalFileLR to deal with a change in Scala IO sources
...
(you can no longer iterate over a Source multiple times).
2011-12-01 13:52:12 -08:00
Edison Tung
42f8847a21
Revert de01b6deaaee1b43321e0aac330f4a98c0ea61c6^..HEAD
2011-12-01 13:43:25 -08:00
Edison Tung
e1c814be4c
Renamed SparkLocalKMeans to SparkKMeans
2011-12-01 13:34:03 -08:00
Edison Tung
3b9d9de583
Added KMeans examples
...
LocalKMeans runs locally with a randomly generated dataset.
SparkLocalKMeans takes an input file and runs KMeans on it.
2011-11-21 16:37:58 -08:00