### What changes were proposed in this pull request?
Change the link to the Scala API document.
```
$ git grep "#org.apache.spark.package"
docs/_layouts/global.html: <li><a href="api/scala/index.html#org.apache.spark.package">Scala</a></li>
docs/index.md:* [Spark Scala API (Scaladoc)](api/scala/index.html#org.apache.spark.package)
docs/rdd-programming-guide.md:[Scala](api/scala/#org.apache.spark.package), [Java](api/java/), [Python](api/python/) and [R](api/R/).
```
### Why are the changes needed?
The home page link for Scala API document is incorrect after upgrade to 3.0
### Does this PR introduce any user-facing change?
Document UI change only.
### How was this patch tested?
Local test, attach screenshots below:
Before:
![image](https://user-images.githubusercontent.com/4833765/74335713-c2385300-4dd7-11ea-95d8-f5a3639d2578.png)
After:
![image](https://user-images.githubusercontent.com/4833765/74335727-cbc1bb00-4dd7-11ea-89d9-4dcc1310e679.png)
Closes#27549 from xuanyuanking/scala-doc.
Authored-by: Yuanjian Li <xyliyuanjian@gmail.com>
Signed-off-by: Sean Owen <srowen@gmail.com>
## What changes were proposed in this pull request?
Add AL2 license to metadata of all .md files.
This seemed to be the tidiest way as it will get ignored by .md renderers and other tools. Attempts to write them as markdown comments revealed that there is no such standard thing.
## How was this patch tested?
Doc build
Closes#24243 from srowen/SPARK-26918.
Authored-by: Sean Owen <sean.owen@databricks.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
## What changes were proposed in this pull request?
Remove Kafka 0.8 integration
## How was this patch tested?
Existing tests, build scripts
Closes#22703 from srowen/SPARK-25705.
Authored-by: Sean Owen <sean.owen@databricks.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
## What changes were proposed in this pull request?
Put Kafka 0.8 support behind a kafka-0-8 profile.
## How was this patch tested?
Existing tests, but, until PR builder and Jenkins configs are updated the effect here is to not build or test Kafka 0.8 support at all.
Author: Sean Owen <sowen@cloudera.com>
Closes#19134 from srowen/SPARK-21893.
## What changes were proposed in this pull request?
Doc for the Kafka 0.10 integration
## How was this patch tested?
Scala code examples were taken from my example repo, so hopefully they compile.
Author: cody koeninger <cody@koeninger.org>
Closes#14385 from koeninger/SPARK-16312.
## What changes were proposed in this pull request?
Fixed broken java code examples in streaming documentation
Attn: tdas
Author: Matthew Wise <matthew.rs.wise@gmail.com>
Closes#13388 from mawise/fix_docs_java_streaming_example.
## What changes were proposed in this pull request?
Renaming the streaming-kafka artifact to include kafka version, in anticipation of needing a different artifact for later kafka versions
## How was this patch tested?
Unit tests
Author: cody koeninger <cody@koeninger.org>
Closes#12946 from koeninger/SPARK-15085.
This PR added instructions to get flume assembly jar for Python users in the flume integration page like Kafka doc.
Author: Shixiong Zhu <shixiong@databricks.com>
Closes#10746 from zsxwing/flume-doc.
With the merge of [SPARK-8337](https://issues.apache.org/jira/browse/SPARK-8337), now the Python API has the same functionalities compared to Scala/Java, so here changing the description to make it more precise.
zsxwing tdas , please review, thanks a lot.
Author: jerryshao <sshao@hortonworks.com>
Closes#10246 from jerryshao/direct-kafka-doc-update.
tdas koeninger
This updates the Spark Streaming + Kafka Integration Guide doc with a working method to access the offsets of a `KafkaRDD` through Python.
Author: Nick Evans <me@nicolasevans.org>
Closes#9289 from manygrams/update_kafka_direct_python_docs.
Author: Mahmoud Lababidi <lababidi@gmail.com>
Closes#8076 from lababidi/master and squashes the following commits:
af4553b [Mahmoud Lababidi] Fixed AtmoicReference<> Example
Author: cody koeninger <cody@koeninger.org>
Closes#6863 from koeninger/SPARK-8390 and squashes the following commits:
26a06bd [cody koeninger] Merge branch 'master' into SPARK-8390
3744492 [cody koeninger] [Streaming][Kafka][SPARK-8390] doc changes per TD, test to make sure approach shown in docs actually compiles + runs
b108c9d [cody koeninger] [Streaming][Kafka][SPARK-8390] further doc fixes, clean up spacing
bb4336b [cody koeninger] [Streaming][Kafka][SPARK-8390] fix docs related to HasOffsetRanges, cleanup
3f3c57a [cody koeninger] [Streaming][Kafka][SPARK-8389] Example of getting offset ranges out of the existing java direct stream api
- Kinesis API updated
- Kafka version updated, and Python API for Direct Kafka added
- Added SQLContext.getOrCreate()
- Added information on how to get partitionId in foreachRDD
Author: Tathagata Das <tathagata.das1565@gmail.com>
Closes#6781 from tdas/SPARK-7284 and squashes the following commits:
aac7be0 [Tathagata Das] Added information on how to get partition id
a66ec22 [Tathagata Das] Complete the line incomplete line,
a92ca39 [Tathagata Das] Updated streaming documentation
I noticed there is a duplicated description about WAL.
```
To ensure zero-data loss, you have to additionally enable Write Ahead Logs in Spark Streaming. To ensure zero data loss, enable the Write Ahead Logs (introduced in Spark 1.2).
```
Let's remove the duplication.
I don't file this issue in JIRA because it's minor.
Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
Closes#6719 from sarutak/remove-multiple-description and squashes the following commits:
cc9bb21 [Kousuke Saruta] Removed duplicated description about WAL
Fixed the broken links (Examples) in the documentation.
Author: Akhil Das <akhld@darktech.ca>
Closes#6666 from akhld/patch-2 and squashes the following commits:
2228b83 [Akhil Das] Update streaming-kafka-integration.md
Updates to the documentation are as follows:
- Added information on Kafka Direct API and Kafka Python API
- Added joins to the main streaming guide
- Improved details on the fault-tolerance semantics
Generated docs located here
http://people.apache.org/~tdas/spark-1.3.0-temp-docs/streaming-programming-guide.html#fault-tolerance-semantics
More things to add:
- Configuration for Kafka receive rate
- May be add concurrentJobs
Author: Tathagata Das <tathagata.das1565@gmail.com>
Closes#4956 from tdas/streaming-guide-update-1.3 and squashes the following commits:
819408c [Tathagata Das] Minor fixes.
debe484 [Tathagata Das] Added DataFrames and MLlib
380cf8d [Tathagata Das] Fix link
04167a6 [Tathagata Das] Merge remote-tracking branch 'apache-github/master' into streaming-guide-update-1.3
0b77486 [Tathagata Das] Updates based on Josh's comments.
86c4c2a [Tathagata Das] Updated streaming guides
82de92a [Tathagata Das] Add Kafka to Python api docs
- Also fixed java link
Author: Jongyoul Lee <jongyoul@gmail.com>
Closes#4172 from jongyoul/SPARK-FIXDOC and squashes the following commits:
6be03e5 [Jongyoul Lee] [SPARK-5058] Part 2. Typos and broken URL - Also fixed java link
Updated the broken link pointing to the KafkaWordCount example to the correct one.
Author: sigmoidanalytics <mayur@sigmoidanalytics.com>
Closes#3877 from sigmoidanalytics/patch-1 and squashes the following commits:
3e19b31 [sigmoidanalytics] Updated broken links
Changed projrect to project :)
Author: Akhil Das <akhld@darktech.ca>
Closes#3876 from akhld/patch-1 and squashes the following commits:
e0cf9ef [Akhil Das] Fixed typos in streaming-kafka-integration.md
Important updates to the streaming programming guide
- Make the fault-tolerance properties easier to understand, with information about write ahead logs
- Update the information about deploying the spark streaming app with information about Driver HA
- Update Receiver guide to discuss reliable vs unreliable receivers.
Author: Tathagata Das <tathagata.das1565@gmail.com>
Author: Josh Rosen <joshrosen@databricks.com>
Author: Josh Rosen <rosenville@gmail.com>
Closes#3653 from tdas/streaming-doc-update-1.2 and squashes the following commits:
f53154a [Tathagata Das] Addressed Josh's comments.
ce299e4 [Tathagata Das] Minor update.
ca19078 [Tathagata Das] Minor change
f746951 [Tathagata Das] Mentioned performance problem with WAL
7787209 [Tathagata Das] Merge branch 'streaming-doc-update-1.2' of github.com:tdas/spark into streaming-doc-update-1.2
2184729 [Tathagata Das] Updated Kafka and Flume guides with reliability information.
2f3178c [Tathagata Das] Added more information about writing reliable receivers in the custom receiver guide.
91aa5aa [Tathagata Das] Improved API Docs menu
5707581 [Tathagata Das] Added Pythn API badge
b9c8c24 [Tathagata Das] Merge pull request #26 from JoshRosen/streaming-programming-guide
b8c8382 [Josh Rosen] minor fixes
a4ef126 [Josh Rosen] Restructure parts of the fault-tolerance section to read a bit nicer when skipping over the headings
65f66cd [Josh Rosen] Fix broken link to fault-tolerance semantics section.
f015397 [Josh Rosen] Minor grammar / pluralization fixes.
3019f3a [Josh Rosen] Fix minor Markdown formatting issues
aa8bb87 [Tathagata Das] Small update.
195852c [Tathagata Das] Updated based on Josh's comments, updated receiver reliability and deploying section, and also updated configuration.
17b99fb [Tathagata Das] Merge remote-tracking branch 'apache-github/master' into streaming-doc-update-1.2
a0217c0 [Tathagata Das] Changed Deploying menu layout
67fcffc [Tathagata Das] Added cluster mode + supervise example to submitting application guide.
e45453b [Tathagata Das] Update streaming guide, added deploying section.
192c7a7 [Tathagata Das] Added more info about Python API, and rewrote the checkpointing section.
Updated the main streaming programming guide, and also added source-specific guides for Kafka, Flume, Kinesis.
Author: Tathagata Das <tathagata.das1565@gmail.com>
Author: Jacek Laskowski <jacek@japila.pl>
Closes#2254 from tdas/streaming-doc-fix and squashes the following commits:
e45c6d7 [Jacek Laskowski] More fixes from an old PR
5125316 [Tathagata Das] Fixed links
dc02f26 [Tathagata Das] Refactored streaming kinesis guide and made many other changes.
acbc3e3 [Tathagata Das] Fixed links between streaming guides.
cb7007f [Tathagata Das] Added Streaming + Flume integration guide.
9bd9407 [Tathagata Das] Updated streaming programming guide with additional information from SPARK-2419.