[MINOR][SS][DOCS] Update doc for streaming deduplication
### What changes were proposed in this pull request?
This patch fixes an error about streaming dedupliaction is Structured Streaming, and also updates an item about unsupported operation.
### Why are the changes needed?
Update the user document.
### Does this PR introduce _any_ user-facing change?
No. It's a doc only change.
### How was this patch tested?
Doc only change.
Closes #33801 from viirya/minor-ss-deduplication.
Authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Signed-off-by: Liang-Chi Hsieh <viirya@gmail.com>
(cherry picked from commit 5876e04de2
)
Signed-off-by: Liang-Chi Hsieh <viirya@gmail.com>
This commit is contained in:
parent
45c4b751f3
commit
212a21ee4f
|
@ -1820,6 +1820,8 @@ Some of them are as follows.
|
||||||
|
|
||||||
- Distinct operations on streaming Datasets are not supported.
|
- Distinct operations on streaming Datasets are not supported.
|
||||||
|
|
||||||
|
- Deduplication operation is not supported after aggregation on a streaming Datasets.
|
||||||
|
|
||||||
- Sorting operations are supported on streaming Datasets only after an aggregation and in Complete Output Mode.
|
- Sorting operations are supported on streaming Datasets only after an aggregation and in Complete Output Mode.
|
||||||
|
|
||||||
- Few types of outer joins on streaming Datasets are not supported. See the
|
- Few types of outer joins on streaming Datasets are not supported. See the
|
||||||
|
@ -3464,7 +3466,7 @@ the effect of the change is not well-defined. For all of them:
|
||||||
|
|
||||||
- *Streaming aggregation*: For example, `sdf.groupBy("a").agg(...)`. Any change in number or type of grouping keys or aggregates is not allowed.
|
- *Streaming aggregation*: For example, `sdf.groupBy("a").agg(...)`. Any change in number or type of grouping keys or aggregates is not allowed.
|
||||||
|
|
||||||
- *Streaming deduplication*: For example, `sdf.dropDuplicates("a")`. Any change in number or type of grouping keys or aggregates is not allowed.
|
- *Streaming deduplication*: For example, `sdf.dropDuplicates("a")`. Any change in number or type of deduplicating columns is not allowed.
|
||||||
|
|
||||||
- *Stream-stream join*: For example, `sdf1.join(sdf2, ...)` (i.e. both inputs are generated with `sparkSession.readStream`). Changes
|
- *Stream-stream join*: For example, `sdf1.join(sdf2, ...)` (i.e. both inputs are generated with `sparkSession.readStream`). Changes
|
||||||
in the schema or equi-joining columns are not allowed. Changes in join type (outer or inner) are not allowed. Other changes in the join condition are ill-defined.
|
in the schema or equi-joining columns are not allowed. Changes in join type (outer or inner) are not allowed. Other changes in the join condition are ill-defined.
|
||||||
|
|
Loading…
Reference in a new issue