spark-instrumented-optimizer/python/pyspark/sql
Liang-Chi Hsieh 3aa933b162 [SPARK-36465][SS] Dynamic gap duration in session window
### What changes were proposed in this pull request?

This patch supports dynamic gap duration in session window.

### Why are the changes needed?

The gap duration used in session window for now is a static value. To support more complex usage, it is better to support dynamic gap duration which determines the gap duration by looking at the current data. For example, in our usecase, we may have different gap by looking at the certain column in the input rows.

### Does this PR introduce _any_ user-facing change?

Yes, users can specify dynamic gap duration.

### How was this patch tested?

Modified existing tests and new test.

Closes #33691 from viirya/dynamic-session-window-gap.

Authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com>
(cherry picked from commit 8b8d91cf64)
Signed-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com>
2021-08-16 11:06:16 +09:00
..
avro [SPARK-34300][PYSPARK][DOCS][MINOR] Fix some typos and syntax issues in docstrings and output of dev/lint-python 2021-02-02 09:30:50 +09:00
pandas [SPARK-36146][PYTHON][INFRA][TESTS] Upgrade Python version from 3.6 to 3.9 in GitHub Actions' linter/docs 2021-07-16 11:41:53 +09:00
tests [SPARK-36224][SQL] Use Void as the type name of NullType 2021-08-02 23:20:11 +08:00
__init__.py [SPARK-32138] Drop Python 2.7, 3.4 and 3.5 2020-07-14 11:22:44 +09:00
__init__.pyi [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
_typing.pyi [SPARK-36211][PYTHON] Correct typing of udf return value 2021-07-27 09:09:11 +02:00
catalog.py [SPARK-33730][PYTHON] Standardize warning types 2021-01-18 09:32:55 +09:00
catalog.pyi [SPARK-35019][PYTHON][SQL] Fix type hints mismatches in pyspark.sql.* 2021-04-13 11:21:13 +09:00
column.py [SPARK-36160][PYTHON][DOCS] Clarifying documentation for pyspark sql/column 2021-07-16 21:41:57 +09:00
column.pyi [SPARK-34630][PYTHON][SQL] Added typehint for pyspark.sql.Column.contains 2021-03-24 15:21:19 +01:00
conf.py [SPARK-32138] Drop Python 2.7, 3.4 and 3.5 2020-07-14 11:22:44 +09:00
conf.pyi [SPARK-35019][PYTHON][SQL] Fix type hints mismatches in pyspark.sql.* 2021-04-13 11:21:13 +09:00
context.py [MINOR][DOCS] Avoid some python docs where first sentence has "e.g." or similar 2021-05-12 10:38:59 +09:00
context.pyi [SPARK-35019][PYTHON][SQL] Fix type hints mismatches in pyspark.sql.* 2021-04-13 11:21:13 +09:00
dataframe.py [SPARK-36161][PYTHON] Add type check on dropDuplicates pyspark function 2021-07-29 19:11:57 +09:00
dataframe.pyi [SPARK-35605][PYTHON] Move to_pandas_on_spark to the Spark DataFrame 2021-06-28 11:47:09 +09:00
functions.py [SPARK-36465][SS] Dynamic gap duration in session window 2021-08-16 11:06:16 +09:00
functions.pyi [SPARK-36465][SS] Dynamic gap duration in session window 2021-08-16 11:06:16 +09:00
group.py [SPARK-36226][PYTHON][DOCS] Improve python docstring links to other classes 2021-07-23 19:18:00 +09:00
group.pyi [SPARK-32714][PYTHON] Initial pyspark-stubs port 2020-09-24 14:15:36 +09:00
readwriter.py [SPARK-36181][PYTHON] Update pyspark sql readwriter documentation 2021-07-19 19:51:24 +09:00
readwriter.pyi [SPARK-33566][CORE][SQL][SS][PYTHON] Make unescapedQuoteHandling option configurable when read CSV 2020-11-27 15:47:39 +09:00
session.py [SPARK-36226][PYTHON][DOCS] Improve python docstring links to other classes 2021-07-23 19:18:00 +09:00
session.pyi [SPARK-33457][PYTHON] Adjust mypy configuration 2020-11-25 09:27:04 +09:00
streaming.py [SPARK-35433][DOCS] Move CSV data source options from Python and Scala into a single page 2021-06-01 10:58:49 +09:00
streaming.pyi [SPARK-33836][SS][PYTHON] Expose DataStreamReader.table and DataStreamWriter.toTable 2020-12-21 19:42:59 +09:00
types.py [SPARK-36224][SQL] Use Void as the type name of NullType 2021-08-02 23:20:11 +08:00
types.pyi [SPARK-33457][PYTHON] Adjust mypy configuration 2020-11-25 09:27:04 +09:00
udf.py [SPARK-34408][PYTHON] Refactor spark.udf.register to share the same path to generate UDF instance 2021-02-11 10:57:02 +09:00
udf.pyi [SPARK-33457][PYTHON] Adjust mypy configuration 2020-11-25 09:27:04 +09:00
utils.py Spelling r common dev mlib external project streaming resource managers python 2020-11-27 10:22:45 -06:00
window.py [SPARK-33250][PYTHON][DOCS] Migration to NumPy documentation style in SQL (pyspark.sql.*) 2020-11-03 10:00:49 +09:00
window.pyi [SPARK-33250][PYTHON][DOCS] Migration to NumPy documentation style in SQL (pyspark.sql.*) 2020-11-03 10:00:49 +09:00