[SPARK-32338][SQL][PYSPARK][FOLLOW-UP] Update slice to accept Column for start and length
### What changes were proposed in this pull request? This is a follow-up of #29138 which added overload `slice` function to accept `Column` for `start` and `length` in Scala. This PR is updating the equivalent Python function to accept `Column` as well. ### Why are the changes needed? Now that Scala version accepts `Column`, Python version should also accept it. ### Does this PR introduce _any_ user-facing change? Yes, PySpark users will also be able to pass Column object to `start` and `length` parameter in `slice` function. ### How was this patch tested? Added tests. Closes #29195 from ueshin/issues/SPARK-32338/slice. Authored-by: Takuya UESHIN <ueshin@databricks.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>
This commit is contained in:
parent
f8d29d371c
commit
7b66882c9d
|
@ -2068,7 +2068,11 @@ def slice(x, start, length):
|
|||
[Row(sliced=[2, 3]), Row(sliced=[5])]
|
||||
"""
|
||||
sc = SparkContext._active_spark_context
|
||||
return Column(sc._jvm.functions.slice(_to_java_column(x), start, length))
|
||||
return Column(sc._jvm.functions.slice(
|
||||
_to_java_column(x),
|
||||
start._jc if isinstance(start, Column) else start,
|
||||
length._jc if isinstance(length, Column) else length
|
||||
))
|
||||
|
||||
|
||||
@since(2.4)
|
||||
|
|
|
@ -292,6 +292,16 @@ class FunctionsTests(ReusedSQLTestCase):
|
|||
for result in results:
|
||||
self.assertEqual(result[0], '')
|
||||
|
||||
def test_slice(self):
|
||||
from pyspark.sql.functions import slice, lit
|
||||
|
||||
df = self.spark.createDataFrame([([1, 2, 3],), ([4, 5],)], ['x'])
|
||||
|
||||
self.assertEquals(
|
||||
df.select(slice(df.x, 2, 2).alias("sliced")).collect(),
|
||||
df.select(slice(df.x, lit(2), lit(2)).alias("sliced")).collect(),
|
||||
)
|
||||
|
||||
def test_array_repeat(self):
|
||||
from pyspark.sql.functions import array_repeat, lit
|
||||
|
||||
|
|
Loading…
Reference in a new issue