spark-instrumented-optimizer/python/pyspark/sql
HyukjinKwon fda0e6e48d [SPARK-29240][PYTHON] Pass Py4J column instance to support PySpark column in element_at function
### What changes were proposed in this pull request?

This PR makes `element_at` in PySpark able to take PySpark `Column` instances.

### Why are the changes needed?

To match with Scala side. Seems it was intended but not working correctly as a bug.

### Does this PR introduce any user-facing change?

Yes. See below:

```python
from pyspark.sql import functions as F
x = spark.createDataFrame([([1,2,3],1),([4,5,6],2),([7,8,9],3)],['list','num'])
x.withColumn('aa',F.element_at('list',x.num.cast('int'))).show()
```

Before:

```
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/.../spark/python/pyspark/sql/functions.py", line 2059, in element_at
    return Column(sc._jvm.functions.element_at(_to_java_column(col), extraction))
  File "/.../spark/python/lib/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", line 1277, in __call__
  File "/.../spark/python/lib/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", line 1241, in _build_args
  File "/.../spark/python/lib/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", line 1228, in _get_args
  File "/.../forked/spark/python/lib/py4j-0.10.8.1-src.zip/py4j/java_collections.py", line 500, in convert
  File "/.../spark/python/pyspark/sql/column.py", line 344, in __iter__
    raise TypeError("Column is not iterable")
TypeError: Column is not iterable
```

After:

```
+---------+---+---+
|     list|num| aa|
+---------+---+---+
|[1, 2, 3]|  1|  1|
|[4, 5, 6]|  2|  5|
|[7, 8, 9]|  3|  9|
+---------+---+---+
```

### How was this patch tested?

Manually tested against literal, Python native types, and PySpark column.

Closes #25950 from HyukjinKwon/SPARK-29240.

Authored-by: HyukjinKwon <gurwls223@apache.org>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
2019-09-27 11:04:55 -07:00
..
avro [SPARK-28698][SQL] Support user-specified output schema in to_avro 2019-08-13 20:52:16 +08:00
tests [SPARK-27659][PYTHON] Allow PySpark to prefetch during toLocalIterator 2019-09-20 09:59:31 -07:00
__init__.py [SPARK-28980][CORE][SQL][STREAMING][MLLIB] Remove most items deprecated in Spark 2.2.0 or earlier, for Spark 3 2019-09-09 10:19:40 -05:00
catalog.py [SPARK-28980][CORE][SQL][STREAMING][MLLIB] Remove most items deprecated in Spark 2.2.0 or earlier, for Spark 3 2019-09-09 10:19:40 -05:00
cogroup.py [SPARK-27463][PYTHON] Support Dataframe Cogroup via Pandas UDFs 2019-09-17 17:13:50 -07:00
column.py [SPARK-28031][PYSPARK][TEST] Improve doctest on over function of Column 2019-06-13 11:04:41 +09:00
conf.py [SPARK-23698][PYTHON] Resolve undefined names in Python 3 2018-08-22 10:06:59 -07:00
context.py [SPARK-28980][CORE][SQL][STREAMING][MLLIB] Remove most items deprecated in Spark 2.2.0 or earlier, for Spark 3 2019-09-09 10:19:40 -05:00
dataframe.py [SPARK-27659][PYTHON] Allow PySpark to prefetch during toLocalIterator 2019-09-20 09:59:31 -07:00
functions.py [SPARK-29240][PYTHON] Pass Py4J column instance to support PySpark column in element_at function 2019-09-27 11:04:55 -07:00
group.py [SPARK-27463][PYTHON] Support Dataframe Cogroup via Pandas UDFs 2019-09-17 17:13:50 -07:00
readwriter.py [SPARK-28977][DOCS][SQL] Fix DataFrameReader.json docs to doc that partition column can be numeric, date or timestamp type 2019-09-05 18:32:45 +09:00
session.py [SPARK-27995][PYTHON] Note the difference between str of Python 2 and 3 at Arrow optimized 2019-06-11 18:43:59 +09:00
streaming.py [SPARK-28651][SS] Force the schema of Streaming file source to be nullable 2019-08-09 18:54:55 +09:00
types.py [SPARK-29041][PYTHON] Allows createDataFrame to accept bytes as binary type 2019-09-12 08:52:25 +09:00
udf.py [SPARK-27463][PYTHON] Support Dataframe Cogroup via Pandas UDFs 2019-09-17 17:13:50 -07:00
utils.py [SPARK-21045][PYTHON] Allow non-ascii string as an exception message from python execution in Python 2 2019-09-21 08:09:19 +09:00
window.py [SPARK-28855][CORE][ML][SQL][STREAMING] Remove outdated usages of Experimental, Evolving annotations 2019-09-01 10:15:00 -05:00