spark-instrumented-optimizer/python/pyspark/sql/tests
Terry Kim 3175f4bf1b [SPARK-29664][PYTHON][SQL] Column.getItem behavior is not consistent with Scala
### What changes were proposed in this pull request?

This PR changes the behavior of `Column.getItem` to call `Column.getItem` on Scala side instead of `Column.apply`.

### Why are the changes needed?

The current behavior is not consistent with that of Scala.

In PySpark:
```Python
df = spark.range(2)
map_col = create_map(lit(0), lit(100), lit(1), lit(200))
df.withColumn("mapped", map_col.getItem(col('id'))).show()
# +---+------+
# | id|mapped|
# +---+------+
# |  0|   100|
# |  1|   200|
# +---+------+
```
In Scala:
```Scala
val df = spark.range(2)
val map_col = map(lit(0), lit(100), lit(1), lit(200))
// The following getItem results in the following exception, which is the right behavior:
// java.lang.RuntimeException: Unsupported literal type class org.apache.spark.sql.Column id
//  at org.apache.spark.sql.catalyst.expressions.Literal$.apply(literals.scala:78)
//  at org.apache.spark.sql.Column.getItem(Column.scala:856)
//  ... 49 elided
df.withColumn("mapped", map_col.getItem(col("id"))).show
```

### Does this PR introduce any user-facing change?

Yes. If the use wants to pass `Column` object to `getItem`, he/she now needs to use the indexing operator to achieve the previous behavior.

```Python
df = spark.range(2)
map_col = create_map(lit(0), lit(100), lit(1), lit(200))
df.withColumn("mapped", map_col[col('id'))].show()
# +---+------+
# | id|mapped|
# +---+------+
# |  0|   100|
# |  1|   200|
# +---+------+
```

### How was this patch tested?

Existing tests.

Closes #26351 from imback82/spark-29664.

Authored-by: Terry Kim <yuminkim@gmail.com>
Signed-off-by: HyukjinKwon <gurwls223@apache.org>
2019-11-01 12:25:48 +09:00
..
__init__.py [SPARK-26032][PYTHON] Break large sql/tests.py files into smaller files 2018-11-14 14:51:11 +08:00
test_arrow.py [SPARK-28881][PYTHON][TESTS][FOLLOW-UP] Use SparkSession(SparkContext(...)) to prevent for Spark conf to affect other tests 2019-08-28 10:39:21 +09:00
test_catalog.py [SPARK-28130][PYTHON] Print pretty messages for skipped tests when xmlrunner is available in PySpark 2019-06-24 09:58:17 +09:00
test_column.py [SPARK-29664][PYTHON][SQL] Column.getItem behavior is not consistent with Scala 2019-11-01 12:25:48 +09:00
test_conf.py [SPARK-28130][PYTHON] Print pretty messages for skipped tests when xmlrunner is available in PySpark 2019-06-24 09:58:17 +09:00
test_context.py [SPARK-28980][CORE][SQL][STREAMING][MLLIB] Remove most items deprecated in Spark 2.2.0 or earlier, for Spark 3 2019-09-09 10:19:40 -05:00
test_dataframe.py [SPARK-27659][PYTHON] Allow PySpark to prefetch during toLocalIterator 2019-09-20 09:59:31 -07:00
test_datasources.py [SPARK-28130][PYTHON] Print pretty messages for skipped tests when xmlrunner is available in PySpark 2019-06-24 09:58:17 +09:00
test_functions.py [SPARK-28153][PYTHON] Use AtomicReference at InputFileBlockHolder (to support input_file_name with Python UDF) 2019-07-31 22:40:01 +08:00
test_group.py [SPARK-28130][PYTHON] Print pretty messages for skipped tests when xmlrunner is available in PySpark 2019-06-24 09:58:17 +09:00
test_pandas_udf.py [SPARK-28130][PYTHON] Print pretty messages for skipped tests when xmlrunner is available in PySpark 2019-06-24 09:58:17 +09:00
test_pandas_udf_cogrouped_map.py [SPARK-27463][PYTHON][FOLLOW-UP] Miscellaneous documentation and code cleanup of cogroup pandas UDF 2019-09-30 22:25:35 +09:00
test_pandas_udf_grouped_agg.py [SPARK-28422][SQL][PYTHON] GROUPED_AGG pandas_udf should work without group by clause 2019-08-14 00:32:33 +09:00
test_pandas_udf_grouped_map.py [SPARK-29402][PYTHON][TESTS] Added tests for grouped map pandas_udf with window 2019-10-11 16:19:13 -07:00
test_pandas_udf_iter.py [SPARK-28198][PYTHON][FOLLOW-UP] Rename mapPartitionsInPandas to mapInPandas with a separate evaluation type 2019-07-05 09:22:41 +09:00
test_pandas_udf_scalar.py [SPARK-28998][SQL] reorganize the packages of DS v2 interfaces/classes 2019-09-12 19:59:34 +08:00
test_pandas_udf_window.py [SPARK-28130][PYTHON] Print pretty messages for skipped tests when xmlrunner is available in PySpark 2019-06-24 09:58:17 +09:00
test_readwriter.py [SPARK-28411][PYTHON][SQL] InsertInto with overwrite is not honored 2019-07-18 13:37:59 +09:00
test_serde.py [SPARK-29041][PYTHON] Allows createDataFrame to accept bytes as binary type 2019-09-12 08:52:25 +09:00
test_session.py [SPARK-28130][PYTHON] Print pretty messages for skipped tests when xmlrunner is available in PySpark 2019-06-24 09:58:17 +09:00
test_streaming.py [SPARK-28130][PYTHON] Print pretty messages for skipped tests when xmlrunner is available in PySpark 2019-06-24 09:58:17 +09:00
test_types.py [SPARK-28454][PYTHON] Validate LongType in createDataFrame(verifySchema=True) 2019-08-08 11:47:25 +09:00
test_udf.py [SPARK-28998][SQL] reorganize the packages of DS v2 interfaces/classes 2019-09-12 19:59:34 +08:00
test_utils.py [SPARK-19926][PYSPARK] make captured exception from JVM side user friendly 2019-09-18 23:32:10 +09:00