spark-instrumented-optimizer/python/pyspark/sql
Tibor Csögör eec1a3c286 [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for Rows
This is PR is meant to replace #20503, which lay dormant for a while.  The solution in the original PR is still valid, so this is just that patch rebased onto the current master.

Original summary follows.

## What changes were proposed in this pull request?

Fix `__repr__` behaviour for Rows.

Rows `__repr__` assumes data is a string when column name is missing.
Examples,

```
>>> from pyspark.sql.types import Row
>>> Row ("Alice", "11")
<Row(Alice, 11)>

>>> Row (name="Alice", age=11)
Row(age=11, name='Alice')

>>> Row ("Alice", 11)
<snip stack trace>
TypeError: sequence item 1: expected string, int found
```

This is because Row () when called without column names assumes everything is a string.

## How was this patch tested?

Manually tested and a unit test was added to `python/pyspark/sql/tests/test_types.py`.

Closes #24448 from tbcs/SPARK-23299.

Lead-authored-by: Tibor Csögör <tibi@tiborius.net>
Co-authored-by: Shashwat Anand <me@shashwat.me>
Signed-off-by: Holden Karau <holden@pigscanfly.ca>
2019-05-06 10:00:49 -07:00
..
avro [SPARK-26856][PYSPARK][FOLLOWUP] Fix UT failure due to wrong patterns for Kinesis assembly 2019-04-02 14:52:56 +09:00
tests [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for Rows 2019-05-06 10:00:49 -07:00
__init__.py [SPARK-22369][PYTHON][DOCS] Exposes catalog API documentation in PySpark 2017-11-02 15:22:52 +01:00
catalog.py [SPARK-24665][PYSPARK][FOLLOWUP] Use SQLConf in PySpark to manage all sql configs 2018-08-17 10:18:08 +08:00
column.py [SPARK-23847][PYTHON][SQL] Add asc_nulls_first, asc_nulls_last to PySpark 2018-04-08 12:09:06 +08:00
conf.py [SPARK-23698][PYTHON] Resolve undefined names in Python 3 2018-08-22 10:06:59 -07:00
context.py [SPARK-26640][CORE][ML][SQL][STREAMING][PYSPARK] Code cleanup from lgtm.com analysis 2019-01-17 19:40:39 -06:00
dataframe.py [SPARK-27276][PYTHON][SQL] Increase minimum version of pyarrow to 0.12.1 and remove prior workarounds 2019-04-22 19:30:31 +09:00
functions.py [SPARK-23619][DOCS] Add output description for some generator expressions / functions 2019-04-27 10:30:12 +09:00
group.py [SPARK-24722][SQL] pivot() with Column type argument 2018-08-04 14:17:32 +08:00
readwriter.py [MINOR][DOC][SQL] Remove out-of-date doc about ORC in DataFrameReader and Writer 2019-04-03 09:11:09 -07:00
session.py [SPARK-27276][PYTHON][SQL] Increase minimum version of pyarrow to 0.12.1 and remove prior workarounds 2019-04-22 19:30:31 +09:00
streaming.py [SPARK-23014][SS] Fully remove V1 memory sink. 2019-04-29 09:44:23 -07:00
types.py [SPARK-23299][SQL][PYSPARK] Fix __repr__ behaviour for Rows 2019-05-06 10:00:49 -07:00
udf.py [SPARK-23836][PYTHON] Add support for StructType return in Scalar Pandas UDF 2019-03-07 08:52:24 -08:00
utils.py [SPARK-23014][SS] Fully remove V1 memory sink. 2019-04-29 09:44:23 -07:00
window.py [SPARK-26860][PYSPARK][SPARKR] Fix for RangeBetween and RowsBetween docs to be in sync with spark documentation 2019-03-11 08:53:09 -05:00