[SPARK-30941][PYSPARK] Add a note to asDict to document its behavior when there are duplicate fields

### What changes were proposed in this pull request?

Adding a note to document `Row.asDict` behavior when there are duplicate fields.

### Why are the changes needed?

When a row contains duplicate fields, `asDict` and `_get_item_` behaves differently. We should document it to let users know the difference explicitly.

### Does this PR introduce any user-facing change?

No. Only document change.

### How was this patch tested?

Existing test.

Closes #27853 from viirya/SPARK-30941.

Authored-by: Liang-Chi Hsieh <viirya@gmail.com>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
This commit is contained in:
Liang-Chi Hsieh 2020-03-09 11:06:45 -07:00 committed by Dongjoon Hyun
parent b6b0343e3e
commit d21aab403a
No known key found for this signature in database
GPG key ID: EDA00CE834F0FC5C

View file

@ -1528,6 +1528,12 @@ class Row(tuple):
:param recursive: turns the nested Rows to dict (default: False).
.. note:: If a row contains duplicate field names, e.g., the rows of a join
between two :class:`DataFrame` that both have the fields of same names,
one of the duplicate fields will be selected by ``asDict``. ``__getitem__``
will also return one of the duplicate fields, however returned value might
be different to ``asDict``.
>>> Row(name="Alice", age=11).asDict() == {'name': 'Alice', 'age': 11}
True
>>> row = Row(key=1, value=Row(name='a', age=2))