spark-instrumented-optimizer/python/pyspark
0x0FFF 6cd98c1878 [SPARK-10417] [SQL] Iterating through Column results in infinite loop
`pyspark.sql.column.Column` object has `__getitem__` method, which makes it iterable for Python. In fact it has `__getitem__` to address the case when the column might be a list or dict, for you to be able to access certain element of it in DF API. The ability to iterate over it is just a side effect that might cause confusion for the people getting familiar with Spark DF (as you might iterate this way on Pandas DF for instance)

Issue reproduction:
```
df = sqlContext.jsonRDD(sc.parallelize(['{"name": "El Magnifico"}']))
for i in df["name"]: print i
```

Author: 0x0FFF <programmerag@gmail.com>

Closes #8574 from 0x0FFF/SPARK-10417.
2015-09-02 13:36:36 -07:00
..
ml [SPARK-9679] [ML] [PYSPARK] Add Python API for Stop Words Remover 2015-09-01 10:48:57 -07:00
mllib [SPARK-9805] [MLLIB] [PYTHON] [STREAMING] Added _eventually for ml streaming pyspark tests 2015-08-15 18:48:20 -07:00
sql [SPARK-10417] [SQL] Iterating through Column results in infinite loop 2015-09-02 13:36:36 -07:00
streaming [SPARK-10168] [STREAMING] Fix the issue that maven publishes wrong artifact jars 2015-08-24 12:38:01 -07:00
__init__.py [SPARK-4172] [PySpark] Progress API in Python 2015-02-17 13:36:43 -08:00
accumulators.py [SPARK-8652] [PYSPARK] Check return value for all uses of doctest.testmod() 2015-06-26 08:12:22 -07:00
broadcast.py [SPARK-8652] [PYSPARK] Check return value for all uses of doctest.testmod() 2015-06-26 08:12:22 -07:00
cloudpickle.py [SPARK-9116] [SQL] [PYSPARK] support Python only UDT in __main__ 2015-07-29 22:30:49 -07:00
conf.py [SPARK-4897] [PySpark] Python 3 support 2015-04-16 16:20:57 -07:00
context.py [MINOR] [SQL] Fix sphinx warnings in PySpark SQL 2015-08-20 10:05:31 -07:00
daemon.py [SPARK-4897] [PySpark] Python 3 support 2015-04-16 16:20:57 -07:00
files.py [SPARK-3309] [PySpark] Put all public API in __all__ 2014-09-03 11:49:45 -07:00
heapq3.py [SPARK-8652] [PYSPARK] Check return value for all uses of doctest.testmod() 2015-06-26 08:12:22 -07:00
java_gateway.py [SPARK-9700] Pick default page size more intelligently. 2015-08-06 23:18:29 -07:00
join.py [SPARK-4897] [PySpark] Python 3 support 2015-04-16 16:20:57 -07:00
profiler.py [SPARK-8652] [PYSPARK] Check return value for all uses of doctest.testmod() 2015-06-26 08:12:22 -07:00
rdd.py [SPARK-9828] [PYSPARK] Mutable values should not be default arguments 2015-08-14 12:46:05 -07:00
rddsampler.py [SPARK-4897] [PySpark] Python 3 support 2015-04-16 16:20:57 -07:00
resultiterable.py [SPARK-3074] [PySpark] support groupByKey() with single huge key 2015-04-09 17:07:23 -07:00
serializers.py [SPARK-8652] [PYSPARK] Check return value for all uses of doctest.testmod() 2015-06-26 08:12:22 -07:00
shell.py [SPARK-9270] [PYSPARK] allow --name option in pyspark 2015-07-24 11:56:55 -07:00
shuffle.py [SPARK-9116] [SQL] [PYSPARK] support Python only UDT in __main__ 2015-07-29 22:30:49 -07:00
statcounter.py [SPARK-9828] [PYSPARK] Mutable values should not be default arguments 2015-08-14 12:46:05 -07:00
status.py [SPARK-4172] [PySpark] Progress API in Python 2015-02-17 13:36:43 -08:00
storagelevel.py [SPARK-3417] Use new-style classes in PySpark 2014-09-08 15:45:36 -07:00
tests.py [SPARK-9244] Increase some memory defaults 2015-07-22 15:28:09 -07:00
traceback_utils.py [SPARK-1087] Move python traceback utilities into new traceback_utils.py file. 2014-09-15 19:28:17 -07:00
worker.py [SPARK-8976] [PYSPARK] fix open mode in python3 2015-08-13 17:33:37 -07:00