spark-instrumented-optimizer

History

0x0FFF 6cd98c1878 [SPARK-10417] [SQL] Iterating through Column results in infinite loop `pyspark.sql.column.Column` object has `__getitem__` method, which makes it iterable for Python. In fact it has `__getitem__` to address the case when the column might be a list or dict, for you to be able to access certain element of it in DF API. The ability to iterate over it is just a side effect that might cause confusion for the people getting familiar with Spark DF (as you might iterate this way on Pandas DF for instance) Issue reproduction: ``` df = sqlContext.jsonRDD(sc.parallelize(['{"name": "El Magnifico"}'])) for i in df["name"]: print i ``` Author: 0x0FFF <programmerag@gmail.com> Closes #8574 from 0x0FFF/SPARK-10417.		2015-09-02 13:36:36 -07:00
..
__init__.py	[SPARK-8060] Improve DataFrame Python test coverage and documentation.	2015-06-03 00:23:34 -07:00
column.py	[SPARK-10417] [SQL] Iterating through Column results in infinite loop	2015-09-02 13:36:36 -07:00
context.py	[SPARK-9942] [PYSPARK] [SQL] ignore exceptions while try to import pandas	2015-08-13 14:03:55 -07:00
dataframe.py	[SPARK-9613] [CORE] Ban use of JavaConversions and migrate all existing uses to JavaConverters	2015-08-25 12:33:13 +01:00
functions.py	[DOCS] [SQL] [PYSPARK] Fix typo in ntile function	2015-08-19 09:42:41 +01:00
group.py	[SPARK-8770][SQL] Create BinaryOperator abstract class.	2015-07-01 21:14:13 -07:00
readwriter.py	[SPARK-9964] [PYSPARK] [SQL] PySpark DataFrameReader accept RDD of String for JSON	2015-08-26 22:19:11 -07:00
tests.py	[SPARK-10417] [SQL] Iterating through Column results in infinite loop	2015-09-02 13:36:36 -07:00
types.py	[SPARK-10392] [SQL] Pyspark - Wrong DateType support on JDBC connection	2015-09-01 14:58:49 -07:00
utils.py	[SPARK-9166][SQL][PYSPARK] Capture and hide IllegalArgumentException in Python API	2015-07-19 00:32:56 -07:00
window.py	[SPARK-9978] [PYSPARK] [SQL] fix Window.orderBy and doc of ntile()	2015-08-14 13:55:29 -07:00