Also added desc/asc function for constructing sorting expressions more conveniently. And added a small fix to lift alias out of cast expression.
Author: Reynold Xin <rxin@databricks.com>
Closes#4752 from rxin/SPARK-5985 and squashes the following commits:
aeda5ae [Reynold Xin] Added Experimental flag to ColumnName.
047ad03 [Reynold Xin] Lift alias out of cast.
c9cf17c [Reynold Xin] [SPARK-5985][SQL] DataFrame sortBy -> orderBy in Python.
1. Column is no longer a DataFrame to simplify class hierarchy.
2. Don't use varargs on abstract methods (see Scala compiler bug SI-9013).
Author: Reynold Xin <rxin@databricks.com>
Closes#4686 from rxin/SPARK-5904 and squashes the following commits:
fd9b199 [Reynold Xin] Fixed Python tests.
df25cef [Reynold Xin] Non final.
5221530 [Reynold Xin] [SPARK-5904][SQL] DataFrame API fixes.
The `int` is 64-bit on 64-bit machine (very common now), we should infer it as LongType for it in Spark SQL.
Also, LongType in SQL will come back as `int`.
Author: Davies Liu <davies@databricks.com>
Closes#4666 from davies/long and squashes the following commits:
6bc6cc4 [Davies Liu] infer int as LongType
Also add tests for distinct()
Author: Davies Liu <davies@databricks.com>
Closes#4667 from davies/repartition and squashes the following commits:
79059fd [Davies Liu] add test
cb4915e [Davies Liu] fix repartition
Author: Davies Liu <davies@databricks.com>
Closes#4658 from davies/explain and squashes the following commits:
db87ea2 [Davies Liu] output explain in Python
1. added explain()
2. add isLocal()
3. do not call show() in __repl__
4. add foreach() and foreachPartition()
5. add distinct()
6. fix functions.col()/column()/lit()
7. fix unit tests in sql/functions.py
8. fix unicode in showString()
Author: Davies Liu <davies@databricks.com>
Closes#4645 from davies/df6 and squashes the following commits:
6b46a2c [Davies Liu] fix DataFrame Python API
Author: Yin Huai <yhuai@databricks.com>
Closes#4542 from yhuai/moveSaveMode and squashes the following commits:
65a4425 [Yin Huai] Move SaveMode to sql package.
1. DataFrame.renameColumn
2. DataFrame.show() and _repr_
3. Use simpleString() rather than jsonValue in DataFrame.dtypes
4. createDataFrame from local Python data, including pandas.DataFrame
Author: Davies Liu <davies@databricks.com>
Closes#4528 from davies/df3 and squashes the following commits:
014acea [Davies Liu] fix typo
6ba526e [Davies Liu] fix tests
46f5f95 [Davies Liu] address comments
6cbc154 [Davies Liu] dataframe.show() and improve dtypes
6f94f25 [Davies Liu] create DataFrame from local Python data
Author: Michael Armbrust <michael@databricks.com>
Closes#4436 from marmbrus/dfToString and squashes the following commits:
8a3c35f [Michael Armbrust] Merge remote-tracking branch 'origin/master' into dfToString
b72a81b [Michael Armbrust] add toString