spark-instrumented-optimizer/python/pyspark/sql
Reynold Xin 8a9d2348e9 [SPARK-7324] [SQL] DataFrame.dropDuplicates
This should also close https://github.com/apache/spark/pull/5870

Author: Reynold Xin <rxin@databricks.com>

Closes #6066 from rxin/dropDups and squashes the following commits:

130692f [Reynold Xin] [SPARK-7324][SQL] DataFrame.dropDuplicates

(cherry picked from commit b6bf4f76c7)
Signed-off-by: Michael Armbrust <michael@databricks.com>
2015-05-11 19:15:32 -07:00
..
__init__.py [SPARK-7240][SQL] Single pass covariance calculation for dataframes 2015-05-01 13:29:17 -07:00
_types.py [SPARK-7462][SQL] Update documentation for retaining grouping columns in DataFrames. 2015-05-11 18:07:19 -07:00
context.py [SPARK-6949] [SQL] [PySpark] Support Date/Timestamp in Column expression 2015-04-21 00:08:18 -07:00
dataframe.py [SPARK-7324] [SQL] DataFrame.dropDuplicates 2015-05-11 19:15:32 -07:00
functions.py [SPARK-7118] [Python] Add the coalesce Spark SQL function available in PySpark 2015-05-07 10:58:47 -07:00
tests.py [SPARK-7133] [SQL] Implement struct, array, and map field accessor 2015-05-08 11:49:49 -07:00