spark-instrumented-optimizer/python/pyspark/sql
Liang-Chi Hsieh 5c78be7a51 [SPARK-5799][SQL] Compute aggregation function on specified numeric columns
Compute aggregation function on specified numeric columns. For example:

    val df = Seq(("a", 1, 0, "b"), ("b", 2, 4, "c"), ("a", 2, 3, "d")).toDataFrame("key", "value1", "value2", "rest")
    df.groupBy("key").min("value2")

Author: Liang-Chi Hsieh <viirya@gmail.com>

Closes #4592 from viirya/specific_cols_agg and squashes the following commits:

9446896 [Liang-Chi Hsieh] For comments.
314c4cd [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into specific_cols_agg
353fad7 [Liang-Chi Hsieh] For python unit tests.
54ed0c4 [Liang-Chi Hsieh] Address comments.
b079e6b [Liang-Chi Hsieh] Remove duplicate codes.
55100fb [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into specific_cols_agg
880c2ac [Liang-Chi Hsieh] Fix Python style checks.
4c63a01 [Liang-Chi Hsieh] Fix pyspark.
b1a24fc [Liang-Chi Hsieh] Address comments.
2592f29 [Liang-Chi Hsieh] Merge remote-tracking branch 'upstream/master' into specific_cols_agg
27069c3 [Liang-Chi Hsieh] Combine functions and add varargs annotation.
371a3f7 [Liang-Chi Hsieh] Compute aggregation function on specified numeric columns.
2015-02-16 10:06:11 -08:00
..
__init__.py [SPARK-5752][SQL] Don't implicitly convert RDDs directly to DataFrames 2015-02-13 23:03:22 -08:00
context.py [SPARK-5752][SQL] Don't implicitly convert RDDs directly to DataFrames 2015-02-13 23:03:22 -08:00
dataframe.py [SPARK-5799][SQL] Compute aggregation function on specified numeric columns 2015-02-16 10:06:11 -08:00
functions.py [SPARK-5799][SQL] Compute aggregation function on specified numeric columns 2015-02-16 10:06:11 -08:00
tests.py [SPARK-5752][SQL] Don't implicitly convert RDDs directly to DataFrames 2015-02-13 23:03:22 -08:00
types.py [SPARK-5677] [SPARK-5734] [SQL] [PySpark] Python DataFrame API remaining tasks 2015-02-11 12:13:16 -08:00