spark-instrumented-optimizer/python/pyspark
Reynold Xin 0a4844f90a [SPARK-7462] By default retain group by columns in aggregate
Updated Java, Scala, Python, and R.

Author: Reynold Xin <rxin@databricks.com>
Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu>

Closes #5996 from rxin/groupby-retain and squashes the following commits:

aac7119 [Reynold Xin] Merge branch 'groupby-retain' of github.com:rxin/spark into groupby-retain
f6858f6 [Reynold Xin] Merge branch 'master' into groupby-retain
5f923c0 [Reynold Xin] Merge pull request #15 from shivaram/sparkr-groupby-retrain
c1de670 [Shivaram Venkataraman] Revert workaround in SparkR to retain grouped cols Based on reverting code added in commit 9a6be746ef
b8b87e1 [Reynold Xin] Fixed DataFrameJoinSuite.
d910141 [Reynold Xin] Updated rest of the files
1e6e666 [Reynold Xin] [SPARK-7462] By default retain group by columns in aggregate
2015-05-11 11:35:16 -07:00
..
ml [SPARK-7427] [PYSPARK] Make sharedParams match in Scala, Python 2015-05-10 19:18:32 -07:00
mllib [SPARK-6092] [MLLIB] Add RankingMetrics in PySpark/MLlib 2015-05-11 09:14:20 -07:00
sql [SPARK-7462] By default retain group by columns in aggregate 2015-05-11 11:35:16 -07:00
streaming [SPARK-2808][Streaming][Kafka] update kafka to 0.8.2 2015-05-01 17:54:56 -07:00
__init__.py [SPARK-4172] [PySpark] Progress API in Python 2015-02-17 13:36:43 -08:00
accumulators.py [SPARK-6661] Python type errors should print type, not object 2015-04-20 10:44:09 -07:00
broadcast.py [SPARK-4897] [PySpark] Python 3 support 2015-04-16 16:20:57 -07:00
cloudpickle.py [SPARK-4897] [PySpark] Python 3 support 2015-04-16 16:20:57 -07:00
conf.py [SPARK-4897] [PySpark] Python 3 support 2015-04-16 16:20:57 -07:00
context.py [SPARK-3444] Provide an easy way to change log level 2015-05-01 18:02:51 -07:00
daemon.py [SPARK-4897] [PySpark] Python 3 support 2015-04-16 16:20:57 -07:00
files.py [SPARK-3309] [PySpark] Put all public API in __all__ 2014-09-03 11:49:45 -07:00
heapq3.py [SPARK-4897] [PySpark] Python 3 support 2015-04-16 16:20:57 -07:00
java_gateway.py [SPARK-6949] [SQL] [PySpark] Support Date/Timestamp in Column expression 2015-04-21 00:08:18 -07:00
join.py [SPARK-4897] [PySpark] Python 3 support 2015-04-16 16:20:57 -07:00
profiler.py [SPARK-4897] [PySpark] Python 3 support 2015-04-16 16:20:57 -07:00
rdd.py [SPARK-7438] [SPARK CORE] Fixed validation of relativeSD in countApproxDistinct 2015-05-09 10:03:15 +01:00
rddsampler.py [SPARK-4897] [PySpark] Python 3 support 2015-04-16 16:20:57 -07:00
resultiterable.py [SPARK-3074] [PySpark] support groupByKey() with single huge key 2015-04-09 17:07:23 -07:00
serializers.py [SPARK-4897] [PySpark] Python 3 support 2015-04-16 16:20:57 -07:00
shell.py [SPARK-4897] [PySpark] Python 3 support 2015-04-16 16:20:57 -07:00
shuffle.py [SPARK-6953] [PySpark] speed up python tests 2015-04-21 17:49:55 -07:00
statcounter.py [SPARK-4897] [PySpark] Python 3 support 2015-04-16 16:20:57 -07:00
status.py [SPARK-4172] [PySpark] Progress API in Python 2015-02-17 13:36:43 -08:00
storagelevel.py [SPARK-3417] Use new-style classes in PySpark 2014-09-08 15:45:36 -07:00
tests.py [SPARK-7438] [SPARK CORE] Fixed validation of relativeSD in countApproxDistinct 2015-05-09 10:03:15 +01:00
traceback_utils.py [SPARK-1087] Move python traceback utilities into new traceback_utils.py file. 2014-09-15 19:28:17 -07:00
worker.py [SPARK-4897] [PySpark] Python 3 support 2015-04-16 16:20:57 -07:00