spark-instrumented-optimizer

History

Yin Huai 5ab9fcfb01 [SPARK-8532] [SQL] In Python's DataFrameWriter, save/saveAsTable/json/parquet/jdbc always override mode https://issues.apache.org/jira/browse/SPARK-8532 This PR has two changes. First, it fixes the bug that save actions (i.e. `save/saveAsTable/json/parquet/jdbc`) always override mode. Second, it adds input argument `partitionBy` to `save/saveAsTable/parquet`. Author: Yin Huai <yhuai@databricks.com> Closes #6937 from yhuai/SPARK-8532 and squashes the following commits: f972d5d [Yin Huai] davies's comment. d37abd2 [Yin Huai] style. d21290a [Yin Huai] Python doc. 889eb25 [Yin Huai] Minor refactoring and add partitionBy to save, saveAsTable, and parquet. 7fbc24b [Yin Huai] Use None instead of "error" as the default value of mode since JVM-side already uses "error" as the default value. d696dff [Yin Huai] Python style. 88eb6c4 [Yin Huai] If mode is "error", do not call mode method. c40c461 [Yin Huai] Regression test.		2015-06-22 13:51:23 -07:00
..
ml	[SPARK-8468] [ML] Take the negative of some metrics in RegressionEvaluator to get correct cross validation	2015-06-20 13:01:59 -07:00
mllib	[SPARK-8511] [PYSPARK] Modify a test to remove a saved model in `regression.py`	2015-06-22 11:53:11 -07:00
sql	[SPARK-8532] [SQL] In Python's DataFrameWriter, save/saveAsTable/json/parquet/jdbc always override mode	2015-06-22 13:51:23 -07:00
streaming	[SPARK-8444] [STREAMING] Adding Python streaming example for queueStream	2015-06-19 00:07:53 -07:00
__init__.py	[SPARK-4172] [PySpark] Progress API in Python	2015-02-17 13:36:43 -08:00
accumulators.py	[SPARK-7899] [PYSPARK] Fix Python 3 pyspark/sql/types module conflict	2015-05-29 14:13:44 -07:00
broadcast.py	[SPARK-4897] [PySpark] Python 3 support	2015-04-16 16:20:57 -07:00
cloudpickle.py	[SPARK-4897] [PySpark] Python 3 support	2015-04-16 16:20:57 -07:00
conf.py	[SPARK-4897] [PySpark] Python 3 support	2015-04-16 16:20:57 -07:00
context.py	[SPARK-8373] [PYSPARK] Add emptyRDD to pyspark and fix the issue when calling sum on an empty RDD	2015-06-17 13:59:39 -07:00
daemon.py	[SPARK-4897] [PySpark] Python 3 support	2015-04-16 16:20:57 -07:00
files.py	[SPARK-3309] [PySpark] Put all public API in __all__	2014-09-03 11:49:45 -07:00
heapq3.py	[SPARK-4897] [PySpark] Python 3 support	2015-04-16 16:20:57 -07:00
java_gateway.py	[SPARK-6949] [SQL] [PySpark] Support Date/Timestamp in Column expression	2015-04-21 00:08:18 -07:00
join.py	[SPARK-4897] [PySpark] Python 3 support	2015-04-16 16:20:57 -07:00
profiler.py	[SPARK-4897] [PySpark] Python 3 support	2015-04-16 16:20:57 -07:00
rdd.py	[SPARK-8373] [PYSPARK] Add emptyRDD to pyspark and fix the issue when calling sum on an empty RDD	2015-06-17 13:59:39 -07:00
rddsampler.py	[SPARK-4897] [PySpark] Python 3 support	2015-04-16 16:20:57 -07:00
resultiterable.py	[SPARK-3074] [PySpark] support groupByKey() with single huge key	2015-04-09 17:07:23 -07:00
serializers.py	[SPARK-8339] [PYSPARK] integer division for python 3	2015-06-19 00:12:20 -07:00
shell.py	[SPARK-4897] [PySpark] Python 3 support	2015-04-16 16:20:57 -07:00
shuffle.py	[SPARK-8202] [PYSPARK] fix infinite loop during external sort in PySpark	2015-06-18 13:45:58 -07:00
statcounter.py	[SPARK-4897] [PySpark] Python 3 support	2015-04-16 16:20:57 -07:00
status.py	[SPARK-4172] [PySpark] Progress API in Python	2015-02-17 13:36:43 -08:00
storagelevel.py	[SPARK-3417] Use new-style classes in PySpark	2014-09-08 15:45:36 -07:00
tests.py	[SPARK-8202] [PYSPARK] fix infinite loop during external sort in PySpark	2015-06-18 13:45:58 -07:00
traceback_utils.py	[SPARK-1087] Move python traceback utilities into new traceback_utils.py file.	2014-09-15 19:28:17 -07:00
worker.py	[SPARK-6216] [PYSPARK] check python version of worker with driver	2015-05-18 12:55:13 -07:00