spark-instrumented-optimizer

History

Bryan Cutler ed075e1ff6 [SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10.0 ## What changes were proposed in this pull request? Upgrade Apache Arrow to 0.10.0 Version 0.10.0 has a number of bug fixes and improvements with the following pertaining directly to usage in Spark: * Allow for adding BinaryType support ARROW-2141 * Bug fix related to array serialization ARROW-1973 * Python2 str will be made into an Arrow string instead of bytes ARROW-2101 * Python bytearrays are supported in as input to pyarrow ARROW-2141 * Java has common interface for reset to cleanup complex vectors in Spark ArrowWriter ARROW-1962 * Cleanup pyarrow type equality checks ARROW-2423 * ArrowStreamWriter should not hold references to ArrowBlocks ARROW-2632, ARROW-2645 * Improved low level handling of messages for RecordBatch ARROW-2704 ## How was this patch tested? existing tests Author: Bryan Cutler <cutlerb@gmail.com> Closes #21939 from BryanCutler/arrow-upgrade-010.		2018-08-14 17:13:38 -07:00
..
ml	[SPARK-25090][ML] Enforce implicit type coercion in ParamGridBuilder	2018-08-13 09:11:37 +08:00
mllib	Fix typos detected by github.com/client9/misspell	2018-08-11 21:23:36 -05:00
sql	[SPARK-24391][SQL] Support arrays of any types by from_json	2018-08-13 20:13:09 +08:00
streaming	Fix typos detected by github.com/client9/misspell	2018-08-11 21:23:36 -05:00
__init__.py	[SPARK-23328][PYTHON] Disallow default value None in na.replace/replace when 'to_replace' is not a dictionary	2018-02-09 14:21:10 +08:00
_globals.py	[SPARK-23328][PYTHON] Disallow default value None in na.replace/replace when 'to_replace' is not a dictionary	2018-02-09 14:21:10 +08:00
accumulators.py	[BUILD] Fix lint-python.	2018-08-03 03:18:46 +09:00
broadcast.py	[SPARK-23522][PYTHON] always use sys.exit over builtin exit	2018-03-08 20:38:34 +09:00
cloudpickle.py	[SPARK-24303][PYTHON] Update cloudpickle to v0.4.4	2018-05-18 09:53:24 -07:00
conf.py	[SPARK-23522][PYTHON] always use sys.exit over builtin exit	2018-03-08 20:38:34 +09:00
context.py	Fix typos	2018-08-12 08:13:09 -05:00
daemon.py	[PYSPARK] Update py4j to version 0.10.7.	2018-05-09 10:47:35 -07:00
files.py	[SPARK-3309] [PySpark] Put all public API in __all__	2014-09-03 11:49:45 -07:00
find_spark_home.py	Fix typos detected by github.com/client9/misspell	2018-08-11 21:23:36 -05:00
heapq3.py	Fix typos detected by github.com/client9/misspell	2018-08-11 21:23:36 -05:00
java_gateway.py	[SPARK-24565][SS] Add API for in Structured Streaming for exposing output rows of each microbatch as a DataFrame	2018-06-19 13:56:51 -07:00
join.py	[SPARK-14202] [PYTHON] Use generator expression instead of list comp in python_full_outer_jo…	2016-03-28 14:51:36 -07:00
profiler.py	[SPARK-23522][PYTHON] always use sys.exit over builtin exit	2018-03-08 20:38:34 +09:00
rdd.py	Fix typos detected by github.com/client9/misspell	2018-08-11 21:23:36 -05:00
rddsampler.py	[SPARK-4897] [PySpark] Python 3 support	2015-04-16 16:20:57 -07:00
resultiterable.py	[SPARK-3074] [PySpark] support groupByKey() with single huge key	2015-04-09 17:07:23 -07:00
serializers.py	[SPARK-23874][SQL][PYTHON] Upgrade Apache Arrow to 0.10.0	2018-08-14 17:13:38 -07:00
shell.py	[SPARK-16451][REPL] Fail shell if SparkSession fails to start.	2018-06-05 08:29:29 +07:00
shuffle.py	[SPARK-23754][PYTHON] Re-raising StopIteration in client code	2018-05-30 18:11:33 +08:00
statcounter.py	[SPARK-6919] [PYSPARK] Add asDict method to StatCounter	2015-09-29 13:38:15 -07:00
status.py	[SPARK-4172] [PySpark] Progress API in Python	2015-02-17 13:36:43 -08:00
storagelevel.py	[SPARK-13992][CORE][PYSPARK][FOLLOWUP] Update OFF_HEAP semantics for Java api and Python api	2016-04-12 23:06:55 -07:00
taskcontext.py	[SPARK-24397][PYSPARK] Added TaskContext.getLocalProperty(key) in Python	2018-05-31 11:23:57 -07:00
tests.py	[SPARK-24396][SS][PYSPARK] Add Structured Streaming ForeachWriter for python	2018-06-15 12:56:39 -07:00
traceback_utils.py	[SPARK-1087] Move python traceback utilities into new traceback_utils.py file.	2014-09-15 19:28:17 -07:00
util.py	[SPARK-23754][PYTHON][FOLLOWUP] Move UDF stop iteration wrapping from driver to executor	2018-06-11 10:15:42 +08:00
version.py	[SPARK-23028] Bump master branch version to 2.4.0-SNAPSHOT	2018-01-13 00:37:59 +08:00
worker.py	[SPARK-24324][PYTHON] Pandas Grouped Map UDF should assign result columns by name	2018-06-24 09:28:46 +08:00