spark-instrumented-optimizer

History

Joseph K. Bradley 36da5e3234 [SPARK-14605][ML][PYTHON] Changed Python to use unicode UIDs for spark.ml Identifiable ## What changes were proposed in this pull request? Python spark.ml Identifiable classes use UIDs of type str, but they should use unicode (in Python 2.x) to match Java. This could be a problem if someone created a class in Java with odd unicode characters, saved it, and loaded it in Python. This PR: Use unicode everywhere in Python. ## How was this patch tested? Updated persistence unit test to check uid type Author: Joseph K. Bradley <joseph@databricks.com> Closes #12368 from jkbradley/python-uid-unicode.		2016-04-16 11:23:28 -07:00
..
ml	[SPARK-14605][ML][PYTHON] Changed Python to use unicode UIDs for spark.ml Identifiable	2016-04-16 11:23:28 -07:00
mllib	[SPARK-14238][ML][MLLIB][PYSPARK] Add binary toggle Param to PySpark HashingTF in ML & MLlib	2016-04-14 21:53:32 +02:00
sql	[SPARK-14573][PYSPARK][BUILD] Fix PyDoc Makefile & highlighting issues	2016-04-14 09:42:15 +01:00
streaming	[SPARK-13579][BUILD] Stop building the main Spark assembly.	2016-04-04 16:52:22 -07:00
__init__.py	[SPARK-10380][SQL] Fix confusing documentation examples for astype/drop_duplicates.	2016-03-14 19:25:49 -07:00
accumulators.py	[SPARK-8652] [PYSPARK] Check return value for all uses of doctest.testmod()	2015-06-26 08:12:22 -07:00
broadcast.py	[SPARK-14418][PYSPARK] fix unpersist of Broadcast in Python	2016-04-06 10:46:34 -07:00
cloudpickle.py	[SPARK-13697] [PYSPARK] Fix the missing module name of TransformFunctionSerializer.loads	2016-03-06 08:57:01 -08:00
conf.py	[SPARK-4897] [PySpark] Python 3 support	2015-04-16 16:20:57 -07:00
context.py	[SPARK-13687][PYTHON] Cleanup PySpark parallelize temporary files	2016-04-10 02:34:54 +01:00
daemon.py	[SPARK-4897] [PySpark] Python 3 support	2015-04-16 16:20:57 -07:00
files.py	[SPARK-3309] [PySpark] Put all public API in __all__	2014-09-03 11:49:45 -07:00
heapq3.py	[SPARK-8652] [PYSPARK] Check return value for all uses of doctest.testmod()	2015-06-26 08:12:22 -07:00
java_gateway.py	[SPARK-9700] Pick default page size more intelligently.	2015-08-06 23:18:29 -07:00
join.py	[SPARK-14202] [PYTHON] Use generator expression instead of list comp in python_full_outer_jo…	2016-03-28 14:51:36 -07:00
profiler.py	[SPARK-8652] [PYSPARK] Check return value for all uses of doctest.testmod()	2015-06-26 08:12:22 -07:00
rdd.py	[SPARK-14368][PYSPARK] Support python.spark.worker.memory with upper-case unit.	2016-04-05 12:19:20 +09:00
rddsampler.py	[SPARK-4897] [PySpark] Python 3 support	2015-04-16 16:20:57 -07:00
resultiterable.py	[SPARK-3074] [PySpark] support groupByKey() with single huge key	2015-04-09 17:07:23 -07:00
serializers.py	[SPARK-10542] [PYSPARK] fix serialize namedtuple	2015-09-14 19:46:34 -07:00
shell.py	[SPARK-12993][PYSPARK] Remove usage of ADD_FILES in pyspark	2016-01-26 14:58:39 -08:00
shuffle.py	[SPARK-10710] Remove ability to disable spilling in core and SQL	2015-09-19 21:40:21 -07:00
statcounter.py	[SPARK-6919] [PYSPARK] Add asDict method to StatCounter	2015-09-29 13:38:15 -07:00
status.py	[SPARK-4172] [PySpark] Progress API in Python	2015-02-17 13:36:43 -08:00
storagelevel.py	[SPARK-13992][CORE][PYSPARK][FOLLOWUP] Update OFF_HEAP semantics for Java api and Python api	2016-04-12 23:06:55 -07:00
tests.py	[SPARK-13687][PYTHON] Cleanup PySpark parallelize temporary files	2016-04-10 02:34:54 +01:00
traceback_utils.py	[SPARK-1087] Move python traceback utilities into new traceback_utils.py file.	2014-09-15 19:28:17 -07:00
worker.py	[SPARK-14267] [SQL] [PYSPARK] execute multiple Python UDFs within single batch	2016-03-31 16:40:20 -07:00