spark-instrumented-optimizer

History

zero323 779f0a84ea [SPARK-32933][PYTHON] Use keyword-only syntax for keyword_only methods ### What changes were proposed in this pull request? This PR adjusts signatures of methods decorated with `keyword_only` to indicate using [Python 3 keyword-only syntax](https://www.python.org/dev/peps/pep-3102/). __Note__: For the moment the goal is not to replace `keyword_only`. For justification see https://github.com/apache/spark/pull/29591#discussion_r489402579 ### Why are the changes needed? Right now it is not clear that `keyword_only` methods are indeed keyword only. This proposal addresses that. In practice we could probably capture `locals` and drop `keyword_only` completel, i.e: ```python keyword_only def __init__(self, , featuresCol="features"): ... kwargs = self._input_kwargs self.setParams(kwargs) ``` could be replaced with ```python def __init__(self, , featuresCol="features"): kwargs = locals() del kwargs["self"] ... self.setParams(*kwargs) ``` ### Does this PR introduce _any_ user-facing change? Docstrings and inspect tools will now indicate that `keyword_only` methods expect only keyword arguments. For example with ` LinearSVC` will change from ``` >>> from pyspark.ml.classification import LinearSVC >>> ?LinearSVC.__init__ Signature: LinearSVC.__init__( self, featuresCol='features', labelCol='label', predictionCol='prediction', maxIter=100, regParam=0.0, tol=1e-06, rawPredictionCol='rawPrediction', fitIntercept=True, standardization=True, threshold=0.0, weightCol=None, aggregationDepth=2, ) Docstring: __init__(self, featuresCol="features", labelCol="label", predictionCol="prediction", maxIter=100, regParam=0.0, tol=1e-6, rawPredictionCol="rawPrediction", fitIntercept=True, standardization=True, threshold=0.0, weightCol=None, aggregationDepth=2): File: /path/to/python/pyspark/ml/classification.py Type: function ``` to ``` >>> from pyspark.ml.classification import LinearSVC >>> ?LinearSVC.__init__ Signature: LinearSVC.__init__ ( self, , featuresCol='features', labelCol='label', predictionCol='prediction', maxIter=100, regParam=0.0, tol=1e-06, rawPredictionCol='rawPrediction', fitIntercept=True, standardization=True, threshold=0.0, weightCol=None, aggregationDepth=2, blockSize=1, ) Docstring: __init__(self, \*, featuresCol="features", labelCol="label", predictionCol="prediction", maxIter=100, regParam=0.0, tol=1e-6, rawPredictionCol="rawPrediction", fitIntercept=True, standardization=True, threshold=0.0, weightCol=None, aggregationDepth=2, blockSize=1): File: ~/Workspace/spark/python/pyspark/ml/classification.py Type: function ``` ### How was this patch tested? Existing tests. Closes #29799 from zero323/SPARK-32933. Authored-by: zero323 <mszymkiewicz@gmail.com> Signed-off-by: HyukjinKwon <gurwls223@apache.org>		2020-09-23 09:28:33 +09:00
..
cloudpickle	[SPARK-32094][PYTHON] Update cloudpickle to v1.5.0	2020-07-17 11:49:18 +09:00
ml	[SPARK-32933][PYTHON] Use keyword-only syntax for keyword_only methods	2020-09-23 09:28:33 +09:00
mllib	[SPARK-32719][PYTHON] Add Flake8 check missing imports	2020-08-31 11:23:31 +09:00
resource	[SPARK-32319][PYSPARK] Disallow the use of unused imports	2020-08-08 08:51:57 -07:00
sql	[SPARK-32933][PYTHON] Use keyword-only syntax for keyword_only methods	2020-09-23 09:28:33 +09:00
streaming	[SPARK-32319][PYSPARK] Disallow the use of unused imports	2020-08-08 08:51:57 -07:00
testing	[SPARK-32319][PYSPARK] Disallow the use of unused imports	2020-08-08 08:51:57 -07:00
tests	[SPARK-32138][FOLLOW-UP] Drop obsolete StringIO import branching	2020-08-31 16:56:50 +09:00
__init__.py	[SPARK-32719][PYTHON] Add Flake8 check missing imports	2020-08-31 11:23:31 +09:00
_globals.py	[SPARK-23328][PYTHON] Disallow default value None in na.replace/replace when 'to_replace' is not a dictionary	2018-02-09 14:21:10 +08:00
accumulators.py	[SPARK-32138] Drop Python 2.7, 3.4 and 3.5	2020-07-14 11:22:44 +09:00
broadcast.py	[SPARK-32138] Drop Python 2.7, 3.4 and 3.5	2020-07-14 11:22:44 +09:00
conf.py	[SPARK-32138] Drop Python 2.7, 3.4 and 3.5	2020-07-14 11:22:44 +09:00
context.py	[SPARK-32160][CORE][PYSPARK][FOLLOWUP] Change the config name to switch allow/disallow SparkContext in executors	2020-08-04 12:45:06 +09:00
daemon.py	[SPARK-26175][PYTHON] Redirect the standard input of the forked child to devnull in daemon	2019-07-31 09:10:24 +09:00
files.py	[SPARK-28206][PYTHON] Remove the legacy Epydoc in PySpark API documentation	2019-07-05 10:08:22 -07:00
find_spark_home.py	[SPARK-29802][BUILD] Use python3 in build scripts	2020-07-19 11:02:37 +09:00
java_gateway.py	[SPARK-32138] Drop Python 2.7, 3.4 and 3.5	2020-07-14 11:22:44 +09:00
join.py	[SPARK-14202] [PYTHON] Use generator expression instead of list comp in python_full_outer_jo…	2016-03-28 14:51:36 -07:00
profiler.py	[SPARK-26640][CORE][ML][SQL][STREAMING][PYSPARK] Code cleanup from lgtm.com analysis	2019-01-17 19:40:39 -06:00
rdd.py	[SPARK-32319][PYSPARK] Disallow the use of unused imports	2020-08-08 08:51:57 -07:00
rddsampler.py	[SPARK-4897] [PySpark] Python 3 support	2015-04-16 16:20:57 -07:00
resultiterable.py	[SPARK-32138] Drop Python 2.7, 3.4 and 3.5	2020-07-14 11:22:44 +09:00
serializers.py	[SPARK-32138] Drop Python 2.7, 3.4 and 3.5	2020-07-14 11:22:44 +09:00
shell.py	[SPARK-32138] Drop Python 2.7, 3.4 and 3.5	2020-07-14 11:22:44 +09:00
shuffle.py	[SPARK-32435][PYTHON] Remove heapq3 port from Python 3	2020-07-27 20:10:13 +09:00
statcounter.py	[SPARK-6919] [PYSPARK] Add asDict method to StatCounter	2015-09-29 13:38:15 -07:00
status.py	[SPARK-4172] [PySpark] Progress API in Python	2015-02-17 13:36:43 -08:00
storagelevel.py	[SPARK-31448][PYTHON] Fix storage level used in persist() in dataframe.py	2020-09-15 08:41:22 -05:00
taskcontext.py	[SPARK-32138] Drop Python 2.7, 3.4 and 3.5	2020-07-14 11:22:44 +09:00
traceback_utils.py	[SPARK-1087] Move python traceback utilities into new traceback_utils.py file.	2014-09-15 19:28:17 -07:00
util.py	[SPARK-32010][PYTHON][CORE] Add InheritableThread for local properties and fixing a thread leak issue in pinned thread mode	2020-07-30 10:15:25 +09:00
version.py	[SPARK-30950][BUILD] Setting version to 3.1.0-SNAPSHOT	2020-02-25 19:44:31 -08:00
worker.py	[MINOR][PYTHON] Fix spacing in error message	2020-07-28 11:22:18 +09:00