spark-instrumented-optimizer/python/pyspark
noelsmith 7583681e6b [SPARK-10188] [PYSPARK] Pyspark CrossValidator with RMSE selects incorrect model
* Added isLargerBetter() method to Pyspark Evaluator to match the Scala version.
* JavaEvaluator delegates isLargerBetter() to underlying Scala object.
* Added check for isLargerBetter() in CrossValidator to determine whether to use argmin or argmax.
* Added test cases for where smaller is better (RMSE) and larger is better (R-Squared).

(This contribution is my original work and that I license the work to the project under Sparks' open source license)

Author: noelsmith <mail@noelsmith.com>

Closes #8399 from noel-smith/pyspark-rmse-xval-fix.
2015-08-27 23:59:30 -07:00
..
ml [SPARK-10188] [PYSPARK] Pyspark CrossValidator with RMSE selects incorrect model 2015-08-27 23:59:30 -07:00
mllib [SPARK-9805] [MLLIB] [PYTHON] [STREAMING] Added _eventually for ml streaming pyspark tests 2015-08-15 18:48:20 -07:00
sql [SPARK-9964] [PYSPARK] [SQL] PySpark DataFrameReader accept RDD of String for JSON 2015-08-26 22:19:11 -07:00
streaming [SPARK-10168] [STREAMING] Fix the issue that maven publishes wrong artifact jars 2015-08-24 12:38:01 -07:00
__init__.py [SPARK-4172] [PySpark] Progress API in Python 2015-02-17 13:36:43 -08:00
accumulators.py [SPARK-8652] [PYSPARK] Check return value for all uses of doctest.testmod() 2015-06-26 08:12:22 -07:00
broadcast.py [SPARK-8652] [PYSPARK] Check return value for all uses of doctest.testmod() 2015-06-26 08:12:22 -07:00
cloudpickle.py [SPARK-9116] [SQL] [PYSPARK] support Python only UDT in __main__ 2015-07-29 22:30:49 -07:00
conf.py [SPARK-4897] [PySpark] Python 3 support 2015-04-16 16:20:57 -07:00
context.py [MINOR] [SQL] Fix sphinx warnings in PySpark SQL 2015-08-20 10:05:31 -07:00
daemon.py [SPARK-4897] [PySpark] Python 3 support 2015-04-16 16:20:57 -07:00
files.py [SPARK-3309] [PySpark] Put all public API in __all__ 2014-09-03 11:49:45 -07:00
heapq3.py [SPARK-8652] [PYSPARK] Check return value for all uses of doctest.testmod() 2015-06-26 08:12:22 -07:00
java_gateway.py [SPARK-9700] Pick default page size more intelligently. 2015-08-06 23:18:29 -07:00
join.py [SPARK-4897] [PySpark] Python 3 support 2015-04-16 16:20:57 -07:00
profiler.py [SPARK-8652] [PYSPARK] Check return value for all uses of doctest.testmod() 2015-06-26 08:12:22 -07:00
rdd.py [SPARK-9828] [PYSPARK] Mutable values should not be default arguments 2015-08-14 12:46:05 -07:00
rddsampler.py [SPARK-4897] [PySpark] Python 3 support 2015-04-16 16:20:57 -07:00
resultiterable.py [SPARK-3074] [PySpark] support groupByKey() with single huge key 2015-04-09 17:07:23 -07:00
serializers.py [SPARK-8652] [PYSPARK] Check return value for all uses of doctest.testmod() 2015-06-26 08:12:22 -07:00
shell.py [SPARK-9270] [PYSPARK] allow --name option in pyspark 2015-07-24 11:56:55 -07:00
shuffle.py [SPARK-9116] [SQL] [PYSPARK] support Python only UDT in __main__ 2015-07-29 22:30:49 -07:00
statcounter.py [SPARK-9828] [PYSPARK] Mutable values should not be default arguments 2015-08-14 12:46:05 -07:00
status.py [SPARK-4172] [PySpark] Progress API in Python 2015-02-17 13:36:43 -08:00
storagelevel.py [SPARK-3417] Use new-style classes in PySpark 2014-09-08 15:45:36 -07:00
tests.py [SPARK-9244] Increase some memory defaults 2015-07-22 15:28:09 -07:00
traceback_utils.py [SPARK-1087] Move python traceback utilities into new traceback_utils.py file. 2014-09-15 19:28:17 -07:00
worker.py [SPARK-8976] [PYSPARK] fix open mode in python3 2015-08-13 17:33:37 -07:00