Commit graph

10 commits

Author SHA1 Message Date
Sean Owen c2d0d700b5 [SPARK-26640][CORE][ML][SQL][STREAMING][PYSPARK] Code cleanup from lgtm.com analysis
## What changes were proposed in this pull request?

Misc code cleanup from lgtm.com analysis. See comments below for details.

## How was this patch tested?

Existing tests.

Closes #23571 from srowen/SPARK-26640.

Lead-authored-by: Sean Owen <sean.owen@databricks.com>
Co-authored-by: Hyukjin Kwon <gurwls223@apache.org>
Co-authored-by: Sean Owen <srowen@gmail.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2019-01-17 19:40:39 -06:00
Katrin Leinweber c5daccb1da [MINOR] Update all DOI links to preferred resolver
## What changes were proposed in this pull request?

The DOI foundation recommends [this new resolver](https://www.doi.org/doi_handbook/3_Resolution.html#3.8). Accordingly, this PR re`sed`s all static DOI links ;-)

## How was this patch tested?

It wasn't, since it seems as safe as a "[typo fix](https://spark.apache.org/contributing.html)".

In case any of the files is included from other projects, and should be updated there, please let me know.

Closes #23129 from katrinleinweber/resolve-DOIs-securely.

Authored-by: Katrin Leinweber <9948149+katrinleinweber@users.noreply.github.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2018-11-25 17:43:55 -06:00
Sean Owen 08c76b5d39 [SPARK-25238][PYTHON] lint-python: Fix W605 warnings for pycodestyle 2.4
(This change is a subset of the changes needed for the JIRA; see https://github.com/apache/spark/pull/22231)

## What changes were proposed in this pull request?

Use raw strings and simpler regex syntax consistently in Python, which also avoids warnings from pycodestyle about accidentally relying Python's non-escaping of non-reserved chars in normal strings. Also, fix a few long lines.

## How was this patch tested?

Existing tests, and some manual double-checking of the behavior of regexes in Python 2/3 to be sure.

Closes #22400 from srowen/SPARK-25238.2.

Authored-by: Sean Owen <sean.owen@databricks.com>
Signed-off-by: hyukjinkwon <gurwls223@apache.org>
2018-09-13 11:19:43 +08:00
Marco Gaido a3dccd24c2 [SPARK-10697][ML] Add lift to Association rules
## What changes were proposed in this pull request?

The PR adds the lift measure to Association rules.

## How was this patch tested?

existing and modified UTs

Closes #22236 from mgaido91/SPARK-10697.

Authored-by: Marco Gaido <marcogaido91@gmail.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
2018-09-01 18:07:58 -05:00
Yuhao Yang ebf33a333e [SAPRK-25011][ML] add prefix to __all__ in fpm.py
## What changes were proposed in this pull request?

jira: https://issues.apache.org/jira/browse/SPARK-25011

add prefix to __all__ in fpm.py

## How was this patch tested?

existing unit test.

Author: Yuhao Yang <yuhao.yang@intel.com>

Closes #21981 from hhbyyh/prefixall.
2018-08-03 15:02:41 +08:00
WeichenXu 90ae98d1ac [SPARK-24146][PYSPARK][ML] spark.ml parity for sequential pattern mining - PrefixSpan: Python API
## What changes were proposed in this pull request?

spark.ml parity for sequential pattern mining - PrefixSpan: Python API

## How was this patch tested?

doctests

Author: WeichenXu <weichen.xu@databricks.com>

Closes #21265 from WeichenXu123/prefix_span_py.
2018-05-31 06:53:10 -07:00
Bryan Cutler 39ee2acf96 [SPARK-23163][DOC][PYTHON] Sync ML Python API with Scala
## What changes were proposed in this pull request?

This syncs the ML Python API with Scala for differences found after the 2.3 QA audit.

## How was this patch tested?

NA

Author: Bryan Cutler <cutlerb@gmail.com>

Closes #20354 from BryanCutler/pyspark-ml-doc-sync-23163.
2018-01-25 01:48:11 -08:00
Yan Facai (颜发才) 139da116f1 [SPARK-20768][PYSPARK][ML] Expose numPartitions (expert) param of PySpark FPGrowth.
## What changes were proposed in this pull request?

Expose numPartitions (expert) param of PySpark FPGrowth.

## How was this patch tested?

+ [x] Pass all unit tests.

Author: Yan Facai (颜发才) <facai.yan@gmail.com>

Closes #18058 from facaiy/ENH/pyspark_fpg_add_num_partition.
2017-05-25 21:40:39 +08:00
Yanbo Liang 913a6bfe4b [SPARK-19281][FOLLOWUP][ML] Minor fix for PySpark FPGrowth.
## What changes were proposed in this pull request?
Follow-up for #17218, some minor fix for PySpark ```FPGrowth```.

## How was this patch tested?
Existing UT.

Author: Yanbo Liang <ybliang8@gmail.com>

Closes #18089 from yanboliang/spark-19281.
2017-05-25 20:15:15 +08:00
zero323 0bc8847aa2 [SPARK-19281][PYTHON][ML] spark.ml Python API for FPGrowth
## What changes were proposed in this pull request?

- Add `HasSupport` and `HasConfidence` `Params`.
- Add new module `pyspark.ml.fpm`.
- Add `FPGrowth` / `FPGrowthModel` wrappers.
- Provide tests for new features.

## How was this patch tested?

Unit tests.

Author: zero323 <zero323@users.noreply.github.com>

Closes #17218 from zero323/SPARK-19281.
2017-03-26 16:49:27 -07:00