[SPARK-35392][ML][PYTHON] Fix flaky tests in ml/clustering.py and ml/feature.py

### What changes were proposed in this pull request?

This PR removes the check of `summary.logLikelihood` in  ml/clustering.py - this GMM test is quite flaky. It fails easily e.g., if:
- change number of partitions;
- just change the way to compute the sum of weights;
- change the underlying BLAS impl

Also uses more permissive precision on `Word2Vec` test case.

### Why are the changes needed?

To recover the build and tests.

### Does this PR introduce _any_ user-facing change?

No

### How was this patch tested?

Existing test cases.

Closes #32533 from zhengruifeng/SPARK_35392_disable_flaky_gmm_test.

Lead-authored-by: Ruifeng Zheng <ruifengz@foxmail.com>
Co-authored-by: Hyukjin Kwon <gurwls223@gmail.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
This commit is contained in:
Ruifeng Zheng 2021-05-13 22:23:51 +09:00 committed by Hyukjin Kwon
parent b6d57b6b99
commit f7704ece40
2 changed files with 3 additions and 5 deletions

View file

@ -273,8 +273,6 @@ class GaussianMixture(JavaEstimator, _GaussianMixtureParams, JavaMLWritable, Jav
3
>>> summary.clusterSizes
[2, 2, 2]
>>> summary.logLikelihood
65.02945...
>>> weights = model.weights
>>> len(weights)
3

View file

@ -4682,9 +4682,9 @@ class Word2Vec(JavaEstimator, _Word2VecParams, JavaMLReadable, JavaMLWritable):
+----+--------------------+
|word| vector|
+----+--------------------+
| a|[0.09511678665876...|
| b|[-1.2028766870498...|
| c|[0.30153277516365...|
| a|[0.0951...
| b|[-1.202...
| c|[0.3015...
+----+--------------------+
...
>>> model.findSynonymsArray("a", 2)