[SPARK-35981][PYTHON][TEST] Use check_exact=False to loosen the check precision

### What changes were proposed in this pull request?

We should use `check_exact=False` because the value check in `StatsTest.test_cov_corr_meta` is too strict.

### Why are the changes needed?

In some environment, the precision could be different in pandas' `DataFrame.corr` function and the test `StatsTest.test_cov_corr_meta` fails.

```
AssertionError: DataFrame.iloc[:, 0] (column name="a") are different
DataFrame.iloc[:, 0] (column name="a") values are different (14.28571 %)
[index]: [a, b, c, d, e, f, g]
[left]:  [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0]
[right]: [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 4.807406715958909e-17]
```

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Modified tests should still pass.

Closes #33179 from ueshin/issuse/SPARK-35981/corr.

Authored-by: Takuya UESHIN <ueshin@databricks.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
This commit is contained in:
Takuya UESHIN 2021-07-02 17:58:10 +09:00 committed by Hyukjin Kwon
parent 0c9c8ff569
commit 77696448db

View file

@ -283,7 +283,7 @@ class StatsTest(PandasOnSparkTestCase, SQLTestUtils):
index=pd.Index([1, 2, 3], name="myindex"),
)
psdf = ps.from_pandas(pdf)
self.assert_eq(psdf.corr(), pdf.corr())
self.assert_eq(psdf.corr(), pdf.corr(), check_exact=False)
def test_stats_on_boolean_dataframe(self):
pdf = pd.DataFrame({"A": [True, False, True], "B": [False, False, True]})