[SPARK-30124][MLLIB] unnecessary persist in PythonMLLibAPI.scala
### What changes were proposed in this pull request?
Removed unnecessary persist.
### Why are the changes needed?
Persist in `PythonMLLibAPI.scala` is unnecessary because later in `run()` of `gmmAlg` is caching the data.
710ddab39e/mllib/src/main/scala/org/apache/spark/mllib/clustering/GaussianMixture.scala (L167-L171)
### Does this PR introduce any user-facing change?
No
### How was this patch tested?
Manually
Closes #26758 from amanomer/improperPersist.
Authored-by: Aman Omer <amanomer1996@gmail.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
This commit is contained in:
parent
35bab33984
commit
5892bbf447
|
@ -407,11 +407,7 @@ private[python] class PythonMLLibAPI extends Serializable {
|
||||||
|
|
||||||
if (seed != null) gmmAlg.setSeed(seed)
|
if (seed != null) gmmAlg.setSeed(seed)
|
||||||
|
|
||||||
try {
|
new GaussianMixtureModelWrapper(gmmAlg.run(data.rdd))
|
||||||
new GaussianMixtureModelWrapper(gmmAlg.run(data.rdd.persist(StorageLevel.MEMORY_AND_DISK)))
|
|
||||||
} finally {
|
|
||||||
data.rdd.unpersist()
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
/**
|
/**
|
||||||
|
|
|
@ -234,6 +234,7 @@ class GaussianMixture private (
|
||||||
iter += 1
|
iter += 1
|
||||||
compute.destroy()
|
compute.destroy()
|
||||||
}
|
}
|
||||||
|
breezeData.unpersist()
|
||||||
|
|
||||||
new GaussianMixtureModel(weights, gaussians)
|
new GaussianMixtureModel(weights, gaussians)
|
||||||
}
|
}
|
||||||
|
|
Loading…
Reference in a new issue