[SPARK-26006][MLLIB] unpersist 'dataInternalRepr' in the PrefixSpan

## What changes were proposed in this pull request?
Mllib's Prefixspan - run method - cached RDD stays in cache. After run is comlpeted , rdd remain in cache.
We need to unpersist the cached RDD after run method.

## How was this patch tested?
Existing tests

Closes #23016 from shahidki31/SPARK-26006.

Authored-by: Shahid <shahidki31@gmail.com>
Signed-off-by: Sean Owen <sean.owen@databricks.com>
This commit is contained in:
Shahid 2018-11-17 09:43:33 -06:00 committed by Sean Owen
parent ed46ac9f47
commit e557c53c59

View file

@ -174,6 +174,13 @@ class PrefixSpan private (
val freqSequences = results.map { case (seq: Array[Int], count: Long) =>
new FreqSequence(toPublicRepr(seq), count)
}
// Cache the final RDD to the same storage level as input
if (data.getStorageLevel != StorageLevel.NONE) {
freqSequences.persist(data.getStorageLevel)
freqSequences.count()
}
dataInternalRepr.unpersist(false)
new PrefixSpanModel(freqSequences)
}